What is Ethereum Whisper? A Detailed Guide
Web 3.0 is the decentralized evolution of the World Wide Web. The concept aims to replace centralized web applications with so-called decentralized applications (DApps), which are implemented on a trusted peer-to-peer (P2P) network without central points of control and central points of failure. In the Ethereum ecosystem, Web 3.0 is implemented in the form of three pillars, of which one is the Ethereum Whisper protocol, which is designed to bring about the emergence of DApps, and by extension Web 3.0, by acting as a secure and decentralized messaging protocol.
The first pillar is smart contract technology, which is run on the Ethereum blockchain as a trusted immutable backend. With smart contracts, the code of the decentralized application is executed on top of a trusted P2P protocol, instead of a web server. The second pillar, decentralized storage, can be found in the form of Swarm. This allows the off-chain parts of DApps, such as web interfaces and larger pieces of data, to be stored in a decentralized manner, eliminating the need for centralized file storage or databases.
The third element of the Web 3.0 vision involves privacy-focused secure messaging. There are a number of situations in which DApps need to communicate through a message bus outside the context of blockchain transactions. Message buses allow applications or users to interchange messages point-to-point or in a broadcast fashion. Traditionally, this has been achieved by centralized message servers. Reasons for DApps to keep communication off-chain include:
- Privacy.
- Temporary limits for the validity of a message (a time-to-live property).
- The cost of on-chain transactions.
In Ethereum, the Whisper protocol is designed to take on the role of a secure off-chain message bus.
The Whisper Protocol
Ethereum Whisper is designed as a flexible and secure messaging protocol that protects user privacy. The protocol follows a “darkness” principle, meaning that it obscures message content and sender and receiver details to observers, which also means that this information cannot be gained through packet analysis. This principle is akin to the Thor projects effort to provide anonymous web browsing.
Messages are encrypted by default either asymmetrically or symmetrically. Asymmetric encryption uses public keys for encryption and private keys for decryption. This form of encryption is used for one-to-one communication. While symmetric encryption facilitates one-to-many messages using a single encryption and decryption key. Messages are received by a participant if they can be decrypted. Thus, the owner of private keys can receive messages destined only for them. One-to-many communications can be received by anyone in possession of the correct symmetric key. Of course, the strong link with Ethereum means that all participants already have public/private key pairs, making this fully encrypted model possible.
Protocol Details
The Ethereum Whisper protocol implementation builds on top of the RLPx transport protocol that is internally used by Ethereum for communication between nodes. While the protocol has been designed for relatively low latency (< 5 seconds), it is not suitable for real-time communication. Due to the underlying broadcast nature of the protocol, Whisper also has bandwidth limitations. The maximum size of a message is capped at 64K Bytes, although most messages are much smaller in practice.
Whisper messages also have a time-to-live (TTL) associated with them, meaning that they expire after a certain time. A TTL property ensures messages are only valid for the specified time and will not be received after the timeout. TTL is useful in a number of situations, for example, when broadcasting a temporary price offer in trading.
In order to avoid spam, nodes must execute a proof of work algorithm to send a message. The amount of work to be performed is relative to the size and TTL of a message.
Messages within Whisper are encoded in an envelope with the following fields:
- Version: Version number of the protocol. This is used to identify the decryption format.
- Expiry: The expiry time of the message in the form of a UNIX timestamp.
- TTL: The time to live in seconds.
- Topic: Arbitrary data field that can be used as an indication of whether a message is “of interest” to a node.
- AESNonce: A unique number that is used by symmetric encryption in combination with the key.
- Data: The encrypted payload of the message.
- EnvNonce: A number used by the proof of work algorithm.
One aspect to note with this design is that the topic field is not a traditional identifier, such as a “subject line”. It can instead be used by a so-called bloom filter to give a probabilistic indication of whether a node should be interested in attempting to decrypt a message. This is needed because the darkness principle impedes nodes from knowing whether they are a recipient to a message until they have tried decrypting it using the keys at their disposal.
Running Ethereum Whisper
Etheruem Whisper is implemented in the Go Ethereum client (Geth). However, it is disabled by default and needs to be activated with the following command line flag:
- geth –shh
As the protocol is meant to be used by DApps internally, rather than end users, it has to be used programmatically through the RPC API. Javascript bindings are included in the web3.js library.
The main alternative to Geth, the Parity Ethereum client, also implements Whisper.