Bitmask: A Perfectly Anonymous, Naturally Scalable Peer-to-Peer Messaging System
Tommy Anderson
tommy
[email protected]www.bitmask.mev0.1.0 - September 6, 2013
Abstract. We propose a trustless peer-to-peer messaging system that allows peers to securely send and receive anonymous, plausibly deniable messages, without exposing one’s true identity. Through the combined defensive efforts of a minimum spanning tree of non- malicious peers, the system is shown to be resilient to attempts made by malicious peers to disrupt both the activity and anonymity of the system. Additionally, peers are responsible for self-regulating and maintaining the health of the system, in a way that is both scalable and fair to all peers.
1 Introduction
Peer-to-peer (P2P) messaging as it currently exists lacks a proper trustless solution that is both anonymous and scalable. A system known as Bitmessage attempts to be such a solution, however its approach to anonymizing the identity and activities of peers prevents the system from achieving the scalability needed for mainstream adoption. This section will analyze the shortcomings of Bitmessage, as well as introduce a new solution titled Bitmask.
1.1 Bitmessage
The private information retrieval (PIR) approach that Bitmessage takes to anonymize the activities of peers is to have each peer download and forward every message within the system’s network. This is done to obfuscate both the addresses that belong to the peer and those that the peer is in communication with. Bandwidth limitations render such a system impractical for use on a mobile device. More generally speaking, usage is limited strictly to those who are willing to store and transfer large amounts of data over a given period of time.
Bitmessage attempts to scale by allowing peers to split up into separate bandwidth partitions of the network known as streams. Streams however limit a peer’s anonymity to a function of how many peers are in a specific stream, rather than the entire network.
Additionally, it has been shown that various timing attacks exist that can be used to break anonymity. These timing attacks result from the exploitation of Bitmessage network protocol packets known as acknowledgment messages. Lack of link layer encryption also makes it possible for anonymity to be broken by internet service providers (ISPs).
1.2 Bitmask
To remedy these shortcomings and provide a system that is both scalable and anonymous, a new system titled Bitmask is proposed with the following features. These feature descriptions give a high level overview of the system, and will be explained in greater detail within the sections to follow.
Note: symmetric-key encryption, the usage of public and private keys to sign and encrypt messages, is used in a manner similar to Bitmessage, including the concept of addresses to shorten public keys.
1.2.1 Perfect Forward Secrecy
Instead of always sending encrypted messages to the same address for a particular contact, a peer will generate a new public/private key pair before sending a new message, and include the new public key at the end of the message. The one-time address associated with this public key is known as an ephemeral address; an address to be used more than once is known as non-ephemeral. When the recipient reads this message, it will know which new ephemeral address the sender is listening on for a reply, and send a reply accordingly. If the most recent private key for one of the two peers communicating is ever compromised, all previous messages in the conversation are safe as long as the previous private keys were discarded, thus achieving perfect forward secrecy.
1.2.2 Plausible Deniability
By including one’s ephemeral private key at the end of every encrypted message and then discarding that private key, one can deny having written a message if accused outside of the network by the recipient. This is because the recipient had the means to forge the message by possessing the private key at the end. Because all peers never use the same address twice, there is no need for identity-theft concern when releasing the private key, as there is no identity to protect.
1.2.3 Link Layer Encryption
By using transport layer security (TLS) to communicate with and send packets over the network to peers, link layer encryption will be provided, and ISPs will be unable to read the contents of packets.
1.2.4 Advertisement Packets
To preserve the anonymity that comes with Bitmessage’s PIR scheme of everyone gets every- thing without actually forwarding and receiving every message, advertisement (ad) packets are used. All messages are advertised to the network as a special ad packet before ever being transferred between peers. This ad packet contains the recipient address of the message being advertised, as well as the time of expiration for how long the peer passing along this ad is willing to respond to requests from other peers to send them the message.
1.2.5 Randomized Caching
Instead of using streams and Bitmessage’s PIR scheme of everyone gets everything, Bitmask uses a concept called random caching. After learning from ad packets what messages are available to download from peers, a peer will randomly and regularly ask to download cer- tain messages from certain peers. The peer will then cache these messages for an agreed upon amount of time, and transfer them to other peers if requested. The peer’s activity is obfuscated by the fact that no peer can tell which messages are of interest to the peer, and the fact that messages sent by the peer are introduced into the network as an ad whose origin of broadcast is undetermined.
1.2.6 Bandwidth for Anti-Flood
Bandwidth metrics are used for protecting the network against flood attacks, instead of Bitmessage’s CPU based proof of work (POW). Natural bounds on system growth are de- fined by self-verifiable and trustless rules between peers, such as bandwidth agreements and statistically allowable thresholds for message transfer and dropping. Malicious peers must follow these rules or risk being dropped by the non-malicious peers who have set them. As a result, malicious peers can only flood in proportion to how much network traffic they are facilitating, which in effect helps the network by adding additional noise.
1.2.7 Permission Packets
In order to prevent spam, all ad packets for messages contain the recipient address instead of the public key. Thus, special permission packets are needed to obtain a public key for non-ephemeral addresses shared outside of the network, in order to initiate a conversation and start sharing ephemeral public keys. Such a permission packet includes the recipient’s address as well as an ephemeral public key for the sender. The recipient can then reply to that public key with its own ephemeral address as an encrypted message (signed by the address requested in the permission packet in order to prevent false/spammed replys), thus allowing the permission requester to then send its first message.
1.2.8 Trustless Identity
In order to prevent attempts to spam permission packets by altering an original packet and rebroadcasting it with a new ephemeral public key, the first message that gets sent after permission has been granted must be a special identity profile message. The peer who is responding to the request can then read and subsequently decide whether or not to initiate a conversation with whoever the sender claims to be (ie: take an accept or reject type of action).
Note: this identity is a claim, it doesn’t guarantee the peer is who he says he is
1.2.9 Low-Bandwidth Accessibility
Peers such as those on mobile phones can request to only receive ad packets that advertise the availability of messages being sent to a specified range of addresses. As long as these peers provide some constant amount of bandwidth to frequently send and receive messages, anonymity is preserved.
2 P2P Systems
P2P based systems like Bitmask and Bitmessage are made up of a decentralized and dis- tributed network of peers. Such peers take on both the role of consumer and producer, each providing some portion of individual resources to the network. Bitmask is designed to be a trustless P2P system, in which network protocol never dictates the trusting of peers to achieve proper functionality. For example, making the assumption that peers are caching all messages transferred to them without some form of verification or accountability would fail to result in a network that is trustless.
2.1 Minimum Spanning Tree of Trust
The only assumption that Bitmask makes with regards to the trusting of peers, is the as- sumption that a minimum spanning tree of non-malicious peers is present within the network. In other words, each peer must be connected to at least one non-malicious peer. If a peer sends messages out over time and never receives a response, he is only connected to mali- cious peers and the assumption is broken. As such, we can philosophically guarantee the validity of this assumption in the sense that the system is only useful if peers can use it to communicate with each other. Thus, as long as peers find Bitmask to be useful, the trustless nature of the system is preserved.
This assumption in combination with various methods of verification will be used to detect the presence of malicious peers within the network, as described in the sections to follow. Non-malicious peers will work together to set mutually-beneficial terms, but at the same time constantly verify the actions of each other in a trustless manner, in order to maintain the functionality of the network.
3 System Components
Central to the use of Bitmask is the usage of addresses for initiating and carrying out communication, in addition to two types of network packets; global packets for broadcasting to every peer within the network, and local packets for cooperating with specific peers.
3.1 Addresses
All addresses are of a form similar to the base 58 encoded addresses used by Bitmessage and the crypto-currency Bitcoin, in order to compress a peer’s public key and make it more user- friendly with regards to copying and pasting. Additionally, addresses help prevent spam by hashing the associated public key and preventing network observers from being able to scrape an ad packet and send an encrypted spam message to that public key. The checksum length is chosen to be the same as for Bitmessage. As a side-note, the shorter this checksum is, the greater the probability that collisions are possible when getting a response to a permission packet (ie: getting two responses back that are signed by different public keys that hash to the same address, which can be interpreted as two public keys claiming responsibility for the same permission packet).
3.2 Global Packets
The following types of packets are relayed from peer to peer until they’ve traversed the entire network.
3.2.1 Permission Request
This packet is needed for getting the public key of a non-ephemeral address and for subse- quently starting a conversation, as previously described.
3.2.2 Advertisement
This packet is needed to introduce the availability of a message within the network. When relaying this packet to another peer, both peers must agree on a time of expiration for how long the receiving peer is willing to answer requests from other peers to transfer the message being advertised. Determination of this time will be discussed in the section to come on network behavior.
3.2.3 Message
This packet contains encrypted content, such as an acknowledgment for receiving a previous message, a trustless identity profile message following a successful permission request, or a plain text message. It is sent when a peer requests it from another peer, sometime between the moment the receiving peer learns of the message’s presence from an advertisement packet, and the time of expiration for that advertisement.
3.3 Local Packets
The following types of packets are used for communication between two peers.
3.3.1 Bandwidth Agreement
Upon establishing a connection, two peers use this packet to enter into an agreement to provide an amount of message storage space for each other. The peer that offers the lower of the two amounts determines the amount agreed to.
3.3.2 Message Request
A vector of messages are requested in each packet of this type, based on which messages are available for request from the peer in question. Included in this request is the proposed time of storage for each message that a peer is requesting, so that the peer sending the files knows the duration of time that he has to spend verifying that the peer is caching the messages as promised. The sending peer will then begin to transfer the requested messages, either immediately if it currently has a message cached, or with a delay if the peer needs to request a message from one of its peers.
To determine how much time a peer will offer to store a message from another peer, a function with an input of the current amount of unused bandwidth minus the size of the file in question will be used (ie: how much space will remain after caching this message). This function will be exponential, where the smaller the amount of currently available space there is, the shorter a peer will be willing to hold on to messages. This curve will be adjusted depending on network activity, for example making it steeper in the presence of high traffic.
3.3.3 Checkup
Upon establishing a connection, two peers use this packet to enter into an agreement to checkup on each other over a regular interval of time, in order to verify that both peers are holding messages for as long as they promised to. In order to prove that caching is occurring, a random nonce is sent by one peer to the other. The receiving peer then computes and returns a checksum for each file plus this nonce, to prove that each file is currently stored.
Failure to respond within an adequate amount of time or to provide the correct answer results in the blacklisting of that peer. Each checkup packet will contain a list of files that in combination have a large total file size, to guarantee that the peer isn’t simply re-downloading the files from other peers.
3.3.4 Message Query
This packet is used to check if a peer is currently storing any messages for specific addresses, and if so, to transfer them. It sacrifices anonymity, but is useful for querying peers that are serving as storage services by keeping messages cached after their ads expires, for an addi- tional amount of time. Anonymity can be preserved by querying a large range of addresses. This packet should only be used by peers who have been offline and are reconnecting to the network.
4 Network Behavior
The following is an overview of interactions between peers.
4.1 Reciprocal Bandwidth
Because of the minimum spanning tree assumption, a peer must continually relay all ad packets within the network or risk being classified as a malicious peer, because all ads will eventually make their way through the minimum spanning tree to the recipient peer. So if a peer receives an ad for a message that it’s the recipient of but never receives the same ad from peers attempting to drop ad packets, those malicious peers can be detected and blacklisted.
Therefore peers must use their bandwidth for the good of the network, in addition to any “junk” (ie: messages to addresses not owned by any peer) they might want to introduce. The more bandwidth that a peer agrees to, the greater the chance that other peers will randomly cache files that the peer is attempting to send, because these peers will be requesting messages faster. Additionally, the more that peers cache messages, the quicker those messages will get to the intended recipient; it’s to every peer’s benefit to create a strong network pipeline.
If a peer is offline when a message is sent, that peer must attempt to query the message from a storage peer. Otherwise, the only other way to retrieve the message is to wait for the sender to either to resend the message, or to send a new permission packet to restart the conversation on a new ephemeral address.
4.2 Reciprocal Advertising
Peers must agree every so often on a constant time to advertise each others ads, because any messages a peer would try to send will then be reciprocally advertised for an equal amount of time. When a peer advertises a message to another peer, the recieving peer agrees to an expiration time no greater than the time of expiration the advertising peer previously agreed to for the same ad, and no more than the constant time they’re currently in agreement on. Thus, the longer a peer is willing to advertise messages, the longer its messages can be advertised within the network.
If a peer frequently fails to obtain and/or send a message after having been requested to do so, that peer can be blacklisted. The fact that a peer may no longer have access to a file due to advertisement expirations or disconnects in its path to obtain the file will be taken into account when determining such a threshold for blacklisting.
5 Anonymity
The following are various thoughts regarding the anonymity of Bitmask.
• Random caching based PIR and ephemeral addresses keeps interests masked; peers appear to simply be facilitating traffic
• By waiting a random amount of time before responding to permission requests, peers will be unable to detect if a non-ephemeral address belongs to a certain peer
• Unless completely connected to malicious peers (which would violate the minimum spanning tree assumption), no peer can prove the origin of an ad
• Because every message has a mutually agreed upon time of expiration based on ads, messages that are cached will be deleted upon expiration so that peers can’t abuse query requests to see if another peer is holding messages longer than they should be, thus giving away that the peer has interest in it (either sent or read it); deletion can be turned off if serving as a storage peer
• Link layer encryption protects against ISP’s from figuring out which ads originate from certain peers
• Message size increments and padding are not needed due to regular bandwidth fre- quency (ie: constantly pushing and pulling a set amount of data)
6 Spam
Public keys are never directly broadcasted within the network, so no peer can encrypt and send spam messages. The only possibility for spam is with regards to permission packets. Having a trustless identity profile message be the first message after permission has been granted gives a peer the means to accept or reject starting a conversation with the requesting peer. The more fields this profile has, the harder it will be for spam peers to generate a meaningful profile that a Bitmask client can choose to mark as spam, and flat out reject the conversation. A client can also allow users to whitelist certain addresses, and ignore permission requests signed by any other addresses.
7 Flooding
Because of the minimum spanning tree assumption, malicious peers must facilitate all normal traffic or risk being caught dropping messages, and blacklisted as a result. Thus a malicious node must support the current network, in addition to any packets it wishes to introduce. As a result of all peers establishing bandwidth requirements amongst themselves, they naturally limit how much data a malicious peer can push into the network, which in the end will only help peers gain more anonymity from each other. Additionally, peers will agree amongst themselves to have bandwidth caps for ad and permission packets to protect against network flood.