Cryptographic “Heads and Tails” and privacy-preserving blockchains

Sometime, the technology behind the Blockchain appears to be out of reach for the common man. Yet, I have a passion in trying to explain the “hard concepts” in simple terms and in this one I will touch upon the idea behind privacy-preserving blockchains like ZCash or Monero.

To do that, we will talk about Heads and Tails.

Heads and Tails over the phone

Suppose you want to play “Heads and Tails” with one of friends that’s living oversea. Also, let’s say that you live in the 80s with no possibilities to have a video conference with your friends whatsoever.

You then arrange to play over the phone: your friends will toss the coin and you’ll do the bet. So, things go like this:

  1. You tell your friend your bet.
  2. Your friend toss the coin and tells you the outcome.

Of course, this very simple protocol is subject to be abused, namely by your friend, since he will have the chance to tell you whichever outcome is beneficial for him, i.e. the opposite of your bet.

The only way it can work is that you trust your friend. Still, you have no concrete ways to conclude that he’s actually cheating or not.

Adding some crypto

In a way, cryptography is all about minimizing (or possibly eliminating) trust from a certain “mechanism”, it being the transmission of a message, a protocol etc.

In this specific case, we want to minimize the need to trust our friend. So we need to find way to forbid him to cheat or, at least, to detect when he does it, at least with a certain probability.

In order to do this, we will make use of a commitment scheme. In simple terms, a commitment scheme is a way for a party to hide a message from the sight of the other parties with no possibility for him to change it later on. It’s pretty much like putting the message in a sealed envelope and giving the envelope to the other party.

A commitment scheme is made up of two parts:

  1. Hiding. The message is hidden using a function f. The easiest function is a one-way hash function. So, the commitment c becomes H(m). c is then sent by the committer to the other party, called the verifier.
  2. Opening or Revealing. The committer sends the message m to the verifier, possibly along with some other information I, to convince him that the message m was indeed the one hidden in c. In the case of a one-way hash function I is simply empty and the verifier just recalculate c = H(m).

With this in mind, let’s see the revised protocol:

  1. Your friend picks a random number n (a nonce).
  2. Your friend then tosses the coin with outcome r.
  3. Your friend hashes n and r together, forming a commitment c = H(n | r) which is sent to you.
  4. Now you make the bet and tell your friend about it.
  5. Your friend sends n and r to you to convince that c = H(n | r).

In this scheme, none of the two parties can cheat: Your friend cannot change the commit after it was sent to you and you don’t learn anything about the outcome r from the commitment c.

Since this scheme cannot be cheated, we actually removed any need of trust between the two parties. It’s important to note that the nonce n is needed for this to work otherwise you just try to hash the two possible outcomes, H(head) and H(tail), and then it would be straightforward for you to learn which of the two outcomes is hidden in c.

Notice that the scheme can also be played as follows:

  1. You pick a nonce n and make a bet b, hashing the two together as c = H(n | b) and then sending it to your friend.
  2. Your friend then tosses the coin with outcome r, which is communicated to you.
  3. You send your friend n and b to convince that c = H(n | b).

This works as well but this time you would be the committer.

“Heads and Tails”: solo version

Let’s say now that you and your friend decide to play N times the game of Heads and Tails. Using the above protocol, there would be a need to perform 3N communications between you and your friend, which is not ideal.

On top of that, now suppose that there is a third party, the payer. Basically, after the game is run N times, you go to the payer in order to get the reward which would of course depend on how many rounds you won.

But, then, there is a problem. How can the payer be sure that you didn’t collude with your friend to split the reward? You cannot just present it the script of the above protocol to the payer.

In general, this means that the protocol above cannot be publicly verified.

In order to make the protocol work under these settings, we need to change it a bit. Basically we would play a solo version of the protocol where you toss the coin by yourself and present the results to anyone interested in seeing it (e.g. your friend and the payer).

But how can one be sure that you played honestly when you are, at the same time, the one who’s betting and the one who’s tossing the coin?

It turns out that this is indeed possible, exploiting the fact that a one-way hash function is kind of splitting a sequence of the events in a “before” and an “after”: If d = H(m) then m must have been chosen before we knew d since H() is one-way and you cannot reverse it (i.e. you cannot reverse time).

Fiat and Shamir showed that this property can fruitfully be used to turn interactive protocols into non-interactive publicly verifiable protocols.

Here’s how it works:

  1. Everybody agrees on a random value x which is called a public coin. This step is really important otherwise you could use a previous sequence of bets that you know are favorable for you.
  2. You now choose N bets, one for each round. Let’s represent these bets as a sequence b of N bits, where a 0 in the k-th position means that your bet is a head for the k-th round (and 1 for the tail, of course).
  3. Now you calculate r = H(x | b). These represent the results at each round: If the k-th bit of r is 0, it means that the outcome for the k-th round would be a head (and 1 for tail, like before).

Now you go to the verifier and the payer and present them b and r. They would simply recalculate r = H(x | b) and, in case it matches, it means that you played honestly and can get your reward.

Why this protocol works?

Since H() is one-way, you cannot “predict” r before “tossing” the hash from b.

Technically speaking, a hash function is approximating a random oracle. In simple terms, a random oracle is a function which is always returning the same output given the same input. But the oracle will pick the output randomly the first time it sees it, giving the same result from that moment on. So, again, you cannot know the randomly picked number before asking the random oracle.

In other terms, finding the b which is giving a pre-defined r would be as hard as inverting the hash function. Put it in another way, this would be equal to solving this equation:

 H(b) \oplus{} b = 0

which requires to invert H to be solved for b.

But there is a caveat to this.

The verifier and the payer can only be sure that you played honestly with a certain probability. It means that there is still a chance for you to cheat the protocol. But this chance can be made negligible if N is sufficiently large.

For the specific case of Heads and Tails, this caveat works as follows.

Since H() is much like a uniformly distributed random variable, on average, you would win 50% of the rounds, sometime more, sometime less. Can you increase this probability?

Let’s suppose that the sequence of bets is b = b0 | b’, i.e. we are specifically looking at the first bet. We can then calculate r = r0 | r’ = H(x | b0 | b’), looking specifically at the first result r0. Regardless of the b0 and r0, since H() is uniformly distributed, on average, the outcomes for b’ would be still 50% favorable to us.

For the first bet b0 the following may happen:

  1. For any b0, we always lose if r0 ≠ b0.
  2. For any b0, we always win if r0 = b0.
  3. b0 = r0 (we win) if either b0 = 0 and r0 = 0 or b0 = 1 and r0 = 1

In case of 2, we just pick head or tail and we are sure to win. In case of 3, we pick the bet where we win. For 1, well, we try to flip another bet :-).

It turns out that we have increased the chance to win the first bet from 50% to 75% and, on average, the winning rounds above 50%. How much? Well, it depends on how large is N.

We can also extend this strategy to other bets but then we must check 2^n hashes and, with n sufficiently large, this would quickly become as impractical as reversing H(). In order to stay on the safe side N should be twice the length of the digests.

Bottom line: we can work out a version of the protocol that’s publicly verifiable and it’s not interactive, but it comes at a cost of requiring that N must be very large.

Enter the Blockchain

What this solo “Heads and Tails” has to do with blockchains?

In privacy-preserving blockchains, i.e. those where transaction content is kept confidential, one must prove to the miners that a transaction is not violating the rules set out for the blockchain, i.e. basically not spending more money than those found in the wallet, without revealing anything of the underlying content of the transaction.

In order to do this, a protocol very similar to the one described above for Heads and Tails is applied. Specifically, we will refer to the second version of the commitment-based interactive protocol.

So, first of all, there will be the creator of a transaction T which will be “encrypted” so nobody can look into it. This is similar to picking head or tail and committing to it via hashing in our “Heads and Tails” interactive protocol.

Let’s now imagine that we have a miner who needs to validate the transaction T. Since it’s encrypted, he cannot check that it’s correct directly. So, the creator of the transaction engages a dialogue with the miner in order to convince him that T is, indeed, valid. Hence, the miner starts a series of questions to which the creator replies.

The most basic question that the miner could ask is “can you show me the content of the transaction?”, which is similar to having the better sending the bet to the verifier in the interactive Heads and Tails protocol.

Of course, the creator of T will never answer such a question since he doesn’t want to reveal the content of the transaction. This represents an important difference compared to the simple Heads and Tails protocol since the miner must not learn anything about the content of the transaction. It’s said that the dialogue is run in zero-knowledge.

So, the miner asks a different kind of questions which are able to signal to him whether the creator knows and that is correct, up to a certain probability, let’s say 50%.

But the miner wants to be very sure that creator of T was not just lucky in answering correctly to the question asked. So he asks another question. The probability of correctly answering both questions without knowing the content of T and convincing that T is correct drops now to 25%.

Hence, the miner will go on with asking questions until this percentage drops to an acceptable amount. In practice, the miner randomly generates a number that is used to build the next question. In the end, the miner will ask Nquestions after which he’s convinced that the creator knows the content of Tand that T is correct, exactly like tossing the coin N times in the non-interactive version of Heads and Tails convinces the verifier that the better didn’t cheat.

The script of this dialogue between the miner and the creator of the transaction is then a sort of “proof” that T is correct, but this proof only works for that specific miner.

Therefore, such an interactive protocol is not suitable in a blockchain context where transactions need to be proved correctly to all the miners: If we had to do it really interactively, the creator would have to talk to each miner separately, with a unique dialogue, to convince each of them that the transaction is correct. This is, of course, impossible to do, considering also the asynchronous context of a blockchain.

So, Fiat-Shamir heuristic can be used to turn the dialogue between the miner and the creator of T to a non-interactive publicly-verifiable proof that the transaction is correct, “simply” by replacing the random number generator used to build the questions asked by the miner with a one-way hash function.

As we saw, this trick works but it turns out to be heavier than in traditional blockchains since proofs of correctness are, in fact, much longer.

Therefore, much of the research in this field is devoted to producing proofs that are as short as possible and to making the validation algorithm of such proofs as quick as possible.

Practical auditable elections with deferred parallel consensus

Voting Software

Recently, a few friends of mine have come back to me after reading my previous articles about elections on the Blockchain (part 1 and part 2 ). They were wondering how feasible it would be to implement a voting scheme that works in the real world.

Some of the requirements I got from them are the following:

  1. The voting platform must scale on the number of voters.
  2. The whole voting process must support short-lived elections, i.e. a few hours or, in some extreme cases, even minutes (think of a poll in a TV show).
  3. Minimal to no trust should be put in any of the party involved in the voting process.
  4. It must not be possible to trace the ballots back to the actual individual voters.
  5. Honest voters must be ensured that their vote will be taken into account during the final tallying.

As you can see, these requirements are pretty challenging: while the “no trust” requirement would make us lean towards a Blockchain-based solution, the possibility to scale to thousands of transactions per seconds seems to rule it out.

As a first attempt, we will go then for a hybrid solution based, both on a centralized setup (for scalability) and a public Blockchain (to keep integrity and auditability) as well as on a private Blockchain to reach consensus on the final results. To design the solution, I took inspiration from the Google’s Certification Transparency Initiative and this paper by Liu et al which introduced the concept of Linkable Spontaneous Anonymous Group (LSAG) signatures.

For the time being, we won’t take into account the problem of vote-selling or vote-coercion.

Electronic voting: still a Chimera

A huge caveat to this article. Full electronic voting has still a lot of road to do. Many problems found a solutions but issues are still standing out there.

One of the trickiest is software integrity: How do we ensure that the software you are using for voting (e.g. an app on a smartphones or a tablet in the polling station) has not been tampered with by an attacker? Marking a ballot with an “X” is a lot easier to verify than checking that an encrypted vote is actually the one you intended to cast. On this topic, I would suggest reading a nice essay by Ronald L. Rivest.

Linkable Spontaneous Anonymous Group (LSAG) signatures

A cornerstone of the solution described in this article are LSAG signatures. In practical terms, a LSAG signature allows to hide the signer’s identity among a group of other identities. A verifier will not be able to identify the actual signer but be sure that the signature has been performed by one of the group members. In addition, an LSAG signature scheme allows a verifier to “link” signatures that have been generated by the same identity which, nevertheless, remains anonymous (or pseudonymous). A LSAG signature has this form:

\left(s,G,L\right)

where s is the signature of message m, G is the group of identities among which the actual signer is concealed, L is the “signer’s link” which is uniquely bound to the signer without revealing her/his real identity, i.e. it cannot be changed by the signer but the signer’s identity cannot be (computationally) derived from L. In this article we will refer to the LSAG as described in Monero’s Confidential Transactions paper by Noether. Specifically, L will not depend on G.

Crypto-currencies and e-voting: a difference in scope

Double spending and double voting seem to be pretty much the same problem: you don’t want a party to spend the same coin twice, the same way you don’t want a voter to cast two or more (possibly contradictory) ballots.

Nevertheless, there are two important differences:

  • Commutativity. There are no specific ordering requirements for the votes expressed during an elections. In the end, the final result doesn’t depend on whether vote A arrived before vote B (or viceversa).
  • Independence. Casting a ballot by a voter is done in isolation: the ballot cast by a voter is not influenced by the ballots previously sent by any other voter. On the other way around, in the context of payments, a coin can be spent only if previously received by another party (or mined).

Somewhat, between payments and voting, there is a difference in scope. Spending is a global-scope activity: everybody must be informed of a party spending a coin to avoid double spending. This holds true also in non blockchain-based financial systems: the global scope is usually maintained in a few core systems, like TARGET2 in the Euro area.

Conversely, voting is basically local-scope: casting a vote doesn’t have any dependency on the other castings performed by the community, the only requirement being that the tallying authority must be able to rule out double votes, either by prevention or by detection.

This difference in scope allows an election to be split into independent sub-elections. By the way, this is not different from what’s happening in the real world where voters express their ballot into different and independent polling stations. Also, the final aggregation does not depend on the order in which votes are received: they are thrown into a voting box where order is not preserved.

On top of that, we will see that the local-scope of voting scheme allows to defer the consensus phase at a later stage, just before the tallying.

A practical scheme

Environment

In the environment where the elections take place the following components are available:

  • The Authoritative Node. This is a server that is publicly accessible and is designated in this role by the party that is starting the election, the Polling Authority.
  • Witness Nodes. These are a number of servers that are publicly accessible. The Witness Nodes is an open set and any party is invited to join.
  • A public Blockchain. All the parties have complete and unrestricted access to a public, non-permissioned Blockchain (e.g. Ethereum). The messages sent to the Blockchain cannot be tampered with by any of the parties involved in the election process. Also, the Blockchain is acting like a provable public clock with time provided in the form of the block number and proof as the block digests.
  • The Polling Authority. This party designates the Authoritative Node and ensures the registration of Voters and the Witness Nodes. Also, the Polling Authority transitions the election process into the different phases and, finally, declares the outcome of the election. The public key of the Polling Authority is known to all the parties.

We also assume that all communications between the parties happens in an anonymous fashion, e.g. using the Tor mixing network.

Registration phase

First of all, the Polling Authority publishes in the Authoritative Node a random number, the PollID, that uniquely identifies the current election, along with a public key, the Poll Public Key . Also, the Polling Authority publishes the list of the possible votes that can be selected by the voters for the current election. For this article we will assume a referendum-like election, i.e. the voter can choose only between two mutually-exclusive votes, indicated as 0 and 1. The PollID, the Poll Public Key and the choices are signed by the Polling Authority.

In this phase, Voters go to the Polling Authority and present their identity along with a public key in order to be registered as voters. The Polling Authority has the right to reject a Voter if she/he is not eligible to vote in the current election. Upon successful registration, the Polling Authority releases a signed Registration Receipt to the Voter containing:

  • The Voter’s identity.
  • The Voter’s public key.
  • The PollID.
  • The current time from the public Blockchain.

Also, in this phase, the Witness Nodes are registered with the same mechanism: an identity and a public key is provided by the party managing the Witness Node and a signed Registration Receipt is released with the same elements as above. For the Witness Nodes, a publicly accessible URI is provided which will be later used during the voting phase by the Voters.

After the registration phase ends, the Polling Authority publishes the list of Voters and the list of Witness Nodes in the Authoritative Node, both in terms of identity and public key. For the Witness Node, the voting URI is also published.

Merkle Tree Hashes of the list of Voters and the Witness Nodes are published to the Blockchain, along with the PollID and the possible votes. These digests are signed by the Polling Authority. For now on, Merkle Trees contain leaves represented by the pair (n,L) where n is the position of the node L in the list.

In case a Voter or a Witness Node is not appearing in the published lists, she/he can use the Registration Receipt to object. Also, the honest Polling Authority can always provide a membership proof for a Voter or Witness Node that can be publicly verified in order to counteract false claims of registration rejections.

After this phase and at a convenient time, the Polling Authority declares the opening of the election. This transition is signaled to the public Blockchain.

Voting phase

A Voter encodes a message containing his voting preference following these steps:

  1. The Voter accesses the Authoritative Node to pick randomly N identities from the public list of Voters. For each identity, the information received from the Authoritative Node is the public key and a membership proof based on the Merkle Tree Hash published in the public Blockchain by the Polling Authority that the identities are actually in the list of Voters. We call this set the Signature Group. Notice that, by using membership proofs, the voter avoids downloading the whole list of voters just to pick some identities randomly. In fact, the voter inquiries the Authoritative Node (via a proper API) and can be ensured that the identities retrieved are part of the list of Voters. In addition, this information must be publicly available, i.e. there must be no need to provide an authenticated identity to perform queries to the Authoritative Node. In this way, we avoid identity fingerprinting by the composition of the group G in the LSAG. Finally, the Authoritative Node must return exactly those identity requested by the voter: recall that the leaves of the corresponding Merkle Tree contains, indeed, the position of the leaf in the tree.
  2. The Voter selects a number of Witness Nodes from the list published in the Authoritative Node. Like in the previous step, the Voter receives the public keys and the membership proof that the selected Witness Nodes are registered as such in the Authoritative Node, consistently with the Merkle Tree Hashes published in the Blockchain.
  3. The Voter gathers the possible votes that are available for this election from the Authoritative Node along with the Poll Public Key.
  4. The information gathered in 1, 2 and 3 are checked for authenticity against the Polling Authority’s public key. Information gathered in 1 and 2 are also verified for integrity using the Merkle Tree Hashes published in the public Blockchain.
  5. The Voter pick one of the votes, encrypts it with the Poll Public Key along with a random nonce (to ensure uniqueness of the encrypted vote) and prepares an LSAG signature using the Signature Group as defined in 1. These information, together, form the ballot.
  6. The Voter sends her/his ballot to the Authoritative Node and to all the Witness Nodes selected in 2.
  7. The Voter receives back a signed Ballot Receipt which acknowledges that the ballot was received by the Authoritative/Witness Node. Among the other information in the receipt, the Voter receives the ID of the block in which his ballot was inserted and the position k in the sequence of ballot in the block (see later).

When the Authoritative/Witness Node receives the ballot from the Voter:

  1. It checks the eligibility of the identities in the Signature Group by looking up the list of voters. Alternatively, the voter can provide the membership proofs received previously by the Authoritative Node.
  2. It checks the validity of the LSAG signature.
  3. It checks that no previous ballot was received from the same voter by looking up the signer’s link in the LSAG signature in the cast ballots. Notice that the anonymity of the voter is still preserved even if two such ballots are found.
  4. It inserts the ballot in the current partial block of ballots.
  5. Prepares and signs a Ballot Receipt with the following information:
    • the current time (taken from the public Blockchain) when the vote was received by the Authoritative/Witness Node,
    • the unique ID of the current partial block,
    • the position k in the block in which the ballot is inserted,
    • a signed digest obtained by hashing these information along with the ballot and the LSAG signature.
  6. Upon receiving the Ballot Receipt, the Voter checks the authenticity, the integrity and the freshness of the receipt.
  7. The Authoritative/Witness Node finalizes the current partial block of ballots by sending the Merkle Tree Hash to the Blockchain, along with the block ID and the identity of the Authoritative/Witness Node. The Voter can request a membership proof of her/his vote from the Authoritative Node and the Witness Nodes. This request is performed by specifying the index k provided in step 5. The anonymity of the Voter is preserved since this information can be publicly inquired.

After a certain amount of time the Election Authority declares that the election is over by signaling this in the Blockchain.

Consensus phase

In this phase, the Authoritative Node and the Witness Nodes cooperates to reach consensus on the complete list of the ballots received for the election. In principle, the Authoritative Node should already have all the ballots expressed by the voters Nevertheless, we cannot exclude a malicious behavior of the Authoritative Node, considering that the Polling Authority knows the votes contained in the ballots received. The Authoritative Node cannot, of course, forge the ballots nor change their content but it may still be able to omit a ballot.

Let’s then suppose that a Voter sends her/his ballot to the Authoritative Node. The Voter receives back the Ballot Receipt and, once the Merkle Tree Hash of the block containing the node has been published to the public Blockchain, the Voter inquiry the Authoritative Node to check that in position k of block N her/his vote has been actually included.

What happens if the Authoritative Node returns a membership proof that doesn’t match the Ballot Receipt? The Voter may be willing to raise an objection, maybe going to higher level authorities. But, doing so, the Voter must be obliged to disclose her/his identity in reference to that particular ballot, something that the Voter is unwilling to do. The Authoritative Node may then play with this fact to actually try and omit that ballot.

To avoid this situation, the Witness Nodes come to play. If a ballot is received by multiple Witness Nodes, we hope that at least one of them notices that the vote has been omitted, trying and reinserting it in the final set of tallied ballots: this way, the Voter doesn’t need to disclose her/his identity. Notice that a Witness Node doesn’t know the Poll Private Key and cannot decrypt the votes. In addition, we assume that the Witness Nodes have different and conflicting interests. For example, the Witness Nodes can be run by the different parties running in an election and by the press that is following the event.

One may also argue that receiving the Ballot Receipt and the subsequent membership proof is pointless with this scheme. In a sense, this is true but they can be used by the Voter to sequentially select the Witness Nodes until one of those confirms that the ballot has been received. Basically, the Voter retrieves the first Witness Node and sends the ballot to it. After the Witness Node publishes the hash of the corresponding block to the public Blockchain, the Voter check that her/his ballot has actually been inserted in the block. If it is, the Voter stops. If not, the Voter select another Witness Node and tries again.

Let’s see how the consensus is reached on the list of ballots. The consensus algorithm works as follows:

  • Each ballots is assigned to a group. The group the ballot is assigned to is determined by taking the n least significant bits of the digest produced from the ballot using a cryptographic one-way hashing function H, i.e. LSBn(H(ballot)).
  • For each group of ballots, a private/permissioned proof-of-authority Blockchain is setup among the Authoritative Nodes and the Witness Nodes. The blocks are built as follows:
    • The genesis blockchain block is proposed by the Authoritative Node. The genesis block contains all the ballots in the group along with: 1) the ballot hash H, 2) the block ID N of the block published to the public Blockchain by the Authoritative Node that contains the ballot, 3) the position k of the ballot in block N.
    • The subsequent blocks are appended by the Witness Nodes and contain the ballots that are missing from the previous blocks (if any) and belongs to the current group, along with the H, N and k for the Witness Node. These latter information is also provided for all the ballots in the previous blocks.
    • As soon as all the Witness Nodes appended a block or when a certain point in time has been reached, the Authoritative Node append a closing block to the Blockchain. The content of the closing block is just the current block digest in the public Blockchain.

Practically, there would be a number of private/permissioned blockchains running in parallel. Once consensus has been reached in all blockchains, the ballots received are then used by the Authoritative/Witness Nodes to calculate the Merkle Tree Hashes for each block generated by the Authoritative Node and the Witness Nodes and comparing the hashes with the ones published in the public Blockchain.

Tallying phase

The Authoritative Node activates the tallying by publishing the Poll Private Key in the public Blockchain. At this point, the Witness Nodes and the Authoritative Node decrypt the votes in the ballots and start the tallying. At the end, the Authoritative Node declares the final winner. If this declaration is in contrast with the set of ballots agreed upon by the Authoritative Nodes and the Witness Nodes during the consensus phase, any Witness Node can always publicly object.

That’s it!

Next steps

This scheme will be analyzed in terms of attack scenarios in a later article. In addition, variations of the scheme will be presented.