Sometime, the technology behind the Blockchain appears to be out of reach for the common man. Yet, I have a passion in trying to explain the “hard concepts” in simple terms and in this one I will touch upon the idea behind privacy-preserving blockchains like ZCash or Monero.
To do that, we will talk about Heads and Tails.
Heads and Tails over the phone
Suppose you want to play “Heads and Tails” with one of friends that’s living oversea. Also, let’s say that you live in the 80s with no possibilities to have a video conference with your friends whatsoever.
You then arrange to play over the phone: your friends will toss the coin and you’ll do the bet. So, things go like this:
- You tell your friend your bet.
- Your friend toss the coin and tells you the outcome.
Of course, this very simple protocol is subject to be abused, namely by your friend, since he will have the chance to tell you whichever outcome is beneficial for him, i.e. the opposite of your bet.
The only way it can work is that you trust your friend. Still, you have no concrete ways to conclude that he’s actually cheating or not.
Adding some crypto
In a way, cryptography is all about minimizing (or possibly eliminating) trust from a certain “mechanism”, it being the transmission of a message, a protocol etc.
In this specific case, we want to minimize the need to trust our friend. So we need to find way to forbid him to cheat or, at least, to detect when he does it, at least with a certain probability.
In order to do this, we will make use of a commitment scheme. In simple terms, a commitment scheme is a way for a party to hide a message from the sight of the other parties with no possibility for him to change it later on. It’s pretty much like putting the message in a sealed envelope and giving the envelope to the other party.
A commitment scheme is made up of two parts:
- Hiding. The message is hidden using a function f. The easiest function is a one-way hash function. So, the commitment c becomes H(m). c is then sent by the committer to the other party, called the verifier.
- Opening or Revealing. The committer sends the message m to the verifier, possibly along with some other information I, to convince him that the message m was indeed the one hidden in c. In the case of a one-way hash function I is simply empty and the verifier just recalculate c = H(m).
With this in mind, let’s see the revised protocol:
- Your friend picks a random number n (a nonce).
- Your friend then tosses the coin with outcome r.
- Your friend hashes n and r together, forming a commitment c = H(n | r) which is sent to you.
- Now you make the bet and tell your friend about it.
- Your friend sends n and r to you to convince that c = H(n | r).
In this scheme, none of the two parties can cheat: Your friend cannot change the commit after it was sent to you and you don’t learn anything about the outcome r from the commitment c.
Since this scheme cannot be cheated, we actually removed any need of trust between the two parties. It’s important to note that the nonce n is needed for this to work otherwise you just try to hash the two possible outcomes, H(head) and H(tail), and then it would be straightforward for you to learn which of the two outcomes is hidden in c.
Notice that the scheme can also be played as follows:
- You pick a nonce n and make a bet b, hashing the two together as c = H(n | b) and then sending it to your friend.
- Your friend then tosses the coin with outcome r, which is communicated to you.
- You send your friend n and b to convince that c = H(n | b).
This works as well but this time you would be the committer.
“Heads and Tails”: solo version
Let’s say now that you and your friend decide to play N times the game of Heads and Tails. Using the above protocol, there would be a need to perform 3N communications between you and your friend, which is not ideal.
On top of that, now suppose that there is a third party, the payer. Basically, after the game is run N times, you go to the payer in order to get the reward which would of course depend on how many rounds you won.
But, then, there is a problem. How can the payer be sure that you didn’t collude with your friend to split the reward? You cannot just present it the script of the above protocol to the payer.
In general, this means that the protocol above cannot be publicly verified.
In order to make the protocol work under these settings, we need to change it a bit. Basically we would play a solo version of the protocol where you toss the coin by yourself and present the results to anyone interested in seeing it (e.g. your friend and the payer).
But how can one be sure that you played honestly when you are, at the same time, the one who’s betting and the one who’s tossing the coin?
It turns out that this is indeed possible, exploiting the fact that a one-way hash function is kind of splitting a sequence of the events in a “before” and an “after”: If d = H(m) then m must have been chosen before we knew d since H() is one-way and you cannot reverse it (i.e. you cannot reverse time).
Fiat and Shamir showed that this property can fruitfully be used to turn interactive protocols into non-interactive publicly verifiable protocols.
Here’s how it works:
- Everybody agrees on a random value x which is called a public coin. This step is really important otherwise you could use a previous sequence of bets that you know are favorable for you.
- You now choose N bets, one for each round. Let’s represent these bets as a sequence b of N bits, where a 0 in the k-th position means that your bet is a head for the k-th round (and 1 for the tail, of course).
- Now you calculate r = H(x | b). These represent the results at each round: If the k-th bit of r is 0, it means that the outcome for the k-th round would be a head (and 1 for tail, like before).
Now you go to the verifier and the payer and present them b and r. They would simply recalculate r = H(x | b) and, in case it matches, it means that you played honestly and can get your reward.
Why this protocol works?
Since H() is one-way, you cannot “predict” r before “tossing” the hash from b.
Technically speaking, a hash function is approximating a random oracle. In simple terms, a random oracle is a function which is always returning the same output given the same input. But the oracle will pick the output randomly the first time it sees it, giving the same result from that moment on. So, again, you cannot know the randomly picked number before asking the random oracle.
In other terms, finding the b which is giving a pre-defined r would be as hard as inverting the hash function. Put it in another way, this would be equal to solving this equation:
which requires to invert H to be solved for b.
But there is a caveat to this.
The verifier and the payer can only be sure that you played honestly with a certain probability. It means that there is still a chance for you to cheat the protocol. But this chance can be made negligible if N is sufficiently large.
For the specific case of Heads and Tails, this caveat works as follows.
Since H() is much like a uniformly distributed random variable, on average, you would win 50% of the rounds, sometime more, sometime less. Can you increase this probability?
Let’s suppose that the sequence of bets is b = b0 | b’, i.e. we are specifically looking at the first bet. We can then calculate r = r0 | r’ = H(x | b0 | b’), looking specifically at the first result r0. Regardless of the b0 and r0, since H() is uniformly distributed, on average, the outcomes for b’ would be still 50% favorable to us.
For the first bet b0 the following may happen:
- For any b0, we always lose if r0 ≠ b0.
- For any b0, we always win if r0 = b0.
- b0 = r0 (we win) if either b0 = 0 and r0 = 0 or b0 = 1 and r0 = 1
In case of 2, we just pick head or tail and we are sure to win. In case of 3, we pick the bet where we win. For 1, well, we try to flip another bet :-).
It turns out that we have increased the chance to win the first bet from 50% to 75% and, on average, the winning rounds above 50%. How much? Well, it depends on how large is N.
We can also extend this strategy to other bets but then we must check 2^n hashes and, with n sufficiently large, this would quickly become as impractical as reversing H(). In order to stay on the safe side N should be twice the length of the digests.
Bottom line: we can work out a version of the protocol that’s publicly verifiable and it’s not interactive, but it comes at a cost of requiring that N must be very large.
Enter the Blockchain
What this solo “Heads and Tails” has to do with blockchains?
In privacy-preserving blockchains, i.e. those where transaction content is kept confidential, one must prove to the miners that a transaction is not violating the rules set out for the blockchain, i.e. basically not spending more money than those found in the wallet, without revealing anything of the underlying content of the transaction.
In order to do this, a protocol very similar to the one described above for Heads and Tails is applied. Specifically, we will refer to the second version of the commitment-based interactive protocol.
So, first of all, there will be the creator of a transaction T which will be “encrypted” so nobody can look into it. This is similar to picking head or tail and committing to it via hashing in our “Heads and Tails” interactive protocol.
Let’s now imagine that we have a miner who needs to validate the transaction T. Since it’s encrypted, he cannot check that it’s correct directly. So, the creator of the transaction engages a dialogue with the miner in order to convince him that T is, indeed, valid. Hence, the miner starts a series of questions to which the creator replies.
The most basic question that the miner could ask is “can you show me the content of the transaction?”, which is similar to having the better sending the bet to the verifier in the interactive Heads and Tails protocol.
Of course, the creator of T will never answer such a question since he doesn’t want to reveal the content of the transaction. This represents an important difference compared to the simple Heads and Tails protocol since the miner must not learn anything about the content of the transaction. It’s said that the dialogue is run in zero-knowledge.
So, the miner asks a different kind of questions which are able to signal to him whether the creator knows T and that T is correct, up to a certain probability, let’s say 50%.
But the miner wants to be very sure that creator of T was not just lucky in answering correctly to the question asked. So he asks another question. The probability of correctly answering both questions without knowing the content of T and convincing that T is correct drops now to 25%.
Hence, the miner will go on with asking questions until this percentage drops to an acceptable amount. In practice, the miner randomly generates a number that is used to build the next question. In the end, the miner will ask Nquestions after which he’s convinced that the creator knows the content of Tand that T is correct, exactly like tossing the coin N times in the non-interactive version of Heads and Tails convinces the verifier that the better didn’t cheat.
The script of this dialogue between the miner and the creator of the transaction is then a sort of “proof” that T is correct, but this proof only works for that specific miner.
Therefore, such an interactive protocol is not suitable in a blockchain context where transactions need to be proved correctly to all the miners: If we had to do it really interactively, the creator would have to talk to each miner separately, with a unique dialogue, to convince each of them that the transaction is correct. This is, of course, impossible to do, considering also the asynchronous context of a blockchain.
So, Fiat-Shamir heuristic can be used to turn the dialogue between the miner and the creator of T to a non-interactive publicly-verifiable proof that the transaction is correct, “simply” by replacing the random number generator used to build the questions asked by the miner with a one-way hash function.
As we saw, this trick works but it turns out to be heavier than in traditional blockchains since proofs of correctness are, in fact, much longer.
Therefore, much of the research in this field is devoted to producing proofs that are as short as possible and to making the validation algorithm of such proofs as quick as possible.