nakamoto

Nakamoto consensus

This document will attempt to describe the theoretical aspect of new way to reach a consensus on the next step called Nakamoto consensus. As we’ll see below, Nakamoto takes a slightly different and somewhat counterintuitive approach. We’ll start by describing how voting works and follow with an explanation of its relation to energy. The goal is to understand that unlike the previous approach to consensus, we end up with an incredibly simple design where we reach consensus just by looking up at the sky and finding the largest energy ball. There’s no need to communicate with anyone about the voting process and individual voting preferences, the only requirement is that we look at the same sky.

Voting

It seems every blockchain consensus must have some kind of voting in it because at least one person must express an opinion on which chain to extend and this is a form of a vote. Nakamoto consensus is no exception here and comes with the assumption that the majority of the voting citizens are honest. In the previous article, the citizens were a resource that lived inside the system i.e. the coins. Who are the voting citizens in Nakamoto consensus? In Nakamoto, there are no predefined citizens. Rather than having these defined in advance, a voting citizen is a computation that lives outside of the system. Interestingly, if we define a vote as selecting one option out of many, certain computations can cast a vote too. In Nakamoto consensus, a computation casts a vote by committing to a choice with a hash. Every hash attempt points at a specific block with hashPrevBlock which, by our voting definition, acts as a vote from which step we want to march forward from. This allows us to vote on what we believe our last position is from which we want to make a new step. Since a general computation can be performed by anyone, this means that the set of potential voters is unbounded. To make a new step we simply have to prove we collected enough votes to move forward and describe the next step we’re going to make.

Example of miner voting

Consider the following two simplified hash attempts:

# Miner 1 computes the following hash attempt
newStep = sha256(
    prevStep=000000000000000000044142f90385fa7bf93cee2b406205ef3cfe118f66a9d0,
    newData=6ed4ec2272678103df71336332890900030d663004a59fc236eaa8c2594375b6,
    nonce=0104620d
)

# Miner 2 computes the following hash attempt
newStep = sha256(
    prevStep=000000000000000000044142f90385fa7bf93cee2b406205ef3cfe118f66a9d0,
    newData=14748a85136e8860cba2bcff8a87cdd18d2d9fc252cbb2e3c18b160276d067ea,
    nonce=aed0d47d
)

The two hash attempts share one value, the prevStep (hashPrevBlock in Bitcoin). This field selects a single block out of many possible choices, which is exactly our vote definition. This means these two one-way computations act as two votes for step 000000000000000000044142f90385fa7bf93cee2b406205ef3cfe118f66a9d0. Each attempt suggests a different newData that describes the actual body of the data we add to the chain (hashMerkleRoot in Bitcoin). If the vote (labeled newStep above) starts with enough zeros, it serves as a proof we collected enough votes and is thus also the next step we take.

Counting votes

We now know a voter is a one-way computation. But this voter actually does two things at once. It not only casts a vote for a step (hashPrevBlock), but also suggests a possible next step to take (hashMerkleRoot). Since the result of a one-way computation is a random sequence of zeros and ones, we can define the rareness score for each vote by looking at the leading zeros. On average we will get 1 leading zero in 2 attempts, 2 zeros in 4 attempts, 4 zeros in 16 attempts, and so on. If we find a vote that is so rare that it must have taken roughly N votes to find it, then this can be used as a proof that we did in fact make roughly N such computations, or in other words, we collected roughly N votes to continue from this step (hashPrevBlock). Anyone can verify the output of this computation is a sequence of leading zeros and we know it couldn’t have been gamed because the result of the computation is provably random. We can also verify the suggested next step (hashMerkleRoot) the vote describes follows our rules. This proof is what we call a PoW (Proof of Work) which, once found, is shared with everyone else. We have solved how we collect and verify votes, but we have a slight issue. Since the number of computations one can perform is not limited, it seems we’re prone to a Sybil attack because we can create as many voting citizens as we like by running the computation on a step multiple times. There are more problems. Nothing prevents us from getting two different valid proofs the votes were collected that suggest a different step forward leading to a fork in the path we walk. Since both steps forward are valid and thus possible to make, we may come to a situation where people have a different view of reality. To have a consensus on a common reality, we have to find a way for everyone to agree on the same steps.

Convergence to a common reality

How do we make people see the same reality? To solve this, we create the simplest rule possible which says “build on the step whose history has the most collected votes in total”. Now even if we disagree for a moment on what the reality is, since voting citizen production won’t be equally fast on these competing realities, one of them will have to get more votes than the other and people will eventually agree at which step the global society is located. It’s a simple objective measure that solves the problem of diverging realities (or forks as they call them) by making everyone eventually converge on the same reality. This chance of divergence at the tip of the chain is the reason why we sometimes want to wait for the chain to make a few more steps so that we’re sure the step we care about is not a part of the reality that will get replaced once we all converge. There could be financial incentives to secretly collect votes for a few sequential steps and later present a new reality that takes a different path. In the case of Bitcoin, this could potentially revert a transaction that was made and thus allow someone to double-spend some coins. This is why we need most of the votes to be honest by following the rule we specified. Luckily, there are incentives in place not to perform these kind of attacks. Someone being able to produce this many votes must have a specialized hardware whose only job is to create citizens that vote - they’re highly optimized vote printers. It’s in their best interest to keep the network running smoothly because if such an attack happened, the network would likely lose some of its value which would hurt their hardware investment and profits. If the majority of votes are honest, an adversary won’t be able to get more votes to overtake the chain of steps by definition. Not only is this system still prone to Sybil attacks, Nakamoto is essentially consensus through a computational Sybil attack because we’re sort of creating ephemeral citizens that vote by performing these computations. It’s an open competition where different realities compete in obtaining/printing as many votes as they can.

This kind of consensus might seem like a relatively obscure and bad way to reach consensus, but it comes with some unique properties:

Everything in Nakamoto consensus is achieved purely through independent computations without interaction. This asocial way of voting may be what makes it more resilient than other consensus mechanisms because you can’t capture the citizens. Even if you manage to stop the majority of the vote producers, new ones could just as easily get spun up. Once the PoW (rare vote) is found, the proof simply appears in the sky where anyone can see it and decide whether they take it as a part of reality or not. The fact that parties don’t communicate and that the citizens are undefined allows us to add/remove vote-producing machines (often specialized computers) on the fly.

NOTE: The only way to have voting on such a massive scale as Bitcoin (270 exa votes per second) is if we somehow avoid sharing all the votes and counting them one by one. This is only possible if our voting system has very strong privacy. The counting method we described above does exactly that and is a form of private probabilistic counting that doesn’t require us to keep or share a vote unless it starts with a specific number of leading zeros.

How many votes do we need to collect?

We still haven’t discussed how we decide what the number of required votes for the next step is. If this number was a constant, we could have a problem of either making a step faster than we could propagate it along the network or too slow. For this reason, Bitcoin balances this with a difficulty adjustment algorithm whose job is to check if we’re doing steps too slow or too fast and adjust the required number of votes accordingly. Bitcoin specifically targets the steps to be approximately 10 minutes apart. This sounds like a long time, but it’s much faster and easier to transfer than gold. It also makes very little difference if the time was 15 seconds or 10 minutes as this doesn’t really impact the throughput of the system, but we’ll leave this as an exercise for the reader to figure out why.

From voting to energy balls and security

Every vote we print is a computation and every computation takes some physical resources to executed namely time, space and energy. Since Nakamoto voting is a computational Sybil attack on an extremely large scale, it means we’re using energy proportionally to the extent of the Sybil attack. The proof of collected votes thus also proves energy was spent meaning every step we agreed on required substantial physical energy to be created. One way to visualize this is to think of these proofs as balls of energy that can be joined together to form a larger energy ball in a similar way two water droplets can be combined. The reality, which is our last step, is then defined by the largest energy ball we see in the sky. The most important part is that the energy transmitted into a new step protects all the previous steps meaning we’re continuously adding security/weight to the path we walked. And this never stops. If an attacker wanted to revert a transaction that was done in a step that happened a year ago, they would need to create more than a year of energy to revert that step and change the course of action which sounds near impossible. Even if such an attack happened, the attack would replace an existing energy ball with a larger one. This means you’d need even more energy to perform the same attack on these steps again and even more than before for the previous steps. The reality and the past continuously keeps hardening and can always make progress because you can always direct energy to make a new step.

NOTE: Another simple visualization is to think of blocks as having a physical weight. If we stack them vertically one on top of the other, it’s clear that every block added on top of the block we care about makes it harder to move our block because of all the weight added on top of it.

Verification of common reality

As long as people are looking at the same sky, all the have to do is follow the largest energy ball. We only have to be able to communicate (PoW, step) pairs and not communication on how these came to existence. If for whatever reason we don’t communicate these, people will be looking at different skies and thus will see different energy balls and have a different view of reality - the 51% attack is dangerous precisely because the attacker hides new steps for long enough. This can go on for some time, but once the two skies get connected, there will be only a single objective reality which is the largest energy ball in the joint sky. As already mentioned, there are no trust assumptions involving people. Unlike humans, energy is an incorruptible resource that can’t be bribed or go insane which makes it trustless.

Physics and security

Now that we understand the traditional and Nakamoto consensus, we can see some differences. One notable difference is that the set of voters in the Nakamoto consensus is unbounded and permissionless - one doesn’t need to own a coin to have a right to cast a vote. However, there is an even more important difference. In the traditional consensus, the resource that represents a voter lives within the system and is not tied to our physical world. The logic of computer programs usually has no significant connection to the physical world other than being executed there. Proof of Stake consensus follows this tradition, where the state and its history defines who has the right to vote at a certain time through some logic rules.

On the other hand, Nakamoto consensus takes a drastically different approach. Instead of listening to specific identities, it listens to short mathematical proofs that demonstrate real-world energy was dedicated to make the next step. This is particularly interesting because by following verifiable statements about energy usage in the real world, our system inherits security from the laws of physics, which may be one of the few constrained systems we have no idea how to exploit. This makes Nakamoto consensus an open and trustless system, as transactions are secured by mathematics (Elliptic Curve Cryptography) and the order of transactions (consensus) is secured by physics through a high energy cost. We can now see that hashes solve two problems simultaneously. They serve as both a voting mechanism and a security parameter for consensus and should not be considered a waste of computational resources. Nakamoto consensus presents a novel and exciting approach by using real-world energy as a resource to reach and physically secure global consensus, making it a worthwhile experiment.

Summary

We managed to reach consensus on the next step without needing to publicly share our opinion about the step or having to trust other people regarding statements about steps. And we managed to secure the order of transactions with physical energy. We simply look in the sky and a bigger energy ball appears that describes the step. This approach is simpler and more resilient, but it introduces the concept of probabilistic finality of the step where we can never really know with 100% certainty that the reality we look at is the real one. This probability however becomes exceedingly small as time passes. I think both approaches to consensus can be valuable in their own ways because they come with different properties, but I have no doubt we want to have the most resilient variant for a global store of value.

PoS doesn’t make PoW obsolete. Their security models are fundamentally so different that it makes little sense to even compare them. Keep calm and look up.

sky