[This text was originally written for academic purposes in May 2021. It was updated in March 2023 to cover Ethereum move from PoW to PoS.]
Since the launch of the first system using blockchain technology, in 2009, to the present day, blockchain has attracted a high level of attention and investment. In this article, we start by reviewing what this technology is and how it came about. In section 2, we propose to review the main characteristics of this technology, its architecture, and operation, starting from the pioneer Bitcoin, which remains a great technological reference in the sector. In section 3, we analyze the Proof-of-Work (PoW) consensus mechanism, central to the Bitcoin architecture. In section 4, we look at the specificities of other blockchains, focusing on the alternative Proof-of-Stake (PoS) mechanism and its implementation in Ethereum.
1. Origins
The desire to create digital currencies has ancient roots. Boneh (in a16z, 2021) dates this desire to cryptographers since the 1980s. The big challenge was to create a system that would guarantee the authenticity of a digital artifact, in this case, symbolizing a digital currency, thus ensuring that it was not spent twice (or more). This problem was exacerbated if we wanted to maintain some level of anonymity, as is the case with physical fiat currency (cash, coins, and notes).
This challenge was solved in 2009 by Satoshi Nakamoto (a pseudonym, whom real identity was never revealed until today). In the whitepaper that launched the Bitcoin protocol, he proposed an innovative technical solution. Boneh identifies Nakamoto’s great innovation as the ability to “achieve open consensus with a honest majority”. By open consensus, he means that it is not a fixed set of participants who write the blocks, but anyone who wants to join (and has the necessary technology to run the code). And honest majority is defined by Boneh as the mere need to assume that the majority of participants in the network do not want to destroy the system, which in a peer-to-peer network like this essentially means a Sybil attack (in which an individual or group of individuals assume a much larger number of identities to try to gain disproportionately large influence).
2. Blockchain Architecture
The architecture of Bitcoin, being the pioneer in the implementation of blockchain technology and remaining in 2021 as the protocol that has attracted the most traction, both in terms of investment and computational power, combined with the fact that it has passed the test of time (it is a system that has been in operation since 2009 without any compromising failures), makes it the great reference for this technology.
We follow the exposition proposed by Zheng et al. (2017) to demonstrate the architecture of Bitcoin. In a simplistic summary, a blockchain is a sequence of blocks containing the complete list of transaction records performed in that protocol, schematically represented in Figure 1. We are thus facing a distributed ledger, having as fundamental characteristics decentralization, persistence, anonymity, and auditability. The combination of these characteristics makes the system robust and inviolable, as there are a large number of agents validating each block of the information chain. Each block has the general form represented in Figure 2, and contains the information related to a set of transactions, having in its header the following elements:
Figure 1: Example of a blockchain containing several blocks. Retrieved from Zheng et al., 2017.
Each block has the general form represented in Figure 2, and contains the information related to a set of transactions, having in its header the following elements:
- Block version – which set of validation rules to follow
- A unique identifier (Merkle tree root hash, in the case of Bitcoin) – the hash value of all transactions in the block;
- Timestamp – the current time in seconds, in universal time;
- nBits – the target threshold of a valid block hash;
- Nonce – a four-byte field, starting usually at 0 and increasing with each hash calculation;
- The identifier of the previous block (parent block hash), which originated it.
Figure 2: The structure of a block. Retrieved from Zheng et al., 2017.
In the body of a block, there is a transaction counter and the transactions it registers. The maximum number of transactions per block depends on the block size and the size of each transaction. To validate the authenticity of each transaction, an asymmetric encryption mechanism is used, a mechanism on which the digital signatures that validate them are based. In blockchain, an elliptic curve digital signature algorithm (ECDSA) is typically used, composed of a private key, which in Bitcoin authorizes the spending of funds (in the format of a 256-bit integer); a public key, a number that corresponds to the private key but does not need to be secret, which determines that a signature is genuine, which can be compressed (a total of 33 bytes, with a prefix of 0x02 or 0x03 and a 256-bit integer) or uncompressed (65 bytes, with a prefix of 0x04 followed by two 256-bit integers); and a signature, that is, a number that proves that the signing operation took place, resulting from a hash of the object being signed and the private key, resulting in a format of 71, 72, or 73 bytes.
Each user has a pair of public and private keys. Following the example of Zheng et al. (2017), if the user Alice decides to send a message to the user Bob, in the signature phase, Alice encrypts her data with her private key and sends Bob the encrypted result and the original data. In the verification phase, Bob validates the value with Alice’s public key, thus verifying whether the data was tampered with.
The function that makes each block unique and relates it to the previous one is the hash function, a one-way mathematical algorithm that takes an input (in this case, the parent block hash) and transforms it into a unique output (Härdle et al., 2020). Originally proposed by Haber & Stornetta (1991), this function allows documents to be certified securely, reliably, and privately, in an inviolable manner. It should be noted that this mechanism should not be confused with encryption, as in the latter, a file is encrypted with one key and decrypted with another, while in the hash function there is no decryption. Furthermore, in a good hashing algorithm, it is computationally infeasible to find two input values that produce the same hash output value (Härdle et al., 2020).
The Bitcoin protocol, as well as many other blockchain protocols, use the Secure Hash Algorithm (SHA-256) cryptographic hashing algorithm. This algorithm has an input value with a maximum size of 2^64-1 bits and an output of 256 bits, typically represented in hexadecimal form. It is noteworthy that even a small change in the input value (such as a comma or a white space) would result in a completely different output hash. To illustrate the hashing function, we conducted a test using the “digest” package in R. We hashed the expression “Bitcoin” and then made a slight change to the input expression to “bitcoin” (lower B). As shown in Figure 3, comparing both results, they are indeed very different. We also tested it in different sessions of the RStudio software using the same expressions and verified that the output of the hash function is always the same for identical inputs.
Figure 3: Use of the “digest” function in the R package to hash the expressions “SCC 20-21” and “SCC 20/21” using RStudio software.
Thus, when a transaction is submitted through a blockchain, it goes into a “waitlist” that nodes continuously monitoring the network add to the block committed to the end of the blockchain. This last block is inextricably linked to the previous block through the cryptographic hash function, thereby linking new transactions to the most recent transactions recognized in the blockchain. Nothing in this new block can contradict the immediately preceding block, and so on with the entire blockchain (Eddy Lazzarin, in a16z, 2021).
Incentivizing user participation to dedicate computational power to the network through block validation is done, in the case of Bitcoin, by issuing new bitcoins (in a decreasing amount over time, as stipulated in the protocol). Nakamoto (2008) draws an analogy between this work, which consumes computation time and energy, and that of gold miners, who “spend resources to add gold to circulation,” hence the expression “mining” cryptocurrencies. With each new block created, the first recorded transaction is conventionally the assignment of the amount of bitcoin created with that block to the creator of that same block.
One of the biggest challenges of blockchain is how to reach consensus among the nodes that make up the distributed and open network about which new blocks should be recognized as part of the network in the absence of a central authority to validate it. This problem is known as the Byzantine Generals Problem. In Bitcoin, this problem is solved through the PoW mechanism, analyzed in section 3.
3. Proof-of-Work
According to Nakamoto (2008), PoW involves finding a value that, when run through the SHA-256 hash function, produces an output hash that starts with a certain number of zero bits. The average work required is exponential to the number of zero bits required and can be verified by executing a single hash. With a network of timestamps, as in this blockchain, PoW is implemented by incrementing the nonce in the block until a value is found that produces the required zero bits in the hash of that block. When a block is produced and starts to be propagated through the network, the different nodes that make up the network perform the validation test and, if that validation is positive, work to create the next block.
Of course, there is the possibility of divergent results for the validation test, either because the network is under attack or because two “validatable” blocks are created simultaneously. The solution to these problems lies with the nodes of the network itself. New blocks are propagated by the network, whose nodes accept or reject them. Thus, the correct ledger will always be the longest chain of blocks, i.e., the one that required the most effort and was validated by the highest number of nodes. Nakamoto (2008) summarizes the requirements for running the network in six steps:
- New transactions are broadcast to all nodes;
- Each node collects new transactions into a block;
- Each node works on solving a PoW for its block;
- When a node finds a PoW, it broadcasts it to all other nodes;
- Nodes accept the block only if all transactions in it are valid (and there is no double spending problem);
- Nodes express their acceptance of the block by working on creating the next block in the chain, using the hash of the accepted block as the previous hash.
The major criticism of this mechanism is the amount of energy and computational power expended in the PoW process. This is because most blockchain protocols, including Bitcoin, have encoded an increase in hashing difficulty as more hash power (computational capacity to execute the hash function) is added to the network. With the increasing interest in this technology and existing financial incentives, the mining task has become extremely difficult. Initially, it was done using normal CPU computational work; as difficulty increased, GPUs were used to increase efficiency. But as networks grew, the use of GPUs became less efficient, leading to the development of application-specific integrated circuit chips (ASICs) specifically designed to increase mining efficiency. The growth of cryptocurrencies and mining interest has become the subject of strong criticism of their ecological footprint due to the enormous amount of electricity expended in this process, estimated to be higher than the annual consumption of some developed economies. Due to this problem, other solutions have been proposed to solve it, solutions that we will explore in section 4, particularly the PoS mechanism.
4. Proof-of-Stake
PoS is an energy-saving alternative to PoW (Zheng et al., 2017), in which interested users have to prove ownership of a certain amount of the currency, with the belief that individuals with higher amounts of the currency have little incentive to attack the network. It should be noted that due to differences in the functions between PoW and PoS, in PoS “miners” are referred to as validators. The first cryptocurrency to introduce this concept was Peercoin (King & Nadal, 2012), proposing a mixed mining mechanism of PoW and PoS, introducing the additional concept of “coin age”, where larger and older coin sets have a greater likelihood of mining the next block (Zheng et al., 2017).
Ethereum, the second-largest protocol in market value, is currently undergoing a long transition from PoW to PoS through the Merge. The Ethereum Foundation (2021) presents PoS as a consensus mechanism for blockchain that requires users to commit a minimum of 32 ethers (the Ethereum token) to become validators on the network. This amount is locked in the network for a certain period of time and cannot be withdrawn, with the analogy of a stake holding the blockchain being pertinent. The creation of new blocks is randomly assigned to one of the available validators on the network, with all validators responsible for verifying and confirming blocks they did not create but that are added to the blockchain. Incentives are guaranteed through the creation of new blocks (as in PoW) as well as the verification and confirmation of blocks.
With the Merge, Ethereum fully transits to PoS, with the network secured by a large pool of validators who stake their ether, instead of miners who use energy-intensive computations to solve mathematical puzzles. The Merge was the name attributed to the process of combining Ethereum’s previous PoW and the PoS system into a single PoS system. The validators are now responsible for verifying transactions and creating new blocks in the Ethereum blockchain, replacing the miners who previously performed these tasks.
The advantages of this mechanism include greater energy efficiency, as the energy and computational costs of PoW are avoided, as illustrated in section 3; lower barriers to entry, as expensive top-of-the-line equipment is not necessary for mining (as is the case with ASICs); greater immunity to centralization, as PoS is expected to attract more nodes to the network; and facilitating other potential upgrades to the blockchain, including sharding (dividing the database to reduce computational work, with the expectation that in the Ethereum context, it will reduce network congestion and increase transaction processing capacity through the creation of new parallel chains – shards) (Ethereum Foundation, 2021).
The focus in this section was on the presentation of the PoS mechanism in Ethereum. However, there are other protocols that use this mechanism, such as Cardano, Polkadot, and Tezos, among many others, each with their own specificities in their implementation. There is also a significant variant of PoS, called Delegated Proof-of-Stake (DPoS), in which different nodes elect their representatives to generate and validate blocks. Zheng et al. (2017) illustrates the difference between PoS and DPoS with the analogy that DPoS is the equivalent of representative democracy, in contrast to the direct democracy of PoS.
In addition to PoW and PoS, there are many other consensus mechanisms, illustrating the variety of concepts in the world of cryptocurrencies.
Conclusion
Despite criticism and much skepticism towards cryptocurrencies (more accurately called crypto assets), it is difficult to imagine a future in which they disappear. The technology they bring allows for solving various problems that were previously unsolvable by information technologies, such as decentralized digital authentication, decentralized incorruptible registration among parties without trust relationships. Their applications are numerous, from digital currencies and payment systems to smart contracts, certification of ownership of digital files (with NFTs). In this brief essay, we analyzed the blockchain architecture, starting from the paradigmatic example of Bitcoin, and delving into the specifics of the technology, notably one of its central concepts, the hash function. We also analyzed existing consensus mechanisms, which ensure the integrity and validity of blockchains, namely through the PoW or PoS mechanisms, the latter through focusing on the implementation proposed for Ethereum, currently in progress. Finally, we covered Ethereum’s Merge, which transitions from PoW to PoS, marking a significant milestone in the evolution of blockchain technology.
References
a16z. (2021, 17 de Abril), Crypto, An Oral Essay (Ep. 633). In The a16z Podcast.
Borges, A., Rodrigues, A., & Rodrigues, R. Elementos de contabilidade geral. rem. Lisboa: Áreas editora, 2007. 24ª edição.
Ethereum Foundation, 2021. The Eth2 upgrades.
Haber, S., & Stornetta, W. S. (1990, August). How to time-stamp a digital document. In Conference on the Theory and Application of Cryptography (pp. 437-455). Springer, Berlin, Heidelberg.
Härdle, W. K., Harvey, C. R., & Reule, R. C. (2020). Understanding Cryptocurrencies. Journal of Financial Econometrics, Volume 18, Issue 2, Spring 2020, Pages 181–208.
King, S., & Nadal, S. (2012). Ppcoin: Peer-to-peer crypto-currency with proof-of-stake. self-published paper, August, 19, 1.
Nakamoto, S., & Bitcoin, A. (2008). A peer-to-peer electronic cash system.
Zheng, Z., Xie, S., Dai, H., Chen, X., & Wang, H. (2017, June). An overview of blockchain technology: Architecture, consensus, and future trends. In 2017 IEEE international congress on big data (BigData congress) (pp. 557-564). IEEE.