What is Blockchain and How Does it Work?
This post is a basic walk-through of the non-technical and technical fundamentals of the blockchain technology that is used in cryptocurrencies, with a focus on Bitcoin’s blockchain. It is blockchain explained simply.
The blockchain can be thought of as a distributed ledger system. The key terms here are distributed and ledger, distributed being the opposite of centralized or being in one location; and ledger being a continuous recording of events, usually meaning transactions. You could also think of recording students coming and leaving school for example. I’m going to start with an overview and then go into detail.
Blockchain Definition / What is Blockchain?
For a blockchain definition in layman terms: Blockchains comprise of data or information that’s organized in succeeding blocks, one after the other in a chronological or time-based order to form a chain. The block is just a collection of information or data collected in a standardized format among all the blocks. So the blockchain is non-physical and can be though of as a database, so you can expect to find the same fields, for example date or amount, within each block, but the data attached to those fields change, capturing what happens from one moment to the next.
Before getting into what is actually in the blocks and what happens when a new block is added, let’s look at why it’s distributed.
A Blockchain is a Distributed Ledger
It is distributed because, by design, the information is not owned or updated by one person or one central group. This is called distributed consensus. There is a network of people participating, that can be joined by anyone through a connec
tion point called a node, usually a computer interface. The information can be updated by anyone on the network who has the right resources to do so and can be verified by all the nodes. There are ways to verify that the information recorded is accurate and make it almost impossible for those updating the blockchain to change it in their favor, much more so than the centralized ledgers that have existed in human history.
One thing to note is that the blockchain, as it is information, is written in programming languages. But the specific language can change depending on the blockchain, and as well the kind of content stored within it. The blockchain itself is just specifying what kind of information and how it should be written, but the programming language used to do it can change. So Bitcoin, arguably the most popular use of a blockchain, is just a blockchain being used as a cryptocurrency. Bitcoin’s blockchain doesn’t have the same content as another’s say Ethereum, is notnecessarily written in the same language, and does not serve the same purpose. Regardless there can be commonalities between different blockchains, such as the basic structure, and methods used to secure the information on it.
That’s really all you need to know for a basic understanding, that a blockchain is decentralized information stored in blocks and that it is continually updated by over time. But I’m going to go into more detail, using Bitcoin as the example.
How is the Blockchain Used in Bitcoin?
For Bitcoin, it’s a peer-to-peer or distributed monetary system or peer to peer electronic cash system as coined by its mysterious creator Satoshi Nakamoto. It’s a way to define and store value, move that value from person to person, accurately track that movement, make sure the right person is doing the movement, as well as give the ability to mine or add to the already existing amount. The blockchain itself is the mechanism by which this is done.
There are no physical coins, just information. But, for lack of a better term, from a coin is first created, one can track it’s location as belonging to one person, and then it’s movement as bits of it or all of it is moved from person to person. So the ledger or blockchain is actually tracking who spends how much, with whom, and when. When there is a transaction, there are inputs and outputs created to say how much was received, and how much was spent. The unspent amount is still seen as remaining at the address because it can be calculated that a certain amount was received, and a certain amount not spent and so can be spent.
Instead of a person, an address is used, in this case a string of characters, which may or may not be able to be linked to an individual. The address is actually a key (actually a hash of a key/that’s a simplification), just one of a key-pair, something used in public-key or asymmetric cryptography. The address that is recorded on the blockchain as having sent or received funds, is the public address that someone can share, but only the owner of the other key in the pair, that’s the private key or private address, has the ability to spend the funds once received. For more information on public-key cryptography, read or watch ‘What is Cryptography?’ or read up some more on your own.
The blockchain just records how the funds are moving. When someone unlocks their wallet with their private key and decides to send it to someone else’s address, that is their public key, this transaction information is broadcast to the rest of the network. There are special nodes on the network, called miners who are able to write that transaction to the blockchain. The transaction is not complete until this is done. The first step is to verify that the sender has those funds to be spent, because the history of all transactions on the blockchain can be checked, and so unspent amounts calculated. The second step is to record the new transaction with others broadcasted around the same time, onto the blockchain by compiling it so that it forms a block, and then adding that block to the blockchain. Miners do both of these steps.
Miners Organize the Blockchain
Since anyone can decide to be a miner and participate in the blockchain, there has to be a system to decide which miner gets to do it. This is called the consensus algorithm, which is proof of work in the case of Bitcoin, as explained in the post. First of all, at this point, for Bitcoin, miners have to have specialized computers called ASICS (Application Specific Integrated Circuits) built specifically for mining crypto, that allow them to write to the blockchain. Back in the day, one could have used a regular old CPU. But things got a little difficult over time as I’ll explain. This is because they have to compete to figure out who gets the right to add the next set of transactions to the blockchain. They do this by performing a calculation that takes a very long time, figuring out a hash that begins with a certain number of zeros. A hash is the output after shortening an arbitrarily sized string of characters, to a specific length. Blockchains use a cryptographic hash function, which among other features, means one can’t figure out the original information from looking at the hash, and changing even one character in the original information, drastically changes the resulting hash. Bitcoin specifically uses SHA-256 (secure hash algorithm). Different mining computers have different hash rates, but the idea is that since many miners are working to solve this problem at the same time, the probability of the correct hash being found can be predicted to make it so that only one miner at a time will be likely to find it, and so be able to write the new block. And it also decreases the chances of one miner being able to do it successively and so write information in their favor. Miners are basically just arbitrarily adding characters, that is searching for a nonce (exactly what it sounds like), to the new set of transactions to be put in the new block, until the right combination is found that results in the right hash with the correct number of zeros at the beginning.
Each block generally has multiple transactions in it. And the information in one block is actually a hash of the previous block as a header, the current transactions being written, and some extra characters that are combined with those two in order to come up with this hash that has a special number of zeros. The time it takes for all the different miners on the blockchain network to compete, and then one miner winning, is about 10 minutes. So a new block is added every 10 minutes. The number of zeros needed is lengthened gradually in order to keep that time being 10 minutes. This is necessary because more miners with faster computers participate in the network overtime, so the difficulty of the hashing problem has to increase as well to maintain that 10 minute time-frame. This 10-minute time-frame and method of competing for the right to add a block, is specific to bitcoin, although other blockchains can have it too. This kind of right to add a block through calculating hashes is called proof-of-work mining.
How Cryptography Secures the Blockchain, Explained
Even though it takes a very long time to calculate the hash, it does not take such a long time for other miners to verify that that hash is correct. This continuous process of each new block being compressed to form its own hash, then that hash being merged with new transaction information to form the next block hash, results in what is called a merkle tree. Although one can look back in time to view every transaction that ever occurred on the blockchain, the continuous compiling of the previous hash into the next, results in data integrity, where one can quickly check that the most recent block information is true. Blockchains can also be viewed as merkle tress, with a new hash being formed combining all previous hashes, so one cannot lie about what happened before, like saying you never spent money you did spend, as it would change the whole blockchain.
Continuing our explanation of how this works in Bitcoin’s blockchain, I mentioned that the block itself contains a hash of the previous block, the new transactions, and then the extra information (the nonce) to find the right hash. Because each new block has the last block’s hash in it, and the hash changes if anything from the original message is changed, this prevents anyone from tampering with the blockchain.
How Miners Organize the Blockchain, Explained
The important thing to note here is that the miners competing for the right to add the next block of data with new transaction information, prevents what is called a double-spend attack from happening, that’s someone spending the same funds twice and trying to lie to the network. Because all the miners receive new transaction broadcasts, they all begin to compile new blocks whenever they receive those transactions, and then take some time to figure out what the right nonce is to be added. The first one to win the proof of work, broadcasts his version of the blockchain with their new block, and that is accepted because other miners can see that the transactions in it were viable, by checking what’s unspent at that address, and they can see that the right nonce was found. At this point, miners begin to compile the next block using this newly accepted blockchain, referencing the hash of this newly accepted block. If two miners somehow solve this at the same time, a very low probability, two versions of the blockchain are created with different transaction ordering, and both broadcasted. Each miner begins working to add the next block based on whichever one is received first. The longest version of the blockchain is the one that is accepted as valid.
If someone were to attempt to double-spend by sending a transaction then trying to send again when they don’t have the funds, they would write a new block spending funds they already spent elsewhere, replacing the one with the initially spent funds, and other miners would then deny the second transaction which could hurt the receiver. But this bad actor would have to do this faster than other miners can write the blockchain. This would be very difficult to do because they would need to have the computing power necessary to outcompete the other miners, with their version of the blockchain that is longer than everyone else’s. They have to write all the new blocks based off of their new deceitful block since all blocks reference the one prior, and it has to be longer than everyone else’s to be accepted by the rest of the network.
Because other miners were already working on writing the new block when the first transaction was sent to the first receiver, and they have been adding to it, the other miners have the hash from the block with the correct information, included in all subsequent blocks. And they, statistically speaking, are faster at writing them than the bad actor could ever be. That person can’t just slip in her second transaction that robs from the initial receiver, because she has to write that block, plus all the next blocks until it is longer than the other chain that other miners on the network are already creating. The odds of this dishonest person or miner being able to do this and write this second transaction to the blockchain, creating the longest one, faster than everyone else, is very very low, as thought out in the original idea for Bitcoin.
Preventing the Double-Spend Attack
This is why it’s suggested that after sending or receiving funds, one wait for a certain number of confirmations (as in new blocks being added) to make sure that enough time has passed where a dishonest person’s probability of catching up to recreate a new chain where your funds are spent again, is negligibly low. There is the issue of mining pools being able to combine computing power and so having the ability to do double-spend attacks. The benchmark is having over 51% of hashing power on the network. At this point in Bitcoin, this is only solved by mining pools deciding to limit themselves. Another method of attack is to cripple other nodes and so automatically defer mining power to other miners who may then have the majority of it, thus being able to keep writing the blockchain which is likely to end up being the longest one. Other blockchains such as Particl or what Ethereum plans to do with CASPER, run on proof of stake instead to provide distributed consensus.
Apart from verifying that the transaction is valid, and ordering the transactions, miners also create new bitcoins in the first transaction that is added to the block. This is called the block reward and is an incentive for miners. This block reward is set to decrease every few years until it is no more. In the future, when there are no more block rewards, miners will be able to receive transaction fees based on simple economics, deferring to the participants willing to pay the higher fees for their transactions.
Bitcoin uses Cryptography But is Not Private
As a side note, this hashing to write blocks, and use of the public and private key pair for sender verification, is the only cryptography that actually exists in bitcoin, and data is not actually encrypted on the blockchain. One cannot undo hashes to decrypt the data and see what was there, and the actual transactions are recorded because one can look back at transaction history for each block, but that is not hidden. The “cryptographic” security of the bitcoin blockchain is in the fact that only the person with the right private key can move their funds, that the address is not necessarily linked to a person, that distributed consensus is needed to verify transactions, and that the data (in a merkle-tree structure) cannot be altered after it is written. Other blockchains are emerging, such as privacy coins, like Monero, Dash, Zcoin, and Particl, that provide much more security. There are other blockchain solutions that encrypt the data itself before it is written to the blockchain. Bitcoin is working on second-layer solutions for privacy, that use privacy technologies as explained in The Privacy Coin Guide.
Summary of ‘What is Blockchain?’
To recap, a blockchain acts as a public ledger, recording information, transactions in the case of bitcoin, in a time based manner, using a decentralized network to update it. It is immutable, meaning once the data is there it cannot be changed. It is publicly verifiable and doesn’t rely on one institution to update or validate it, It is secure in that it uses cryptography in the form of a public and private key system to ensure that only the right persons can move funds. The major issue it solves is that it removes the need for trust and discourages fraudulence from a centralized institution. It does not only have to be used as an electronic payment system as in cryptocurrencies, but any database with records can use a blockchain as its underlying technology.
I have to add that this is not a perfect explanation, because it would take a long time, especially the intricacies since adding a detail means explaining it. I hope that was really useful. Leave any comments or questions below. I’d love to read them. I’ll be talking about the implications of blockchain technology in the future.
You can watch the video version of this article here:
Learn more:
https://bitcoin.org/bitcoin.pdf
How Bitcoin Works Under the Hood by Curious Inventor https://www.youtube.com/watch?v=Lx9zgZCMqXE
http://www.michaelnielsen.org/ddi/how-the-bitcoin-protocol-actually-works/