Blockchain 101

In 2009 Satoshi Nakamoto released the first decentralised cryptocurrency – Bitcoin. It was the first implementation of blockchain technology.

Blockchain is mainly associated with cryptocurrency. Yet the specific blockchain type utilised for this relatively new financial use case is just a tip of an iceberg. Recent approaches incline towards enterprise usage. Employing blockchain into Supply Chain Management (SCM) enables to track product or service subsequent states in the business flow. Business partners can join the blockchain network to either read or write information to the ledger history.

Is blockchain Bitcoin?

Blockchain can be compared to a distributed database in which each chunk of information is immutable. In the case of Bitcoin, it is the database that tracks the owners and the amount of the cryptocurrency they hold, which essentially can be called ledger or a transaction log. The fundamental concept is based on a non-breakable secure chain of blocks.

The blockchain entry is like a library card: one knows where to find the book, that it exists and that someone wrote it, and it is possible to check how it changed owners, but not what is the book’s content.

blockchain blocks
How blocks are connected in a blockchain

The whole blockchain is stored continuously and chronologically. Its data structure is similar to a linked list with hash pointers. None of the data saved in a block can be falsified or changed due to asymmetric cryptography used.

Cryptography

Every block contains data (a list of transactions) and two unique hashes (fingerprints). Transactions use digital signatures to ensure the initiator of the operation, e.g. in Bitcoin ECDSA is used. A digital signature consists of:

  • a public key,
  • a private key.

They are fixed in length. Signing a message uses a private key. The signature can be verified with the public key.

The two hashes held in a block are for the block itself and a hash for the previous block header, i.e. its parent. The only exception is the first block – a genesis block does not have a parent. Both hashes are part of a block header, which usually stores information about:

  • a creation date (timestamp),
  • a version,
  • Merkle Root (the block hash),
  • the previous hash.

Merkle Root detects tampering transactions stored inside a block. A special tree-based data structure (Merkle Tree) captures all the hashes from every transaction stored as leaves. Each parent in this binary tree is a hash pointer that combines the hashes from its children up to the hash root called Merkle Root. Proving a certain block of data to be a part of a Merkle Tree has a relatively efficient complexity of O(log(n)). Through Merkle Root, a single hash value, efficiently modified by any changes in that database, represents the entire state of the database.

 

Hashes are unique values which result from a hashing function that is a mathematical function that can take input of any length and produces fixed in size output. Function’s desirable security properties are:

  • collision-free – different inputs should not produce the same hash as output,
  • hiding – it is impossible to deduce the input based on the output.

None of the existing functions were proven to be collision-free, so what truly matters is the chance of collision. Due to the birthday paradox by increasing input number, the probability does not grow linearly, instead of being approximately proportional to the root of the number of all possible outputs, e.g. with 2¹³⁰ inputs, there is a 99.8% chance of hash collision. SHA-256 is the hashing algorithm for which no collision was found so far. Bitcoin’s implementation uses it.

Nodes

Except for the blockchain itself, what builds a blockchain system are their participants. The network architecture is defined as peer-to-peer (P2P), which is mainly associated with torrents. Such a network type assures scalability and decentralisation. Any electronic device on the network is known as a node, e.g. every device connected to the Internet, such as desktop computers, mobile phones or routers, is a node as well. Depending on the node type, they can perform different operations.

There is no single version of blockchain. Distributed nature indicates that it exists with every node. When propagating a new block across the network, every participant should update and synchronise their copy of the chain.

A partial node, so-called a lightweight node, can be identified with a regular user who only wants to communicate with the blockchain. These can be any devices with the proper software. A frequent simplified explanation on how a blockchain works says that every node has to store the whole ledger history. Nonetheless, a light node has solely to download, maintain and verify only block headers. The statement is true for full nodes that store all of the blocks present in a chain. An additional full node’s responsibility is verifying the entire blockchain (hashes validation) once a new block is being added.

Transactions

A single transaction can either contain information about:

  • the business transaction (transfer of ownership),
  • or store a code that can be executed by a blockchain, i.e. smart contracts.

Before actually writing them to a blockchain, transactions are pooled in the network. A transaction pool is a temporary place for storing incoming transactions until successfully adding them to the blockchain.

A miner will pick and choose transactions from the memory pool (mempool) to include them in his block. A consensus mechanism takes responsibility for gathering a list of transactions, writing them to a block and attaching the block to a blockchain. It requires all nodes to reach agreement over one version of the transaction history. Adding a new block of transactions is preceded by its validation– comparison whether hashes present in the chain conform with each other. In cryptocurrency, the two most common consensus mechanisms are proof-of-work (PoW) and proof-of-stake (PoS).

PoW is used both in Bitcoin and Ethereum. In this protocol, one can become a miner when a full node runs a mining software and has enough computing power. Nowadays, it consumes huge amounts of electricity and requires an investment in the form of a stack of CPUs and GPUs. A miner is rewarded with a transaction fee each time he mines a new block by solving a computationally heavy puzzle to validate transactions. Such an inventive model in Bitcoin provides 1 Bitcoin for every mined block. This is another example of cryptography use because a miner has to generate a hash with either lower or the same value as the target one.

 

Except for PoW and PoS, many recent publications focus on improving the area by introducing other protocols like proof-of-delivery (PoD), proof-of-majority (PoM), proof-of-property (PoP), proof-of-authentication (PoA), or variations of existing ones such as non-interactive-proofs-of-proof-of-work (NIPoPoWs) and CloudPoS.

Types of blockchain

Blockchain is mainly known for not having any central authority or a reliable third party due to the use of P2P communication model. It is trust-free. Even though blockchain participants do not trust each other, they can rely on the system itself.

The statement remains true just in case of public (sometimes referred to as open) permissionless blockchains like Bitcoin, where every node is equal. In these blockchains, PoW is the most common consensus mechanism by which breaking the immutability is very expensive. The history is mutable by using at least 50% of hash power (51% attack), so an entity that invests necessary resources can undermine the chain.

blockchain types
Four types of blockchain properties and examples

Public blockchains are open, i.e. anyone can join the network and propose to add data (transactions) to blocks or to add blocks to the blockchain. If compared to the operating system, anyone can write. What is more, the node’s identity is pseudo-anonymous or anonymous. In cryptocurrency, a wallet address, which is the user’s public key (its hash), is the only characteristic which ascribes user identity. In the Bitcoin network, the hashing algorithm used is SHA-256. A user can have any number of addresses due to the decentralised identity management scheme. The greatest challenge in public blockchains is maintaining privacy.

The contrary type, which solves confidentiality problem, is a private blockchain, usually used within financial institutions. The authority a private blockchain belongs to limits the access to it and depends on a user’s identity. A blockchain owner controls the number of nodes which validate or add (write) blocks.

Private and public blockchains differ in a user’s authentication. Another distinction can be made for permissioned and permissionless blockchains, which concern authorization and define accessibility.

 

The authority on the network level limits the actions that a user can take in a permissioned blockchain. Such a solution reduces security risks, offers a more scalable system and enables to add blocks faster as it relies on trusted nodes. Transactions are only visible to its participants. Therefore, they are not transparent. A sample implementation is delivered by Hyperledger (private permissioned) and by Ripple (public permissioned).

As opposed to permissioned, public blockchains that arose initially, are most commonly permissionless (or non-permissioned) since there is no authority. Their transaction history is transparent, i.e. visible to any network participant, and exposed real-time. Every transaction that has ever occurred on public permissionless can be read using blockchain explorers, which are tools that can read the data from certain blockchain. A sample blockchain explorer for Bitcoin and Ethereum is Block Explorer. Nodes can read transactions data with no restrictions, which makes the history transparent. Anyone can download the protocol and become a blockchain participant who validates transactions.

In spite of this, permissionless with smart contracts like Ethereum can disallow certain actions and restrict them to a contract’s owner only. The logic deployed on the chain is not taken into account for the definition of permissioned and permissionless.

Smart contracts are programming codes added to the blockchain like blocks and can be compared to the stored procedures in RDBMS. Only some blockchains (e.g. Bitcoin, Ethereum, Ripple) implement those as a built-in feature. They define permissions.

A sample smart contract can be: the company gets money when the customer receives a product, e.g. a car.

With smart contracts, it is possible to automate processes by getting rid of the middle-man who would be a car dealer in the example. Smart contracts solve the trust issue to the intermediary and make the process faster. A blockchain-specific language defines them, e.g. in Ethereum, it is Solidity.

Summary

With the growth of interest in blockchain technology from companies and institutions, the need for privacy and the ability to decide who can participate in a blockchain-based network arisen. Since Bitcoin and other cryptocurrencies used a completely public network, private and regulated solutions have been proposed. Hence the new terms introduced were permissioned as opposed to permissionless, as well as a private and public dichotomy.

About

Java software engineer, DevOps enthusiast. Enjoys developing her own and university projects on GitHub. Her spare time spends actively doing rock climbing and pole dancing.