Blockchain is transaction database shared by all nodes participating in system based on Bitcoin protocol. The full copy of blockchain contains every transaction ever executed in the currency.With this information, one can find out how much value belonged to each address at any point in history.
A block is like a data container of transaction, and it consists of a block header and a list of transactions. The block header is used to calculate the block hash. Specifically, it can be calculated by double-hashing(SHA256) the block header. The capacity of the block header is about 80 byte, whereas one transaction is approximately 250 byte. And since there are 500 or more transactions included in one block, it reaches about 125,000 byte. That is, the transaction has a capacity of 1000 times or more of the block header!
The block header is composed of the following contents.
①Version(4byte):Verification rule of block
②Previous block(32byte):A thing to connect with Previous block and next block
③Time stamp(4byte), difficulty(4byte), nonce(4byte):involved to mining
④Merkle tree(32byte):A data structure used for collecting transactions in block efficiently
The block hash is not included in block's data, when block is released to network or stored in the storage as part of block chain.It is calculated at each node when it arrives at the node from the network.It is stored in the database of each node as a part of the block's metadata, and it is used for retrieving and indexing blocks.
Derivation of block hash
Calculate block hash by referring to this site.
1. BlockVersion and PreviousBlocHash and BlockMerkleRoot convert to Little-Endian
2. BlockTime convert to Unix Time → Convert to hexadecimal → Conver to Little-Endian
3. BlockBits and BlockNonce convert to hexadecimal → Conver to Little-Endian
4. Connect 1, 2, 3 in order
5. Calcurate double hash value with SHA256 to ④
6. Convert this value to Big-Endian
import hashlib import binascii import re #little edian function def endian(src): scr = ''.join(re.split('(..)',src)[1::2][::-1]) return scr #block data(from blockchain info) block_version = "00000004" previous_block_hash = "0000000000000000005629ef6b683f8f6301c7e6f8e796e7c58702a079db14e8" block_merkle_root = "efb8011cb97b5f1599b2e18f200188f1b8207da2884392672f92ac7985534eeb" block_time = "2016-01-30 13:23:09" block_bits = "403253488" block_nonce = "1448681410" #convert little endian block_version = endian(block_version) previous_block_hash = endian(previous_block_hash) block_merkle_root = endian(block_merkle_root) #convert GMT to unix from datetime import datetime block_time = int((datetime.strptime(block_time, "%Y-%m-%d %H:%M:%S")-datetime(1970,1,1)).total_seconds()) block_time = endian(format(block_time,"x")) #convert decimal to hexadecimal block_bits = endian('%x' % int(block_bits)) block_nonce = endian('%x' % int(block_nonce)) #block header block_data = block_version + previous_block_hash + block_merkle_root + block_time + block_bits + block_nonce #calcurate hash = hashlib.sha256(hashlib.sha256(binascii.a2b_hex(block_data)).digest()).digest() hash = endian(binascii.b2a_hex(hash).decode('utf-8')) print(hash)
"Block Height" means the hight of stacked of block.For example,when block hight is "29500", it expressing that the block processed at 29500th from beginning.The block height does not include in part of the data of block.So,the node don't judge place of block in the blockchain from "block height"."Block height" is saved as metadata in data base.
First block of the blockchain is called "Genesis block" and it was made in 2009.As the word "Genesis" implies, it eventually return to genesis block even if any block.Every block contains a hash of the previous block. By This structure,it has the effect of creating a chain of blocks from the genesis block to the current block. This ensures the safety of the block chain.
Concatenation of blocks
Full node client maintains local copy of block chain that continues from genesis block.The client is periodically updated every time new block is found.When the nodes receive block from Bitcoin network, node verifies block and connects to the blockchain already held.
The fork is phenomenon where different blocks are found by different minors almost at the same time and the chain branches.However, since longer chain is adopted and shorter one is discarded, it is solved when one child block becomes part of the blockchain at certain timing.Generally, the more times network approve it, the more reliable it is. When more than six times are approved,it would be almost certainly.
Part of the block contains Merkle tree information.Merkle tree contains an overview of all transactions in the block.The tree is also called "binary hash tree", it can efficiently compile large data and verify the whole data."Tree" represents data structure with leaf, with upper direction being root and lower direction being leaf.
Function of Merkle tree
① Merkle tree is used to compile all the transactions included in the block
② Create digital fingerprint of the entire transaction (hashed digital information)
③ It helps to confirm efficiently whether a transaction is included in the block
Calcurate of Merkle tree
1. If there is a hash of two transactions in the block
2. Convert each hash value to Little Endian
3. Connect the hash value of 2 directly
4. Convert 3 to a binary string and double hash the SHA 256
5. Convert 4 to Big Endian
※If there are no hash pairs, duplicate ｍｙself.
import hashlib import re import binascii #edian function def endian(src): scr = ''.join(re.split('(..)',src)[1::2][::-1]) return scr transaction_hash1 = "e909043fa259d7a36c4e868cb0e4fc2ed589384c1766705d0c4e132117fc207d" transaction_hash2 = "e6d51744d77bd69a00acc42c322b2aeeec80ee6fcce7bb371fb0239ad2865fd2" #convert to little endian transaction_hash1 = endian(transaction_hash1) transaction_hash2 = endian(transaction_hash2) #connect to hash1 and hash2 transaction_hash = transaction_hash1 + transaction_hash2 #double hash with SHA256 hash = hashlib.sha256(hashlib.sha256(binascii.a2b_hex(transaction_hash)).digest()).digest() hash = endian(binascii.b2a_hex(hash).decode('utf-8')) print(hash)
Merkle tree computes one hash value from a pair of leaf nodes,and continue calculation with pairs until one hash value. Last remaining value is called "Merkle route".One block has several hundreds to several thousands of transactions, but eventually it becomes 32byte hash on Merkle route.Regardless of how many transactions,it does not change.
In order to prove whether a particular transaction is included in a block, it is enough to create hashes of log 2 (N) . As shown in the figure below.
Effect of Meekle root
The block capacity is only 4 KB in 16 transactions, but in 65,535 transactions the block capacity is 16 MB (16,000,000 bytes) .In the case of Merkle path,the block capacity is 128 bytes in 16 transactions,in 65,535 transactions the block capacity is 512 bytes .It can save 1 / 31,250 times the capacity by using Merkle path.
※The code is used in this article,it have posted on GitHub