Bitcoin blog

Essence of Bitcoin is innovation that may be related to remittance and asset holding rights and liberation and freedom


Blockchain is transaction database shared by all nodes participating in system based on Bitcoin protocol. The full copy of blockchain contains every transaction ever executed in the currency.With this information, one can find out how much value belonged to each address at any point in history.

Block structure
A block is like a data container of transaction, and it consists of a block header and a list of transactions. The block header is used to calculate the block hash. Specifically, it can be calculated by double-hashing(SHA256) the block header. The capacity of the block header is about 80 byte, whereas one transaction is approximately 250 byte. And since there are 500 or more transactions included in one block, it reaches about 125,000 byte. That is, the transaction has a capacity of 1000 times or more of the block header!


Block header
The block header is composed of the following contents.
①Version(4byte):Verification rule of block
②Previous block(32byte):A thing to connect with Previous block and next block
③Time stamp(4byte), difficulty(4byte), nonce(4byte):involved to mining
④Merkle tree(32byte):A data structure used for collecting transactions in block efficiently

Block hash
The block hash is not included in block's data, when block is released to network or stored in the storage as part of block chain.It is calculated at each node when it arrives at the node from the network.It is stored in the database of each node as a part of the block's metadata, and it is used for retrieving and indexing blocks.

Derivation of block hash
Calculate block hash by referring to this site.


1. BlockVersion and PreviousBlocHash and BlockMerkleRoot convert to Little-Endian
2. BlockTime convert to Unix Time → Convert to hexadecimal → Conver to Little-Endian
3. BlockBits and BlockNonce convert to hexadecimal → Conver to Little-Endian
4. Connect 1, 2, 3 in order
5. Calcurate double hash value with SHA256 to ④
6. Convert this value to Big-Endian

import hashlib
import binascii
import re

#little edian function
def endian(src):
    scr = ''.join(re.split('(..)',src)[1::2][::-1])
    return scr

#block data(from blockchain info)
block_version = "00000004"
previous_block_hash = "0000000000000000005629ef6b683f8f6301c7e6f8e796e7c58702a079db14e8"
block_merkle_root = "efb8011cb97b5f1599b2e18f200188f1b8207da2884392672f92ac7985534eeb"
block_time = "2016-01-30 13:23:09"
block_bits = "403253488"
block_nonce = "1448681410"

#convert little endian
block_version       = endian(block_version)
previous_block_hash = endian(previous_block_hash)
block_merkle_root   = endian(block_merkle_root)

#convert GMT to unix
from datetime import datetime
block_time = int((datetime.strptime(block_time, "%Y-%m-%d %H:%M:%S")-datetime(1970,1,1)).total_seconds())
block_time = endian(format(block_time,"x"))

#convert decimal to hexadecimal
block_bits = endian('%x' % int(block_bits))
block_nonce = endian('%x' % int(block_nonce))

#block header
block_data = block_version + previous_block_hash + block_merkle_root + block_time + block_bits + block_nonce

hash = hashlib.sha256(hashlib.sha256(binascii.a2b_hex(block_data)).digest()).digest()
hash = endian(binascii.b2a_hex(hash).decode('utf-8'))

Block height
"Block Height" means the hight of stacked of block.For example,when block hight is "29500", it expressing that the block processed at 29500th from beginning.The block height does not include in part of the data of block.So,the node don't judge place of block in the blockchain from "block height"."Block height" is saved as metadata in data base.

Genesis block
First block of the blockchain is called "Genesis block" and it was made in 2009.As the word "Genesis" implies, it eventually return to genesis block even if any block.Every block contains a hash of the previous block. By This structure,it has the effect of creating a chain of blocks from the genesis block to the current block. This ensures the safety of the block chain.


Concatenation of blocks
Full node client maintains local copy of block chain that continues from genesis block.The client is periodically updated every time new block is found.When the nodes receive block from Bitcoin network, node verifies block and connects to the blockchain already held.

The fork is phenomenon where different blocks are found by different minors almost at the same time and the chain branches.However, since longer chain is adopted and shorter one is discarded, it is solved when one child block becomes part of the blockchain at certain timing.Generally, the more times network approve it, the more reliable it is. When more than six times are approved,it would be almost certainly.

Merkle tree
Part of the block contains Merkle tree information.Merkle tree contains an overview of all transactions in the block.The tree is also called "binary hash tree", it can efficiently compile large data and verify the whole data."Tree" represents data structure with leaf, with upper direction being root and lower direction being leaf.

Function of Merkle tree
① Merkle tree is used to compile all the transactions included in the block
② Create digital fingerprint of the entire transaction (hashed digital information)
③ It helps to confirm efficiently whether a transaction is included in the block


Calcurate of Merkle tree
1. If there is a hash of two transactions in the block
2. Convert each hash value to Little Endian
3. Connect the hash value of 2 directly
4. Convert 3 to a binary string and double hash the SHA 256
5. Convert 4 to Big Endian
※If there are no hash pairs, duplicate myself.

import hashlib
import re
import binascii

#edian function
def endian(src):
    scr = ''.join(re.split('(..)',src)[1::2][::-1])
    return scr

transaction_hash1 = "e909043fa259d7a36c4e868cb0e4fc2ed589384c1766705d0c4e132117fc207d"
transaction_hash2 = "e6d51744d77bd69a00acc42c322b2aeeec80ee6fcce7bb371fb0239ad2865fd2"

#convert to little endian
transaction_hash1 = endian(transaction_hash1)
transaction_hash2 = endian(transaction_hash2)

#connect to hash1 and hash2
transaction_hash = transaction_hash1 + transaction_hash2

#double hash with SHA256
hash = hashlib.sha256(hashlib.sha256(binascii.a2b_hex(transaction_hash)).digest()).digest()
hash = endian(binascii.b2a_hex(hash).decode('utf-8'))

Merkle root
Merkle tree computes one hash value from a pair of leaf nodes,and continue calculation with pairs until one hash value. Last remaining value is called "Merkle route".One block has several hundreds to several thousands of transactions, but eventually it becomes 32byte hash on Merkle route.Regardless of how many transactions,it does not change.
In order to prove whether a particular transaction is included in a block, it is enough to create hashes of log 2 (N) . As shown in the figure below.


Effect of Meekle root
The block capacity is only 4 KB in 16 transactions, but in 65,535 transactions the block capacity is 16 MB (16,000,000 bytes) .In the case of Merkle path,the block capacity is 128 bytes in 16 transactions,in 65,535 transactions the block capacity is 512 bytes .It can save 1 / 31,250 times the capacity by using Merkle path.


※The code is used in this article,it have posted on GitHub

Remove all ads


Bitcoin is builded as peer-to-peer(P2P) network on internet. P2P network is essentially decentralize and open. This network is formed by connecting mutually equal nodes. we calling computer terminal participating in the network as “node”, and Bitcoin network means that entire note running the P2P protocol. This node can be divided into several roles according below.

Node type
The Bitcoin node is collection of functions such as routing, block chain database, mining, and wallet.

User wallet has two types of full node and SPV node.
Mining nodes are competing to create new blocks.
Full blockchain
It refers to full node with a complete blockchain.
Routing node
All nodes must have routing function in order to join the network.

Extended network type
It shows the main type on the extended Bitcoin network.

Join the network
When new node launch, the node must find another node to join Bitcoin network. And this node has to find and connect at least one node. Geographical location of other nodes is irrelevant. Becouse topology of Bitcoin network isn't desided in reration to geography. The node can be randomly chosen.

①The node establish TCP conection(use port 8333)
TCP (Transmission Control Protocol):It is a standard protocol used to realize reliable communication in the Internet.

②Stand up to New node(It find and connect at least one node)

③Contact DNS server (DNS seed provide IP address of Bitcoin node)
DNS(Domain Name System):It tells IP address based on hostname.

④ As soon as establishing a connection, the node sending“version message”
version message:It include basic Identification information

⑤The peer node returns " verack " to acknowledge the connection and establish it
verack:It inform that trying to connect

⑥When the node establish one or more connections, New node sends "addr message" to the adjacent node
addr message: It include IP address

⑦Adjacent nodes communicate "addr message" to another adjacent nodes one after another

⑧Also, when send "getaddr" to adjacent node
getaddr:It get to return IP address of other peer

⑨The node connects to plurality of different peers

⑩node must always keep finding new node

Full node working
Full node is node managing a complete blockchain including all transactions.

①Full node sends to “version message”
version message:It is included to“Bestheight”that shows current blockchain hight of node

②By looking at "version messages", the nodes know how many peers the peer holds and can compare with our own blockchains

③Then the peer node exchanges "getblocks message" containing the top block hash (fingerprint) of each other local blockchain

④The node can compare with the hash of other node and identify which block the other node wants

⑤Identify the first 500 blocks to share with other nodes
"inv message" include in this infomation

⑥Send “inv message” to other node

⑦Pick a hash of missing blocks and request "getdata messages" to be sent full block data

※The node keeps track of how many blocks per peer are still in the "transmitting" state that has not yet been sent.
※It continues checking so that it does not exceed the maximum number of blocks during transmission for one peer.

Simplified payment verification
As Bitcoin grows in usage the datasize amount needed to download blocks and transaction broadcasts increases. All notes can't hold a full blockchain. The data size problem occer when running Bitcoin wallet on low spec machin like smartphone. Convenience will be lost unless irrelevant transactions are discarded and block chains are not synchronized quickly. The SPV is used so that network processes can be executed without holding a full blockchain.

Solution method
The SPV node downloads only the block header and does not download transaction. It only validates all blockchains and transactions associated with this SPV node. So the SPV node can't verify that UTXO is not in use.
Instead, the SPV node uses merkle path to tie the transaction to block containing this transaction. The SPV node can check existence of the transaction in the block by requesting the stead node for merkle path certification and by verifying the proof of work in blockchain.

※The SPV node can prove that the transaction exists reliably, but it can not verify whether there is a transaction like dual use of the same UTXO.So the SPV node needs to be connected randomly to several nodes.

What is Bloom filter?
With the protocol extension proposed in BIP0037,SPV client can efficiently download only transactions related to myself address by using "Bloom filter".Bloom filter operates at a very high speed and is a data structure that can probabilistically determine whether an element is included in a set.
"Stochastic" means that sometimes to say "Although do not included,it contains",but it does not say "despite being included in it, not included" .
※there is a false positive but false There is no negative

How does Bloom filter work?
①Initialize bloom filter with empty state

②Create list of all the addresses the wallet has

③Create a search pattern for each Bitcoin address

④These search patterns are passed through a hash function to obtain a number between 1 and N

⑤Find bits in the bit string corresponding to this number and set "1"

⑥It will do this further by the number of the pattern(already remains 1 stands place where 1)
※Accuracy decreases as the number of patterns increases.Conversely, the larger the size (N) of the bit string and the number (M) of hash functions, the more accurate it becomes.Using larger bit strings and more hash functions makes more accurate.

⑦bloom fliter send to peer

⑧The peer uses sent bloom filter to find out which transaction matches the search pattern

⑨Use bloom filter to check pattern "X" exists.
If the place where the bit pattern is "1" is the same "1" in the bit string of the bloom filter,the bit pattern may be included in the bloom filter.

⑩Conversely, to check that a pattern is not included in the bloom filter, make sure that any one of the bit strings of the corresponding bloom filter is 0.

Bloom Filter and Inventory
1.SPV node send "filterload(include bloom filter)" to peer

2.Peer chack this bloom filter

3.Peer send "transaction(only matching bloom filter)" to SPV node

4.SPV node send "getdata message" to peer

5.Peer send "merkleblock message" to SPV node
※Block header and transaction correspond to merkle path

6.Peer send "tx message" to SPV node
※Contains transactions matching "bloom filter"

Remove all ads


Transaction is encoded data stracture that value transfer between participants in Bitcoin sistem.
Individual transaction add into blockchain as global ledger.
Anyone can see each transaction in the blockchain.
Basic component of transaction is the unused transaction output(UTXO).
UTXO is chunks of undividable Bitcoin locked to a specific owner.

Transaction life cycle
Transaction generate→ Transaction signature → Broadcast to network → Network verification → Propagation to network → Mining node verfication → Recode to blockchain

Transaction construction
The transaction contains several fields as shown in the following transaction structure.


→Specifies which rules this transaction follows

Input values(1byte):
→How many input are included

Previous block hash(32byte):
→Pointer to the transaction containing the UTXO to be spent

Previous index(4byte):
→The index number of the UTXO to be spent; first one is 0

Script bytes length(1byte):
→Unlocking-Script length in bytes

→A script that fulfills the conditions of the UTXO locking script

Terminal symbols(4byte):
→Currently disabled Tx-replacement feature, set to "0xFFFFFFFF"

Output values(1byte):
→How many output are included

Amount invested(8byte):
Bitcoin value in satoshis

Script byte length(1byte):
→Locking-Script length in bytes

→A script defining the conditions needed to spend the output

Lock time(4byte):
→A Unix timestampn or block number

Locktime = 0: Propagated immediately

Lock time ≠ 0> 500,000,000: Block height (meaning that this transaction is not included in the block chain in the block before this)

Lock time ≠ 0 <500,000,000: Unix time (meaning this transaction is not valid before this time)

Raw transaction example

Transaction type
Bitcoin uses a scripting system for transactions.
Forth-like, Script is simple, stack-based, and processed from left to right.
To prevent attacks on the Bitcoin network by not allowing complicated processing.
There are five standard transaction scripts for Bitcoin.

The scripte is used vast majority of transactions on the bitcoin network.The P2PKH designate Bitcoin address as payee.An output locked by a P2PKH script can be unlocked (spent) by presenting a public key and a digital signature created by the corresponding private key.

unlocking script(scriptSig)

<Signature><Public Key>

locking script(scriptPublicKey)


locking script exemple

unlocking script + locing script

<Signature><Public Key> OP_DUP OP_HASH160<public Key Hash> OP_EQUAL OP_CHECKSIG


P2PK is a simplified form of P2PKH,P2PK is not safer than P2PKH and generally don't use now.

locking script

<Public key A><OP_CHECKSIG>

unlocking script

<Signature from Private Key A>

unlocking script + locking script

<Signature from Private Key A><Public key A>OP_CHECKSIG

Multi-signature scripts must set N public keys and recorded in the script and at least M number of signatures must provide when release. This is also known as an M-of-N scheme, where N is the total number of keys and M is the threshold of signatures required for validation.

locking script

2<Public Key A><Public Key B><Public Key C>3 OP_CHECKMULTISIG

unlocking script

OP_0<Signature B><Signature C> 

unlocking script + locking script

OP_0<Signature B><Signature C>2<Public Key A><Public Key B><Public Key C>3 OP_CHECKMULTISIG

④data output(OP_RETURN)
OP_RETURN can add 80 bytes of data unrelated to payment to the transaction.OP_RETURN don't have "unlock script" for use to output.This will not be an available UTXO,output is assigned with 0BTE.If Script verifcation software find to OP_RETURN,Stops execution of the verification script immediately and invalidates the transaction

The Multisig use multiple PubKey at the time of remittance ,Locking Script has a problem that the amount of data becomes large.In P2SH, PubKey data is separated from the Locking Script at the time of payment,and that Public key is included in "Redeem Script" and the problem was solved by referring to data.

When P2MS used

When P2SH used

P2MS script

Actually P2MS data is long like this!

Even such a long script become short when expressing with SHA 160(20byte)

In other word,P2SH can be expressed simply as follows

When using UTXO created with P2SH
Redeem Script is checked against Locking Script whether this hash matches

Unlocking Script will be executed to release of Redeem Script when Redeem Script matches

※P2SH address is encoded Hash value of Redeem script,Version Prefix is "5" and P2SH address is beginning "3"

Transaction Signature
There are various methods among digital signatures,but Elliptic Curve Digital Signature Algorithm (ECDSA) is used in Bitcoin.The ECDSA offers Digital Signature Algorithm (DSA) which is used ECC.This cryptographic algorithm ensure that funds can only be spent by their rightful owners.

When Alice send message(m) and signature(u, v) to Bob who can be verified data has not been tampered by third parties.

1.In advance Alice generate private key(d) and publishing public key(Q)

Q = d * G ...① ※Prime numbers(I) and Base point(G) known to Bob

2.Generate rundam numbers(r),and r times moved point R(coordinate xU,yU) calculate from base point(G)

r * G = R =(xU, yU)

3.Find the hash value H of message m

4.Calculate signature value(u, v)

u = xU ( mod l) 
v = (H + u * d) / r (mod l) ...② ※The remainder u dividing xU to l(prime number)

5.The signature(u, v) send to Bob

6.Point P calcurated using public key(Q) and parameter(prime number l, base point G) and signature value(u, v)

P  = (xV, yV) = H/ v*G + u /v*Q ...③

7.If "xV" and "xU mod l" are equal,they has not been tampered

Why is this confirming the signature?

1.Let's verify using ③ formula

P =  (xV, yV) = H/ v*G + u /v*Q

From ①

	= H/v * G + u/v * dG
	= 1/v (H + ud)G

From ②

	= r / (H + ud)  * (H + ud)G
	=rG = R

When the signature value (u, v), the hash value H, and the public key Q are correct.Since the point P and the point R are equal, the signature value (u) of the point R is equal to the x coordinate (xV) of the point P.Therefore, the signature is correct.

Transaction signature flow


priv = sha256('some big long brainwallet password')
pub = privtopub(priv)
addr = pubtoaddr(pub)
h = history(addr)
outs = [{'value': 90000, 'address': '16iw1MQ1sy1DtRPYw3ao1bCamoyBJtRB4t'}]
tx = mktx(h,outs)

h = history(addr)
→The history() retrieves all transaction outputs (TXO) that have been "received" by the specified bitcoin address. This includes both unspent and spent TXOs.if marked with "spend",meaning that have been spent.

tx = mktx(h,outs)
→The tx create frame of transaction that include as shown in the figure above


def sign(tx, i, priv, hashcode=SIGHASH_ALL):
    pub = privkey_to_pubkey(priv)
    address = pubkey_to_address(pub)
    #temporary staging to Public key hash 
    #signature_from:add scriptSig + previous output index
    signing_tx = signature_form(tx, int(i), '76a914' + b58check_to_hex(address) + '88ac', hashcode)

def sign()
→This function generate signatur use private key

signing_tx = signature_form(tx, int(i),
'76a914' + b58check_to_hex(address) + '88ac', hashcode)

→A normal sendtoaddress transaction will start with 76a914 and end with 88ac.and 76a914 means "OP_DUP OP_HASH160 PUSH_20BYTES" and 88ac is "OP_EQUALVERIFY OP_CHECKSIG".
→Temporarily put to ScriptPubKey and Script bytes-long


def deterministic_generate_k(msghash, priv):
    v = b'\x01' * 32
    k = b'\x00' * 32
    priv = encode_privkey(priv, 'bin')
    msghash = encode(hash_to_int(msghash), 256, 32)
    k =, v+b'\x00'+priv+msghash, hashlib.sha256).digest()
    v =, v, hashlib.sha256).digest()
    k =, v+b'\x01'+priv+msghash, hashlib.sha256).digest()
    v =, v, hashlib.sha256).digest()
    return decode(, v, hashlib.sha256).digest(), 256)

def ecdsa_raw_sign(msghash, priv)
→RFC6979 is a deterministic digital signature generation procedure.If ECDSA implementation bad,private keys have leaked.Deterministic signatures retain the cryptographic security features associated with digital signatures but can be more easily implemented in various environments, since they do not need access to a source of high-quality randomness.Therefore, it must implement deterministic signature schemes such as the scheme described in RFC6979.

※Section 3.2 of RFC6979 is described the process of generating nonce.


def ecdsa_raw_sign(msghash, priv):
    z = hash_to_int(msghash)
    k = deterministic_generate_k(msghash, priv)
    r, y = fast_multiply(G, k)
    s = (z + r*decode_privkey(priv)) * inv(k, N) % N
    v = 27+((y % 2) ^ (0 if s * 2 < N else 1))
    s = s if s * 2 < N else N - s
    return v, r, s

r, y = fast_multiply(G, k)
→Use ECC to find the point R
→r * G = R =(xU, yU) ※Point R is r times moved point R
→u = xU(mod l) u is signature value(In above code it is "r")

s = (z + r*decode_privkey(priv)) * inv(k, N) % N
→The other signature value calculated from "v = (H + u * d) / r ( mod l)"

v = 27 +( (y % 2) ^ (0 if s * 2 < N else 1) )
→It returns 27 when V is an even number, and 28 in case of an odd number

s = s if s * 2 < N else N - s
→The number of prime points on the plane is limited to order N.
→N is parameters of the "secp256k1" used in Bitcoin

def der_encode_sig(v, r, s):
    b1 = str(binascii.hexlify(encode(r, 256)), 'utf-8')
    b2 = str(binascii.hexlify(encode(s, 256)), 'utf-8')
    left = '02'+encode(len(b1)//2, 16, 2)+b1
    right = '02'+encode(len(b2)//2, 16, 2)+b2
    return '30'+encode(len(left+right)//2, 16, 2)+left+right

def ecdsa_tx_sign(tx, priv, hashcode=SIGHASH_ALL):
    rawsig = ecdsa_raw_sign(binascii.unhexlify(txhash(tx, hashcode)), priv)
    return der_encode_sig(*rawsig)+encode(hashcode, 16, 2)

def der_encode_sig(v, r, s)
→DER encoded signature has the following form
0x30 b1 0x02 b2 r 0x02 b3 s
DER is a set of rules for transforming a data structure into a sequence of bytes, and back.


def sign(tx, i, priv, hashcode=SIGHASH_ALL):
    pub = privkey_to_pubkey(priv)
    address = pubkey_to_address(pub)
    signing_tx = signature_form(tx, int(i), 
          '76a914' + b58check_to_hex(address) + '88ac', hashcode)
    sig = ecdsa_tx_sign(signing_tx, priv, hashcode)
    txobj = deserialize(tx)
    txobj["ins"][int(i)]["script"] = serialize_script([sig, pub])
    return serialize(txobj)  

txobj = deserialize(tx)
→Restore the signed tx to the original data format

txobj["ins"][int(i)]["script"] = serialize_script([sig, pub])
→The signature value and public key put in ScriptSig of input

※The code is used in this article,it have posted on GitHub

Remove all ads