Blockchain technical details
- Published on
- Authors
- Name
- Harvey
- github
- @github
General
- The idea of this post is to elucidate the core functionality of a blockchain protocol. This wiki touches on some technical issues on a typical blockchain system, but everything is conducted in synchronous (i.e. sequential) order.
- In a short summary, the program first generates two users and one miner; it further constructs a genesis block, which allocates some money to a user (bmm1). Then the two users initialize three transactions between them. Finally, the miner collects the transactions and mine a new block.
Accounts: Private & Public Key and Address
(see ./src/account_management
)
- The concept of private and public keys stems from encryption. There are in general two use-cases
- One can encrypt a message using a public key, and the private key can be used to decrypt it.
- One can use the private to sign a message, and the public key can be used to verify the signature (this is the use-case for a typical blockchain protocol).
- A private-public key pair can be generated using some standard/protocol (e.g. RSA or ECDSA). Effectively it is a mathematical design in which we can generate two series of different random numbers. The two series of numbers are mathematically related and have the property such that it is easy to retrieve the public one from the private one while almost impossible to do the reverse operation.
- From a financial transaction perspective, private key == bank password; public key == bank account number; address == bank card number.
- However, in the blockchain case, if one knows the private key(bank password), they can calculate the public key (bank account number).
- An address is a series of masked (called hashed) numbers from the public key. It is effectively an identifier that can be used to send/receive money from others.
Transactions (Tx)
(see ./src/transaction_management.go
)
type Transaction struct {
From string `json:"From"`
To string `json:"To"`
Amount float64 `json:"Amount"`
Fee float64 `json:"Fee"`
TimeStamp int64 `json:"TimeStamp"`
TxHash string `json:"TxHash"`
Message string `json:"Message"`
Signature []byte `json:"Signature"`
Accepted bool `json:"Accepted"`
}
- A blockchain transaction is similar to a standard bank transfer.
- A hashstring is the first layer of the generic transaction validity protocol. In this repo, the hashstring is calculated as follows:
- We first generate a string by attaching From, To, Amount, Message and TimeStamp together.
- e.g. If From = "a", To="b", Amount=312, Message="aha", TimeStamp=52442, then hashstring="ab312aha52442".
- Here the TimeStamp is the number of nanoseconds from now to 0:00:00 UTC on Jan. 1st, 1970 (UNIX format).
- Then we use a hash function (SHA256) to convert the hashstring into a series of numbers.
- The beauty of the hash function is that if the hashstring changes a little bit the hash will be massively different.
- We first generate a string by attaching From, To, Amount, Message and TimeStamp together.
- Once a transaction is initialized, the user may sign the transaction using the private key. This is the second layer of the generic transaction validity protocol.
- Here we use the Keccak256 (used by Ethereum) to again mask/hash the HashString generated above.
- Then we sign the transaction. The signature is another series of alphabet/number combinations.
- After the transaction is signed, it is sent to a pool where all pending transactions are waiting for the miners to pick up.
type MEMPool struct {
pendingTransactions []Transaction
}
Blocks
(see ./src/blockchain_management
)
type Block struct {
Index int `json:"Index"`
PreviousBlockAddress string `json:"PreviousBlockAddress"`
TimeStamp int64 `json:"TimeStamp"`
MinerAddress string `json:"MinerAddress"`
Nonce int `json:"Nonce"`
Difficulty int `json:"Difficulty"`
BlockHash string `json:"BlockHash"`
SelectedTransactionList []Transaction `json:"SelectedTransactionList"`
}
- A block is simply a collection of transactions selected by the miner (in
SelectedTransactionList
), along with some information. - Index is an integer to provide a human-readable way to identify where the block is in the blockchain.
- In a Bitcoin system, it is referred to as block height.
- Often you will hear the concept of number of confirmations, meaning how many blocks have been mined/confirmed after the block where the transaction belongs to.
- Each miner initializes their own new block, and start to perform mining. The TimeStamp records when the first miner successfully mine/confirm the block.
- The rest of the information is explained separately in the following section.
BlockChain
(see ./src/blockchain_management
)
type BlockChain struct {
Chain []Block `json:"Chain"`
}
- A blockchain is a chain of blocks. Each confirmed block is pseudo-connected to the other via
BlockHash
andPreviousBlockAddress
, i.e.BlockHash
in block 1 ==PreviousBlockAddress
in block 2. - A block can only be included on the blockchain if it is mined/confirmed.
- The first block of a blockchain is called the genesis block, in this case, it is hardcoded into the blockchain (see
func CreateGenesisBlock
).
Miner and Mining
(see ./src/blockchain_management
)
- The process of mining involves the following steps:
- Miners select transactions in the memory pool based on fee rankings.
- Miners verify the validity of each transaction based on two requirements:
- If From address has enough balance to cover the Amount plus Fee.
- If the signature can be verified.
- Miners compete against each other in speed on solving a mathematical problem.
- Transaction signature verification is not that trivial. To verify the validity of a transaction, the miner has to know the public key (recall the two use-cases in the above section.).
- A naive way:
From
address adds the raw public key to the transaction. For privacy reasons obviously, the users do not wish to do that. - A more sophisticated way: given the
TxHash
andSignature
, it is possible to recover the public key. We can then hash the recovered public key to re-generate theFrom
address, and see whether it matches theFrom
string printed on the original transaction. - In this repo, the miner accepts two transactions (bmm1 send to bmm2), rejects the third one (bmm2 send to bmm1) because bmm2 has 0 balance.
- A naive way:
- Like aforementioned, mining is a process of solving a problem. In this repo, the process is designed as the following:
- We first generate a string, which consists of
TimeStamp
,TxHash
of all selected transactions,PreviousBlockAddress
andNonce
. - Note that
TimeStamp
changes objectively,Nonce
is an integer that the miner wants to find to solve the problem, and all other strings are static. - We then use the SHA256 protocol to hash this long string to generate the
BlockHash
- The objective here is to find a
Nonce
such that the resultBlockHash
has the first n sub-string of it to match the difficulty level.- For instanc, if the difficulty level is set to be 2, then a successful
BlockHash
must start with"01"
; if the difficulty level is 3, then the successfulBlockHash
must start with"012"
. - Obviously increasing the difficulty level requires the miners to conduct much more computations.
- For instanc, if the difficulty level is set to be 2, then a successful
- Once a
Nonce
is found such that the problem is solved, the miner will declare a successful mining of the block and sends the block to the blockchain.
- We first generate a string, which consists of
Notes on the differences and missing features
- The most important feature that this repo missed upon is the asynchronous nature.
- A full-size blockchain application is like a chat app.
- The central system listens to all users, distributes work to miners and broadcasts messages to all parties involved, all at (roughly) the same time.
- To implement this feature, a network protocol like WebSocket or REST is required.
- The other missing feature is the distributed ledger nature of a blockchain system. Again this is related to the above bullet points. Once the first miner declares successful mining, it will immediately distribute the mined block to all miners for validation. The newly mined block will only be included once enough miners acknowledge the victory.
- Transaction fee: the receiver bears the fee.
- Block size: I limit the number of selected Tx instead of the block size (easy to code). In BTC protocol, the block size is limited by 1MB; in Ethereum, the block size is subject to how many units of gas can be spent per block.
- Coinbase transaction/Block mining rewards
- In a BTC system, there is no explicit concept of
sender
orreceiver
, instead it containsinput
andoutput
. BTC coinbase transaction contains a single input. This is input is not used, and it contains 32 bytes zeros as the previous tx. - The ETH system employs a state system (i.e., if A transfer some ETH to B, the state of balance will shift), there is no coinbase transaction, so there is no transaction for mining reward. The miner who successfully mines the block simply gets a reward (say, 2 ETH), and the state simply changes to reflect the balance increments in the miner's account.
- In this repo, I've created a
coinbase
account, and allocate some money at the beginning. In each block, the first transaction is a transaction from the coinbase account to the miner's account. My hunch is that it is easier to understand in this way (as if there is a central bank, which of course goes against the concept of decentralized finance).
- In a BTC system, there is no explicit concept of
- Building a full system requires much more beyond this mini example, here in this repo I build the minimum with the intention to understand the core concepts.
Reference:
- BlockChain in Python by Nathan Ang @CMU
- Nonce and difficulty
- How does a blockchain work
- Ethereum: siging and validating
- Ethereum EOAs and public keys
- Getting senders' public key in a smart contract
- Recover ECDSA public key
- Ethereum implementation in Golang
- Bitcoin cookbook
- Ethereum contract call vs. transaction
- Ethereum block data explain