Incremental Db
The incremental DB feature leverages on AWS Simple Storage Service (S3) to provide an efficient way for miners and seed nodes to get blockchain data in order to join the network.
Background
Prior to this feature, the basic design involved uploading or syncing entire persistence to an AWS S3 bucket at each and every Tx epoch. New nodes would then fetch the entire persistence from that bucket.
This would have been alright for all existing LevelDB databases, with the exception of the state
database. This is because running aws-cli sync
on state
results in uploading all files in the database, which is time-consuming and not bandwidth efficient.
Uploading of state
LevelDB for every Tx epoch is thus a bottleneck, and so incremental DB was introduced as the solution.
Note
It is practically possible that all files in stateDB
get updated at every Tx epoch, if transactions in that particular epoch changed the states of addresses that somehow update TrieDB across all the files in state
LevelDB.
Implementation
Two scripts make up the building blocks for incremental DB.
Upload Incremental DB Script
The script uploadIncrDB.py
runs on a lookup node managed by Zilliqa Research. It performs the following steps:
- Add
Lock
file to S3 bucket incremental -
Perform sync between local
persistence
folder (i.e., within this lookup node) andincremental\persistence
on AWS S3 every Tx epoch. More specifically, syncing is done according to different criteria based on the Tx epoch number. These are the possibilities: -
At script startup
- Clear both buckets, i.e., incremental and statedelta
- Sync entire
persistence
(i.e., everything that exists in the folder, includingstate
,stateroot
,txBlocks
,txnBodies
,txnBodiesTmp
,microblock
, etc) to bucket incremental - Clear all state deltas from bucket statedelta
- At every
(INCRDB_DSNUMS_WITH_STATEDELTAS * NUM_FINAL_BLOCK_PER_POW) == 0
DS epoch (where INCRDB_DSNUMS_WITH_STATEDELTAS and NUM_FINAL_BLOCK_PER_POW are the constants fromconstants.xml
file) - Sync entire
persistence
to bucket incremental - Clear all state deltas from bucket statedelta
- At all other Tx epochs
- Sync entire
persistence
(excludingstate
,stateroot
,contractCode
,contractStateData
,contractStateIndex
) to bucket incremental - For the first Tx block within the DS epoch (e.g., 100, 200, 300, …), we don't need to upload state delta differences. Instead, the complete
stateDelta
LevelDB (composed as a tarball, e.g.,stateDelta_100.tar.gz
) is uploaded to S3 bucket statedelta -
For other Tx blocks, we upload the state delta differences (composed as tarballs, e.g.,
stateDelta_101.tar.gz
,stateDelta_102.tar.gz
, ....stateDelta_199.tar.gz
) to S3 bucket statedelta -
Remove
Lock
file from S3 bucket incremental
Download Incremental DB Script
The script downloadIncrDB.py
is executed upon startup by every miner or seed node to get the latest block chain data. It performs the following steps:
- Check if
Lock
file exists. Wait until noLock
file is found - Clear existing local
persistence
folder, then download entire persistence data (exceptmicroblocks
andtxBodies
for miner nodes) from S3 bucket incremental - Check if
Lock
file has appeared after executing the previous step. If yes, return to the first step - Clear existing local
StateDeltasFromS3
folder, then download all state deltas from S3 bucket statedelta toStateDeltasFromS3
folder
Incremental DB Usage by a Joining Node
- Node uses the
downloadIncrDB.py
script to populate itspersistence
folder from S3 bucket incrementalDB - Node uses the same script to populate its
StateDeltasFromS3
folder with all the state deltas from S3 bucket statedelta - Node loads the contents of
persistence
and initiates syncup. At this point, node has a base state of, say,X
- Node then recreates the latest state using the state deltas in
StateDeltasFromS3
(e.g.stateDelta_101.tar.gz
,stateDelta_101.tar.gz
, ....,stateDelta_199.tar.gz
,stateDelta_200.tar.gz
,stateDelta_201.tar.gz
, ....) - Using these files, the final state
Y
is computed asY = X + x1 + x2 + ... + x99 + x100 + x101 + x102 + ...
More information on new node joining can be found in the Rejoin Mechanism page.