Dataset. Bitcoin is the biggest digital currency based on so-called blockchain technology. A blockchain is a shared ledger (transaction log) consisting blocks. The bitcoin blockchain can be written only after computationally-intensive mining. The size of the current bitcoin blockchain is 350GB (compressed), and we downloaded it for you to analyze.
B2: BlockChain Evolution. The transactions of the bitcoin blockchain form a temporal graph as each transaction has a timestamp. We would like to learn about the evolution over time of blockhain usage (users, account balances, transactions, mined blocks, etc). An intermediary step will be the splitting of the blockchain in multiple time-based graph snapshots.
Summary. Spark SQL and GraphX were used on top of the cryptoledger library to take {1,2,3,4,5,6,7} year snapshots of the bitcoin blockchain and run a plethora of graph analysis algorithms on these. It comes with an interactive visualization where one can appreciate the growth of the bitcoin blockchain and the changing network characteristics, over time.
Data Curiosity: *** Paper Writing: *** Technical difficulties mastered: **** Visualization coolness: ****