Author

Topic: Problems with Tick-Level Data from Bitcoincharts.com (Read 1954 times)

newbie
Activity: 4
Merit: 0
Hi all,

I've been doing some research using transaction-level data provided by bitcoincharts (http://api.bitcoincharts.com/v1/csv/) but have run into a bit of a problem. Trades that occur within the same second have identical unix timestamps, so it's unclear which trade comes first within the second. This becomes an issue when I try to compare datasets downloaded at different times. I downloaded the bitstamp data in March, for instance, and again yesterday, but the order of transactions is different between the 2 datasets.

For instance, in the dataset through March 25, 2014, trades with unix code 1395399739 have corresponding prices listed in this order:
 
576.01, 575.85, 577, 576.02, 576.99, 576.01
 
But in the dataset through June 10, 2014, the same trades have corresponding prices listed in this order:
 
576.01, 576.02, 575.85, 576.01, 577, 576.99

All I've done is extract the gz file and import the delimited CSV into STATA (I haven't sorted/made any other changes, so I don't think the change in order is a result of anything on my end).  

Clearly they are the same group of trades, but is there any way to verify the correct order? Otherwise it seems like I have to collapse by second.

Thanks in advance for any insight!

EDIT: I've compared the data from bitcoincharts.com and SierraChart.com and they are inconsistent. The order of trades that happened in the same second is different, even though the observations across seconds are all the same. What use is tick-level data compared to second-level data if no one knows the right order?!

Jump to: