Pages:
Author

Topic: [0.2 BTC] Python coder, need solve issue (Read 2112 times)

legendary
Activity: 1512
Merit: 1028
legendary
Activity: 1974
Merit: 1029
It looks like the easiest way would be to get the entire history CSV files here first:

http://api.bitcoincharts.com/v1/csv/

These are updated daily (but might not be permanent).

As I said in the sierrachart bridge thread, I'm keeping a kind of mirror of all mtgoxusd data in my dropbox, with data segregated in daily files, e.g. https://dl.dropboxusercontent.com/u/24587684/mtgox-daily/2013/04/23. It goes back to 2010/08/17. It's automatically updated every 90 minutes, although since the process goes through my laptop, some updates will be skipped while I'm on the road.
hero member
Activity: 924
Merit: 501
lock the thread lock the thread Smiley

help design the next mt gox lucif....
full member
Activity: 124
Merit: 100
Now a maximum of 20000 trades will be returned
The api description says 2000.. With 15 min delay, which is going to be a challenge to work around or maybe not..
legendary
Activity: 1512
Merit: 1028
Also, a big oops in the "accepted" code above, you can't just sort data, otherwise trades that happen the same second are sorted (randomized) based on the trade amount. Also, you are sorting again and again for EVERY line that is read.

Raw data format:
Code:
1366740776,136.120010000000,0.010000000000
1366740774,136.120010000000,0.200000000000
1366740773,136.690000000000,0.843387830000
1366740773,136.600000000000,0.914900000000
1366740771,136.111650000000,0.020000000000
1366740770,136.100010000000,0.072978190000
1366740770,136.100010000000,0.020000000000
1366740770,136.120000000000,0.020000000000
1366740769,136.120000000000,0.020000000000
1366740769,136.120000000000,0.020000000000
1366740769,136.100010000000,0.020000000000
1366740769,136.600000000000,0.085100000000
1366740769,136.100010000000,0.020000000000
1366740768,136.100010000000,0.020000000000

fucked up data after that code:
>>> print list1
Code:
['1366740768,136.100010000000,0.020000000000',
'1366740769,136.100010000000,0.020000000000',
'1366740769,136.100010000000,0.020000000000',
'1366740769,136.120000000000,0.020000000000',
'1366740769,136.120000000000,0.020000000000',
'1366740769,136.600000000000,0.085100000000',
'1366740770,136.100010000000,0.020000000000',
'1366740770,136.100010000000,0.072978190000',
'1366740770,136.120000000000,0.020000000000',
'1366740771,136.111650000000,0.020000000000',
'1366740773,136.600000000000,0.914900000000',
'1366740773,136.690000000000,0.843387830000',
'1366740774,136.120010000000,0.200000000000',
'1366740776,136.120010000000,0.010000000000']

Reversing the data is correct, after the whole chunk has been read.

list.reverse()

    Reverse the elements of the list, in place.

hero member
Activity: 924
Merit: 501
* Viceroy bows to lucif for his graciousness saying
"Thank you for the tip, sir!"
legendary
Activity: 1512
Merit: 1028
It looks like the easiest way would be to get the entire history CSV files here first:

http://api.bitcoincharts.com/v1/csv/

These are updated daily (but might not be permanent). These could be saved to the hard drive in the py directory, and could be resumed or updated using http range requests if download is interrupted or if bridge is started later (only the tail should grow bytes). Only newly-seen trades need be written to the SCIDs while this is downloading. Then the history API could work forward from there in small re-sorted time chunks to catch up SCID from the last timestamp to current, throwing a "retry a smaller chunk" error if this gets >19000 trades.
sr. member
Activity: 462
Merit: 250
Clown prophet
hehe. congrats
hero member
Activity: 924
Merit: 501
* Viceroy opens eyes wide and turns to look while pushing open hand with palm up toward lucif while grinning and reluctantly agreeing with poster
legendary
Activity: 1512
Merit: 1028
The solution is not as simple.

The old chart API is simply mapped to the new API URL, which is why it doesn't work the same. The old API URL should have been killed to let you know why it doesn't work.

Was:
Code:
http://bitcoincharts.com/t/trades.csv?start=from_timestamp&end=99999999999999&symbol=mtgoxUSD

Now:
Code:
http://api.bitcoincharts.com/v1/trades.csv?start=1366740861&end=1366740980&symbol=mtgoxUSD

Requesting the "whole" history with a range 0-99999999999999 previously would retrieve all 187MB of trade data (just for mtgox) in chronological order (basically effecting a DDOS when people used this). A "restart" of sierrachartfeed would request data starting at the "last seen" timestamp to 99999999999999.

Now a maximum of 20000 trades will be returned, in reverse chronological order (newest to oldest). Requesting an end time of 999999999999 will always return the last 20000 trades (about a day's worth).

The solution is to download chunks of time ranges. You cannot request by trade number, only by a date range. This presents a challenge, because not only must you request appropriate time ranges, reassemble and sort them all into chronological order before writing to SCID, but also you must not request data in a way that may exceed 20000 trades (or you must refine if you do), and you must deal with duplicate results if your method gets them. You also cannot assume that if you get all 20000 possible and the last trade you received was time 1366740001 that you can continue another request at 1366740000 - more trades may have that same time.

Anyway, getting the data requires intelligence to request and retrieve an appropriate amount of data, then you must put all 187M back in order, and get it as current as possible before handing off to the live stream. I would have already billed .2 BTC equivalent for my time checking this out.
hero member
Activity: 924
Merit: 501
Never earned a tip before, thanks!  ;-)


144RtpxYKbigosiqTyXNwzwtH6Z7s96xUx


(lock thread else hungry coders will come ask for more)
sr. member
Activity: 462
Merit: 250
Clown prophet
Viceroy should get something for pointing Lucif to sort function  Smiley
Okay, okay. Give me address. You earned 0.1 BTC.
sr. member
Activity: 462
Merit: 250
Clown prophet
Okay, I update sierrachart... Lazy programmers. Everything I should do with myself =)

A bit crappy solution, but it works.

https://github.com/pentarh/sierrachartfeed
hero member
Activity: 924
Merit: 501
Viceroy should get something for pointing Lucif to sort function  Smiley

sr. member
Activity: 462
Merit: 250
Clown prophet
April 23, 2013, 11:44:24 AM
#9
None is right. Lucif got 0.2 bounty. Hehe.

Quote
--- a/sierrachartfeed.py
+++ b/sierrachartfeed.py
@@ -29,7 +29,9 @@ def bitcoincharts_history(symbol, from_timestamp, volume_precision, log=False):
     url = '%s?start=%s&end=99999999999999&symbol=%s' % (BITCOINCHARTS_TRADES_URL, from_timestamp, symbol)
     #print url
     req = urllib2.Request(url)
-    for line in urllib2.urlopen(req).read().split('\n'):
+    list1 = urllib2.urlopen(req).read().split('\n')
+    list1.sort()
+    for line in list1:
         if not line:
             continue
        
legendary
Activity: 1400
Merit: 1000
April 23, 2013, 11:39:54 AM
#8
def bitcoincharts_history(symbol, from_timestamp, volume_precision, log=False):
    url = '%s?start=%s&end=99999999999999&symbol=%s' % (BITCOINCHARTS_TRADES_URL, from_timestamp, symbol)
    #print url
    req = urllib2.Request(url)
    for line in urllib2.urlopen(req).read().split('\n').reverse():
[EDIT]for line in urllib2.urlopen(req).read().split('\n').sort():

or if does not work try add ( ... )

for line in (urllib2.urlopen(req).read().split('\n')).reverse():
legendary
Activity: 1400
Merit: 1000
April 23, 2013, 11:27:28 AM
#7
def bitcoincharts_history(symbol, from_timestamp, volume_precision, log=False):
    url = '%s?start=%s&end=99999999999999&symbol=%s' % (BITCOINCHARTS_TRADES_URL, from_timestamp, symbol)
    #print url
    req = urllib2.Request(url)
    for line in urllib2.urlopen(req).read().split('\n').reverse():
sr. member
Activity: 462
Merit: 250
Clown prophet
April 23, 2013, 11:17:21 AM
#6
Okay, I forked this feed
https://github.com/pentarh/sierrachartfeed

Submit patch to this repository and if it works - you get 0.2 BTC.

I repeat, in function def bitcoincharts_history, data should be in guaranted ascending order BEFORE updating scid.
sr. member
Activity: 462
Merit: 250
Clown prophet
April 23, 2013, 11:14:23 AM
#5
You can completely rewrite all https://github.com/slush0/sierrachartfeed in perl if you like
hero member
Activity: 924
Merit: 501
April 23, 2013, 11:11:12 AM
#4
And you need in python, why? cannot perl work?
(solution should be ... you just go to different website address for data pull, no?)

Step 1, I read API for Mt Gox Data
http://bitcoincharts.com/about/markets-api/

http://api.bitcoincharts.com/v1/markets.json

MUHUHAHAHAHAHA!

Pages:
Jump to: