Pages:
Author

Topic: New method to pull MtGox trade data - page 2. (Read 18586 times)

legendary
Activity: 1106
Merit: 1026
May 30, 2013, 06:22:48 PM
#19
1. At the moment I only need BTCUSD trade tick data (no need for Bid and / or Ask data). For a CSV file (I will later import this data into my application) I need:
Date,Time,Price,Volume (in Satoshis)
yyyy-mm-dd,hh:mm:ss,65000000000

Step 1: Go to the sign up website (picture)

Code:
https://bigquery.cloud.google.com/

Step 2: Create project.. (picture)

Step 3: Enable "BigQuery API" (picture)

Step 4: If you agree, accept the ToS (picture)

Step 5: Compose a query (picture)

Code:
SELECT DATE(TIMESTAMP(date)) as date, TIME(TIMESTAMP(date)) as time, price/100000 as price, amount
FROM [mt-gox:mtgox.trades]
WHERE Currency__ = "USD"
AND date < "2013-05-19 17:30:00"
ORDER BY date DESC, time DESC LIMIT 16000;

Step 6: Edit the query for your needs and after processing you should be able to download the data as csv.

Code:
date,time,price,amount
2013-05-19,17:29:46,120.88,18501496
2013-05-19,17:29:45,120.83,84148500
2013-05-19,17:29:42,120.83,50000000
2013-05-19,17:28:57,120.12501,1138895
...

If you wish, delete the row with the date. This will return the newst transactions (or the newst before the 23rd of may).

The maximum number of results is limited to 16000, otherwise it would contain too many rows for direct download, but you can repeat step 5.
sr. member
Activity: 287
Merit: 250
May 30, 2013, 03:29:18 PM
#18
The query language for BigQuery is pretty similar to SQL. You can just ask bigQuery to return the date data in the format you want:

For everyone who needs timestamp instead of a date string:

Code:
SELECT PARSE_UTC_USEC(date) as timestamp FROM [mt-gox:mtgox.trades] WHERE price = 2122000 LIMIT 1

This will return:

Code:
Row timestamp
1 1359647088000000

For everyone who needs the date and the time data in separated columns:

Code:
SELECT DATE(TIMESTAMP(date)) as date, TIME(TIMESTAMP(date)) as time FROM [mt-gox:mtgox.trades] WHERE price = 2122000 LIMIT 1

This will return:

Code:
Row date time
1 2013-01-31 15:44:48

Edit, here is a list of stuff you can do related to date/time.
sr. member
Activity: 378
Merit: 250
Born to chew bubble gum and kick ass
May 30, 2013, 12:34:18 PM
#17
What data format do you need? I'm happy to help. Smiley

1. At the moment I only need BTCUSD trade tick data (no need for Bid and / or Ask data). For a CSV file (I will later import this data into my application) I need:
Date,Time,Price,Volume (in Satoshis)
yyyy-mm-dd,hh:mm:ss,65000000000

2. In future I will need / want / beg for a mechanism / application that will update the CSV file with missing historical data and will update real-time data from MtGox. I will later map this CSV file (ASCII mapping) with my charting application to dynamically connect the constantly updating BTCUSD file to this charting application.
legendary
Activity: 1106
Merit: 1026
May 30, 2013, 09:24:04 AM
#16
Wow, the query speed is impressive. Great! Thank you for sharing.

Edit:

This would be for MtGox to decide if they are inclined to incur this additional cost. There are tens of thousand of third party charting (I am not talking about toys like MT4) software's users: forex and futures and stock traders like myself who, in the future when Bitcoin goes mainstream, would potentially be excluded from being able to load MtGox data into their apps, if MtGox decides not to to avail two additional fields to collect.

The data given is already enough for all those purposes. What data format do you need? I'm happy to help. Smiley


@nitrous: Currently the date has the data type string. I suggest to change it to timestamp. String operations are slow, take much more storage and something like DAYOFWEEK(date) doesn't work.
sr. member
Activity: 378
Merit: 250
Born to chew bubble gum and kick ass
May 29, 2013, 07:24:56 PM
#15
Well I'm already creating the underlying mechanism of the app for my own personal use, and I am more than happy to do the little bit of extra work to make it cross-platform for free.
So I spotted this thread right on time.

Of course I always welcome donations Tongue but most of the bitcoin work I do/am doing is open-source.
I have no problems with donations. Actually I got interested in Bitcoin because I wanted to make a donation (to a philosopher). But I like programmers too.

Hopefully I can release a self-contained python app which won't need you to already have installed python so that it's as intuitive as possible to use.

Fantastic Smiley
sr. member
Activity: 246
Merit: 250
May 29, 2013, 07:10:39 PM
#14
I will ask him when I next catch him, if what you're saying is true then it would make a lot of sense to include all three to accommodate all users.

Fantastic  Smiley

On the other hand, Google I believe charges for the amount of data uploaded, so that may factor into their decision as it may seem a bit redundant.

This would be for MtGox to decide if they are inclined to incur this additional cost. There are tens of thousand of third party charting (I am not talking about toys like MT4) software's users: forex and futures and stock traders like myself who, in the future when Bitcoin goes mainstream, would potentially be excluded from being able to load MtGox data into their apps, if MtGox decides not to to avail two additional fields to collect.

I don't think google BigQuery makes it very easy for non-programmers to actually download bulk data from them, and I haven't had a chance to read up on using it programmatically yet. Once I have though, it would be really easy for me to make a CSV as you asked for though, and perhaps I could even make a small application to keep a local database updated from the BQ servers. What platform are you on? I assume from your screenshot it is Windows?

I do not know how much would creating this application cost. I hope crowd financing might work (with a little help from MtGox). Yes, I am using Windows.

Well I'm already creating the underlying mechanism of the app for my own personal use, and I am more than happy to do the little bit of extra work to make it cross-platform for free. Of course I always welcome donations Tongue but most of the bitcoin work I do/am doing is open-source. Hopefully I can release a self-contained python app which won't need you to already have installed python so that it's as intuitive as possible to use.
sr. member
Activity: 378
Merit: 250
Born to chew bubble gum and kick ass
May 29, 2013, 07:03:03 PM
#13
I will ask him when I next catch him, if what you're saying is true then it would make a lot of sense to include all three to accommodate all users.

Fantastic  Smiley

On the other hand, Google I believe charges for the amount of data uploaded, so that may factor into their decision as it may seem a bit redundant.

This would be for MtGox to decide if they are inclined to incur this additional cost. There are tens of thousand of third party charting (I am not talking about toys like MT4) software's users: forex and futures and stock traders like myself who, in the future when Bitcoin goes mainstream, would potentially be excluded from being able to load MtGox data into their apps, if MtGox decides not to to avail two additional fields to collect.

I don't think google BigQuery makes it very easy for non-programmers to actually download bulk data from them, and I haven't had a chance to read up on using it programmatically yet. Once I have though, it would be really easy for me to make a CSV as you asked for though, and perhaps I could even make a small application to keep a local database updated from the BQ servers. What platform are you on? I assume from your screenshot it is Windows?

I do not know how much would creating this application cost. I hope crowd financing might work (with a little help from MtGox). Yes, I am using Windows.
sr. member
Activity: 246
Merit: 250
May 29, 2013, 06:50:53 PM
#12
The dates are formatted as strings and include the time, e.g. "2013-03-12 23:52:16". So the format is "yyyy-mm-dd hh:mm:ss".

Just a thought: could 2 additional (to already existing Date) columns / fields to collect be added to the API: Date_Only and Time_Only (derived from Date)? Would this be something that requires a lot of work?

I will ask him when I next catch him, if what you're saying is true then it would make a lot of sense to include all three to accommodate all users. On the other hand, Google I believe charges for the amount of data uploaded, so that may factor into their decision as it may seem a bit redundant.

I don't think google BigQuery makes it very easy for non-programmers to actually download bulk data from them, and I haven't had a chance to read up on using it programmatically yet. Once I have though, it would be really easy for me to make a CSV as you asked for though, and perhaps I could even make a small application to keep a local database updated from the BQ servers. What platform are you on? I assume from your screenshot it is Windows?
sr. member
Activity: 378
Merit: 250
Born to chew bubble gum and kick ass
May 29, 2013, 06:11:43 PM
#11
Just look, there is an application called Sierra Chart (I think some trials were made with this app to accept MtGox data) that has Date and Time separated for intraday data in CSV format http://www.sierrachart.com/index.php?l=doc/doc_TextCSVDataFormat.html#ExampleIntraday
sr. member
Activity: 378
Merit: 250
Born to chew bubble gum and kick ass
May 29, 2013, 06:00:25 PM
#10
The dates are formatted as strings and include the time, e.g. "2013-03-12 23:52:16". So the format is "yyyy-mm-dd hh:mm:ss".

Just a thought: could 2 additional (to already existing Date) columns / fields to collect be added to the API: Date_Only and Time_Only (derived from Date)? Would this be something that requires a lot of work?
sr. member
Activity: 378
Merit: 250
Born to chew bubble gum and kick ass
May 29, 2013, 05:20:47 PM
#9
    Does the application not allow any way of doing this directly?

    Unfortunately not.

    • You can ask MagicalTux and see if he'll consider it for when he begins regular updates (although that would be 1 or 2 weeks away before you see any actual changes).

    I wouldn't have the courage to bother him  Smiley

    • When I am able to write a program to download the entire database, I will release the code. If you have any programming experience, it should be relatively simple to then modify the data to your needs before input into your application. If not, I may be able to convert it to any format you need, however I'm not too sure of the logistics of then giving you the database.
    I have no programming experience. Out of curiosity, would it be possible somehow get the data into a CSV file in my computer that would be updated real-time? I would not like to bother you with this. I only want to know if this is feasible.

    • You could try doing a SQL query along the lines of:
    Code:
    SELECT LEFT(Date,10),RIGHT(Date,8) FROM [mt-gox:mtgox.trades] LIMIT 10;
      Anything more complicated than Word or Excel overwhelms my circuits  Cheesy[/list]
      newbie
      Activity: 42
      Merit: 0
      May 29, 2013, 05:18:08 PM
      #8
      Thanks, this is great info!
      sr. member
      Activity: 246
      Merit: 250
      May 29, 2013, 05:02:48 PM
      #7
      The dates are formatted as strings and include the time, e.g. "2013-03-12 23:52:16". So the format is "yyyy-mm-dd hh:mm:ss".

      Thank you very much for the information.

      Is it possible (if it's not too late) to separate Date from Time? Most widely used trading and charting applications for retail traders (Multicharts or Ninjatrader and others) will not be able to use the data in the format where date and time are merged.

      Does the application not allow any way of doing this directly?

      If not, you have a few different options:

      • You can ask MagicalTux and see if he'll consider it for when he begins regular updates (although that would be 1 or 2 weeks away before you see any actual changes).
      • When I am able to write a program to download the entire database, I will release the code. If you have any programming experience, it should be relatively simple to then modify the data to your needs before input into your application. If not, I may be able to convert it to any format you need, however I'm not too sure of the logistics of then giving you the database.
      • You could try doing a SQL query along the lines of:
      Code:
      SELECT LEFT(Date,10),RIGHT(Date,8) FROM [mt-gox:mtgox.trades] LIMIT 10;
        N.B. though, that BigQuery may not let you download all the data at once using this method.[/li]
      sr. member
      Activity: 378
      Merit: 250
      Born to chew bubble gum and kick ass
      May 29, 2013, 04:10:04 PM
      #6
      The dates are formatted as strings and include the time, e.g. "2013-03-12 23:52:16". So the format is "yyyy-mm-dd hh:mm:ss".

      Thank you very much for the information.

      Is it possible (if it's not too late) to separate Date from Time? Most widely used trading and charting applications for retail traders (Multicharts or Ninjatrader and others) will not be able to use the data in the format where date and time are merged.

      sr. member
      Activity: 246
      Merit: 250
      May 29, 2013, 03:53:42 PM
      #5

      DateThe date of the trade in ISO format.str


      Great  Grin because my charting app does not accept Unixtime.

      Okay, we have Date like ''2013-05-28''. Isn't Time like ''hhmmss'' missing?

      The dates are formatted as strings and include the time, e.g. "2013-03-12 23:52:16". So the format is "yyyy-mm-dd hh:mm:ss".
      Here's a sample of the database:

      Money_Trade__PrimaryCurrency__Bid_User_Rest_App__Ask_User_Rest_App__TypePropertiesItemAmountPriceDate
      1363132336180895trueUSDe7b6384e-3981-42e7-b3b7-7719c4dc74a5nullbidmarketBTC2227667644350002013-03-12 23:52:16
      1363132369821802trueUSDnullnullbidlimitBTC817406644350002013-03-12 23:52:49
      1363132372700876trueUSDnulle75e9f56-4b37-4440-9b10-b56fc6a47ee1asklimitBTC6425087344343512013-03-12 23:52:52
      1363132372786101trueUSDnulle75e9f56-4b37-4440-9b10-b56fc6a47ee1asklimitBTC817406644339102013-03-12 23:52:52
      1363132386679941trueUSDnulle75e9f56-4b37-4440-9b10-b56fc6a47ee1asklimitBTC16000000044147172013-03-12 23:53:06
      sr. member
      Activity: 378
      Merit: 250
      Born to chew bubble gum and kick ass
      May 29, 2013, 03:29:25 PM
      #4

      DateThe date of the trade in ISO format.str


      Great  Grin because my charting app does not accept Unixtime.

      Okay, we have Date like ''2013-05-28''. Isn't Time like ''hhmmss'' missing?
      sr. member
      Activity: 287
      Merit: 250
      May 29, 2013, 03:02:57 AM
      #3
      This is really sweet, thanks!
      vip
      Activity: 608
      Merit: 501
      -
      May 29, 2013, 01:52:56 AM
      #2
      Just for info, automatic update of these data is scheduled to start this week or next week
      sr. member
      Activity: 246
      Merit: 250
      May 29, 2013, 01:41:52 AM
      #1
      Hi there,

      As I've mentioned before, the extra load caused by people downloading the entire MtGox trade history from the API is unsustainable, and recently I published my own dump to reduce the load on the servers. However, since the 23rd of May, MagicalTux has uploaded a full dump to Google's BigQuery service. This database contains trades for all currencies and contains all information available by the API (and more). The service isn't yet fully integrated, however soon it will start being updated automatically every 10mins-1hr. Until then, you can still access all data before the 23rd of May, and it is the recommended method.

      To gain access to it, follow these instructions:

      • First, you need to sign up for Google's BigQuery - see here for instructions on doing so.
      • Now that you are signed up, go to the BigQuery console: https://bigquery.cloud.google.com
      • Now each time you go to the console, you will need to add the MtGox data set -- click on the drop down array next to API Project on the lefthand sidebar.
      • Go to 'Switch to project' > 'Display project...'
      • In the resulting popup box (Add Project), enter the Project ID, 'mt-gox'.
      • In the sidebar, mt-gox:mtgox should appear. Click on it to reveal the tables, which currently consists of 'trades'.

      Now you can perform SQL queries on this using [mt-gox:mtgox.trades] for the table name, e.g.:
      Code:
      select * from [mt-gox:mtgox.trades] where Currency__='GBP';

      The fields mean the following:
      FieldDescriptionType
      Money_Trade__This is the tid value, a unique id for the order. The first trades used (almost) sequential integers up to 218868, the majority of trades however came after this (~6 million) and use a Unix microstamp instead.int
      PrimaryWhether the trade happened in the currency specified (see Currency__). Some trades may appear on multiple tickers, so to ignore duplicates you may want to `select where primary is true'.bool
      Currency__The symbol of the currency against which BTC was bought or sold.str
      Bid_User_Rest_App__App ID of the bidding party (the person buying BTC)*str
      Ask_User_Rest_App__App ID of the asking party (the person selling BTC)*str
      Typebid or ask (bid means to buy BTC by selling the specified currency, ask means to buy the specified currency by selling BTC)str
      PropertiesA comma separated list of properties of the order, e.g. limit,market,mixed_currencystr
      ItemThe thing being bought or sold, always BTC.str
      AmountThe volume of the item (BTC) being bought/sold in satoshis (1 BTC = 100,000,000 satoshis).int
      PriceThe price of each item in the currency specified. For most currencies, this is given in units of 0.00001 of the currency, so $12.3456 would be 1,234,560. Some currencies, currently JPY and SEK use units of 0.001 instead.int
      DateThe date of the trade in ISO format.str

      * These fields represent the app ID when an app is being used to make the trade. Currently, these are null for all but 4% of trades. There are two fields because there are two parties to every trade, one who is buying the BTC (the bidder) and another who is selling it (the asker). For most of these 4% app trades, only one of the parties is an app, however there are also ~3500 trades for which both parties are apps and both of these fields are non-null.

      I plan on writing a python program soon to download a dump of this data straight from Google to a local sqlite3 database. Once you are able to obtain this data, feel free to then keep it up-to-date using the API methods (see here for info from my own documentation to do this using the v2 api). MagicalTux has confirmed that this usage is permissible, as the load on the API is minimised this way.

      Thanks,
      nitrous
      Pages:
      Jump to: