Pages:
Author

Topic: [BEGINNER WORKSHOP]: bitcoincharts + postgres = cool sql queries (Read 5743 times)

hero member
Activity: 784
Merit: 1000
Wonder if you would be interested in estimating early adopters' reserves again? I have not finished downloading the blockchain yet. Sad

this thread could be interesting for you: https://bitcointalksearch.org/topic/distribution-of-bitcoin-wealth-by-owner-316297

Yeah, that's cool, thanks.
donator
Activity: 2772
Merit: 1019
Wonder if you would be interested in estimating early adopters' reserves again? I have not finished downloading the blockchain yet. Sad

this thread could be interesting for you: https://bitcointalksearch.org/topic/distribution-of-bitcoin-wealth-by-owner-316297
donator
Activity: 2772
Merit: 1019
api url has change to http://api.bitcoincharts.com/v1/csv/mtgoxUSD.csv, fixed op, ignore previous post.
donator
Activity: 2772
Merit: 1019
EDIT: ignore this post, see next post

tcatm has limited the trade api. max 20000 trades can be downloaded per request.

here's a bash script that updates local trades table

Code:
#!/bin/bash

db="mtgox"
user="postgres"
symbol="mtgoxUSD"
PSQL="psql -q -t -U ${user} ${db}"
DL="curl -s"

function sql() {
  rc=$(echo "$1" | $PSQL)
}

function extend() {
  sql "drop table if exists import;"
  sql "create table import (id serial, unixtime int, price numeric(32,10), volume numeric(32,8), type smallint);"
  
  len=$(( 4 * 60 * 60 ))
  while true; do
    sql "select max(unixtime) from trades;"
    start=$rc
    start_human=$(date -d "@$start")
    echo "--- $start_human ----------------------------------------------------------"
    end=$(( $start + $len ))
    echo "start ($start) + len ($len) = end ($end)"
    
    # download
    $DL "http://bitcoincharts.com/t/trades.csv?symbol=$symbol&start=$start&end=$end" > trades.csv
    end_file=$(head -n 1 trades.csv | cut -d , -f 1)
    start_file=$(tail -n 1 trades.csv | cut -d , -f 1)
    echo "start_file ($start_file) - start ($start) = $(( $start_file - $start ))"
    echo "end_file ($end_file) - end ($end) = $(( $end_file - $end ))"
    if [ $start_file -eq $start ]; then
      echo start times match, updating trades table
      # put into import table
      sql "delete from import;"
      sql "\copy import(unixtime,price,volume) from 'trades.csv' delimiters ',' csv;"
      sql "delete from trades where unixtime >= $start;"
      sql "insert into trades (unixtime, t, price, volume) select unixtime, TIMESTAMP 'epoch' + unixtime * INTERVAL '1 second', price, volume from import order by id desc;"
      if [ $start_file -eq $end_file ]; then
        echo "end detected, sleeping for 10 minutes,...."
        sleep 10m;
      fi
    else
      echo "start_file != start, exiting, check code"
      exit 1
    fi
  done
}

extend

it's quite fresh (use accordingly), currently in process of syncing my table. It uses 4-hour blocks.

dont know if it quits nicely yet when done Wink

hoping someone can use it.

EDIT: added sync-end-detection. will sleep 10 minutes, then continue to sync.

EDIT2:

hmm, it stops prematurely. I think something is weird with the bitcoincharts api:

neither this: http://bitcoincharts.com/t/trades.csv?symbol=mtgoxUSD&start=1365709116&end=1365710916
nor this: http://bitcoincharts.com/t/trades.csv?symbol=mtgoxUSD&end=1365710916

deliver any data (except one trade in the first case) while I think they should.

while this: http://bitcoincharts.com/t/trades.csv?symbol=mtgoxUSD&start=1365709116

delivers the most recent trade data (20000 most recent trades) ignoring "start" (at least that is consistent with the api docs).

the api docs: http://bitcoincharts.com/about/markets-api/ say I should only use the "end" parameter. That doesn't work, though (http://bitcoincharts.com/t/trades.csv?symbol=mtgoxUSD&end=1365710916 doesn't deliver)

ideas?
hero member
Activity: 784
Merit: 1000
Wonder if you would be interested in estimating early adopters' reserves again? I have not finished downloading the blockchain yet. Sad

Yeah, I'm generally interested. Just need a couple of quiet hours and the muse to kiss me. Been thinking about this.

this http://statistics.ecdsa.org/ would tell us the info if it was up-to-date. Maybe we should pester ThomasV (?).


Yeah, I have been using this, it's great but a bit outdated.
donator
Activity: 2772
Merit: 1019
Wonder if you would be interested in estimating early adopters' reserves again? I have not finished downloading the blockchain yet. Sad

Yeah, I'm generally interested. Just need a couple of quiet hours and the muse to kiss me. Been thinking about this.

this http://statistics.ecdsa.org/ would tell us the info if it was up-to-date. Maybe we should pester ThomasV (?).
hero member
Activity: 784
Merit: 1000
Wonder if you would be interested in estimating early adopters' reserves again? I have not finished downloading the blockchain yet. Sad
donator
Activity: 2772
Merit: 1019
the params in that datatype represent scale and precision, scale is the total number of digits (ie including decimals) precision is the number of decimals

with the current bitcoin spec 16,8 is enough to store the largest theoretical possible transaction amount of 21million, and the smallest value of one satoshi.

The datatype scale and precision only affect the stored value, aggregate functions operate correctly disregarding scale and precision of the datatype (within the limits of the architecture).

eg the max value that can be stored in the field is 99999999.99999999,if you sum this with 0.00000001 then the result is 100000000 (ie no overflow occurs)

Good to know. Thanks. Changed the op to use 16,8
legendary
Activity: 2576
Merit: 1087
the params in that datatype represent scale and precision, scale is the total number of digits (ie including decimals) precision is the number of decimals

with the current bitcoin spec 16,8 is enough to store the largest theoretical possible transaction amount of 21million, and the smallest value of one satoshi.

The datatype scale and precision only affect the stored value, aggregate functions operate correctly disregarding scale and precision of the datatype (within the limits of the architecture).

eg the max value that can be stored in the field is 99999999.99999999,if you sum this with 0.00000001 then the result is 100000000 (ie no overflow occurs)
donator
Activity: 2772
Merit: 1019
EDIT4: anyone knowledgable with postgres have a suggestion what datatype to best use for the monetary values?

NUMERIC(16,8) should do it




In case postgres uses that datatype for aggregate functions, 8 places before the decimal point will be enough.
legendary
Activity: 2576
Merit: 1087
EDIT4: anyone knowledgable with postgres have a suggestion what datatype to best use for the monetary values?

NUMERIC(16,8) should do it


legendary
Activity: 1176
Merit: 1010
Borsche

I usually only ever do this like once a month.

If someone makes a script or something, please share. I'd do it for money because my own need for it is low Wink

The bitcoinchart.com api takes a start-time (unix time), that should be passed as "select max(unixtime) from trades;". I'm not sure how the \copy command behaves (wether or not it overwrites data or how that can be configured)



Ok so thats easy then, as "copy from" would append data to the table without touching existing rows.
donator
Activity: 2772
Merit: 1019
Molecular what are the programs you used for trading? Also give me your donation address please?

used? you mean traidor.py?

1JANa7gQ2VE7Wkv3o4917ECy8NWYtUx5F5
hero member
Activity: 784
Merit: 1000
Thanks, useful. How do you fetch updates, or you recreate the full table every time?

I usually only ever do this like once a month.

If someone makes a script or something, please share. I'd do it for money because my own need for it is low Wink

The bitcoinchart.com api takes a start-time (unix time), that should be passed as "select max(unixtime) from trades;". I'm not sure how the \copy command behaves (wether or not it overwrites data or how that can be configured)



How is the size of the CSV file? Also I don't recommend people updating their databases too frequently, bitcoinchart is already under quite a bit of load and went 503 from time to time.

142 MB

tcatm recently told me at ccc it wasn't a problem for the server. I'm all for figuring out a way to do updates, though.


Molecular what are the programs you used for trading? Also give me your donation address please?
donator
Activity: 2772
Merit: 1019
Thanks, useful. How do you fetch updates, or you recreate the full table every time?

I usually only ever do this like once a month.

If someone makes a script or something, please share. I'd do it for money because my own need for it is low Wink

The bitcoinchart.com api takes a start-time (unix time), that should be passed as "select max(unixtime) from trades;". I'm not sure how the \copy command behaves (wether or not it overwrites data or how that can be configured)



How is the size of the CSV file? Also I don't recommend people updating their databases too frequently, bitcoinchart is already under quite a bit of load and went 503 from time to time.

142 MB

tcatm recently told me at ccc it wasn't a problem for the server. I'm all for figuring out a way to do updates, though.
hero member
Activity: 784
Merit: 1000
Thanks, useful. How do you fetch updates, or you recreate the full table every time?

I usually only ever do this like once a month.

If someone makes a script or something, please share. I'd do it for money because my own need for it is low Wink

The bitcoinchart.com api takes a start-time (unix time), that should be passed as "select max(unixtime) from trades;". I'm not sure how the \copy command behaves (wether or not it overwrites data or how that can be configured)



How is the size of the CSV file? Also I don't recommend people updating their databases too frequently, bitcoinchart is already under quite a bit of load and went 503 from time to time.
donator
Activity: 2772
Merit: 1019
Thanks, useful. How do you fetch updates, or you recreate the full table every time?

I usually only ever do this like once a month.

If someone makes a script or something, please share. I'd do it for money because my own need for it is low Wink

The bitcoinchart.com api takes a start-time (unix time), that should be passed as "select max(unixtime) from trades;". I'm not sure how the \copy command behaves (wether or not it overwrites data or how that can be configured)

legendary
Activity: 1176
Merit: 1010
Borsche
Thanks, useful. How do you fetch updates, or you recreate the full table every time?
hero member
Activity: 784
Merit: 1000
Sorry, I think there is something wrong with the 2010 volume in USD...
Hmm? You do realize BTC was worth nothing at that time?

Never below $0.01 after Gox opened, $0.00001 something is outright impossible

yes, you're correct. something wrong... might also affect the other years... checking.

EDIT: oh goddamnit, I accidentally used an old script for this. The datatypes are wrong, everything was imported as integer with 0 decimal places.

I'm sorry. I will fix this... will take a while.

EDIT2 corrected OP. you can fix things by doing this:

Code:
mtgox=# drop table trades;
DROP TABLE
mtgox=# create table trades (id serial, unixtime int, t timestamp, price numeric(32,10), volume numeric(32,8));
NOTICE:  CREATE TABLE will create implicit sequence "trades_id_seq" for serial column "trades.id"
CREATE TABLE
mtgox=# \copy trades(unixtime,price,volume) from 'trades.csv' delimiters ',' csv;
mtgox=# update trades set t = TIMESTAMP 'epoch' + unixtime * INTERVAL '1 second';
UPDATE 3563178

EDIT3: fixed my other 2 posts containing queries. The values for the years >=2011 changed only "slightly".

EDIT4: anyone knowledgable with postgres have a suggestion what datatype to best use for the monetary values?

Thanks, I guessed this would be a floating point problem. Smiley
donator
Activity: 2772
Merit: 1019
Sorry, I think there is something wrong with the 2010 volume in USD...
Hmm? You do realize BTC was worth nothing at that time?

Never below $0.01 after Gox opened, $0.00001 something is outright impossible

yes, you're correct. something wrong... might also affect the other years... checking.

EDIT: oh goddamnit, I accidentally used an old script for this. The datatypes are wrong, everything was imported as integer with 0 decimal places.

I'm sorry. I will fix this... will take a while.

EDIT2 corrected OP. you can fix things by doing this:

Code:
mtgox=# drop table trades;
DROP TABLE
mtgox=# create table trades (id serial, unixtime int, t timestamp, price numeric(32,10), volume numeric(32,8));
NOTICE:  CREATE TABLE will create implicit sequence "trades_id_seq" for serial column "trades.id"
CREATE TABLE
mtgox=# \copy trades(unixtime,price,volume) from 'trades.csv' delimiters ',' csv;
mtgox=# update trades set t = TIMESTAMP 'epoch' + unixtime * INTERVAL '1 second';
UPDATE 3563178

EDIT3: fixed my other 2 posts containing queries. The values for the years >=2011 changed only "slightly".

EDIT4: anyone knowledgable with postgres have a suggestion what datatype to best use for the monetary values?
Pages:
Jump to: