Pages:
Author

Topic: Your bets for 2014 - page 2. (Read 6867 times)

legendary
Activity: 1904
Merit: 1002
January 09, 2013, 04:44:08 AM
#67
Guys, cumon Smiley You are talking with highload architect and unix sysadmin Smiley

Who doesn't believe that optimal search is log(n) Roll Eyes
legendary
Activity: 1904
Merit: 1002
January 09, 2013, 04:43:35 AM
#66
This is my 7200 seagate stats under running bitcoin. Regular disks have lower rates.

Quote
Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda              50,67     0,00  122,00    2,00   690,67     6,67    11,25     1,02    8,27    8,09   19,17   7,47  92,60
sda              51,33     5,67  124,33   22,67   702,67   105,33    10,99     1,18    8,00    7,88    8,62   6,32  92,97
sda             100,33     2,67  123,33   29,33   894,67   113,33    13,21     1,14    7,48    8,05    5,10   6,10  93,17
sda              51,00     0,00  111,33   35,67   664,00   132,00    10,83     1,13    7,66    8,52    4,96   6,35  93,37

As you can see, its about 100 requests per second with almost busy heads.

And this is cached sequental read, when caches are available as not so much data in DB. And first column displays how many requests were merged.

For the third time, quit using the old code.  We know it's shitty.  The new code is already available and barely touches the disk.
sr. member
Activity: 462
Merit: 250
Clown prophet
January 09, 2013, 04:43:17 AM
#65
Guys, cumon Smiley You are talking with highload architect and unix sysadmin Smiley
sr. member
Activity: 462
Merit: 250
Clown prophet
January 09, 2013, 04:40:17 AM
#64
This is my 7200 seagate stats under running bitcoin. Regular disks have lower rates.

Quote
Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda              50,67     0,00  122,00    2,00   690,67     6,67    11,25     1,02    8,27    8,09   19,17   7,47  92,60
sda              51,33     5,67  124,33   22,67   702,67   105,33    10,99     1,18    8,00    7,88    8,62   6,32  92,97
sda             100,33     2,67  123,33   29,33   894,67   113,33    13,21     1,14    7,48    8,05    5,10   6,10  93,17
sda              51,00     0,00  111,33   35,67   664,00   132,00    10,83     1,13    7,66    8,52    4,96   6,35  93,37

As you can see, its about 100 requests per second with almost busy heads.

And this is cached sequental read, when caches are available as not so much data in DB. And first column displays how many requests were merged.
legendary
Activity: 1904
Merit: 1002
January 09, 2013, 04:38:06 AM
#63
Plenty of data structures can look up individual transactions in O(log u) where u is the number of unspent transactions.
O(log u) - proof pls.

No need for proof, it's common knowledge.   Just look on the top right.

Not the best example... Binary search trees are only average case log(n), worst case n.  Btrees are worst case log(n).
legendary
Activity: 1904
Merit: 1002
January 09, 2013, 04:36:33 AM
#62
Plenty of data structures can look up individual transactions in O(log u) where u is the number of unspent transactions.  Even using the lowest base of two, if there are u= 1 trillion unspent transactions, we are looking at about 40*16 lookups per second.  Even a spinning disk can easily hit 2000 seeks per second, and a SSD would handle that many reads trivially.  Only if we assume 1 trillion unspent transactions, a suboptimal data structure, 16 transactions a second and 3 inputs per transaction do we move to SSD as a necessity.  And that's assuming spinning drives don't improve in the decade minimum it takes to get to these levels.

Your move.
O(log u) - proof pls.

afaik, spinning drive has about 100 iops performance.

http://en.wikipedia.org/wiki/B-tree

And I'll give you the point on 100 iops for low end consumer drives.  IOPS is a better measure than seeks.  Good drives can get you close to 200 and SSDs can blow the 2000 I calculated around out of the water: http://en.wikipedia.org/wiki/IOPS#Examples
legendary
Activity: 1288
Merit: 1080
January 09, 2013, 04:35:06 AM
#61
Plenty of data structures can look up individual transactions in O(log u) where u is the number of unspent transactions.
O(log u) - proof pls.

No need for proof, it's common knowledge.   Just look on the top right.
sr. member
Activity: 462
Merit: 250
Clown prophet
January 09, 2013, 04:27:57 AM
#60
Plenty of data structures can look up individual transactions in O(log u) where u is the number of unspent transactions.  Even using the lowest base of two, if there are u= 1 trillion unspent transactions, we are looking at about 40*16 lookups per second.  Even a spinning disk can easily hit 2000 seeks per second, and a SSD would handle that many reads trivially.  Only if we assume 1 trillion unspent transactions, a suboptimal data structure, 16 transactions a second and 3 inputs per transaction do we move to SSD as a necessity.  And that's assuming spinning drives don't improve in the decade minimum it takes to get to these levels.

Your move.
O(log u) - proof pls.

afaik, spinning drive has about 100 iops performance.
legendary
Activity: 1288
Merit: 1080
January 09, 2013, 04:25:57 AM
#59
Each trans can have many inputs. And every input needs to be found in huge data set.

And again, it is not a descending process.

Even if your transaction has an hundred inputs, you'll have to verify that each of these input point to an output of a valid transaction, but you won't go further, for the same reason as described above.  There is no exponential growth of the number of required checks.

Also, the number of inputs and outputs is more or less proportional to the size of the transactions, or the average size of the block.  I've already calculated that for 10k transaction par block it was about  8kB per second, iirc.  However you look at it, processing 8kB of data per second does not seem like such a tough task for a computer.

legendary
Activity: 1904
Merit: 1002
January 09, 2013, 04:16:25 AM
#58
PS.  Of course I oversimplify here because there can be chain forking which requires that you don't forget spent transactions too easily.  But even so, "forget" is a bit too strong a word since it just means you don't put it in RAM anymore. It's still on disk.
Looks like its out of speculation discussion, but i will continue this chess game.

10k trans per block is a 16 trans per second.

Each trans can have many inputs. And every input needs to be found in huge data set.

Your move.

Plenty of data structures can look up individual transactions in O(log u) where u is the number of unspent transactions.  Even using the lowest base of two, if there are u= 1 trillion unspent transactions, we are looking at about 40*16 lookups per second.  Even a spinning disk can easily hit 2000 seeks per second, and a SSD would handle that many reads trivially.  Only if we assume 1 trillion unspent transactions, a suboptimal data structure, 16 transactions a second and 3 inputs per transaction do we move to SSD as a necessity.  And that's assuming spinning drives don't improve in the decade minimum it takes to get to these levels.  Oh, and we're ignoring the cache in ram that will hold many of the transactions needed.

Your move.
legendary
Activity: 1288
Merit: 1080
January 09, 2013, 04:14:08 AM
#57

It's a cumulative graph and it looks quite linear.  What looks linear in a cumulative graph?  A constant!

Also those are "destroyed" bitcoins.  Therefore they represent transactions that could be "forgotten".  So if this is of any relation with this thread (which is not obvious), it really does not seem to prove your point at all.
sr. member
Activity: 462
Merit: 250
Clown prophet
January 09, 2013, 04:08:00 AM
#56
PS.  Of course I oversimplify here because there can be chain forking which requires that you don't forget spent transactions too easily.  But even so, "forget" is a bit too strong a word since it just means you don't put it in RAM anymore. It's still on disk.
Looks like its out of speculation discussion, but i will continue this chess game.

10k trans per block is a 16 trans per second.

Each trans can have many inputs. And every input needs to be found in huge data set.

Your move.
legendary
Activity: 1904
Merit: 1002
January 09, 2013, 04:06:15 AM
#55

well duh it goes up

there are 10 million bitcoin days created every day!
sr. member
Activity: 462
Merit: 250
Clown prophet
legendary
Activity: 1288
Merit: 1080
January 09, 2013, 03:50:28 AM
#53
Unspent needs to be verified against spent. And spent are growing to infinity.

It's not a descending process.

Coinbase transaction A is spent by transaction B which is spent by transaction C.

During database loading, you verify each block and you say:

- ok A is good,
- (you process more blocks)
- oh B spends A so B is good and I can forget about A
- (you keep processing more blocks)
- oh C spends B so C is good and I can forget about B.

Now if a transaction D comes and spends C you don't have to go all the way to A because you just need to remember that C is good.

All you have to do is to put C (and not A nor B) in an index or something so that you don't have to do this everytime you run the client.

PS.  Of course I oversimplify here because there can be chain forking which requires that you don't forget spent transactions too easily.  But even so, "forget" is a bit too strong a word since it just means you don't put it in RAM anymore. It's still on disk.
sr. member
Activity: 462
Merit: 250
Clown prophet
January 09, 2013, 03:29:49 AM
#52
Don't know - don't tell. Search for connectinputs() in source code.
legendary
Activity: 1193
Merit: 1003
9.9.2012: I predict that single digits... <- FAIL
January 09, 2013, 03:24:23 AM
#51
Unspent needs to be verified against spent.

Why? I believe a transaction only needs to be verified against unspent.
sr. member
Activity: 462
Merit: 250
Clown prophet
January 09, 2013, 03:21:07 AM
#50
Unspent needs to be verified against spent. And spent are growing to infinity.

Btw, disk stunning persists also at syncs (write), but I think its an ugly code.
sr. member
Activity: 462
Merit: 250
Clown prophet
January 09, 2013, 03:15:50 AM
#49
Slowpokes reserve here.

You will be doing well with this as long as your RAM is enough to cache disk data. When it is not - big problems comming.
full member
Activity: 150
Merit: 100
January 09, 2013, 03:12:30 AM
#48
Which as i and others have mentioned is resolved in v0.8. The current RAM required for the unspent set is barely 150MB. We are a long way from having to worry about it exceeding RAM availability. Even if it does, 10minutes is plenty of time to page it from disk.

Off the top of my head, i can already think of some easy ways to optimise this.

1) Only store the most recent unspent outputs in RAM.(Temporal Locality)
This is based on the observation that recent outputs are more likely to be spent again versus those which have been dormant for many years.
This will automatically optimise out lost/saving/cold storage wallets(Which is huge) by leaving them on disk.

2) Do a x pass process to verify the transactions in the block.
Basically x is
Total unspent output DB size / Your RAM

For i = 0 : X
Load each partition(i) in RAM, verify transactions which are in partition

There are probably many other ways you could optimise this but i thought of these 2 in less than a minute.
There are many very experienced and intelligent people involved in the Bitcoin ecosystem, this is the least of my worries.
Pages:
Jump to: