Your bets for 2014 - page 2. | Bitcointalksearch.org

notme

legendary

Activity: 1904

Merit: 1002

Quote from: lucif on January 09, 2013, 04:43:17 AM

Guys, cumon

You are talking with highload architect and unix sysadmin

Who doesn't believe that optimal search is log(n) Roll Eyes

notme

legendary

Activity: 1904

Merit: 1002

Quote from: lucif on January 09, 2013, 04:40:17 AM

This is my 7200 seagate stats under running bitcoin. Regular disks have lower rates.

Quote

Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 50,67 0,00 122,00 2,00 690,67 6,67 11,25 1,02 8,27 8,09 19,17 7,47 92,60
sda 51,33 5,67 124,33 22,67 702,67 105,33 10,99 1,18 8,00 7,88 8,62 6,32 92,97
sda 100,33 2,67 123,33 29,33 894,67 113,33 13,21 1,14 7,48 8,05 5,10 6,10 93,17
sda 51,00 0,00 111,33 35,67 664,00 132,00 10,83 1,13 7,66 8,52 4,96 6,35 93,37

As you can see, its about 100 requests per second with almost busy heads.

And this is cached sequental read, when caches are available as not so much data in DB. And first column displays how many requests were merged.

For the third time, quit using the old code. We know it's shitty. The new code is already available and barely touches the disk.

lucif

sr. member

Activity: 462

Merit: 250

Clown prophet

Guys, cumon

You are talking with highload architect and unix sysadmin

lucif

sr. member

Activity: 462

Merit: 250

Clown prophet

This is my 7200 seagate stats under running bitcoin. Regular disks have lower rates.

Quote

Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 50,67 0,00 122,00 2,00 690,67 6,67 11,25 1,02 8,27 8,09 19,17 7,47 92,60
sda 51,33 5,67 124,33 22,67 702,67 105,33 10,99 1,18 8,00 7,88 8,62 6,32 92,97
sda 100,33 2,67 123,33 29,33 894,67 113,33 13,21 1,14 7,48 8,05 5,10 6,10 93,17
sda 51,00 0,00 111,33 35,67 664,00 132,00 10,83 1,13 7,66 8,52 4,96 6,35 93,37

As you can see, its about 100 requests per second with almost busy heads.

And this is cached sequental read, when caches are available as not so much data in DB. And first column displays how many requests were merged.

notme

legendary

Activity: 1904

Merit: 1002

Quote from: grondilu on January 09, 2013, 04:35:06 AM

Quote from: lucif on January 09, 2013, 04:27:57 AM

Quote from: notme on January 09, 2013, 04:16:25 AM

Plenty of data structures can look up individual transactions in O(log u) where u is the number of unspent transactions.

O(log u) - proof pls.

No need for proof, it's common knowledge. Just look on the top right.

Not the best example... Binary search trees are only average case log(n), worst case n. Btrees are worst case log(n).

notme

legendary

Activity: 1904

Merit: 1002

Quote from: lucif on January 09, 2013, 04:27:57 AM

Quote from: notme on January 09, 2013, 04:16:25 AM

Plenty of data structures can look up individual transactions in O(log u) where u is the number of unspent transactions. Even using the lowest base of two, if there are u= 1 trillion unspent transactions, we are looking at about 40*16 lookups per second. Even a spinning disk can easily hit 2000 seeks per second, and a SSD would handle that many reads trivially. Only if we assume 1 trillion unspent transactions, a suboptimal data structure, 16 transactions a second and 3 inputs per transaction do we move to SSD as a necessity. And that's assuming spinning drives don't improve in the decade minimum it takes to get to these levels.

Your move.

O(log u) - proof pls.

afaik, spinning drive has about 100 iops performance.

http://en.wikipedia.org/wiki/B-tree

And I'll give you the point on 100 iops for low end consumer drives. IOPS is a better measure than seeks. Good drives can get you close to 200 and SSDs can blow the 2000 I calculated around out of the water: http://en.wikipedia.org/wiki/IOPS#Examples

grondilu

legendary

Activity: 1288

Merit: 1080

Quote from: lucif on January 09, 2013, 04:27:57 AM

Quote from: notme on January 09, 2013, 04:16:25 AM

Plenty of data structures can look up individual transactions in O(log u) where u is the number of unspent transactions.

O(log u) - proof pls.

No need for proof, it's common knowledge. Just look on the top right.

lucif

sr. member

Activity: 462

Merit: 250

Clown prophet

Quote from: notme on January 09, 2013, 04:16:25 AM

Plenty of data structures can look up individual transactions in O(log u) where u is the number of unspent transactions. Even using the lowest base of two, if there are u= 1 trillion unspent transactions, we are looking at about 40*16 lookups per second. Even a spinning disk can easily hit 2000 seeks per second, and a SSD would handle that many reads trivially. Only if we assume 1 trillion unspent transactions, a suboptimal data structure, 16 transactions a second and 3 inputs per transaction do we move to SSD as a necessity. And that's assuming spinning drives don't improve in the decade minimum it takes to get to these levels.

Your move.

O(log u) - proof pls.

afaik, spinning drive has about 100 iops performance.

grondilu

legendary

Activity: 1288

Merit: 1080

Quote from: lucif on January 09, 2013, 04:08:00 AM

Each trans can have many inputs. And every input needs to be found in huge data set.

And again, it is not a descending process.

Even if your transaction has an hundred inputs, you'll have to verify that each of these input point to an output of a valid transaction, but you won't go further, for the same reason as described above. There is no exponential growth of the number of required checks.

Also, the number of inputs and outputs is more or less proportional to the size of the transactions, or the average size of the block. I've already calculated that for 10k transaction par block it was about 8kB per second, iirc. However you look at it, processing 8kB of data per second does not seem like such a tough task for a computer.

notme

legendary

Activity: 1904

Merit: 1002

Quote from: lucif on January 09, 2013, 04:08:00 AM

Quote from: grondilu on January 09, 2013, 03:50:28 AM

PS. Of course I oversimplify here because there can be chain forking which requires that you don't forget spent transactions too easily. But even so, "forget" is a bit too strong a word since it just means you don't put it in RAM anymore. It's still on disk.

Looks like its out of speculation discussion, but i will continue this chess game.

10k trans per block is a 16 trans per second.

Each trans can have many inputs. And every input needs to be found in huge data set.

Your move.

Plenty of data structures can look up individual transactions in O(log u) where u is the number of unspent transactions. Even using the lowest base of two, if there are u= 1 trillion unspent transactions, we are looking at about 40*16 lookups per second. Even a spinning disk can easily hit 2000 seeks per second, and a SSD would handle that many reads trivially. Only if we assume 1 trillion unspent transactions, a suboptimal data structure, 16 transactions a second and 3 inputs per transaction do we move to SSD as a necessity. And that's assuming spinning drives don't improve in the decade minimum it takes to get to these levels. Oh, and we're ignoring the cache in ram that will hold many of the transactions needed.

Your move.

grondilu

legendary

Activity: 1288

Merit: 1080

Quote from: lucif on January 09, 2013, 04:00:20 AM

Yes, but...

http://blockchain.info/charts/bitcoin-days-destroyed-cumulative

It's a cumulative graph and it looks quite linear. What looks linear in a cumulative graph? A constant!

Also those are "destroyed" bitcoins. Therefore they represent transactions that could be "forgotten". So if this is of any relation with this thread (which is not obvious), it really does not seem to prove your point at all.

lucif

sr. member

Activity: 462

Merit: 250

Clown prophet

Quote from: grondilu on January 09, 2013, 03:50:28 AM

PS. Of course I oversimplify here because there can be chain forking which requires that you don't forget spent transactions too easily. But even so, "forget" is a bit too strong a word since it just means you don't put it in RAM anymore. It's still on disk.

Looks like its out of speculation discussion, but i will continue this chess game.

10k trans per block is a 16 trans per second.

Each trans can have many inputs. And every input needs to be found in huge data set.

Your move.

notme

legendary

Activity: 1904

Merit: 1002

Quote from: lucif on January 09, 2013, 04:00:20 AM

Yes, but...

http://blockchain.info/charts/bitcoin-days-destroyed-cumulative

well duh it goes up

there are 10 million bitcoin days created every day!

lucif

sr. member

Activity: 462

Merit: 250

Clown prophet

Yes, but...

http://blockchain.info/charts/bitcoin-days-destroyed-cumulative

grondilu

legendary

Activity: 1288

Merit: 1080

Quote from: lucif on January 09, 2013, 03:21:07 AM

Unspent needs to be verified against spent. And spent are growing to infinity.

It's not a descending process.

Coinbase transaction A is spent by transaction B which is spent by transaction C.

During database loading, you verify each block and you say:

- ok A is good,
- (you process more blocks)
- oh B spends A so B is good and I can forget about A
- (you keep processing more blocks)
- oh C spends B so C is good and I can forget about B.

Now if a transaction D comes and spends C you don't have to go all the way to A because you just need to remember that C is good.

All you have to do is to put C (and not A nor B) in an index or something so that you don't have to do this everytime you run the client.

PS. Of course I oversimplify here because there can be chain forking which requires that you don't forget spent transactions too easily. But even so, "forget" is a bit too strong a word since it just means you don't put it in RAM anymore. It's still on disk.

lucif

sr. member

Activity: 462

Merit: 250

Clown prophet

Don't know - don't tell. Search for connectinputs() in source code.

Kupsi

legendary

Activity: 1193

Merit: 1003

9.9.2012: I predict that single digits... <- FAIL

Quote from: lucif on January 09, 2013, 03:21:07 AM

Unspent needs to be verified against spent.

Why? I believe a transaction only needs to be verified against unspent.

lucif

sr. member

Activity: 462

Merit: 250

Clown prophet

Unspent needs to be verified against spent. And spent are growing to infinity.

Btw, disk stunning persists also at syncs (write), but I think its an ugly code.

lucif

sr. member

Activity: 462

Merit: 250

Clown prophet

Slowpokes reserve here.

You will be doing well with this as long as your RAM is enough to cache disk data. When it is not - big problems comming.

Nagato

full member

Activity: 150

Merit: 100

Which as i and others have mentioned is resolved in v0.8. The current RAM required for the unspent set is barely 150MB. We are a long way from having to worry about it exceeding RAM availability. Even if it does, 10minutes is plenty of time to page it from disk.

Off the top of my head, i can already think of some easy ways to optimise this.

1) Only store the most recent unspent outputs in RAM.(Temporal Locality)
This is based on the observation that recent outputs are more likely to be spent again versus those which have been dormant for many years.
This will automatically optimise out lost/saving/cold storage wallets(Which is huge) by leaving them on disk.

2) Do a x pass process to verify the transactions in the block.
Basically x is
Total unspent output DB size / Your RAM

For i = 0 : X
Load each partition(i) in RAM, verify transactions which are in partition

There are probably many other ways you could optimise this but i thought of these 2 in less than a minute.
There are many very experienced and intelligent people involved in the Bitcoin ecosystem, this is the least of my worries.

Topic: Your bets for 2014 - page 2. (Read 6888 times)