Pages:
Author

Topic: PetaFLOPS and how it relates to Bitcoin (Read 10975 times)

member
Activity: 104
Merit: 10
June 18, 2013, 01:48:09 PM
#28
Still it is strange they didn't use GPU in the super computer to boost the more computationally intensive portions.

The top two current supercomputers do, in fact, use APU accelerators: http://www.top500.org/blog/lists/2013/06/press-release/

The Tianhe-2 (~34 PetaFlops) uses 48,000 Intel Xeon Phi co-processors, with 63 cores each.  The Phi is designed similar to a graphics card, but instead of the cores being optimized for graphics pipeline, they are optimized for general purpose math.  Titan (~17.5 Petaflops) uses 256K x Nvidia K20x cards, which uses the same GK-110 graphics chip as their top end consumer card.  The only real difference is the scientific card uses error correcting memory, and the consumer card doesn't, because dropping a few bits here and there in rendering a video game doesn't matter, but in doing scientific problems it does.

Titan has 22,000 (not 256K) GK-110s. At best those would produce about 500MH/s each. Therefore Titan could generate 11TH/s if they chose to use it for that. I think they are too busy simulating nuclear bomb blasts to worry about Bitcoin. If they are really worried, it would be cheaper for them to do something else to mess with it...
newbie
Activity: 24
Merit: 0
Still it is strange they didn't use GPU in the super computer to boost the more computationally intensive portions.

The top two current supercomputers do, in fact, use APU accelerators: http://www.top500.org/blog/lists/2013/06/press-release/

The Tianhe-2 (~34 PetaFlops) uses 48,000 Intel Xeon Phi co-processors, with 63 cores each.  The Phi is designed similar to a graphics card, but instead of the cores being optimized for graphics pipeline, they are optimized for general purpose math.  Titan (~17.5 Petaflops) uses 256K x Nvidia K20x cards, which uses the same GK-110 graphics chip as their top end consumer card.  The only real difference is the scientific card uses error correcting memory, and the consumer card doesn't, because dropping a few bits here and there in rendering a video game doesn't matter, but in doing scientific problems it does.
kjj
legendary
Activity: 1302
Merit: 1026
March 17, 2012, 11:19:50 PM
#26
Why in the world are so many people so insistent on this meaningless comparison?  Floating point and integer operations aren't even done on the same hardware, which is easy to overlook since both units have been routinely packaged inside the same die for the last couple of decades.
mrb
legendary
Activity: 1512
Merit: 1028
March 17, 2012, 10:06:46 PM
#25
That's not what's at issue. FLOPS is not, in fact, FLoating Point Operations Per Second but the numeric result of running a very specific, standardized benchmark from the LINPACK codes solving a large, dense system of linear equations. It is a benchmark that is meaningful for most scientific and technical disciplines, but says *nothing* about the ability to crank out SHA-256 hashes.

FLOPS is floating point operations per second. LINPACK FLOPS are just meaningful in the context of the TOP500 supercomputer rank because this organization decided to use this benchmark to establish the rank. A supercomputer might achieve "x" FLOPS on linear equations with LINPACK, "y" FLOPS when doing protein folding, and "z" FLOPS when doing some other work. The "x" FLOPS value of the LINPACK benchmark is no more significant than "y" or "z".

All are a fraction of "t", the theoretical peak FLOPS of the hardware (which can often be reached within 1-2% with a useless loop of multiply-add instructions). This theoretical peak can be used to predict the performance of SHA256-based Bitcoin mining because it scales linearly with the peak theoretical integer performance of a chip, which is itself directly related to its peak theoretical FLOPS performance by a fixed ratio. For example, there is exactly a 1:4 ratio between the number of integer and double precision floating point instructions that an HD 69xx series GPU can execute.
legendary
Activity: 905
Merit: 1012
March 17, 2012, 07:19:43 PM
#24
Comparisons between integer and floating point operations are meaningless.  Not just inaccurate, or merely approximate, but not even in the same universe.

I disagree with that. You can always emulate floating point operations on top of integer operations, or integer operations on top of floating point operations. This places an upper and lower bound on the speed ratio between floating point and integer operations, which is basically independent of the hardware you are using. I don't really know how big the gap is between the upper and lower bound, but at least I can say from experience that emulated floating point can be relatively fast(*).

That's not what's at issue. FLOPS is not, in fact, FLoating Point Operations Per Second but the numeric result of running a very specific, standardized benchmark from the LINPACK codes solving a large, dense system of linear equations. It is a benchmark that is meaningful for most scientific and technical disciplines, but says *nothing* about the ability to crank out SHA-256 hashes.

LINPACK, which is what measures FLOPS, is a test of general-purpose scientific/technical capability. Bitcoin is highly, HIGHLY specialized. Using a measurement of one to derive the other could be off by orders of magnitude, not even in the same ballpark. It's a totally meaningless, apples-to-oranges comparison.
donator
Activity: 1218
Merit: 1079
Gerald Davis
March 17, 2012, 08:54:05 AM
#23
That seems to be an explanation, although it seems weird that you can already create a more powerful (in terms of flops) machine with a number of GPUs that is affordable for a hobbyist. Maybe raw processing speed is not the most expensive factor in supercomputers anymore? That weather forecast thing also has a huge amount of RAM, and probably some high-speed inter-computer connections as well.

That is it.  A hobbyist GPU "Supercomputer" made up of mining rigs would only be useful for problems which have no inter node depndencies, little bandwidth, and no storage requirements.  Essentially completely useless for anything other than things like Bitcoin hashing or password cracking.

Most problems in the world are complex and weather modeling isn't hashing an incrementing nonce value each of the nodes is doing work which depends on other nodes.   A mining rig would simply fail at that.  Still it is strange they didn't use GPU in the super computer to boost the more computationally intensive portions.
Quote
I used an exchange rate of about 4EUR/BTC, so your value would correspond to about 160 EUR/day.

If you are correct, then how is that possible? Wouldn't they be able to have more FLOPs than they report even by doing emulated floating points on the Xeons?

Or does bitcoinwatch.com use an unrealistic estimation of the FLOPS / hashes ratio? I see in an earlier post that they use 1 INTOP = 2 FLOP. So it assumes integer is actually slower than floating point? But even if it is 4 INTOP = 1 FLOP for floating point emulation, you'd still only lose a factor 8, which is less than needed to explain the ratio between 12 EUR/day and 160 EUR/day.

(*) I did some speed measurements on an ARM 7 without floating point unit and a Pentium 3 or 4 with floating point unit. On the ARM, emulated floating point operations (addition and multiplication) were actually faster w.r.t. integer operations than the hardware-accelerated floating point w.r.t. integer on the Pentium (TfloatARM / TintARM < TfloatIntel / TintIntel). I did not use things like SSE on the Pentium.


There is no Universal ratio.  The ratio they used is relevant only for AMD 5000 series GPUs.  On any other hardware it is completely off and not by 10%, or 50% but 1000% or 20,000%.  It is really ony a guestimate and shouldn't be taken as a serious value.  Nobody rates supercomputers in intops.  If they did one could simply post the intops of the network and be done.

The * which should be next to the Petaflop numbers should read something like:
"(if the entire network consisted of x000 AMD HD 5870 GPUs.  Single precision only, double precision would be 1/4th stated speed).
cjp
full member
Activity: 210
Merit: 124
March 17, 2012, 05:53:11 AM
#22
1) 58.2 TFlop is pathetically weak for a super computer.  A 5970 graphics card has about 4.6 TFLOPs.  So it is roughly equal to the floating point math calculations of ~12 5970 GPUs (4x that if it is double precision).

3) The largest super computers in the world is 11,280 TFLOPs.  http://i.top500.org/system/177232

That seems to be an explanation, although it seems weird that you can already create a more powerful (in terms of flops) machine with a number of GPUs that is affordable for a hobbyist. Maybe raw processing speed is not the most expensive factor in supercomputers anymore? That weather forecast thing also has a huge amount of RAM, and probably some high-speed inter-computer connections as well.

Comparisons between integer and floating point operations are meaningless.  Not just inaccurate, or merely approximate, but not even in the same universe.

I disagree with that. You can always emulate floating point operations on top of integer operations, or integer operations on top of floating point operations. This places an upper and lower bound on the speed ratio between floating point and integer operations, which is basically independent of the hardware you are using. I don't really know how big the gap is between the upper and lower bound, but at least I can say from experience that emulated floating point can be relatively fast(*).

Just the Xeon CPUs in that Dutch thing are capable of finding a block every 30 hours or so, or about 40 BTC per day.  Presumably, that is more than 12 euros.
I used an exchange rate of about 4EUR/BTC, so your value would correspond to about 160 EUR/day.

If you are correct, then how is that possible? Wouldn't they be able to have more FLOPs than they report even by doing emulated floating points on the Xeons?

Or does bitcoinwatch.com use an unrealistic estimation of the FLOPS / hashes ratio? I see in an earlier post that they use 1 INTOP = 2 FLOP. So it assumes integer is actually slower than floating point? But even if it is 4 INTOP = 1 FLOP for floating point emulation, you'd still only lose a factor 8, which is less than needed to explain the ratio between 12 EUR/day and 160 EUR/day.

(*) I did some speed measurements on an ARM 7 without floating point unit and a Pentium 3 or 4 with floating point unit. On the ARM, emulated floating point operations (addition and multiplication) were actually faster w.r.t. integer operations than the hardware-accelerated floating point w.r.t. integer on the Pentium (TfloatARM / TintARM < TfloatIntel / TintIntel). I did not use things like SSE on the Pentium.
kjj
legendary
Activity: 1302
Merit: 1026
March 16, 2012, 02:13:15 PM
#21
Comparisons between integer and floating point operations are meaningless.  Not just inaccurate, or merely approximate, but not even in the same universe.

Just the Xeon CPUs in that Dutch thing are capable of finding a block every 30 hours or so, or about 40 BTC per day.  Presumably, that is more than 12 euros.
donator
Activity: 1218
Merit: 1079
Gerald Davis
March 16, 2012, 02:08:39 PM
#20
1) 58.2 TFlop is pathetically weak for a super computer.  A 5970 graphics card has about 4.6 TFLOPs.  So it is roughly equal to the floating point math calculations of ~12 5970 GPUs (4x that if it is double precision).

2) Super computers aren't optimized for Bitcoin mining.  They have lots of very expensive parts.  Terrabytes of ram, petabytes of storage, redundant backups, high speed interconnects.  All that adds up to hundreds of millions of dollars and produces 0.0 hashes.

3) The largest super computers in the world is 11,280 TFLOPs.  http://i.top500.org/system/177232

4) That ratio is just a guestimate.  Bitcoin mining uses integer math TFLOPS are a measure of floating point math.  Saying the network is 137.91 PFLOPS isn't exactly accurate.  It is 0 PFLOPS however it likely has the computing hardware that combined is roughly equal to ~100 PFLOPS.
cjp
full member
Activity: 210
Merit: 124
March 16, 2012, 12:41:44 PM
#19
Check out this page (warning: Dutch language!):
http://www.knmi.nl/cms/nieuws/nieuwsbericht/_rp_column1-1_elementId/1_105587

Basically it says that the Dutch weather forecast agency now has a new supercomputer, which has 58.2 Tflop, which makes it one of the fastest supercomputers in the country.

I made a small calculation of how much income it could make with BTC mining, and my result was approximately 12 EUR / day (at current hash rate and exchange rate).

I am a bit disappointed by the low number. How can it possibly be so low? I can't imagine you can possibly break-even at 12 EUR/day with a computer like that, so apparently many miners manage to be A LOT more efficient.

And I don't really buy the story that it is much more efficient to do highly parallelized computations on GPUs. AFAIK, weather calculations are highly parallelizable too, so if GPUs were so much better, I'd expect the weather forecast computer to use GPUs as well. In fact, I'd expect this computer of them to contain a GPU-style architecture (although I haven't verified it yet).

Maybe the whole ratio between hash/s and flops/s is wrong? I used the value from bitcoinwatch.com, which currently says:
137.91 PetaFLOPS
10.86 Thash/s
full member
Activity: 154
Merit: 100
November 03, 2011, 05:19:18 PM
#18
you would find this interesting:

https://bitcointalksearch.org/topic/m.565787

i also ran this by Gavin and he agrees completely with ethotheipi's analysis.  Gavin thinks we have nothing to worry about for 10 yrs and then we can change the system to handle any threats.

Wonderful, thats a whole lot of other info to incorporate, thank you very much!
legendary
Activity: 1764
Merit: 1002
November 03, 2011, 05:14:48 PM
#17
you would find this interesting:

https://bitcointalksearch.org/topic/m.565787

i also ran this by Gavin and he agrees completely with ethotheipi's analysis.  Gavin thinks we have nothing to worry about for 10 yrs and then we can change the system to handle any threats.
full member
Activity: 154
Merit: 100
November 03, 2011, 01:35:43 PM
#16
Okay, wonderful.

Thanks Gabi and Death for helping me wrap my head around these issues!
donator
Activity: 1218
Merit: 1079
Gerald Davis
November 03, 2011, 01:33:47 PM
#15
It's winter, you want cold? Open a window. Also if we speak about rich people i expect they already have enough space for the hardware.
Ok, 1 million is not enough but 10 are.

Um I don't think you understand how much heat 7.3MW is.  It is enough to melt steel.  Natural convection would be insufficient to transfer that heat "out the window".  The cards near the center of the "farm" would burnup.  There is a good chance the entire building would catch fire.  There is a reason why datacenters are airconditioned year round even in say Finland and they don't even pull 1 MW.

I would say maybe $20M would be enough when you consider hardware, building space, cooling, labor, power distribution, etc.  Granted it can be done but it is hardly "cheap".
donator
Activity: 1218
Merit: 1079
Gerald Davis
November 03, 2011, 01:31:05 PM
#14
Nope your right.

Also at say 1.5 MH/W it would require 7.3MW electrical connection to power that hashing farm and roughly 35% more to cool it so ~10MegaWatts.  At $0.10 per kWh that would be ~$24,000 per day in electrical costs.   If racked up into 4 high shelve with 6 GPU per motherboard and 2 feet of space between shelves it would still take roughly 8,000 sq ft of warehouse space.  

So yeah it can be done but the guestimate of $1M isn't even close.
Okay, just a few more details and I think I'll have enough to compile all of this:

Any idea on how much the labor would cost to organize something like that?

Those are harder to guestimate but here is a stab.

Labor wouldn't be too much.  Likely you would have a core team of say dozen linux admins who would manage the farm and write a custom distro which links to a mangement server.  For the physical work you are talking about 6000 computers which need to be assembled, booted, tested.  Even at an hour a piece thats only 6000 man hours.  50 people could likely do that in a month including all other ancillary work like creating raking and power distributions.  Maybe $200K in labor for a month.

Finding a building that can handle 7.3MW of heat and 10 MW of electrical load is a little more tough.  Worst case scenario your looking a custom AC install and something that can handle 7.3MW of heat isn't going to be cheap.  Easily in the million plus range.  Most mains even in light industrial/warehouse buildins only handle 300A @ 208V three phase which is ~180 kW so 10 MW is serious power.  No idea how hard it would be to fine and get that kind of setup installed and how many places can even hanlde that kind of localized load on the local distribution grid.

Of course you likely could break it up into say 10 teams/buildings of 4000 GPU each.

Quote
Also, what how much time does it take to start reorganizing the blockchain before you put it out in the wild? (I don't even know if I'm using the right words there, sorry)
There is no cost effective way to work backwards unless you have significantly more than 51% hashing power so you can "catch up".  Assumming the attacker starts at the current block they pretty much are guaranteed to be the longest chain after a dozen blocks or so they attack itself would happen very quickly.  The attacker would keep their bad chain private until it was sufficiently longer than the good chain to ensure the good chain can never catch up.  Then they would release it.  The warning sign would be a split blockchain and the split portion growing very rapidly.  However at that point it is too late.

Quote
Lastly, I know the 'usual' confirmation window is 6, is 120 'absolutely' confirmed? IE no risk of 51%?  Or does it depend on the depth of the reorganized blockchain?

No even 120 blocks isn't guaranteed confirmed.  Technically right now a 51% attack could have started >120 blocks ago and they have a longer "bad chain" they are keeping private.  If they released it right now it would quickly replicate through the network and replace all blocks going back to the start of the split.  The only thing that is guaranteed is a checkpoint.  Hardcoding a block # and hash into the client. the client will reject any blocks prior to that which don't match the hardcoded hash.  As a result any transaction prior to the checkpoint can't be replaced. This is possible in Bitcoin network but I don't know if it has ever been done.
legendary
Activity: 1148
Merit: 1008
If you want to walk on water, get out of the boat
November 03, 2011, 01:22:23 PM
#13
It's winter, you want cold? Open a window. Also if we speak about rich people i expect they already have enough space for the hardware.
Ok, 1 million is not enough but 10 are.
full member
Activity: 154
Merit: 100
November 03, 2011, 01:12:23 PM
#12
Nope your right.

Also at say 1.5 MH/W it would require 7.3MW electrical connection to power that hashing farm and roughly 35% more to cool it so ~10MegaWatts.  At $0.10 per kWh that would be ~$24,000 per day in electrical costs.   If racked up into 4 high shelve with 6 GPU per motherboard and 2 feet of space between shelves it would still take roughly 8,000 sq ft of warehouse space.  

So yeah it can be done but the guestimate of $1M isn't even close.

Okay, just a few more details and I think I'll have enough to compile all of this:

Any idea on how much the labor would cost to organize something like that?

Also, what how much time does it take to start reorganizing the blockchain before you put it out in the wild? (I don't even know if I'm using the right words there, sorry)

Lastly, I know the 'usual' confirmation window is 6, is 120 'absolutely' confirmed? IE no risk of 51%?  Or does it depend on the depth of the reorganized blockchain?
donator
Activity: 1218
Merit: 1079
Gerald Davis
November 03, 2011, 01:01:18 PM
#11
Nope your right.

Also at say 1.5 MH/W it would require 7.3MW electrical connection to power that hashing farm and roughly 35% more to cool it so ~10MegaWatts.  At $0.10 per kWh that would be ~$24,000 per day in electrical costs.   If racked up into 4 high shelve with 6 GPU per motherboard and 2 feet of space between shelves it would still take roughly 8,000 sq ft of warehouse space.  

So yeah it can be done but the guestimate of $1M isn't even close.
legendary
Activity: 1148
Merit: 1008
If you want to walk on water, get out of the boat
November 03, 2011, 12:58:29 PM
#10
Oh well ok, 7 millions. It's still less than 10.

It's almost nothing for any rich guy or a government...
full member
Activity: 154
Merit: 100
November 03, 2011, 12:46:10 PM
#9
Can someone do a 51% attack without spending too much? Damn, YES! You only need like 1 million of $ or so to buy enough GPU to have 106PetaFLOPS.

From my brief stint with the calculator, if the average GPU does ~200mh/s (midgrade ATI), and if there is ~7th/s on the network, that's a total of 35000 GPU's.  At ~$200 per GPU, that's $7,000,000 in GPU's alone.  Lets say for funsies, double that for the extra hardware you'd need.  Then there is housing, cooling, electricity, etc.... so (very roughly) lets say ~$18m total to gain a 51% foothold.

Have I messed up my 0's somewhere along the way?
Pages:
Jump to: