Pages:
Author

Topic: BitFury 110GH/s per rack? - page 2. (Read 11047 times)

donator
Activity: 980
Merit: 1004
felonious vagrancy, personified
May 27, 2012, 03:55:13 PM
#46
Wow - someone has out-eldentyrelled eldentyrell...
That's what I thought as well when I saw this...

Hey, at least I get to share with zhoutong the dubious honor of having had my name verbified by the forum users.  I guess that's something! Wink

Does anyone know if bitfury's design stores the SHA-256 constants in BRAMs or has them spread over through the SLICEs?

Just guessing, but he probably daisy-chains the hashers in each clock region, runs them one step out-of-phase with each other, and has a single bram feed them k-values which get passed along from one hasher to the next, bucket-brigade style.  My very first design -- which was bit-serial (really bad idea!) -- worked that way.

There's actually several possibilities - one possibility that bitstream reads Device DNA code (it's serial number),

The DNA register is just a shift register; it's trivial to swap it out for an SRL32 in fpga_editor.

By the way, these guys have documented the bitstream format and made tools that turn .bit files back into .ncd files -- even (completely illegible) .v files in some cases.

It is bad that chip manufacturer implemented AES only, because if they would implement in silicon some public-private key infrastructure with Xilinx certificate - it would be much simpler.

I don't think Xilinx wants the liability that comes with being a certificate authority -- especially one whose certificates can't be revoked because they're burned into millions of dollars worth of silicon.  You can bootstrap similar schemes yourself on Virtex-6 and above; see section 6 of this paper.

Even with e-fuse it is less protection compared to SRAM + battery for AES key.

I wouldn't trust them if I were you.  It's completely trivial to extract the AES key from Xilinx devices, even Spartan-6.  Even battery-backed ram.  Only one power-on is required, and the equipment isn't expensive (if you rent it it's downright cheap).  They didn't fix this problem until Virtex-6.

About remote activation - it is pretty possible thing.

Quite prescient of you -- stay tuned.  But I'm not sure how well this would work for you -- with a highly-rolled design it's easy for an attacker to tell the difference between countermeasure circuits and the actual hashers -- just look for the pattern and chop out anything irregular.  Once you've got it down to a few hundred slices it's easy to figure out where the inputs and outputs are and what they mean. Replicate that block, stitch it back together and the game's over.

On the other hand, you guys make your own hardware -- that's a big advantage when it comes to anti-piracy measures.  You might be better off looking into ways to leverage that, like a tamper-proof housing around the spartan that erases the bitstream if breached and extra circuits to thwart power-analysis attacks.  People are also less likely to try to reverse engineer your work if they have to take apart and possibly damage a box they've paid $100,000 for!
hero member
Activity: 592
Merit: 501
We will stand and fight.
May 23, 2012, 01:56:20 AM
#45
On their website, they state:
Then after choosing serial round design it was very challenging to fit it exactly into 240 slices (8 x 32 area). As you see in snapshot image on the left, magenta color shows exactly two SHA256 rounds location. These double-SHA256 with round and round expanders and additional control logic fits into 240 slices. This took another month of development. Fitting in 240 slices was important to obtain good fill of XC6SLX150 right part.

I hate to break the news, but 8 x 32 is not 240. It is 256. At least, where I grew up.  Roll Eyes

So, what did they really do? Fit two rounds of SHA-256 into 240 slices, including control logic? I find that hard to believe.
Or fit two rounds of SHA-256 into 256 slices - I find that slightly easier to believe, but it still would be a major achievement.

i believe because we did just exactly the same.  2X 64cycle SHA256 core in 8X32 area, include control logic,  timing report is much over 300MHz.
the coding work is easy(maybe less than 50 lines.) but write the UCF files used month of time, and still have some small bugs now.  Smiley

My point was, that 8 x 32 is not 240. It is 256.
If you can fit this in only 240 slices, then maybe 16 x 15 is a better geometry, since 16 x 15 really is 240.
Or am I missing the point here?

i think it's really not important to a accurate number. our cores are using 256 slices. but only 64 clocks.

Or am I missing the point here?
Could it be related to the fact that a Bitcoin hash only needs 61 rounds instead of 64?

by special optimization on the arithmetic and setup pre-processors (certainly, inside the FPGA), it can reduce 3-4 calculate rounds.

I find it amusing people are comparing a non released product to something that is already operating and outputting real world numbers.

i think it's very close to us.
sr. member
Activity: 472
Merit: 250
May 22, 2012, 10:10:12 PM
#44
I find it amusing people are comparing a non released product to something that is already operating and outputting real world numbers.
sr. member
Activity: 448
Merit: 250
May 22, 2012, 08:17:24 PM
#43
Or am I missing the point here?
Could it be related to the fact that a Bitcoin hash only needs 61 rounds instead of 64?

No. Completely unrelated.

A guy or gal named Valery just responded to a personal email of mine with the clarification that they are indeed talking about 8 x 30 slices,
the same number a count on their screen shot comes up with.
In other words, 8 x 32 was a typo, which they have corrected on their website by now.
rjk
sr. member
Activity: 448
Merit: 250
1ngldh
May 22, 2012, 08:11:26 PM
#42
Or am I missing the point here?
Could it be related to the fact that a Bitcoin hash only needs 61 rounds instead of 64?
legendary
Activity: 1778
Merit: 1008
May 22, 2012, 06:12:52 PM
#41
My point was, that 8 x 32 is not 240. It is 256.
If you can fit this in only 240 slices, then maybe 16 x 15 is a better geometry, since 16 x 15 really is 240.
Or am I missing the point here?

42? (i have nothing useful to say... sorry.)
sr. member
Activity: 448
Merit: 250
May 22, 2012, 05:49:13 PM
#40
On their website, they state:
Then after choosing serial round design it was very challenging to fit it exactly into 240 slices (8 x 32 area). As you see in snapshot image on the left, magenta color shows exactly two SHA256 rounds location. These double-SHA256 with round and round expanders and additional control logic fits into 240 slices. This took another month of development. Fitting in 240 slices was important to obtain good fill of XC6SLX150 right part.

I hate to break the news, but 8 x 32 is not 240. It is 256. At least, where I grew up.  Roll Eyes

So, what did they really do? Fit two rounds of SHA-256 into 240 slices, including control logic? I find that hard to believe.
Or fit two rounds of SHA-256 into 256 slices - I find that slightly easier to believe, but it still would be a major achievement.

i believe because we did just exactly the same.  2X 64cycle SHA256 core in 8X32 area, include control logic,  timing report is much over 300MHz.
the coding work is easy(maybe less than 50 lines.) but write the UCF files used month of time, and still have some small bugs now.  Smiley

My point was, that 8 x 32 is not 240. It is 256.
If you can fit this in only 240 slices, then maybe 16 x 15 is a better geometry, since 16 x 15 really is 240.
Or am I missing the point here?
hero member
Activity: 592
Merit: 501
We will stand and fight.
May 22, 2012, 03:50:04 PM
#39
Any word yet if the low end -7 series will have metal heatspreader.  I would imagine you could get 20 to 30 Mhz more out of the Spartans if it like trying to pull that heat through the low conductivity plastic package.

by our review, not only heat. Smiley
some thing other limit break out when we solved over-heat.
How were you solving the overheat? Liquid nitrogen? Grin

at present no comment, but
much easier than you think...  Grin
hero member
Activity: 592
Merit: 501
We will stand and fight.
May 22, 2012, 03:48:25 PM
#38
On their website, they state:
Then after choosing serial round design it was very challenging to fit it exactly into 240 slices (8 x 32 area). As you see in snapshot image on the left, magenta color shows exactly two SHA256 rounds location. These double-SHA256 with round and round expanders and additional control logic fits into 240 slices. This took another month of development. Fitting in 240 slices was important to obtain good fill of XC6SLX150 right part.

I hate to break the news, but 8 x 32 is not 240. It is 256. At least, where I grew up.  Roll Eyes

So, what did they really do? Fit two rounds of SHA-256 into 240 slices, including control logic? I find that hard to believe.
Or fit two rounds of SHA-256 into 256 slices - I find that slightly easier to believe, but it still would be a major achievement.

i believe because we did just exactly the same.  2X 64cycle SHA256 core in 8X32 area, include control logic,  timing report is much over 300MHz.
the coding work is easy(maybe less than 50 lines.) but write the UCF files used month of time, and still have some small bugs now.  Smiley
sr. member
Activity: 448
Merit: 250
May 22, 2012, 03:34:09 PM
#37
On their website, they state:
Then after choosing serial round design it was very challenging to fit it exactly into 240 slices (8 x 32 area). As you see in snapshot image on the left, magenta color shows exactly two SHA256 rounds location. These double-SHA256 with round and round expanders and additional control logic fits into 240 slices. This took another month of development. Fitting in 240 slices was important to obtain good fill of XC6SLX150 right part.

I hate to break the news, but 8 x 32 is not 240. It is 256. At least, where I grew up.  Roll Eyes

So, what did they really do? Fit two rounds of SHA-256 into 240 slices, including control logic? I find that hard to believe.
Or fit two rounds of SHA-256 into 256 slices - I find that slightly easier to believe, but it still would be a major achievement.
rjk
sr. member
Activity: 448
Merit: 250
1ngldh
May 22, 2012, 03:10:09 PM
#36
Any word yet if the low end -7 series will have metal heatspreader.  I would imagine you could get 20 to 30 Mhz more out of the Spartans if it like trying to pull that heat through the low conductivity plastic package.

by our review, not only heat. Smiley
some thing other limit break out when we solved over-heat.
How were you solving the overheat? Liquid nitrogen? Grin
hero member
Activity: 592
Merit: 501
We will stand and fight.
May 22, 2012, 03:06:56 PM
#35
This thread is boring, it's just speculation on a very uncompetitive product.
Although the parts may not be very modern, the ideas and things surrounding it are very relevant and fun to discuss. Theoretically if designed correctly, it should be possible to replace all the little FPGA modules with 28nm models when those come along, or even specially designed ASICs.

28nm is still very far away.

on the -7 series the routing resource is far different form spartan6, so this design can not be simply transplant to -7 series, but this architecture is still useful.
Any word yet if the low end -7 series will have metal heatspreader.  I would imagine you could get 20 to 30 Mhz more out of the Spartans if it like trying to pull that heat through the low conductivity plastic package.

by our review, not only heat. Smiley
some thing other limit break out when we solved over-heat.
legendary
Activity: 1274
Merit: 1004
May 22, 2012, 03:02:08 PM
#34
I wouldn't say this is uncompetitive at all. If they built it for US$90K and it gets 110GH/s, it's the cheapest of the LX150 options.

No, Enterpoint's cairnsmore1 pre-order prices are cheaper. It will do at the very least 800 Mh/s with an average-performing bitstream (more likely 850 Mh/s) at $640. This is 1.25 Mh/s/$ (more likely 1.33 Mh/s/$). BitFury is more expensive at 1.22 Mh/s/$.

I guess one could say BitFury is cheaper on a technicality: Cairnsmore1 has not shipped yet.

The Cairnsmore are cheaper, but that's a time limited special offer (at least according to them). At the regular price of US$1280, they'd be more expensive.
mrb
legendary
Activity: 1512
Merit: 1028
May 22, 2012, 02:57:23 PM
#33
I wouldn't say this is uncompetitive at all. If they built it for US$90K and it gets 110GH/s, it's the cheapest of the LX150 options.

No, Enterpoint's cairnsmore1 pre-order prices are cheaper. It will do at the very least 800 Mh/s with an average-performing bitstream (more likely 850 Mh/s) at $640. This is 1.25 Mh/s/$ (more likely 1.33 Mh/s/$). BitFury is more expensive at 1.22 Mh/s/$.

I guess one could say BitFury is cheaper on a technicality: Cairnsmore1 has not shipped yet.
legendary
Activity: 1274
Merit: 1004
May 22, 2012, 02:49:19 PM
#32
The woman, my first love had a tattoo of a butterfly a bit like this, so I am attracted to anything to do with butterflies and because of this, Butterfly Labs have won my heart as soon as I heard about them.


Did she also have an ass like that?
sr. member
Activity: 336
Merit: 250
May 22, 2012, 02:40:34 PM
#31
The woman, my first love had a tattoo of a butterfly a bit like this, so I am attracted to anything to do with butterflies and because of this, Butterfly Labs have won my heart as soon as I heard about them.

legendary
Activity: 1274
Merit: 1004
May 22, 2012, 02:19:29 PM
#30
This thread is boring, it's just speculation on a very uncompetitive product.
Someone's being a little extra pissy this morning. I didn't see the sticky indicating that no FPGAs could be discussed unless it beats the BFL offerings.

I wouldn't say this is uncompetitive at all. If they built it for US$90K and it gets 110GH/s, it's the cheapest of the LX150 options. The power consumption is pretty high, but I'd be interested to see the actual breakdown. 110GH/s÷360 = 305MH/s, and they claim their 300MH/s bitstream consumes 12W which seems reasonable compared to other LX150s. For 360 FPGAs, that's 4.32kW. The remainder at 2.7kW seems a little high for the microcontroller backplanes, Atom boards and inefficiencies in the power supplies.

As for the price, it's moot since they haven't sold any at that price as far as anyone here knows. For that matter, BFL isn't selling their minirigs either. At this point they're just taking people's money as an interest free loan for an indeterminate amount of time.
donator
Activity: 1218
Merit: 1079
Gerald Davis
May 22, 2012, 02:15:51 PM
#29
This thread is boring, it's just speculation on a very uncompetitive product.
Although the parts may not be very modern, the ideas and things surrounding it are very relevant and fun to discuss. Theoretically if designed correctly, it should be possible to replace all the little FPGA modules with 28nm models when those come along, or even specially designed ASICs.

28nm is still very far away.

on the -7 series the routing resource is far different form spartan6, so this design can not be simply transplant to -7 series, but this architecture is still useful.
Any word yet if the low end -7 series will have metal heatspreader.  I would imagine you could get 20 to 30 Mhz more out of the Spartans if it like trying to pull that heat through the low conductivity plastic package.
hero member
Activity: 592
Merit: 501
We will stand and fight.
May 22, 2012, 01:03:19 PM
#28
This thread is boring, it's just speculation on a very uncompetitive product.
Although the parts may not be very modern, the ideas and things surrounding it are very relevant and fun to discuss. Theoretically if designed correctly, it should be possible to replace all the little FPGA modules with 28nm models when those come along, or even specially designed ASICs.

28nm is still very far away.

on the -7 series the routing resource is far different form spartan6, so this design can not be simply transplant to -7 series, but this architecture is still useful.
hero member
Activity: 812
Merit: 1001
-
May 22, 2012, 12:57:50 PM
#27
This thread is boring, it's just speculation on a very uncompetitive product.
Although the parts may not be very modern, the ideas and things surrounding it are very relevant and fun to discuss. Theoretically if designed correctly, it should be possible to replace all the little FPGA modules with 28nm models when those come along, or even specially designed ASICs.

Indeed, specialise in heating/cooling systems like that, make the FPGA/ASIC's modules, well, modular, publish specs, get me on the board of directors, lol. You got yourself a killer niche here.

I actually want to buy myself a few of those rigs minus FPGA modules.
Pages:
Jump to: