Pages:
Author

Topic: BFL SC Die Guestimation/Speculation (Read 4129 times)

member
Activity: 70
Merit: 10
December 23, 2013, 10:08:55 PM
#28
I've also seen something like 654,000 logic gates for a full SHA256 pipeline, which is, of course, much faster than one that has just the 13.5K gates functional unit taking 65 cycles.
legendary
Activity: 1274
Merit: 1004
September 24, 2012, 12:31:42 PM
#27
With a standard cell ASIC, default structures are used. Say you want to place an inverter, it might end up looking something like this (in Cadence Virtuoso)



If you want to easily chain together different structures, it works out very nicely. It doesn't mean it's always the most efficient way to lay out a design though. With a full custom ASIC, once you have your schematic you can move away from standard cells and lay out the design in the most efficient way for both size and speed by moving around the location of the transistors and metal layers.

For SHA this would be much less expensive than a comparable sized chip, since you really only have to optimize one hashing engine. All the rest can basically be repeats of your single design.
legendary
Activity: 966
Merit: 1000
September 24, 2012, 11:51:56 AM
#26

Full Custom huh.

Reading some on the different ASIC types here:
http://electronics.stackexchange.com/questions/7042/how-much-does-it-cost-to-have-a-custom-asic-made

They say that the pros of a Full Custom ASIC are the same as with a standard-cell ASIC, but more so.  A Full Custom can deliver even better performance, smaller die size, reduced power consumption, [presumably] lower cost per unit to manufacture, etc.

The cons are also the same as a standard-cell ASIC, but more so.  Design cost and effort required is even higher, and "Odds of screwing something up is much higher".

If BFL is indeed going for a Full Custom design before anyone has even done a [simpler] standard-cell design, I would call this a "shoot the moon" approach.

http://en.wikipedia.org/wiki/Hearts#Shooting_the_moon
mem
hero member
Activity: 644
Merit: 501
Herp Derp PTY LTD
September 23, 2012, 11:47:55 PM
#25
Quote
By the way, BFL doesn't use the phrase "full custom" to mean the same thing it means in the industry.

We don't?  Please elaborate. (I'm serious, I'm not being snarky.  If we/I am using it incorrectly, then I would like to use the proper term.)



"Full of Shit" is the term I use to describe you and your companies claims frequently Inaba, feel free to borrow it.

Honestly as you work for and defend a predatory con artist who scammed senior citizens out of a some totaling over 26 Million USD I think shit is the nicest thing you could be full of.
hero member
Activity: 504
Merit: 500
September 23, 2012, 07:59:35 PM
#24
If Altera HardCopy is used it will be on 28nm, with a maximum of 11.5M gates, or a maximum hash rate of 12.35 GH/s at 1 GHz, but these run at 400-700 MHz typically.

If this really is the case, the power usage will not be much less than the corresponding ASIC.  Altera themselves state, "Average of 50% performance improvement over corresponding FPGA, average of 40% less power consumption compared to corresponding FPGA."  Thus, from a hash/s/w standpoint, the ASIC would be about 200% greater than the corresponding FPGA.  A by-hand design like that of CAST's ASIC would be the only ASIC able to really deliver the kind of power consumption BFL has been hinting at.

That is their older, outdated hardcopy process.

see the new hardcopy info starting here;
http://www.altera.com/devices/asic/asic-index.html
hero member
Activity: 568
Merit: 500
September 23, 2012, 07:10:23 PM
#23

http://www.youtube.com/watch?v=bT-smMzg54k&feature=relmfu

at 0.48  "full-custom asics, etc

 

Quote
By the way, BFL doesn't use the phrase "full custom" to mean the same thing it means in the industry.

We don't?  Please elaborate. (I'm serious, I'm not being snarky.  If we/I am using it incorrectly, then I would like to use the proper term.)

Standard-cell ASICs and synthesis-flow ASICs are not considered full-custom chips.

The phrase "fully custom" is a BFL-ism that sounds a lot like "truthiness" Smiley  In fact the third google hit for "fully custom asic" on the entire interweb is BFL which ought to be a hint that it is a contortion of the usual industry terminology...
hero member
Activity: 686
Merit: 564
September 23, 2012, 10:19:35 AM
#21
Standard-cell ASICs and synthesis-flow ASICs are not considered full-custom chips.

The phrase "fully custom" is a BFL-ism that sounds a lot like "truthiness" Smiley  In fact the third google hit for "fully custom asic" on the entire interweb is BFL which ought to be a hint that it is a contortion of the usual industry terminology...
I've pointed this out to Inaba too. He also claimed that the fact bASIC had said they were using cell-based ASICs meant that they were using structured ASICs and wouldn't be able to compete with BFL's chips on power efficiency. I'm fairly sure that's wrong?
donator
Activity: 980
Merit: 1004
felonious vagrancy, personified
September 22, 2012, 10:35:53 PM
#20

Quote
By the way, BFL doesn't use the phrase "full custom" to mean the same thing it means in the industry.

We don't?  Please elaborate. (I'm serious, I'm not being snarky.  If we/I am using it incorrectly, then I would like to use the proper term.)

Standard-cell ASICs and synthesis-flow ASICs are not considered full-custom chips.

The phrase "fully custom" is a BFL-ism that sounds a lot like "truthiness" Smiley  In fact the third google hit for "fully custom asic" on the entire interweb is BFL which ought to be a hint that it is a contortion of the usual industry terminology...
legendary
Activity: 1260
Merit: 1000
September 22, 2012, 07:53:28 PM
#19
Quote
By the way, BFL doesn't use the phrase "full custom" to mean the same thing it means in the industry.

We don't?  Please elaborate. (I'm serious, I'm not being snarky.  If we/I am using it incorrectly, then I would like to use the proper term.)

donator
Activity: 980
Merit: 1004
felonious vagrancy, personified
September 22, 2012, 12:55:41 PM
#18
SHA256 hashing requires about 13,500 logic gates per circuit or 27,000 transistors.

Transistor/gate counts aren't really useful anymore.  There's a saying in the industry "you pay for the wires, we throw in the transistors for free".  Interconnect is everything.  Unfortunately people are seduced by the fact that transistors come in discrete units so you can count how many of them you have.  Interconnect costs are more subtle.

Transistor/gate counts are really only useful if you're comparing standard-cell designs pushed through the same toolchain.  Otherwise the choice of synthesis tool matters way more than the gate count.

It's utterly pointless to compare a standard-cell design to a full-custom design using transistor count.  Even between full-custom designs it's normal to see a 4x variation in area based on the foresight of the architect and the skill of the layout designer.  By the way, BFL doesn't use the phrase "full custom" to mean the same thing it means in the industry.

Also keep in mind that unlike FPGA gates, VLSI gates come in all different sizes.  There are "strong" NAND gates that are 64x (or more) as large as the weakest NAND gates, yet they still count as one gate or four transistors!
member
Activity: 112
Merit: 10
September 12, 2012, 05:08:49 AM
#17
No. It's more like 50.000$ per lot (=50 wafers).

At least this is what I can remember when working for a semiconductor manufacturer (ASICS). But that's 20 years ago ... I'm an old fart now.

Values like these and the retail values of GPU/CPU dies on these technologies are where I'm pulling it from.  From a wafer I think you could expect ~200.   From a $2000 150mm wafer you would get ~200 100mm^2 dies, of which 70-80% may be usable or a yield of 150 usable dies.  This is $13 each to the company producing the ASICs, who would presumably need to test them for fidelity and mark them up before sending them off to BFL/whoever, plus assembly, R&D and whatever else setup overhead.  It is presumed that BFL would be ordering these wafers by the hundreds.

See more here:
http://smithsonianchips.si.edu/ice/cd/CEICM/SECTION7.pdf
http://www.overclockers.com/forums/showthread.php?t=550542

Since when does cost of wafer = cost of device? Wafers are cheap compared to other pieces required to fabricate devices.
legendary
Activity: 1270
Merit: 1000
September 12, 2012, 04:47:39 AM
#16
While in fact testing should be trivial, it says nothing about the yield which depends on how good the fab can produce the chips. There could be 'dust' oarticles that produce defective dies, oder the metalization isn't homogenous over the wafer etc. As far i know dies in the inner of a wafer tend to be of 'better' quality.
legendary
Activity: 2128
Merit: 1073
September 11, 2012, 06:24:55 PM
#15
of which 70-80% may be usable or a yield of 150 usable dies.
They manufacturing yield on a Bitcoin mining chip will be eiher 0% or 100%. The structure is so repetitive and the failure modes are inconsequential: coin mining is essentially buying the lottery tickets really fast. On top of that almost every SHA256 implementation is essentially self-testing: it either always works or always doesn't work, so the post manufacturing tests are trivial.
mrb
legendary
Activity: 1512
Merit: 1028
September 11, 2012, 03:51:06 PM
#14
tacotime: I think the power claims made by BFL are absolutely plausible. 700 Mhash/Joule is doable at 65nm, check the math here: https://bitcointalksearch.org/topic/best-demonstrated-efficiency-1290-mhashjoule-95762

Based on the 130 nm technology in the paper there there (as far as I can tell the only real experimental data) and the clock rates they've given, you'd be looking at 6 GH/s with a TDP of ~90 W (100mm^2 die) considering how much space the hashing unit in the study takes up on the die.  That'd be 66.7 MH/s/w.  At 65 nm you're moving to maybe three times the efficiency (real life examples: AMD K8 vs. early Core2Duo), or 200 MH/s/w.  Hence, you should NOT be able to achieve 700 MH/s/w without moving to 32 nm or below (even then it's likely below 700 MH/s/w).

The pb is that you start your calculations from non-optimal numbers ("66.7 Mh/J").

Virginia Tech 130nm simulations estimated 75 Mh/J (13.42 mJ/Gbits); real chips did very, very close: 73 Mh/J (13.76 mJ/Gbits). The reason simulations predict very accurate numbers is because SHA-256 has a very predictable gate toggle rate.
Bitfountain 130nm simulations estimate 122 Mh/J; therefore real chips are very likely to achieve the same.

Then, even based on your very conservative estimation of 3x efficiency gain when moving from 130nm to 65nm, Bitfountain numbers should translate to 122 x 3 = 370 Mh/J, which even that is in the rough (~2x) ballpark of BFL's inferred claim of 700 Mh/J...
legendary
Activity: 1484
Merit: 1005
September 11, 2012, 02:49:39 PM
#13
No. It's more like 50.000$ per lot (=50 wafers).

At least this is what I can remember when working for a semiconductor manufacturer (ASICS). But that's 20 years ago ... I'm an old fart now.

Values like these and the retail values of GPU/CPU dies on these technologies are where I'm pulling it from.  From a wafer I think you could expect ~200.   From a $2000 150mm wafer you would get ~200 100mm^2 dies, of which 70-80% may be usable or a yield of 150 usable dies.  This is $13 each to the company producing the ASICs, who would presumably need to test them for fidelity and mark them up before sending them off to BFL/whoever, plus assembly, R&D and whatever else setup overhead.  It is presumed that BFL would be ordering these wafers by the hundreds.

See more here:
http://smithsonianchips.si.edu/ice/cd/CEICM/SECTION7.pdf
http://www.overclockers.com/forums/showthread.php?t=550542
hero member
Activity: 1162
Merit: 500
September 11, 2012, 01:45:35 PM
#12
... If you produce one wafer it's several millions dollars. ...

Setup cost is high - but not in the millions.

Quote
If you produce a million wafers it's more like $5 per wafer.

No. It's more like 50.000$ per lot (=50 wafers).

At least this is what I can remember when working for a semiconductor manufacturer (ASICS). But that's 20 years ago ... I'm an old fart now.
legendary
Activity: 980
Merit: 1008
September 11, 2012, 12:22:23 PM
#11
Nice thread. Good to see some more educated guesses.

Cost to produce a 100mm^2 die on 45 nm technology that gets an estimated 15-30 GH/s at ~200 MHz is also probably $100-200.  Likely the reason in that study that they couldn't be clocked higher is incredible power consumption/heat dissipation.
Where do you get this figure? As far as I can gather (disclaimer: not a hardware guy either) is that price per wafer makes no sense. If you produce one wafer it's several millions dollars. If you produce a million wafers it's more like $5 per wafer. Ie. the marginal cost of chunking out the chips is tiny compared to the NRE costs.
legendary
Activity: 1484
Merit: 1005
September 11, 2012, 09:46:38 AM
#10
tacotime: I think the power claims made by BFL are absolutely plausible. 700 Mhash/Joule is doable at 65nm, check the math here: https://bitcointalksearch.org/topic/best-demonstrated-efficiency-1290-mhashjoule-95762


Based on the 130 nm technology in the paper there there (as far as I can tell the only real experimental data) and the clock rates they've given, you'd be looking at 6 GH/s with a TDP of ~90 W (100mm^2 die) considering how much space the hashing unit in the study takes up on the die.  That'd be 66.7 MH/s/w.  At 65 nm you're moving to maybe three times the efficiency (real life examples: AMD K8 vs. early Core2Duo), or 200 MH/s/w.  Hence, you should NOT be able to achieve 700 MH/s/w without moving to 32 nm or below (even then it's likely below 700 MH/s/w).

As I've said above, even Altera themselves have stated that the ASICs produced from their FPGAs are not more efficient than the FPGAs by an order of magnitude.  The likelihood is higher that BFL's SC mining ASICs will perform somewhere in the vicinity of 100-200 MH/s/w.

Why do you think that BFL hasn't been talking about power consumption up to now?  Probably because they know it's unlikely they'll deliver to the hype of their rumours.

Cost to produce a 100mm^2 die on 45 nm technology that gets an estimated 15-30 GH/s at ~200 MHz is also probably $100-200.  Likely the reason in that study that they couldn't be clocked higher is incredible power consumption/heat dissipation.
mrb
legendary
Activity: 1512
Merit: 1028
September 10, 2012, 10:43:10 PM
#9
tacotime: I think the power claims made by BFL are absolutely plausible. 700 Mhash/Joule is doable at 65nm, check the math here: https://bitcointalksearch.org/topic/best-demonstrated-efficiency-1290-mhashjoule-95762
Pages:
Jump to: