Best demonstrated efficiency: 1290 Mhash/Joule - page 4.

mrb

legendary

Activity: 1512

Merit: 1028

My point, and rjk's point is that: What makes you think the authors of that paper are the world's best ASIC designers? They are not. They are students and professors. The bleeding edge of ASIC research happens in the professional world (at TSMC, Intel, etc), not in the academic world.

The authors did not need to be excellent ASIC designers to conduct this research. They merely tried to make an average design, and that's all they needed to fairly compare the efficiency of different hash functions. This was all they needed to reach their research goal.

That team achieved 71 Mh/J at 130nm, using standard-cell tech. The true best ASIC designers would have achieved higher that that, using full-custom tech not standard-cell, and would have demonstrated it on a smaller process node like 45nm.

PS: the Virginia Tech researchers did not even do the VHDL design themselves, they implemented the one from GMU: https://cryptography.gmu.edu/athena/index.php?id=source_codes It looks like it is https://cryptography.gmu.edu/athena/sources/2011_10_01/basic/SHA-2_basic.zip -> any half-decent ASIC designers should be able to take it, implement it to 45nm standard-cell tech, and get 700 Mh/J

Coinoisseur

sr. member

Activity: 336

Merit: 250

Using a 2012 research chip design? If they pull that off then they should just become a chip design firm because it'll mean they have some of the best engineers in the world.

mrb

legendary

Activity: 1512

Merit: 1028

I have explained many times I think they will do 700 Mh/J, not 1750 Mh/J. Read this thread.

Coinoisseur

sr. member

Activity: 336

Merit: 250

Which still brings us back to.

Quote from: eldentyrell on July 26, 2012, 03:28:51 AM

Let me get this straight: BFL is claiming 1,750 MH/J and you are trying to say that is plausible based on some paper you found that demonstrated 71 MH/J?

Seriously?

And keep in mind that's Intel showing that there is no free performance bonus when aiming for power reduction, are we seriously going to armchair ref that BFL is on par with Intel in terms of engineering and chip production?

mrb

legendary

Activity: 1512

Merit: 1028

Quote from: Coinoisseur on July 28, 2012, 08:46:19 PM

http://www.cpubenchmark.net/cpu_lookup.php?cpu=Intel+Atom+Z510+%40+1.10GHz

http://ark.intel.com/products/35469/Intel-Atom-Processor-Z510-%28512K-Cache-1_10-GHz-400-MHz-FSB%29

http://ark.intel.com/products/31855/Intel-Pentium-III-Processor---S-1_00-GHz-512K-Cache-133-MHz-FSB

Z510 is a bit slower than mobile P3 1GHz, 130nm->45nm no performance increase but 16.5% of the power use. Keep in mind this is Intel the biggest chip foundry in the world.

The round "2 W" number quoted for the Z510 is likely Intel rounding up.
Compare instead the (faster) 1.3 W Z600 which I linked above.
130nm->45nm predicts a reduction of the power to 12% (1/8th), and the Z600 reduces it to 11%, hence proving my point.

Coinoisseur

sr. member

Activity: 336

Merit: 250

http://www.cpubenchmark.net/cpu_lookup.php?cpu=Intel+Atom+Z510+%40+1.10GHz

http://ark.intel.com/products/35469/Intel-Atom-Processor-Z510-%28512K-Cache-1_10-GHz-400-MHz-FSB%29

http://ark.intel.com/products/31855/Intel-Pentium-III-Processor---S-1_00-GHz-512K-Cache-133-MHz-FSB

Z510 is a bit slower than mobile P3 1GHz, 130nm->45nm no performance increase but 16.5% of the power use. Keep in mind this is Intel the biggest chip foundry in the world.

mrb

legendary

Activity: 1512

Merit: 1028

Quote from: Coinoisseur on July 28, 2012, 08:57:38 AM

Power doesn't usually turn out ideally on these die shrinks though. Otherwise instead of Atom cpus Intel would have a sub Watt Pentium 3 1GHz die at 32nm.

...and they do:

0.65W Atom Z500 with twice the cache than the Pentium III, more instructions supported (SSE2, SSE3), twice the threads, not 1GHz but close: 800MHz (because it is not 32nm, but 45nm): http://ark.intel.com/products/35472

Or look at this one:
1.4W Atom Z600 with twice the cache, SSE2, SSE3, twice the threads, and 1.2GHz: http://ark.intel.com/products/49656

What do you think Atom CPUs are? They are built upon the Pentium III/M design. Yeah they don't support OOO execution for whatever reason (artificial market segmentation between Atom and higher-end CPUs, or making TDP room for supporting things Intel deemed more important such as 512kB cache, SSE2, SSE3, etc), but they pretty much prove that a sub-Watt (at 32nm) or ~1.5W (at 45nm) Pentium III core is possible.

Gomeler

hero member

Activity: 697

Merit: 500

Quote from: Coinoisseur on July 28, 2012, 08:57:38 AM

Power doesn't usually turn out ideally on these die shrinks though. Otherwise instead of Atom cpus Intel would have a sub Watt Pentium 3 1GHz die at 32nm.

I love when I get to dust off the cobwebs..

Pentium M was the mobile processor that all current out-of-order Intel desktop/server chips are based off of. Pentium M was a Pentium 3 core with some spruced up I/O. Pentium M led to Core Duo(yonah), which were holywtf awesome at the time, which then lead to Core 2 Duo(conroe) which then lead to the 45nm shrink(penryn). That then lead to the architecture improvement that was nehalem, which lead to the 32nm shrink that was westmere(with the awkward clarkdale phase), which lead to the architecture improvement that was sandybridge, and then the 22nm shrink that is ivy bridge.

If Intel wanted, they could sell a sub-watt single core Ivy Bridge chip. But instead they chose to butcher their core, remove the out of order components, and trick the market in to needing a second core that costs them next to nothing to manufacture. You have to give them credit as they effectively established a tablet-like market with the netbooks based off early Atom processors.

Coinoisseur

sr. member

Activity: 336

Merit: 250

Power doesn't usually turn out ideally on these die shrinks though. Otherwise instead of Atom cpus Intel would have a sub Watt Pentium 3 1GHz die at 32nm.

Lethos

sr. member

Activity: 476

Merit: 250

Keep it Simple. Every Bit Matters.

Quote from: mrb on July 28, 2012, 03:55:36 AM

Quote from: Lethos on July 27, 2012, 04:38:00 PM

I really think many are over estimating what a 45nm ASIC is capable of.

This is math. Power consumption varies with the square of the feature size. So when comparing a chip designed at 45nm to a 130nm version of it running at the same frequency and same voltage, there should be a 8x better power efficiency: (130/45)**2 = 8.3

Ask your dad, he will tell you that for 2 identical designs, power consumption will vary proportionally to the transistor junction area.

I'm aware of the mathematics, the math of scaling does work like that. But it's not the only math that effects the final outcome.

However you have also made a convenient assumption that it will utilise two usb ports to power it, allowing it to have twice as much power, for a max of 5 watts. That is abit of a stretch to assume that and why the math to me does not add up for it to do 3.5 Gh/s at 2.5W and is what I stated.
2.5W is also something an ASIC could easily run off, it would not need to rely on 5W to work fully. 5W might allow it to go that bit higher, but I still think it be off by a bit of course.

Their FPGA to ASIC conversion and how efficiently they move that over will matter the most, since few are doing the same sort of double hashing, so it's not like they can just copy or modify the design of someone elses.

Coinoisseur

sr. member

Activity: 336

Merit: 250

It's not anywhere near double but the costs of masking and producing are still there, unless they plan on disabling lots and lots of blocks off one larger die (this is what Intel and AMD do for some of their chips).

Quote from: mrb on July 28, 2012, 03:35:52 AM

Quote from: lame.duck on July 27, 2012, 03:01:37 AM

Quote from: mrb on July 26, 2012, 10:32:21 PM

Best guess:
50 chips of 20 Ghash/s each in the Mini Rig
2 chips of 20 Ghash/s each in the Single
1 "small" chip of 3.5 Ghash/s in the Jalapeno which has roughly 1/6th the die size (therefore 1/6th the performance) of the other chips

Wouldn't this require 2 separate mask sets etc. which would produce 2 times NRE cost?

Do you think that Intel having, say, 5 different combinations of core counts, cache size, etc for their Sandy Bridge processors, mean that they incurred 5x the NRE costs to develop them?

No.

They take pre-designed logic blocks (cores, cache, etc) and can mix and match them relatively easily to produce a die with specific characteristics. The few cases where different SKUs are built on the same design (eg. a 3-core CPU made from a 4-core die) allow processor manufacturers to keep a stock of the same die, and "brand" them on-the-fly to match market demand (so that they don't get stuck with unsold 4-core inventory and production capacity when the markets buy 3-core).

For the same reason, BFL designing 2 different dies will not double their NRE cost. It makes sense for a Bitcoin ASIC to be made of the same hashing logic block duplicated dozens/hundreds of times across the die (see the "sea-of-tiny hashers" design made by bitfury -- the same applies to FPGAs). Therefore there is almost zero engineering effort and cost in taking a working die with, say, 60 hashing blocks, and deciding to produce a smaller die with only 10 hashing blocks.

mrb

legendary

Activity: 1512

Merit: 1028

Quote from: Lethos on July 27, 2012, 04:38:00 PM

I really think many are over estimating what a 45nm ASIC is capable of.

This is math. Power consumption varies with the square of the feature size. So when comparing a chip designed at 45nm to a 130nm version of it running at the same frequency and same voltage, there should be a 8x better power efficiency: (130/45)**2 = 8.3

Ask your dad, he will tell you that for 2 identical designs, power consumption will vary proportionally to the transistor junction area.

mrb

legendary

Activity: 1512

Merit: 1028

Quote from: lame.duck on July 27, 2012, 03:01:37 AM

Quote from: mrb on July 26, 2012, 10:32:21 PM

Best guess:
50 chips of 20 Ghash/s each in the Mini Rig
2 chips of 20 Ghash/s each in the Single
1 "small" chip of 3.5 Ghash/s in the Jalapeno which has roughly 1/6th the die size (therefore 1/6th the performance) of the other chips

Wouldn't this require 2 separate mask sets etc. which would produce 2 times NRE cost?

Do you think that Intel having, say, 5 different combinations of core counts, cache size, etc for their Sandy Bridge processors, mean that they incurred 5x the NRE costs to develop them?

No.

They take pre-designed logic blocks (cores, cache, etc) and can mix and match them relatively easily to produce a die with specific characteristics. The few cases where different SKUs are built on the same design (eg. a 3-core CPU made from a 4-core die) allow processor manufacturers to keep a stock of the same die, and "brand" them on-the-fly to match market demand (so that they don't get stuck with unsold 4-core inventory and production capacity when the markets buy 3-core).

For the same reason, BFL designing 2 different dies will not double their NRE cost. It makes sense for a Bitcoin ASIC to be made of the same hashing logic block duplicated dozens/hundreds of times across the die (see the "sea-of-tiny hashers" design made by bitfury -- the same applies to FPGAs). Therefore there is almost zero engineering effort and cost in taking a working die with, say, 60 hashing blocks, and deciding to produce a smaller die with only 10 hashing blocks.

Lethos

sr. member

Activity: 476

Merit: 250

Keep it Simple. Every Bit Matters.

I really think many are over estimating what a 45nm ASIC is capable of. It certainly won't do what many are suggesting aka BFL stated numbers.
A 22nm or at a push a 28nm might, but it will massively depend on their design which is far from perfect. All the FPGA's on the market including BFL's still have not reached their ceiling limit, so I don't expect them to be pushing an ASIC to it's limit either.

I don't say this flippantly, my dad (him far more than I) and I, have done a lot of encryption based projects, involving FPGA's and ASIC's. He has always focused on hardware, myself more software based.
ASIC are fantastic chips if you can afford the upstart costs, however FPGA's have advanced to a point that they are fast enough that they are being used as the first choice, instead of ASIC.

The original poster brings up accurate statistics, I'm sure they can make a USB powered ASIC, I've used both FPGA's and ASIC that run on those sort of wattage before, they can do a lot, especially the ASIC, but they can't do a double-SHA-256 as frequently as they say.
I estimate the 2.5 watt (coffee warmer) will likely only do 1000 Mhash/s, A long way off their 3500 Mhash/s Statement.

MrTeal

legendary

Activity: 1274

Merit: 1004

Quote from: SgtSpike on July 27, 2012, 01:11:21 PM

Quote from: Coinoisseur on July 27, 2012, 05:37:43 AM

It's plausible they will achieve perfect MHz, die scaling, and process shrinking improvements on a university research design (Published in March of this year no less)? That their identity-less VC backer will pony up the money to develop a full ASIC design on 45nm. We're talking R&D on 45nm, multiple wafer tests would be expected. Oh yeah this is plausible...

Quote from: mrb on July 26, 2012, 10:24:42 PM

Quote from: Coinoisseur on July 26, 2012, 03:38:28 PM

28nm theoretical Mh/W have been tossed around. Very rosy to think BFL has a VC source willing to pony up 10s of millions up front for that kind of chip development.

BFL's claims are plausible at 45nm, not 28nm. See post above.

The university design is likely imperfect. I bet engineers with years of experience can make it happen with even greater efficiency.

And yeah, you can bet your bottom that any VC with any sort of investment fortitude would be willing to put down whatever dollars are required to finish this after seeing BFL take in millions of dollars without even having a product to show for it yet.

Source?

SgtSpike

legendary

Activity: 1400

Merit: 1005

Quote from: Coinoisseur on July 27, 2012, 05:37:43 AM

It's plausible they will achieve perfect MHz, die scaling, and process shrinking improvements on a university research design (Published in March of this year no less)? That their identity-less VC backer will pony up the money to develop a full ASIC design on 45nm. We're talking R&D on 45nm, multiple wafer tests would be expected. Oh yeah this is plausible...

Quote from: mrb on July 26, 2012, 10:24:42 PM

Quote from: Coinoisseur on July 26, 2012, 03:38:28 PM

28nm theoretical Mh/W have been tossed around. Very rosy to think BFL has a VC source willing to pony up 10s of millions up front for that kind of chip development.

BFL's claims are plausible at 45nm, not 28nm. See post above.

The university design is likely imperfect. I bet engineers with years of experience can make it happen with even greater efficiency.

And yeah, you can bet your bottom that any VC with any sort of investment fortitude would be willing to put down whatever dollars are required to finish this after seeing BFL take in millions of dollars without even having a product to show for it yet.

Coinoisseur

sr. member

Activity: 336

Merit: 250

It's plausible they will achieve perfect MHz, die scaling, and process shrinking improvements on a university research design (Published in March of this year no less)? That their identity-less VC backer will pony up the money to develop a full ASIC design on 45nm. We're talking R&D on 45nm, multiple wafer tests would be expected. Oh yeah this is plausible...

Quote from: mrb on July 26, 2012, 10:24:42 PM

Quote from: Coinoisseur on July 26, 2012, 03:38:28 PM

28nm theoretical Mh/W have been tossed around. Very rosy to think BFL has a VC source willing to pony up 10s of millions up front for that kind of chip development.

BFL's claims are plausible at 45nm, not 28nm. See post above.

lame.duck

legendary

Activity: 1270

Merit: 1000

Quote from: mrb on July 26, 2012, 10:32:21 PM

Best guess:
50 chips of 20 Ghash/s each in the Mini Rig
2 chips of 20 Ghash/s each in the Single
1 "small" chip of 3.5 Ghash/s in the Jalapeno which has roughly 1/6th the die size (therefore 1/6th the performance) of the other chips

Wouldn't this require 2 separate mask sets etc. which would produce 2 times NRE cost?

mrb

legendary

Activity: 1512

Merit: 1028

Quote from: pieppiep on July 26, 2012, 04:09:15 AM

So you should add to the question the minimum hashrate for a chip.
But you don't know yet how many chips BFL is using in the new products.

Best guess:
50 chips of 20 Ghash/s each in the Mini Rig
2 chips of 20 Ghash/s each in the Single
1 "small" chip of 3.5 Ghash/s in the Jalapeno which has roughly 1/6th the die size (therefore 1/6th the performance) of the other chips

mrb

legendary

Activity: 1512

Merit: 1028

Quote from: Coinoisseur on July 26, 2012, 03:38:28 PM

28nm theoretical Mh/W have been tossed around. Very rosy to think BFL has a VC source willing to pony up 10s of millions up front for that kind of chip development.

BFL's claims are plausible at 45nm, not 28nm. See post above.

Topic: Best demonstrated efficiency: 1290 Mhash/Joule - page 4. (Read 20634 times)