Pages:
Author

Topic: Official Open Source FPGA Bitcoin Miner (Last Update: April 14th, 2013) - page 30. (Read 432966 times)

hero member
Activity: 686
Merit: 564
www.asicminer.net

Too good to be true?
They also claim to have reached 300MHash/sec on the OGD1 FPGA board, so I'm treating everything they say with just a slight pinch of salt for now...
member
Activity: 98
Merit: 10
www.asicminer.net

Too good to be true?

Maybe yes, maybe not so yes! The numbers I threw out are based on making the ASIC in the context of a start-up tiger team in Southern CA, with engineering costs and wages typical for the area. Doing it in China changes the salary numbers. If it's done by a company that already has access to the ASIC tools, that helps, too (i.e., I have access to all of the software needed to do ASIC design at work, but I'm just not in a position to use company resources on this kind of thing without getting fired). Maybe these folks are in a position to get better pricing on the first silicon (like, maybe a cousin in just the right position at a semiconductor foundry can slip the job into a shuttle run?). Maybe some guy who owns an ASIC design house in China happens to be a Bitcoin enthusiast, and decided to pursue the ASIC design with resources he already had? Maybe some mid level manager decided to put his team on the job without getting approval from higher up? Maybe it's all just a scam?

In any case, thanks for finding that! I think it's worth watching to see if it's real.
member
Activity: 99
Merit: 10
member
Activity: 98
Merit: 10
Thank you, and I'm glad that I could contribute something to the thread!

Now, please don't take my negativism about making an ASIC miner the wrong way. If Bitcoin does succeed in becoming a viable world currency as we hope that it will, then I think that a time will come when it will make sense to make mining ASICs. I don't think that time is quite yet, and I just wanted to help folks understand what "make an ASIC" really means. I think it would be a fun project to work on, but I don't have a couple million bucks to spend on it just now. Smiley

One thing that would help reduce risk in the eyes of any potential ASIC development investor would be if the same hardware could also perform other useful functions with an exploitable market, such as cracking passwords or generating pseudorandom porn. This may not be the right thread to brainstorm in this topic, but I am curious about what other useful tasks could be performed by hardware that is designed to mine efficiently, whether it's hard coded into an ASIC with just the right kind of configurability, or programmed into an FPGA.
full member
Activity: 158
Merit: 100
aquí dice algo personal.
hero member
Activity: 560
Merit: 517
Very well written and insightful post, NF6X. Thank you for contributing your knowledge!
member
Activity: 98
Merit: 10
Who guessed the guy with the South Park picture knew so much?

My mind is a warehouse of useless trivia. Smiley
full member
Activity: 210
Merit: 100
Who guessed the guy with the South Park picture knew so much?
member
Activity: 98
Merit: 10
The time it takes to create a production-ready ASIC is not constrained by processing power; it's not something that you can speed up by throwing more computers at it.

The one year benchmark is a rough estimate of the total time needed for an ASIC of low to moderate complexity, designed by a small team of experienced engineers (say, around five of them, with various different specializations), based on my experience in the industry, both directly working on ASIC designs and in other roles related to development, test, support, etc. It's hard to explain where all of that time and money goes to somebody without direct experience in the field, but I'll try to throw out examples of some of the major costs and time-drains. This list is nowhere near exhaustive... bringing an ASIC design from concept through production readiness is a terribly complex task which involves direct action by many dozens of people over a long period of time.

First, the engineers. Experienced ASIC engineers don't come cheap. They'll expect six-figure salaries, and if you don't want to give them that, then they'll go work for somebody else. They also won't just go work on a one-off project like this without some strong incentive, if it means leaving a large, profitable company with good compensation, benefits and stability. For an ASIC comparable to a scaled-up version of the FPGA designs discussed in this thread, I'd estimate that 4-6 engineers would work on it directly. One would be dedicated to physical design (place and route, managing the foundry libraries for the other engineers, etc.). One would be dedicated to production and testing. Then one to three would work on the design itself. So, there are 3-5 expensive, skilled, experienced people who each expect to make north of a hundred grand a year plus benefits (and they will earn that pay with long hours and a lot of difficult work). If you want them to work as contractors and then go away, rather than staying at your company for several years, then double their pay.

Then, there's a mask set for each release to the chip foundry. If your team is very skilled, very careful, and very lucky, they might come up with a production ready part on the first try, but it's safer to assume at least two mask sets. Guess what: a full mask set for a fairly recent process will cost around a quarter million dollars. Prices can be lower, particularly if you share the wafer with other jobs (usually called a "shuttle run" in the industry), but it's still Real Money.

Ok, those engineers are expensive already, but they can't do anything without tools. If you haven't been exposed to the semiconductor industry before, you might be utterly stunned by the cost of ASIC design software. That team of 3-5 engineers will probably need over $50k per year for their software licenses. They'll also need computers to run that software on. And somewhere to work. Heating and air conditioning are nice, but it's amazing what horrible conditions people will put up with if there is enough money in it. Don't skimp on coffee, soft drinks and snacks, though because that investment really pays off in increased productivity.

The team will probably spend three or four months on the design of a chip that's basically a scaled up version of the FPGA designs discussed in this thread, including design, simulation, place and route, timing closure, packaging, bond-out, and all of the other stuff needed. Once they release the design to the chip foundry (called "taping out"), it'll take a couple of months before the first silicon arrives. Of course, you need to keep paying the engineers while you're waiting if you want them to be around to debug the chip when it arrives. Of course, folks aren't just sitting on their hands doing nothing while the chips are being made; there's also the production test program that needs to be developed.

Once the first silicon arrives, you'll be spending some quality time in the lab debugging the chip, and hopefully finding work arounds for any bugs that let you avoid an expensive and time consuming design spin. Remember, the ASIC equivalent of typing "make" costs a quarter million bucks and takes a couple of months. Presumably, you had the foresight to design any needed PCBs for the chip bring-up effort in the lab. You'll probably be amazed by the cost of the production test board and its socket.

Oh, yeah, you'll also need some expensive test equipment for the lab work. We're talking about a GHz chip here, so a $100 logic analyzer from SparkFun won't cut it. Luckily, the test equipment can be rented instead of bought, saving you tens of thousands of dollars (but still costing you thousands of dollars).

After you spend all of this time, if you have done everything right, you now have a working chip design. Now you get to start manufacturing the chips and trying to sell them. And dealing with customer support. And production issues. And testing. And returns. By the time all is said and done, it's not worth starting the job unless you expect to sell several millions of dollars worth of chips to the manufacturers who make products that use them.

I'm sure I've missed a lot of details, but I hope that this has helped explain a little bit of what's involved in making an ASIC.
hero member
Activity: 560
Merit: 517
Quote
Why is ASIC so slow to do?
It's slow because of logistics, not necessarily because the software is slow. When you're dropping millions of dollars on something, you want it done right the first time, so you take your time triple-checking everything and finding the right people and factory (including negotiations).

That said, there are cheaper, and quicker options. Pointed out earlier in this thread (e.g., Mosis), there are services where you can share a wafer with a bunch of other people. This is for test runs, and I think they do runs every two months or less. But obviously the cost is very high. I think when I ran the numbers with someone else, it worked out to maybe getting chips for $2 per MHash/s, with a cost of $52,000 USD for 40 chips. That seems reasonable for a first test run, but I don't have $52,000 sitting around and it would need a custom PCB designed.

That would be a stepping stone towards full scale production, if the demand and investment are made available. But it isn't a viable option on its own, because you can build FPGA solutions for around $2 per MHash/s Tongue
legendary
Activity: 2940
Merit: 1090
Why is ASIC so slow to do? If the routing compiles and so on take a lot of hours aren't there ways to process such problems in parallel or something? Maybe we could all share the CPU power of our mining rigs to help speed up such computations?

-MarkM-
member
Activity: 98
Merit: 10
Regarding ideas about an ASIC miner, I believe that it would be very feasible on a purely technical basis. The only snag is that it would take on the order of a million dollars and a year of work up front to assemble a small ASIC team, design the chip, fabricate the first silicon in a fairly recent process, debug it, spin it once if necessary, develop production test, and get that first prototype ASIC-based mining platform out to the market in sample quantities.

My whole career has been in the semiconductor industry so far. I could certainly assemble a design team to do this. All I would need is a million or two dollars and some expectation of an ROI greater than I'd get by simply putting the money in a savings account. Anybody want to fund a small start-up in southern California to make mining hardware? :/

member
Activity: 99
Merit: 10
 Shocked

Pretty slick there.  Nice going. 
member
Activity: 70
Merit: 10
One cycle per nonce, fully unrolled. So 100MHz = 100MHash/s. And in case it isn't clear, that is a full Bitcoin Hash; two passes of SHA-256 every clock cycle.
Wow. That's quite impressive. Now we need to make an ASIC that runs at 1GHz with 12 fully-unrolled miners on it. Then we need to put four of them on a card.


Best I have been able to do so far on Stratix IV 530 was 8 unrolled cores at 175MHz.  There is still room for improvement though.
legendary
Activity: 1596
Merit: 1012
Democracy is vulnerable to a 51% attack.
you are talking: bitcoind <-> script , right? please tell me if im wrong.
im talking about: script <-> FPGA, the bottleneck is here, i think.
I'm talking about bitcoind<->script, yes. But script<->FPGA is also no problem. A work unit is about 300 bytes. Even a simple I2C bus could handle 1,000 work units a second. JTAG is even faster.

The obvious optimization would be to build the two pieces of code together to avoid having to serialize/deserialize the work units. So you'd just be generating the work unit and writing it to a high-speed serial port. Probably the best way to do it is to have the ASIC read work units as it needs them, and use the flow control in the serial port to control the flow rate. You could use a hardware FIFO if needed. You would probably want a way to clear the buffer when a new block was found to avoid wasted work, but it might be so fast you're better off just letting it drain in the fraction of a second that would take. Wink
legendary
Activity: 1050
Merit: 1000
You are WRONG!
hmm. if we did that there would be a bottleneck. there would be a need to call getwork every 1/3 s. for every chip, right?
I load-tested my getwork optimization patches with a script that does 1,000 getwork queries. The script takes about 1/10th of a second.

You need one work unit per 2^32 hashes. We can easily generate 10,000 work units a second without even doing any serious optimization. That would sustain 43THash/s on my lowly Core 2 Quad.

you are talking: bitcoind <-> script , right? please tell me if im wrong.
im talking about: script <-> FPGA, the bottleneck is here, i think.
legendary
Activity: 1596
Merit: 1012
Democracy is vulnerable to a 51% attack.
hmm. if we did that there would be a bottleneck. there would be a need to call getwork every 1/3 s. for every chip, right?
I load-tested my getwork optimization patches with a script that does 1,000 getwork queries. The script takes about 1/10th of a second.

You need one work unit per 2^32 hashes. We can easily generate 10,000 work units a second without even doing any serious optimization. That would sustain 43THash/s on my lowly Core 2 Quad.
legendary
Activity: 1050
Merit: 1000
You are WRONG!
One cycle per nonce, fully unrolled. So 100MHz = 100MHash/s. And in case it isn't clear, that is a full Bitcoin Hash; two passes of SHA-256 every clock cycle.
Wow. That's quite impressive. Now we need to make an ASIC that runs at 1GHz with 12 fully-unrolled miners on it. Then we need to put four of them on a card.

hmm. if we did that there would be a bottleneck. there would be a need to call getwork every 1/3 s. for every chip, right?
legendary
Activity: 1596
Merit: 1012
Democracy is vulnerable to a 51% attack.
One cycle per nonce, fully unrolled. So 100MHz = 100MHash/s. And in case it isn't clear, that is a full Bitcoin Hash; two passes of SHA-256 every clock cycle.
Wow. That's quite impressive. Now we need to make an ASIC that runs at 1GHz with 12 fully-unrolled miners on it. Then we need to put four of them on a card.
hero member
Activity: 560
Merit: 517
Quote
So is there any overhead loss when you "reduce the number of workers?" Or is it linear? half the workers, half the hash rate?
There is overhead, because when fully unrolled it can take advantage of several optimizations specific to each round of SHA-256 with respect to Bitcoin. When you roll it up, you lose all those optimizations; each round calculator/worker needs to be generalized. Like the assembly like example, if each worker only does one specific thing, they can be very efficient at that one thing. But if they need to do different things at different times then they lose that specialization advantage.

Quote
Awesome. One more question -- when fully unrolled, how many clock cycles per nonce tested?
One cycle per nonce, fully unrolled. So 100MHz = 100MHash/s. And in case it isn't clear, that is a full Bitcoin Hash; two passes of SHA-256 every clock cycle.
Pages:
Jump to: