Official Open Source FPGA Bitcoin Miner (Last Update: April 14th, 2013) - page 47.

r2k-in-the-vortex

newbie

Activity: 19

Merit: 0

i wouldnt reccommend buying fpga board if you only want to mine bitcoins but if you want to learn fpga development i guess you would already know what you want
also pretty decent programming skills and good understanding of electronics are a prequisite

nelisky

legendary

Activity: 1540

Merit: 1002

So before I decide to go fpga happy and shell out a few grand, any advice on Altera vs Xilinx vs whoever else? And what numbers / features should one look at in the specs when deciding what to get? LUT count is obvious, but what else?

r2k-in-the-vortex

newbie

Activity: 19

Merit: 0

Hi

i have also been wanting to get into fpga development for ages and this bitcoin mining idea made the final push to aquiring an fpga dev board, will get get my basys2 pcb with spartan 3e 250k gate device on thursday.
ofcourse i cant put fully pipelined implementation in it as its about 3X larger than the fpga would fit but i imagine i could fit a looped implementation into it easily

so thanks for releasing the code it will be a good reference

i dont know if i'll get to the actual implementation working on it but among other projects this will certainly be one idea i'll try

interfect

full member

Activity: 141

Merit: 100

I'd like to second the request for the small, slow version. I have access to a bunch of DE2 boards (not DE2-115s), which have Cyclone IIs with (I think) 33000 LUs on them. Something that would fit on those would be nice to have, just so I have an excuse to play with them.

On a related note, anyone know how to restore the firmware on a DE2 board? Roll Eyes

mimarob

full member

Activity: 354

Merit: 103

Great!

Now there is a reference implementation, so I don't have to threaten people with my crappy fpga code anymore!

fpgaminer

hero member

Activity: 560

Merit: 517

Quote

I did not look at the code but maybe you can clarify to me how this particular approach scales;

As mentioned, it scales linearly, but only in integer multiples. So an 80K device can get 80MH/s. A 160K device can get 160MH/s. A 240K device can get 240MH/s. But no in-between. At least, not without a different design.

Note that, the design in the repo is not optimized and so uses something like 90K LEs. An optimized design fitting into at least 80K will be released once I've finished it.

Quote

Say I use one of these http://www.terasic.com.tw/cgi-bin/page/archive.pl?Language=English&CategoryNo=138&No=501 instead, what kind of performance could one expect? I'm assuming you can easily put multiple fully unrolled code paths for parallel execution.

Correct. The EP4SGX230 flavor would get at least 160MH/s. The EP4SGX530 would get at least 480 MH/s.

I say "at least" because as far as I understand the Stratix series of devices have better timing than the Cyclone series, and so will support a much faster clock. If they are, for example, twice as fast then you can expect 960MH/s out of the EP4SGX530. However, I don't know for sure what clock rate they can achieve with the mining core.

Quote

In the FPGA though, you have to implement something to comunicate with bitcoind and, if using multiple devices, communicate with each one of them, right?

The FPGA requires a controller, and so is really just a dumb processor like the GPU. It performs the hashing algorithm, and that's about it. Like a GPU, there is a small memory space inside the FPGA that a controller must write the work to (through some external interface like SPI), and a memory space where results (valid hashes) must be read from.

The controller gives the FPGA a 256-bit Midstate, and 512-bit Data (which are acquired through a getwork request from bitcoind or a mining pool). The FPGA then proceeds to process all 2^32 variations and return any nonces that result in a valid hash. In that sense, it's exactly like a GPU where you give it data, and tell it to run 2^32 instances of the kernel.

The controller can be anything. A microcontroller like an Arduino, a microprocessor like an ARM, or even an entire PC like the one you're reading this post on Tongue

Quote

How would one scale this to multiple FPGAs? Some communication between devices would be needed, or will there be a full TCPIP stack communitcation with bitcoin on each one?

As said there are many approaches. One approach is to have the controller make a getwork request for each FPGA, so each FPGA gets its own data to work on and cycle through 2^32 times. This has the benefit of scaling easily, and not requiring traces on the board between the FPGAs (which would need to support high frequency data transfers). The FPGAs can just be put on a single bus, like I2C or SPI, and controlled by a single microcontroller or microprocessor (possibly embedded in one of the FPGAs).

Tukotih

member

Activity: 70

Merit: 10

Quote from: allinvain on May 21, 2011, 10:46:44 PM

Quote from: bulanula on May 21, 2011, 09:21:38 PM

Quote from: allinvain on May 21, 2011, 07:38:36 PM

Off topic perhaps, but I'm wondering if ATI hasn't already built a mining farm of their own.

Lol they have

Whaaaat?

Can you post some proof plz.

Lol, I think he is joking.

The funny thing is that they could just sell them off later as a retail "brand new" GPU. They are already testing every device carefully so why not test them with some bitcoin mining? I believe that would be insanely profitable for them. I mean, how many 6990's do you think they test every hour right now? The tests might be quick, expensive and advanced though...

nathanrees19

full member

Activity: 196

Merit: 100

Quote from: nelisky on May 22, 2011, 07:17:53 AM

I did not look at the code but maybe you can clarify to me how this particular approach scales;

It should scale linearly.

Quote from: nelisky on May 22, 2011, 07:17:53 AM

How would one scale this to multiple FPGAs? Some communication between devices would be needed, or will there be a full TCPIP stack communitcation with bitcoin on each one?

There are many approaches. You could have one master block (with a basic TCP/IP stack) that distributes the work amongst all the other FPGAs if you wanted to eliminate the PC entirely.

Quote from: nelisky on May 22, 2011, 07:17:53 AM

I guess my confusion comes from my PC coding mindset. In the GPU implementation we just create the "unit" calculator and the GPU knows nothing about communicating to and from the bitcoin protocol, it just has an algorithm and a memory space to work with.

If I'm correctly understanding the current implementation, that's pretty much what it does. A script running on a PC handles the communication, and hands the work to the FPGA over the USB interface.

nelisky

legendary

Activity: 1540

Merit: 1002

I did not look at the code but maybe you can clarify to me how this particular approach scales;

Say I use one of these http://www.terasic.com.tw/cgi-bin/page/archive.pl?Language=English&CategoryNo=138&No=501 instead, what kind of performance could one expect? I'm assuming you can easily put multiple fully unrolled code paths for parallel execution.

How would one scale this to multiple FPGAs? Some communication between devices would be needed, or will there be a full TCPIP stack communitcation with bitcoin on each one?

I guess my confusion comes from my PC coding mindset. In the GPU implementation we just create the "unit" calculator and the GPU knows nothing about communicating to and from the bitcoin protocol, it just has an algorithm and a memory space to work with. In the FPGA though, you have to implement something to comunicate with bitcoind and, if using multiple devices, communicate with each one of them, right?

Sorry for the dense questions, just trying to understand this.

nathanrees19

full member

Activity: 196

Merit: 100

Quote from: fpgaminer on May 20, 2011, 05:42:41 PM

Quote

A Spartan 3E 250K has 5508 logic cells. If I understand correctly, this is not enough for a design that needs 90K LUTs, though I'm still learning about all of this.

I'll try to put my serial design online as well. That fits into 3K or 4K I think, although it's obviously much slower (64 cycles per hash). It's a nice toy for people to play with if they have smaller boards.

That would be very much appreciated. Many low-end boards could fit half a dozen of these, and you could pack some into the remaining space on a higher end board to approach 95-100% LUT utilisation.

allinvain

legendary

Activity: 3080

Merit: 1083

Quote from: bulanula on May 21, 2011, 09:21:38 PM

Quote from: allinvain on May 21, 2011, 07:38:36 PM

Off topic perhaps, but I'm wondering if ATI hasn't already built a mining farm of their own.

Lol they have

Whaaaat?

Can you post some proof plz.

bulanula

hero member

Activity: 518

Merit: 500

Quote from: allinvain on May 21, 2011, 07:38:36 PM

Off topic perhaps, but I'm wondering if ATI hasn't already built a mining farm of their own.

Lol they have

allinvain

legendary

Activity: 3080

Merit: 1083

Off topic perhaps, but I'm wondering if ATI hasn't already built a mining farm of their own.

eturnerx

member

Activity: 84

Merit: 10

Quote from: teknohog on May 21, 2011, 02:18:31 PM

There must be cheaper and simpler boards with the same or equivalent FPGA. If you buy this board just for mining, all those nice I/O ports and displays are going to waste.

I saw some cheap boards that were basically RS232+PMOD+FPGA boards - just trying to find them again. Add another $25 for an Ethernet PMOD and you've got a standalone miner. You could probably chain a few of them together to use just the one Ethernet PMOD.
I think the closest thing I've seen to what you guys want is: http://www.copacobana.org/ - it basically nothing but the minimum to run a butt load of FPGAs. Economies of scale much?
I think one of the things that'll drive FPGAs to better economies per MHash/s is good PCB design. You're right - once the design is proven in a development board it makes more sense to invest in custom PCBs for mass rollout.

k

sr. member

Activity: 451

Merit: 250

Quote from: allinvain on May 21, 2011, 04:30:13 PM

Quote from: Basiley on May 21, 2011, 05:30:39 AM

Quote from: allinvain on May 20, 2011, 09:04:52 PM

This breaks my heart, open source FPGA mining code, yet the hardware is prohibitively expensive for the performance that it gives. For $600 I can buy nearly 4 5870s!

nowadays FPGA's had small density/per case[read small chips, less switches in package], made on obsolete waffer[0.13 in "best case" more usually 0.35], mean work on laughable frequencies and produced in small quatities primarily to IC designers, supers manufacturers and gov'ts.
when/if market grow up and/or someone invest in it[without which its unlikely happen], something may change.

I hope that happens, or it could also be the case that video cards will just get more and more powerful and they'll always dominate FPGAs. Where FPGA excell is at performance per watt, but not per $...if video cards double or triple their hashing performance but remain within the same thermal (PCIE 2.0 spec) envelope then GPU mining will still have a future.

Let's just hope that future ATI cards will remain excellent performers in doing straight integer work. But I have some doubs though as games are more floating point heavy and naturally graphics card manufacturers are going to optimize their architecture to perform better while gaming, not mining.

Imagine though if ATI caught wind of this whole mining phenomenon and they'd produce a Cayman core optimized for hashing!

was thinking about this and about how much mining hardware is being added continually to the network and would it be on a GPU manufacturers radar.

based on the numbers I have seen ~620000 Mhash/s was added over the period of the last difficulty increase. I think that took ~9 days. With some very rough top of the head numbers, that's ~1000 AMD cards (just assuming an average of 620 Mhash/card here for round numbers) say in 9 days. If that was constant it would imply ~40000 new AMD cards/year. How many sales would that translate into for AMD? A large % would probably have been cards people already owned.

But the rate of increase in the network strength has been increasing exponentially.

If the rate of increase continues on this trend it's not hard seeing the #cards needed to maintain the network strength being significant and starting to have an impact on graphics card sales sales numbers. I found this article from a Google search saying the Q1 sales #'s for add-in-boards were ~19m and games enthusiasts constitute ~9m sales/year of graphics cards http://jonpeddie.com/publications/add-in-board-report/ Just a few more doublings of the network hash rate (not long on the current trajectory) and the numbers of cards/year approaches a significant % of graphics card sales.

It'll be interesting to see if the incredible growth in everything bitcoin continues on its current explosive path.
But who knows and by then perhaps FPGAs will have over and we'll be discussing the imminent arrival of custom made ASICs.
Sorry for going way off topic - moderators feel free to move this post if you want.

jasonk

full member

Activity: 168

Merit: 100

Are there other dev boards that have more FPGA's per board that would make things more cost effective? $300 for 50M hashes is steep. If one could get a board for $500 with 10 FPGA's on it, that would be more worth it.

allinvain

legendary

Activity: 3080

Merit: 1083

Quote from: teknohog on May 21, 2011, 02:18:31 PM

Quote from: fpgaminer on May 19, 2011, 09:33:56 PM

Compatible Board (and only purchase currently required):
Terasic DE2-115 Development Board

There must be cheaper and simpler boards with the same or equivalent FPGA. If you buy this board just for mining, all those nice I/O ports and displays are going to waste.

Hey let's form a mining group and approach the FPGA manufacturers and tell them if they can build for us a custom more stripped own version of their hardware we'd buy in bulk. I think aggregating purchasing power this way can cut costs down for everyone. Just a thought.

allinvain

legendary

Activity: 3080

Merit: 1083

Quote from: Basiley on May 21, 2011, 05:30:39 AM

Quote from: allinvain on May 20, 2011, 09:04:52 PM

This breaks my heart, open source FPGA mining code, yet the hardware is prohibitively expensive for the performance that it gives. For $600 I can buy nearly 4 5870s!

nowadays FPGA's had small density/per case[read small chips, less switches in package], made on obsolete waffer[0.13 in "best case" more usually 0.35], mean work on laughable frequencies and produced in small quatities primarily to IC designers, supers manufacturers and gov'ts.
when/if market grow up and/or someone invest in it[without which its unlikely happen], something may change.

I hope that happens, or it could also be the case that video cards will just get more and more powerful and they'll always dominate FPGAs. Where FPGA excell is at performance per watt, but not per $...if video cards double or triple their hashing performance but remain within the same thermal (PCIE 2.0 spec) envelope then GPU mining will still have a future.

Let's just hope that future ATI cards will remain excellent performers in doing straight integer work. But I have some doubs though as games are more floating point heavy and naturally graphics card manufacturers are going to optimize their architecture to perform better while gaming, not mining.

Imagine though if ATI caught wind of this whole mining phenomenon and they'd produce a Cayman core optimized for hashing!

teknohog

sr. member

Activity: 520

Merit: 253

555

Quote from: fpgaminer on May 19, 2011, 09:33:56 PM

Compatible Board (and only purchase currently required):
Terasic DE2-115 Development Board

There must be cheaper and simpler boards with the same or equivalent FPGA. If you buy this board just for mining, all those nice I/O ports and displays are going to waste.

Basiley

newbie

Activity: 42

Merit: 0

Quote from: allinvain on May 20, 2011, 09:04:52 PM

This breaks my heart, open source FPGA mining code, yet the hardware is prohibitively expensive for the performance that it gives. For $600 I can buy nearly 4 5870s!

nowadays FPGA's had small density/per case[read small chips, less switches in package], made on obsolete waffer[0.13 in "best case" more usually 0.35], mean work on laughable frequencies and produced in small quatities primarily to IC designers, supers manufacturers and gov'ts.
when/if market grow up and/or someone invest in it[without which its unlikely happen], something may change.

Topic: Official Open Source FPGA Bitcoin Miner (Last Update: April 14th, 2013) - page 47. (Read 432976 times)