Pages:
Author

Topic: Official Open Source FPGA Bitcoin Miner (Last Update: April 14th, 2013) - page 43. (Read 432921 times)

hero member
Activity: 504
Merit: 500
FPGA Mining LLC
I am probably planning to buy an FPGA. Which one is faster?

The Xilinx Spartan 6 XC6SLX25 FPGA, Speed Grade 3 or an Spartan 3E 500,000 gate FPGA?
Both aren't really suited well.
I estimate the XC6SLX25 at about 20MH/s, and the Spartan 3E 500K at <2MH/s.
You'll want to get an XC6SLX150-2, those can do about 190MH/s (at least ArtForz claims to have achieved that).

The next best Spartan3 has 1200K gates vs this 500K, so it might be able to quadruple the units.
Still <5MH/s.

But congratulations for making it work Smiley
sr. member
Activity: 520
Merit: 253
555
Unfortunately, my Spartan3E 500K has to keep the loop unrolling to a minimum. Sad
How fast is it?

I haven't figured it out exactly, but I understand that without any unrolling, it takes something like 66 clock cycles per hash, so at 50 MHz this is a little less that 1 Mhash/s. This would give about 1.5 h between shares in a pool, which is roughly what I see.

However, this is only about 60 % utilization, it's frustratingly close to being able to double this. (It would need about 10K vs. my 9K LUTs.) The next best Spartan3 has 1200K gates vs this 500K, so it might be able to quadruple the units.

I think you need a Spartan6 to do any serious mining, but even then you should check the number of logic units, the series has some low-end models as well.

Of course, you can also increase the clock frequency. But, for example, the current code on this chip is limited to about 70 MHz.
newbie
Activity: 20
Merit: 0
I am probably planning to buy an FPGA. Which one is faster?

The Xilinx Spartan 6 XC6SLX25 FPGA, Speed Grade 3 or an Spartan 3E 500,000 gate FPGA?
newbie
Activity: 20
Merit: 0
Phew! After a couple of weeks of learning FPGAs, here is my port of the "Official" FPGA miner to Xilinx chips, using the serial port for communications:

[...]

Unfortunately, my Spartan3E 500K has to keep the loop unrolling to a minimum. Sad

How fast is it?
sr. member
Activity: 520
Merit: 253
555
Phew! After a couple of weeks of learning FPGAs, here is my port of the "Official" FPGA miner to Xilinx chips, using the serial port for communications:

http://iki.fi/teknohog/hacks/software/xilinx-serial-miner.zip

I have tried to make only minimal changes to the original Verilog code. The communication could probably use some error checking, but it's a "works for me" first release, with a few accepted shares in a pool.

Unfortunately, my Spartan3E 500K has to keep the loop unrolling to a minimum. Sad
hero member
Activity: 686
Merit: 564
1) The last 3 rounds the second SHA-256 pass are not needed. You only need to check that Round64.H is equal to 0, and the last three rounds do not affect H.

If I'm reading the synthesis messages correctly, I think Quartus II has at least partially noticed this during its optimizations? There's a message about a whole bunch of registers losing all their fanouts during optimization, and the list of which have seems to contain a substantial portion of the last 3 rounds, as well as parts of slightly earlier rounds...
hero member
Activity: 504
Merit: 500
FPGA Mining LLC
do you already ported the configurable version to vhdl?
Now I have: http://dl.dropbox.com/u/23683845/fpgaminer-virtex5.zip
You'll need to adjust the line "constant DEPTH : integer := 6;" (2^n pipeline stages) in top.vhd.

If you like my work, please send some coins to 13kwqR7B4WcSAJCYJH1eXQcxG5vVUwKAqY
hero member
Activity: 504
Merit: 500
FPGA Mining LLC
@TheSeven: thanks for providing the vhdl sources. i am currently porting it to an atlys board (which is quite trivial) but the completely unrolled version is not fitting. do you already ported the configurable version to vhdl?
I didn't get around to making this configurable, but I made a smaller one manually earlier today, which is completely untested though.
Hm, I might as well just have a shot at a configurable one right now.

Actually I might offer to make a custom-made optimized fpga images (bit file) for Xilinx FPGAs for an adequate amount of bitcoins, if people want me to...
newbie
Activity: 10
Merit: 0
@TheSeven: thanks for providing the vhdl sources. i am currently porting it to an atlys board (which is quite trivial) but the completely unrolled version is not fitting. do you already ported the configurable version to vhdl?
hero member
Activity: 504
Merit: 500
FPGA Mining LLC
I don't think so, but it probably isn't worth its price.
I have one. The first fully unrolled design didn't fit (too large by factor of 3), but hopefully one of the newer designs will.
This board will get something around 5MH/s.
newbie
Activity: 56
Merit: 0
I don't think so, but it probably isn't worth its price.
I have one. The first fully unrolled design didn't fit (too large by factor of 3), but hopefully one of the newer designs will.
hero member
Activity: 504
Merit: 500
FPGA Mining LLC
Quote
it reports 48C for the junction temperature. I might just keep it at 50 to be safe.
Altera commercial FPGAs are rated for 85C JT.
Xilinx as well, and they're pretty stable. Voltage range on mine is 0.95-1.05V, temperature range is 0-85°C, but it's still running fine at 0.93V at 90°C, even though that probably hurts life expectancy. So you will probably not want to try this for extended periods of time, but rather try to keep it below 70°C or something.

I don't think so, but it probably isn't worth its price.
newbie
Activity: 17
Merit: 0
hero member
Activity: 560
Merit: 517
Quote
it reports 48C for the junction temperature. I might just keep it at 50 to be safe.
Altera commercial FPGAs are rated for 85C JT.
newbie
Activity: 54
Merit: 0
Well I will certainly double check my math, but you can most certainly compute some of W after the initial 16. Example (0 indexed):

Code:
w[16] = w[0] + s0(w[1]) + w[9] + s1(w[14])

All those values are known and do not change during the course of a work unit. The same applies to w[17] and w[18]. I don't have my notes with me for the rest.
You are probably right.  There is some portion of the total W block which is a constant.  The portions which require re-evaluation though will probably have to do multiple iterations of addition and the S0/S1 functions.  Plus once you get past around W[25] or so, all of the higher up entries will be affected by the nonce (going only off of memory right now).  I'm not sure it's such a big win in area to recompute only parts of the W block, but I haven't looked at it exhaustively.  I would be curious to hear if you have any details about this enhancement.
hero member
Activity: 504
Merit: 500
FPGA Mining LLC
Check with PowerPlay. Make sure no heatsink and no fan is selected, and the toggle rate is ~65%. See what it says the JT is.

With a 50Mhz clock and the toggle rate manually set to 65%, it reports 48C for the junction temperature. I might just keep it at 50 to be safe.
I've just reworked the cooling of mine. With a big heatsink, I can do 120MH/s with passive cooling on the virtex5. With a smaller heatsink it would need a fan to run at very low RPM.
I'd suggest to just stick a small heatsink to yours and closely monitor the temperature for a couple of minutes. If it stays at like 60°C that's perfectly fine.
full member
Activity: 196
Merit: 100
Check with PowerPlay. Make sure no heatsink and no fan is selected, and the toggle rate is ~65%. See what it says the JT is.

With a 50Mhz clock and the toggle rate manually set to 65%, it reports 48C for the junction temperature. I might just keep it at 50 to be safe.
hero member
Activity: 560
Merit: 517
Quote
It worked! There was enough room left to add a blinking LED when it finds the golden nonce. Just.
Grin Yay!

Quote
It seems to be reporting an Fmax of 82.31Mhz. Is this actually safe without a heatsink?
It isn't on the C4-115 chip, but for the tiny C4-22 it might be. Check with PowerPlay. Make sure no heatsink and no fan is selected, and the toggle rate is ~65%. See what it says the JT is.

Quote
The computation of future rounds of W
Well I will certainly double check my math, but you can most certainly compute some of W after the initial 16. Example (0 indexed):

Code:
w[16] = w[0] + s0(w[1]) + w[9] + s1(w[14])

All those values are known and do not change during the course of a work unit. The same applies to w[17] and w[18]. I don't have my notes with me for the rest.
full member
Activity: 196
Merit: 100
Try commenting out line 107, the virtual_wire for "NONC". It isn't currently used and should save a few LEs.

It worked! There was enough room left to add a blinking LED when it finds the golden nonce. Just.

It seems to be reporting an Fmax of 82.31Mhz. Is this actually safe without a heatsink?
newbie
Activity: 56
Merit: 0
Wow, cool! I doubt I'd get anywhere close to reasonable hash rates with my nexys 2 board, but I'm glad that someone has done work to make things xilinx-compatible!
Pages:
Jump to: