Pages:
Author

Topic: Official Open Source FPGA Bitcoin Miner (Last Update: April 14th, 2013) - page 31. (Read 432891 times)

legendary
Activity: 1596
Merit: 1012
Democracy is vulnerable to a 51% attack.
Great question! Yes, that is what it does when it is fully unrolled. You can reduce the amount of unrolling which, by your analogue, would be like shrinking the number of workers on the assembly line, and make the conveyor belt of this SHA computing factory move slower. Taken to the extreme, there could be just one factory worker, banging away on a single nonce for 128 cycles before moving onto the next nonce.
Awesome. One more question -- when fully unrolled, how many clock cycles per nonce tested?
full member
Activity: 210
Merit: 100
So is there any overhead loss when you "reduce the number of workers?" Or is it linear? half the workers, half the hash rate?
hero member
Activity: 560
Merit: 517
This may be a dumb question, but ...

Does the FPGA bitcoin miner use an assembly-line approach? That is, while you're doing SHA round 5 on nonce 1, are you doing SHA round 4 on nonce 2, round 3 on nonce 3, round 2 on nonce 4, round 1 on nonce 5 and preparing nonce 6?

Great question! Yes, that is what it does when it is fully unrolled. You can reduce the amount of unrolling which, by your analogue, would be like shrinking the number of workers on the assembly line, and make the conveyor belt of this SHA computing factory move slower. Taken to the extreme, there could be just one factory worker, banging away on a single nonce for 128 cycles before moving onto the next nonce.

Rolling it up, reducing the number of workers, helps save resource consumption on the FPGA and allows it to fit on a smaller FPGA. But since the conveyor is moving slower, performance is reduced by the same amount.
legendary
Activity: 1596
Merit: 1012
Democracy is vulnerable to a 51% attack.
This may be a dumb question, but ...

Does the FPGA bitcoin miner use an assembly-line approach? That is, while you're doing SHA round 5 on nonce 1, are you doing SHA round 4 on nonce 2, round 3 on nonce 3, round 2 on nonce 4, round 1 on nonce 5 and preparing nonce 6?
newbie
Activity: 16
Merit: 0
Hi guys,

I have finished the first version of my C port of TheSevens pyminer
fpga-cminer 0.1 : http://pastebin.com/JRPsnpeJ

It is not as advanced yet but it works on systems without python.
My next step is a I2C interface for mutliple boards with small hashing power, as I have plenty of them Wink

b.r
Lazarus
member
Activity: 99
Merit: 10
Is there any way to run the fpgaminer program on a pc without a quartus license?

No, at present you would require at least the quartus web edition. You could use the jtag access method with quartus but this would require some scripting which is quite complicated to setup if you didn't that before. the serial solutions is much easier to go.

I downloaded the web edition, hoping that without  license i could still get it to work, no dice.
legendary
Activity: 1270
Merit: 1000
Is there any way to run the fpgaminer program on a pc without a quartus license?

No, at present you would require at least the quartus web edition. You could use the jtag access method with quartus but this would require some scripting which is quite complicated to setup if you didn't that before. the serial solutions is much easier to go.
member
Activity: 99
Merit: 10
Is there any way to run the fpgaminer program on a pc without a quartus license?
newbie
Activity: 10
Merit: 0
Hi there! I've just gotten out of the newbie area, so finally I can post here too Smiley

First I want to thank all of you for the great effords you have put in making this FPGA solution for bitcoin mining. I'm new to bitcoin but I already calculated and understanded, that I won't be able to make much money/bitcoins by buying me a handful of GPUs. Electricity costs too much for me and the risks are high, that difficulty keeps on growing. So I look for alternative / new ways for mining. I'm also an electrical engineer and really interested in FPGA development. But I still have much to learn, so I will take the oppertunity by looking at this project.

So far I have managed to synthesize the Xilinx VHDL port (I think I have to thank TheSeven for that) for my tiny AVnet Spartan3a board. http://shop.trenz-electronic.de/catalog/product_info.php?products_id=456
The board posseses a Xilinx Spartan3A, 400K gates, speed grade -4 FPGA. I had to "modify" the board, since the original TI voltage regulators somehow burnt through and I wasn't able to get replacement parts for them of the same type. So the current voltage regulator is an LM350 (3A version) for the core voltage.

I had great problems to get this project up and running for me. At first ISE wasn't able to map the design onto my Spartan or the serial communication was not working (was the FPGA working at all?). At this time I think I used the verilog port for Xilinx. The next problem was to adjust the clock rate to fit my board. I had to reduce the clock rate to 64MHz. At the end I had problems in getting the python miner running on windows.

Here are my statistics for this project, for anyone who is interested in them:
- ISE Version used: 12.3
- FPGA type: xc3s400a,ft256,-4
- Clock rate: 69,34 MHz (I consider this overclocked, since it definitely massively violates timing constraints... 64 MHz looked way more stable)
- Performance: Measuring FPGA performance... 2.167228 MH/s
- pyfpgaminer successfully submitted some shares (so FPGA seems to calculate correctly)
- Synthesis results:

Logic Utilization   Used    Available   Utilization   
Number of Slice Flip Flops   3,948   7,168   55%
Number of occupied Slices3,4353,58495%
Total Number of 4 input LUTs5,4777,16876%
Average Fanout of Non-Clock Nets   2.56

- Path delay example: 23.088ns (11.918ns logic, 11.170ns route)
                                               (51.6% logic, 48.4% route)
- Empirical temperatures:
   - FPGA: almost cold
   - Voltage regulator: ... lets see ... OUCH ... it's hot!
- ESD protection: still intact on FPGA after empirical temperature tests...

Next steps for me:
- Fully understand the design, the SHA256 algorithm and the miner code Grin
- Try to optimize some bits
- Get a damn big and cheap FPGA from somewhere for further testing - I want those 100+ MHash/s
- Report back with new results
member
Activity: 70
Merit: 10
MAC: http://www.altera.com/literature/ug/ug_ethernet.pdf

UDP: http://opencores.org/project,udp_ip__core

Also the dev kit comes with Nios II license, niche TCP/IP stack and reference design (although Nios will consume precious resources)
hero member
Activity: 504
Merit: 500
FPGA Mining LLC
UDP might make more sense, but I've never dealt with that so far.

According to this tutorial, it looks easy stuff: http://www.fpga4fun.com/10BASE-T2.html

Hm, this assumes that you're directly driving the pins of the ethernet port.
On this board, we have an ethernet PHY chip located in between (which should make things easier), and probably have an ethernet MAC block on the FPGA (which should make things easier as well, but we'll first need to figure out how to actually use it).
I'm wondering if there is any ethernet FPGA design for this board floating around on the net?
newbie
Activity: 29
Merit: 0
UDP might make more sense, but I've never dealt with that so far.

According to this tutorial, it looks easy stuff: http://www.fpga4fun.com/10BASE-T2.html
hero member
Activity: 504
Merit: 500
FPGA Mining LLC
Damn, that's just about the worst case. There are only two means of accessing this FPGA:
- Ethernet (not exactly trivial on the FPGA side, but in theory this board could mine standalone!)
- A built-in USB blaster (USB to JTAG bridge), which cannot easily be accessed from anything else than the Altera tools.

Hm, there is a FPGA communication package that 'speaks' UDP on opencores that could be used. Of course the can be some packets missing, since UDP has no provision for package loss ( i would use a sequence number an send a nonce multiple times ...

The other was would be the jtag solution, there is an advance debug system on opencores that uses jtag, as far i understand  the docs, it does not require Altera software, but i think it needs some more effort to get it running. I did not dug into, as i am a fan of the serial solution but on the other hand, i still looking for transfering the jtag-communication to an arm system so i could power off my PC at night (given there are no compilation task pending)

The problem with the JTAG solution is that you need to have a way to access the JTAG interface in the first place. And IIRC communicating with those blasters is a PITA.

UDP might make more sense, but I've never dealt with that so far.

Hooking up some level shifters to one of those gigabit transceivers and driving RS232 on them might be another possibility Smiley
legendary
Activity: 1270
Merit: 1000
Damn, that's just about the worst case. There are only two means of accessing this FPGA:
- Ethernet (not exactly trivial on the FPGA side, but in theory this board could mine standalone!)
- A built-in USB blaster (USB to JTAG bridge), which cannot easily be accessed from anything else than the Altera tools.

Hm, there is a FPGA communication package that 'speaks' UDP on opencores that could be used. Of course the can be some packets missing, since UDP has no provision for package loss ( i would use a sequence number an send a nonce multiple times ...

The other was would be the jtag solution, there is an advance debug system on opencores that uses jtag, as far i understand  the docs, it does not require Altera software, but i think it needs some more effort to get it running. I did not dug into, as i am a fan of the serial solution but on the other hand, i still looking for transfering the jtag-communication to an arm system so i could power off my PC at night (given there are no compilation task pending)
hero member
Activity: 504
Merit: 500
FPGA Mining LLC
Damn, that's just about the worst case. There are only two means of accessing this FPGA:
- Ethernet (not exactly trivial on the FPGA side, but in theory this board could mine standalone!)
- A built-in USB blaster (USB to JTAG bridge), which cannot easily be accessed from anything else than the Altera tools.

member
Activity: 99
Merit: 10
I'm using this Stratix IV Dev board: DK-SI-4SGX230N  (http://parts.digikey.com/1/parts/1666458-kit-dev-stratix-iv-4sgx230n-c2-dk-si-4sgx230n.html)
`No Serial Port.

I'm using this FPGA: EP4SGX230KF40C2

I've been using FPGAMiner's program to communicate with the board and it seems to work fine.

Anything else you need to know?

Thanks.


hero member
Activity: 504
Merit: 500
FPGA Mining LLC

Is the FPGA wrapping around on the keyspace while it's being fed new data? If yes, you'd need to take that into account as well, by measuring the time between the work uploads.

If you care to either rewrite your FPGA design to communicate via a serial port or to write a python interface module for your FPGA's JTAG, you could use my improved python mining code, which should be able to handle 440MH/s at 100% efficiency easily.

I'm not really qualified to do that.  I'm barely able to hang on reading instructions and switching out one or two settings.  I'm at the whims of the rest of you here, until i get a little more experience under my belt.  :-p

Which version of the FPGA design are you using?
Does your board have a serial port?
I might be able to whip up something...
member
Activity: 99
Merit: 10

Is the FPGA wrapping around on the keyspace while it's being fed new data? If yes, you'd need to take that into account as well, by measuring the time between the work uploads.

If you care to either rewrite your FPGA design to communicate via a serial port or to write a python interface module for your FPGA's JTAG, you could use my improved python mining code, which should be able to handle 440MH/s at 100% efficiency easily.

I'm not really qualified to do that.  I'm barely able to hang on reading instructions and switching out one or two settings.  I'm at the whims of the rest of you here, until i get a little more experience under my belt.  :-p
hero member
Activity: 504
Merit: 500
FPGA Mining LLC
440MH/s is the raw underlying hash rate for your configuration, but you need to consider delays associated with presenting the golden nonce, getting new data from the pool etc.  You would never be able to achieve 440MH/s in practice.
Actually making that run at 100% efficiency is fairly trivial, and my Xilinx design is already doing it. Just upload the next work when the old one is like 80% processed, and while you upload it, the FPGA will continue to old one, and still report shares if it found them during the upload.

440MH/s? What kind of badass FPGA is that?

Can that be implemented into "FPGA Miner"'s program?  I suck at coding and would screw it up.  (I'm learning bit by bit...)

I would guess yes, but I'm not at all familar with all this Tcl and Altera stuff.

Its a Stratix IV dev board -- the GX230KF40C2 chip.  Hoping for a Stratix V board within the year... Yeah -- its not cost effective (got the Stratix IV at discount for $2k), but it is pretty slick.  I'm going to try to push it to 260MHz and 2 cores.  But yeah, wasting most of my time talking to the server and asking for blocks is killing me more than anything else.

Way better than mine! It costs $2k as well but only runs 1 fully-unrolled miner at 120MHz (XC5VLX110T-1)

You know of a better way to track ACTUAL mhash/sec so we can see where we need to tweak?

By calculating it, like you already did: 220MHz * 2 fully unrolled miners = 440MH/s. That's the truth. Everything else is just estimates based on the number of shares submitted, which is subject to variance/luck.

Is the FPGA wrapping around on the keyspace while it's being fed new data? If yes, you'd need to take that into account as well, by measuring the time between the work uploads.

If you care to either rewrite your FPGA design to communicate via a serial port or to write a python interface module for your FPGA's JTAG, you could use my improved python mining code, which should be able to handle 440MH/s at 100% efficiency easily.
member
Activity: 99
Merit: 10
440MH/s is the raw underlying hash rate for your configuration, but you need to consider delays associated with presenting the golden nonce, getting new data from the pool etc.  You would never be able to achieve 440MH/s in practice.
Actually making that run at 100% efficiency is fairly trivial, and my Xilinx design is already doing it. Just upload the next work when the old one is like 80% processed, and while you upload it, the FPGA will continue to old one, and still report shares if it found them during the upload.

440MH/s? What kind of badass FPGA is that?

Can that be implemented into "FPGA Miner"'s program?  I suck at coding and would screw it up.  (I'm learning bit by bit...)

Its a Stratix IV dev board -- the GX230KF40C2 chip.  Hoping for a Stratix V board within the year... Yeah -- its not cost effective (got the Stratix IV at discount for $2k), but it is pretty slick.  I'm going to try to push it to 260MHz and 2 cores.  But yeah, wasting most of my time talking to the server and asking for blocks is killing me more than anything else.

You know of a better way to track ACTUAL mhash/sec so we can see where we need to tweak?
Pages:
Jump to: