Pages:
Author

Topic: Official Open Source FPGA Bitcoin Miner (Last Update: April 14th, 2013) - page 29. (Read 432950 times)

hero member
Activity: 560
Merit: 517
Just a quick progress report. I dug back into my Spartan-6 LX150, updated its code to the version that can roll up, and used a CONFIG_LOOP_LOG2 setting of 5; I just wanted to get something compiled and working  Tongue And after waiting long enough, it did indeed churn out a valid result! So ... progress! I am going to clean up the project and get it into the public repo. From there I will crank down the LOOP_LOG2 setting as low as it will go, and begin adding extra pipelining to see if it will push further.
hero member
Activity: 560
Merit: 517
Does someone got the current version on github (fpgaminer + makomk modifications) successfully running in an FPGA?
It compiles problemless in a EPS4GX230 device but creates me only Stales, no Shares.
The "original" fpgaminer version runs problemless ...
Are you using projects/DE2_115_makomk_mod/ with CONFIG_LOOP_LOG2 set to something other than 0? It must be set to 0 for the version on my public repo. The version on makomk's repo will work with other settings, but it basically just reverts to the unoptimized version if you do that so there's little point in using it unless CONFIG_LOOP_LOG2 is 0.

You also mentioned in your PM that you're using an "old" mining tcl script. How old? It was updated a few weeks ago to reflect changes in the code.
newbie
Activity: 6
Merit: 0
Does someone got the current version on github (fpgaminer + makomk modifications) successfully running in an FPGA?
It compiles problemless in a EPS4GX230 device but creates me only Stales, no Shares.
The "original" fpgaminer version runs problemless ...
legendary
Activity: 1270
Merit: 1000
And going OT, could someone produce a bitstream for the EP1C80 for me?
I wasn't even aware there was such a device...

Err...  EP1S80F1508C6, unfortunalty altera issues only licenses for quartus that are sold by licensed distributirs so i have no possiblity to produce a bitstream myself. Sad
hero member
Activity: 686
Merit: 564
Great work, on my EP3C25C6 Board the Device utilisation is reduced by approx 20% and the Fmax increases from 87MHz to 103Mhz but maybe there is still some room. I did some compile cycles i get different results and the difference is substantial. 10 Mhz  or so.
I could verify that it works, by erarning shares.

For the EP2C35C8 i am able to change LOOP_LOG2 from 3 to 2 with no change for Fmax. The test with the real hardware isn't done yet  but maybe tomorrow.
Glad to hear a success story!

Is the toplevel compatible with the serial toplevel?
Unfortunately, some manual merging of changes is probably going to be required because both this and serial support modify fpgaminer_top.v. Shouldn't be too difficult to do in theory though, especially now that teknohog's merged his serial code with an older version of my modificatiosn.

And going OT, could someone produce a bitstream for the EP1C80 for me?
I wasn't even aware there was such a device...
legendary
Activity: 1270
Merit: 1000
By the way, anyone running with LOOP_LOG2=1, 2 or 3 might find the experimental partial-unroll-opt branch useful - it reduces the resource usage somewhat. Unfortunately it also breaks LOOP_LOG2=4 and greater and hasn't been tested on actual FPGAs.

Great work, on my EP3C25C6 Board the Device utilisation is reduced by approx 20% and the Fmax increases from 87MHz to 103Mhz but maybe there is still some room. I did some compile cycles i get different results and the difference is substantial. 10 Mhz  or so.
I could verify that it works, by erarning shares.

For the EP2C35C8 i am able to change LOOP_LOG2 from 3 to 2 with no change for Fmax. The test with the real hardware isn't done yet  but maybe tomorrow.

Is the toplevel compatible with the serial toplevel?

And going OT, could someone produce a bitstream for the EP1C80 for me?
hero member
Activity: 560
Merit: 517
udif and makomk, I do not have donation addresses for you. If you would like your donation address added to the first post with the Contributors list, and in the github project's README.md, please contact me and let me know.
hero member
Activity: 560
Merit: 517
July 17th, 2011 - Code Updates and Minor Cleanup
teknohog's Xilinx Verilog port on the public repo has been updated. teknohog's serial modifications to makomk's code have been added as a separate project. OrphanedGland's port to Stratix devices, using VHDL, has been merged into the public repo. To top it all off, I updated the project's main README.md file, to prominently include a list of contributors and their donation addresses, because they deserve recognition for their hard work. I will modify the first post in this thread to include the same list Smiley

As it wasn't mentioned before on the first post, I am mentioning here that makomk made improvements to my base Verilog code. These changes improved both the overall performance of the design, and its area consumption, allowing the design to fit on a smaller, cheaper EP4CE75 chip. Great work makomk!
hero member
Activity: 686
Merit: 564
By the way, anyone running with LOOP_LOG2=1, 2 or 3 might find the experimental partial-unroll-opt branch useful - it reduces the resource usage somewhat. Unfortunately it also breaks LOOP_LOG2=4 and greater and hasn't been tested on actual FPGAs.
hero member
Activity: 686
Merit: 564
That's my suspicion as well, however I failed to figure out how to configure this, didn't have time to search thoroughly. Do you have a hint?
"HDL Options" section of the XST configuration, options "Shift register extraction" and "Shift register minimum size". The default is to create shift registers as short as 2 stages, which is a bit daft. (I assume you've already figured out this quirk of ISE, but for some reason you have to select fpgaminer_top in the hierarchy before it'll show you the list of process steps and let you configure them.)

This probably matters most for Spartan-6; other Xilinx FPGAs don't seem to have such annoying limitations on adders.
hero member
Activity: 504
Merit: 500
FPGA Mining LLC
Also, have you tried disabling shift register inferrence or turning up the threshold for it? Those compete for the same resources as adders.
That's my suspicion as well, however I failed to figure out how to configure this, didn't have time to search thoroughly. Do you have a hint?
hero member
Activity: 686
Merit: 564
That might explain it. Since SHA-256 is almost entirely addition logic, then that means half the Spartan-6 chips are useless. But I couldn't even get 64 rounds fit on my LX150. Maybe the CLBs are ordered in such a way that it was forced to route around the useless ones, causing massive routing delays? If so, padding with registers between rounds might fix that (as others have suggested). If those useless CLBs can't be used for anything else, might as well pack 'em with a bunch of registers.

It could also be a combination of poor routing design, and having to route around the useless CLBs.
My personal suspicion is that support for this particular Spartan-6 misfeature in ISE is buggy and badly tested; I'm fairly sure that at one point I was trying to cram on more adders than there were even SLICEM/Ls it could fit them in and it tried to map them anyway without realising this was impossible. It also doesn't seem to report usage of this particular resource in any useful way. It's possible that their routing design is also bad too, of course. I recently upgraded to ISE 13.2 and its behaviour in this department seems to have changed quite a bit from ISE 13.1, though I'm not sure if this is entirely for the better...

Edit: Think I may have been wrong about this; it's a bit under the maximum number of SLICEM/Ls available in theory.

Also, have you tried disabling shift register inferrence or turning up the threshold for it? Those compete for the same resources as adders.

A great first number! I'm very interested in the Kintex-7 series, and hope they are made available soon. I'll be checking their booth at CES next year for sure  Cheesy Think they'll sell me a devkit for Bitcoins?  Tongue
Heheheheheheh. They certainly seem impressive, though I'll wait until someone's actually submitting shares with one, or at least until they're actually available for sale, before I actually believe it ;-).

Edit 2: 200 MHash/sec, though that increased total build time to over an hour. I have a feeling this isn't even pushing the tools slightly yet...
hero member
Activity: 560
Merit: 517
Quote
I also found a way to program the FPGA without Altera tools, using UrJTAG. There is a script included for this.
Cheesy Fantastic! I played with UrJTAG once, but it just barfed on me. Maybe I'll give it another try. Their disclaimer that it might damage the hardware is a bit disconcerting ... but I doubt it's likely.

Quote
Too good to be true?
Unless you have the code, or the physical device in your hand, you shouldn't believe it. This applies to every number someone quotes, including myself. It's like that quote you find in a lot of version control tutorials and guides. If it isn't commit-ed, it doesn't exist.

Quote
By the way - and this is probably what I get for not reading data sheets - I only just realized that the Virtex-5 and Virtex-6 series don't suffer from the annoying "only half the CLBs have fast carry logic" restriction of Spartan-6 FPGAs. Guess that could partly explain why Spartan-6 FPGAs have so much trouble with this design...
That might explain it. Since SHA-256 is almost entirely addition logic, then that means half the Spartan-6 chips are useless. But I couldn't even get 64 rounds fit on my LX150. Maybe the CLBs are ordered in such a way that it was forced to route around the useless ones, causing massive routing delays? If so, padding with registers between rounds might fix that (as others have suggested). If those useless CLBs can't be used for anything else, might as well pack 'em with a bunch of registers.

It could also be a combination of poor routing design, and having to route around the useless CLBs.

Quote
It appears ISE has no trouble at all fitting a 150 MHash/sec design on the Kintex-7 XC7K70T either (as in, it reaches that clock without even trying hard) - though note that this is an untested tweaked design and obviously I don't have the FPGA to run it on anyway. Kintex-7 is supposed to basically be a much cheaper 28nm equivalent of the Virtex-6 series.
A great first number! I'm very interested in the Kintex-7 series, and hope they are made available soon. I'll be checking their booth at CES next year for sure  Cheesy Think they'll sell me a devkit for Bitcoins?  Tongue
hero member
Activity: 686
Merit: 564
By the way - and this is probably what I get for not reading data sheets - I only just realized that the Virtex-5 and Virtex-6 series don't suffer from the annoying "only half the CLBs have fast carry logic" restriction of Spartan-6 FPGAs. Guess that could partly explain why Spartan-6 FPGAs have so much trouble with this design...

It appears ISE has no trouble at all fitting a 150 MHash/sec design on the Kintex-7 XC7K70T either (as in, it reaches that clock without even trying hard) - though note that this is an untested tweaked design and obviously I don't have the FPGA to run it on anyway. Kintex-7 is supposed to basically be a much cheaper 28nm equivalent of the Virtex-6 series.
sr. member
Activity: 520
Merit: 253
555
Thank you!  But unfortunately, my board does not have a serial connection Undecided

Ah, I noticed this has already been discussed in some form. I think somebody also mentioned level converters, you could then use any general I/O pins for the serial port. For example, I've used an old Nokia phone cable for this, and then there are other simple circuits, for example using the MAX232.

(A lot of embedded systems have serial ports at this 3.3V level. I guess you could have a serial connection with such levels at both ends, without any conversion.)
member
Activity: 99
Merit: 10
Is there any way to run the fpgaminer program on a pc without a quartus license?

Now there is:

https://github.com/teknohog/Open-Source-FPGA-Bitcoin-Miner/tree/master/projects/DE2_115_makomk_serial

Basically, my earlier serial port code for Xilinx, adapted into the DE2-115 project. As with the other projects, this runs at 50 MHz by default, but I chose the makomk version so that I could run it at 109 MHz, and it works great Cheesy

I also found a way to program the FPGA without Altera tools, using UrJTAG. There is a script included for this.

I'll also post .sof and .svf files for a quick start, as soon as the 50 MHz versions are ready.

By the way, one reason for this port was the problem of getting tclcurl working. The Quartus II software for Linux is 32-bit only, and while it runs on a 64-bit system, it is tricky to install a suitable library.

Thank you!  But unfortunately, my board does not have a serial connection Undecided
sr. member
Activity: 520
Merit: 253
555
Is there any way to run the fpgaminer program on a pc without a quartus license?

Now there is:

https://github.com/teknohog/Open-Source-FPGA-Bitcoin-Miner/tree/master/projects/DE2_115_makomk_serial

Basically, my earlier serial port code for Xilinx, adapted into the DE2-115 project. As with the other projects, this runs at 50 MHz by default, but I chose the makomk version so that I could run it at 109 MHz, and it works great Cheesy

I also found a way to program the FPGA without Altera tools, using UrJTAG. There is a script included for this.

I'll also post .sof and .svf files for a quick start, as soon as the 50 MHz versions are ready.

By the way, one reason for this port was the problem of getting tclcurl working. The Quartus II software for Linux is 32-bit only, and while it runs on a 64-bit system, it is tricky to install a suitable library.
legendary
Activity: 2940
Merit: 1090
They claim to be foreigners working in China, that might be to explain their English or if true might explain their English.

I have seen better scam-sites than that though, as well as worse. Who will send money first?

-MarkM-
full member
Activity: 210
Merit: 100
I'm skeptical for a couple reasons:
1. Perfect English
2. They claim to have developed it in a few months timeframe. That's before bitcoin really took off and became valuable. Why would they have started so early? And is it possible to do in just a few months?
3. It's in China
4. No solid evidence released yet. They were supposed to release info a few days ago but nothing
hero member
Activity: 504
Merit: 500
FPGA Mining LLC
Oil cooling for a 500MH/s ASIC? Doesn't really sound true.
Even for old sASIC processes this should be doable with passive cooling with a reasonably large heatsink, or with a small fan.
Pages:
Jump to: