I *thought* the Icarus bitstream could, was reported as able to, produce 400mh/s from these things while using only two of the four FPGAs. Is no one getting that?
It should, the problem with the default icarus bitstream is clock stability, it does it, but it seems to have clocking problems at the default rate. Anyone with the ability to reflash their boards should be able to reflash the default icarus bitstream to these, and try connecting, it should work with the first 2 chips only. at about 200MHash/s per chip. Provided they can get the clock stable. Ideally we need to modify the icarus bitstream to run at a lower clock source (say 50Mhz) and clock it up using an internal DCM to clock it up for hashing.
It should be noted that they initially designed the board with the hope that the icarus bitstream would run out of the box on this. The problem was there was a swap on 2 of the tx/rx pins from the first to second fpga in each "Icarus Chain". This means that only one of the chips hashes on the stock icarus bitstream.
Icarus is opensource, so several people (myself among them) have been working our butts off to do 2 things:
- Synthesize the icarus source code to a new "Cairnsmore" bitstream, which we've marginally succeeded at (resulting in the 50Mhz bitstream out now).
- Design a new bitstream from scratch that should easily beat the icarus bitstream.
The problem with re-synthesizing the icarus bitstream is that:
A) The source is distributed in a raw HDL format. Not including project files, compiler flags, pinouts, and other important bits needed to synthesize it. this requires a LOT of work to get the project ready to "compile" so to speak. Ultimately the open source project is "incomplete" in it's existing state.
B) Most of the time synthesis outright fails. P&R on this source is bad, and it has a hard time placing it all on the chip (I believe ngzhang did this by using a combination of Xilinx ISE and Synplify, without a "legit" license for these, which may be legal in his country but not here, and synplify is expensive)
C) Each run takes MANY hours, so you can only get a couple attempts in a single day (and yes I know you can fire up a cluster in EC2 and run PAR there, I've done that, on a cluster of about a dozen Double Extra Large instances in EC2, for a day, still didn't help matters so far, and yes it was expensive lol).
D) The code is a bit strange, several things are... Wierd... in it. And it causes some problems, such as the boundry between clock domains between the comms code and the hashing cores, this causes in the default setup a clocking breakdown and the comms core fails, leaving the chip useless until it's restarted. There is also some over-sensitive settings on the hashing clock, which is why we had to underclock it to 50Mhz (over that the clock becomes unstable), which since the icarus code runs one hash per clock, that means 50MHash per chip. And a few other issues. Essentially it needs A LOT of hand holding to get it into a useful state at all.
Since getting the basic build done (which was used to produce the 50Mhz bitstream), I've since given up on porting icarus over and moved back to building my own bitstream from scratch, designed for Cairnsmore. But that's also not a simple task. The reason I focused on Icarus initially was in the hope it would only take me a few days to hack together a usable 200Mhash bitstream. Unfortunately the level of effort is increasing drastically, and that time is better spent on the fresh bitstream.
At the same time the Enterpoint guys are still working to get the icarus bitstream "hacked" into a working state pulling at least close to 200MHash per chip.
Also the Tricone bitstream is an option as well which should be doable on these chips. Hopefully someone drives that forward as well, as it would be a good option for people to have (as it should pull close to 1GHash/s on these boards)
Lastly enterpoint has the skillset to make their own bitstream, but they have said that will be several months away. Right now their developers are all tied up, but their hardware team had cycles (which is why the boards could be made quickly). Once their developers are freed up they can likely bang out a nice bitstream for these boards.
The low performance is not indicitive of this board's actual performance, it's simply the fact that the icarus code is painful to work with, and due to one minor oversight (the swapped tx/rx, which is partially due to unclear source and lack of documentation on the icarus project), the icarus bitstream doesn't run out of the box on it. So the current options from a bitstream perspective are a bit lacking.
Overall this was all made clear in the beginning, Enterpoint has been awesome at keeping everyone in the loop during development and shipping. So hopefully either we get a hack working using the icarus based bitstream, or someone releases another bitstream for this board very soon that takes advantage of it's real potential.
I don't want to make any promises at all on my bitstream as it's too early to tell, and my time is limited (I have a day job to keep up with too lol).
So right now the way it stands, I managed to re-package and synthesize the icarus code into a usable bitstream, but it had issues. I sent that "complete" project and bitstream (along with ncd files and such) to Enterpoint, which they were able to tweak/adjust to get the current bitstreams in distribution. I'm not doing any further work on the icarus stuff, but hopefully they will make more progress.
I should have posted some of this info earlier, but I've been so busy the little time I have has been spent with blinders on working on the bitstream. So I had no idea this thread had progressed this far lol.
Hopefully this helps clarify some of the current bitstream situation.