Pages:
Author

Topic: Algorithmically placed FPGA miner: 255MH/s/chip, supports all known boards - page 42. (Read 119429 times)

hero member
Activity: 556
Merit: 500
I hate to nag, but do you have any plans to release this bitstream or sell it? I am willing to offer my first born child. Thanks.
donator
Activity: 980
Merit: 1004
felonious vagrancy, personified
I will be happy to do that once you post the other half of the IDCODE readout!

ngzhang has extracted the IDCODE readout (nice job!) and I have paid him the bounty:

  https://bitcointalksearch.org/topic/m.886013
donator
Activity: 980
Merit: 1004
felonious vagrancy, personified
I will offer a 10BTC bounty to anybody who posts the JTAG IDCODE readout from the BFL single -- merely to satisfy my curiosity.  There was a JTAG header on the last PCB I saw them post.
Here you go:
https://bitcointalksearch.org/topic/m.870733

You've only posted half of the IDCODE readout (there are two chains).

Send your bitcoins to: 139uZdmLamPy2uifmijbGJBJAfYg4HqUZp

I will be happy to do that once you post the other half of the IDCODE readout!
hero member
Activity: 489
Merit: 500
Immersionist
I will offer a 10BTC bounty to anybody who posts the JTAG IDCODE readout from the BFL single -- merely to satisfy my curiosity.  There was a JTAG header on the last PCB I saw them post.

Here you go:
https://bitcointalksearch.org/topic/m.870733

Send your bitcoins to: 139uZdmLamPy2uifmijbGJBJAfYg4HqUZp

 Grin
donator
Activity: 980
Merit: 1004
felonious vagrancy, personified
An you certainly don't have 13A on that pin you measured... Wink

Of course not, because I'm using four of them! (and another five for ground)

In the end we're talking more like maybe 50mV drop here, which can already cause trouble in applications like this (and generally lowers efficiency).

You have to ask yourself here, which is worse: losing a bit of efficiency or being stuck at (say) 220MH/s while everybody else is getting 270MH/s?

Just trying to be helpful.

BTW, you might want to consider software-controlled voltage.  It isn't hard; my boards have it.  Most high-quality DC-DC converters determine the output voltage based on a resistor.  Stick a digipot in there instead.  If you do the math right you can arrange things so that even if the digipot fails completely (0 ohms resistance) the voltage doesn't cause damage to the FPGAs.  There's a fair bit of room between the maximum operating voltage and the voltage Xilinx says they can handle without being damaged.

Another alternative is to put a jumper on your board that disables the power supply (they all have "enable" pins these days) and include space for an 8-pin mini-fit connector that isn't soldered down by default.  This gives you an emergency escape, albeit an ugly one, in the event that you underbudgeted for power: add on the connector and use an off-board power supply.

I think it would already be very valuable to us if you could measure the power consumption of whatever you have currently at 100MHz.

I don't.  The numbers would not be representative of the final design.  And I don't want to start rumors about "eldentyrell's design won't work on board XYZ because it can't supply enough power" unless they're actually true (which is highly unlikely).  Having to shift my focus from performance to power prematurely in order to fight these rumors would be an inefficient use of my time.
hero member
Activity: 504
Merit: 500
FPGA Mining LLC
PSU on a separate board is not an option, trace/connector voltage drop quickly becomes unmanageable with that.

Hrm, look, I'm an amateur when it comes to PCB design, but you might want to reconsider that.  What you say is certainly true for the good old IDT headers (the kind you connect ribbon cables to) -- I've seen that myself and got burned by it.  But the Molex Mini-Fit connectors can carry 13A per pin -- they're the same connectors that deliver power to your motherboard and your graphics card.  I measure no more than 15mV drop across the connector on my boards when running at full power.  That's not to say I don't have power distribution headaches -- just that the IR drop across the connector isn't one of them.

The connector itself is only part of the problem here. An you certainly don't have 13A on that pin you measured... Wink
If you have to route power through connectors, this also extends the trace length/decreases the trace width towards the connector. In the end we're talking more like maybe 50mV drop here, which can already cause trouble in applications like this (and generally lowers efficiency). Having the regulator on the FPGA board itself just makes more sense IMHO, for a variety of reasons. Ask the guy who made the design you quoted above, he also got burned by that Tongue

For these 50A that you ask for (for a quad-FPGA board) you just need solid power supply layers.

Actually I think I've been extremely careful to avoid giving a specific power number.  I'm stalling (sorry) until I can measure the new 180mhz design -- which has much shorter wires and 10% fewer registers -- on a high-quality board.

In other news, I've got some used Virtex5-155 boards showing up next week.  Porting the design to them was stupidly easy.  Virtex5 looks like Spartan6 with faster routing, more carry chains, and (most important) without all the idiotic "potholes".  So much more pleasant to have a perfectly regular fabric.  Porting to Artix7 is the same story except that it has "potholes" so I'm not terribly motivated to go through the hassle until I know when the boards will be available.

How about Kintex btw? What kind of fabric is that one using?

I think it would already be very valuable to us if you could measure the power consumption of whatever you have currently at 100MHz. We can compare to the existing designs running at the same frequency and will get at least a ball park estimate of what the requirements for your design are going to be. Sure, this is likely to be optimized over time, but an upper bound would be very helpful Smiley
donator
Activity: 980
Merit: 1004
felonious vagrancy, personified
PSU on a separate board is not an option, trace/connector voltage drop quickly becomes unmanageable with that.

Hrm, look, I'm an amateur when it comes to PCB design, but you might want to reconsider that.  What you say is certainly true for the good old IDT headers (the kind you connect ribbon cables to) -- I've seen that myself and got burned by it.  But the Molex Mini-Fit connectors can carry 13A per pin -- they're the same connectors that deliver power to your motherboard and your graphics card.  I measure no more than 15mV drop across the connector on my boards when running at full power.  That's not to say I don't have power distribution headaches -- just that the IR drop across the connector isn't one of them.

For these 50A that you ask for (for a quad-FPGA board) you just need solid power supply layers.

Actually I think I've been extremely careful to avoid giving a specific power number.  I'm stalling (sorry) until I can measure the new 180mhz design -- which has much shorter wires and 10% fewer registers -- on a high-quality board.

In other news, I've got some used Virtex5-155 boards showing up next week.  Porting the design to them was stupidly easy.  Virtex5 looks like Spartan6 with faster routing, more carry chains, and (most important) without all the idiotic "potholes".  So much more pleasant to have a perfectly regular fabric.  Porting to Artix7 is the same story except that it has "potholes" so I'm not terribly motivated to go through the hassle until I know when the boards will be available.
hero member
Activity: 504
Merit: 500
FPGA Mining LLC
The best strategy is to have the Spartan and power supply on separate boards so that you can replace or underpopulate the power supply boards if needed.  This is what I and at least one other person do.  If I find out that I underdesigned the power supply I can just leave a slot empty.

If you can't put them on separate boards, you're going to have to overdesign by a wide margin to be sure you don't get left behind due to running out of power.  Artforz' boards can deliver a whopping 15A of current to each chip, which is so much current that you'll hit insurmountable cooling problems long before you run out of power -- plenty of margin.  Ztex is a far better board designer than I am, but I do feel that he skimped on the power supply by providing only 8A and I was disappointed to see that he hasn't upgraded this on his 4-chip board.

My personal bet was on ~10A per FPGA until now. (And you can probably actually pull 11-12A from most 10A supplies if you need to.)
PSU on a separate board is not an option, trace/connector voltage drop quickly becomes unmanageable with that. For these 50A that you ask for (for a quad-FPGA board) you just need solid power supply layers.
donator
Activity: 980
Merit: 1004
felonious vagrancy, personified
How much current on VCCINT does your design currently use, at which clock frequency?

Short answer: I don't know right now.  Measuring that is actually why I shelled out for a non-DIY Spartan6 board, which *just* arrived.  My own homebrew boards are crap and probably leak power all over the place.  However, getting the 180mhz design running is a higher priority right now.

I have not yet put any effort at all into minimizing power consumption, and don't plan on doing so until I feel that further performance improvements are tapped out.  FWIW I am still using SRL16s, which are power hogs, instead of RAM32Ms with LFSR address generators.


It's probably in your own interest to ensure that future boards meet the power requirements of your design.

The best strategy is to have the Spartan and power supply on separate boards so that you can replace or underpopulate the power supply boards if needed.  This is what I and at least one other person do.  If I find out that I underdesigned the power supply I can just leave a slot empty.

If you can't put them on separate boards, you're going to have to overdesign by a wide margin to be sure you don't get left behind due to running out of power.  Artforz' boards can deliver a whopping 15A of current to each chip, which is so much current that you'll hit insurmountable cooling problems long before you run out of power -- plenty of margin.  Ztex is a far better board designer than I am, but I do feel that he skimped on the power supply by providing only 8A and I was disappointed to see that he hasn't upgraded this on his 4-chip board.
sr. member
Activity: 448
Merit: 250
the latest iteration has a design clock rate of 180mhz and meets timing for all of the "ordinary" stages

Drool.

-- i.e. all of them except the funny ones at the very beginning, the very end, and the corner turn (stages 30-31).

Could one, two, or all three of these corner cases be solved by adding a "dummy" stage?
c_k
donator
Activity: 242
Merit: 100
I really look forward to your finished product, it sounds very exciting Smiley
hero member
Activity: 504
Merit: 500
FPGA Mining LLC
Dr. Tyrell, can you please answer an important question for us board designers?

How much current on VCCINT does your design currently use, at which clock frequency?

It's probably in your own interest to ensure that future boards meet the power requirements of your design.
brand new
Activity: 0
Merit: 250
However, unless there are miners out there with tens of thousands of LX150-based boards already running,

I think you are greatly underestimating the number of FPGAs out there that are mining.  The number of people involved in FPGA mining is small, but there are already several very large FPGA farms.  Also, I have pretty good info on Xilinx' pricing curve, which lets me put a lower bound on how many units the major vendors have sold.  Lastly, I've been watching AVNet's inventory data, and they are moving pretty huge quantities of LX150's compared to LX75's -- and I don't know of any other non-scientific-computing product that uses them.
That's really useful information to know. I guess that the early-adopter miners who *aren't* constrained by capital (sadly unlike myself) wouldn't think twice about dropping a few tens of thousands, maybe a hundred k, on efficient mining kit that *doesn't* need the power and heat management resources of a professional datacentre outfit.

But this was always the risk in my business plan. Large amounts of capital could generate enough hashpower to make my return on investment worthless (I get your point about 'stripped out' FPGAs and their resale value, but this isn't necessarily the case for the proper development boards, and there are a few other cryptographic applications... but that's off-topic and probably a dangerous topic to follow).

However, until someone invests in proper ASICs (which few seem to think is likely in the probable lifetime of Bitcoin), the problem that the big money faces is simply physical integration. The major vendors (in the Bitcoin space) are offering single or dual units (I will not speculate on BFL because I don't know what's in the box). Anyone wanting to wade in and wipe out the small miners by building 10,000,000 unit FPGA clusters will have to *build* the cluster, and integrate 10,000,000 individual boards.

This is quite an extreme task and the technical barrier to entry is still fairly high. So I still consider my risk to be worth a shot - the main retail vendors (certainly outside the US) seem to be targeted at the small operations, such as mine. The other options appear to involve buying much more powerful FPGAs with many more LUTs and writing one's own code. It is clear that this is not a simple task and certainly not something a rich mining enthusiast could even *consider* unless he/she was also an FPGA development expert.

Following further out to the extremes, my understanding is that Bitcoin mining only works properly whilst it remains distributed. The entire economy is small enough to be wiped out by your average hedge fund, let alone a hostile government. AFAIK (please correct me on this if wrong), *real* money could be deployed to make every single current miner's hashpower contribution utterly irrelevant. The difficulty algorithm would compensate, but a government agency's resources, or even just an investment bank or unregulated fund, or even one of the well-known private individuals with much-speculated-upon motives, would be able to control the entire money supply. I'm not talking about the 51% attacks and other 'gaming the algorithm' approaches, but the ability of an abnormally large hashpower being able to receive 100% of the block rewards and hence control the money supply by being able to dictate how many, and at what price, BTC are available. This would reduce Bitcoin to a fiat currency with this 'agent' as the central bank. This is not necessarily the hostile 'BITCOIN MUST DIE' paranoia (as in receiving 100% of block rewards and hoarding them, killing the velocity of money and trust in the integrity of the network), but even a sufficiently greedy (and rich) miner in such a state could keep the economy running but hardly in its original intent (the probability of trust disintegrating, along with the currency, would be high, even if use was widespread).

Now please forgive me if I've completely misunderstood this, but this is how I see it, and have put these scenarios into my 'force majeure' category re: risk analysis, since I have no way of countering *truly* massive additions to the economy's hashpower by one agent.


However, with the hashpower still distributed, clearly a small miner like myself is facing an *imperative* to move to FPGA technology if I wish to continue mining. I'm limited by capital, and the very rich are limited by how quickly they can build out and maintain hundreds of thousands of units (or employ board designers to scale up the integration past 2-chip boards).

If I'm right about the control of money supply, then anyone wishing the Bitcoin economy to succeed must *also* attempt to ensure hashpower remains distributed, and not centrally controlled by one huge entity. This isn't going to be feasible with hobbyists and graphics cards, even to the extremes some of us have gone - not with FPGAs around.


Hence my value algorithm isn't quite as simple as your plain 50%-of-capital; I have several other known unknowns to analyse which focus on certain critical events and possible timescales. For example, there's no point paying you the 'rational' amount (i.e. cost of equivalent hashpower) if all capital is lost before payback time - this merely increases the lost capital. Given the 208 MH bitstream is free, your bitstream is comparable to financial leverage. Equally, paying equivalent-capital only for an open-source bitstream to *even marginally* close the gap performance-wise within the payback timeframe would be irrational - however I have absolutely no reason to doubt your claims about how difficult the task would be, so I will take them at face value until I have a serious chat with my old mate from university who was a VHDL consultant for over a decade before taking a chief engineer post at a large firm I won't disclose. It may be that you're the best, it may be that there are concurrent approaches like yours being taken, but silently because the experts concerned have a large capital investment in their own hardware, and aren't letting anyone know!

There are too many variables to approach this simplistically. I've been doing risk analytics for well over a decade professionally, but sometimes I find *instinct* more valuable than expected (and yeah, I've always done things 'differently'...) - and my gut feeling right now is that given the choice between 100 mining units with the free bitstream running at 208 MH, or 66 mining units plus paying you the equivalent-capital for your special 312 (say) MH bitstream (i.e. total capital expenditure the same, total hashpower the same)... I'd go with more hardware.

Simply because you've shown that the LX150 has plenty of headroom above 208 MH, but the probability of your bitstream becoming significantly faster is limited by one person's work and the fact that you've already taken the next step in optimisation. The probability of the free bitstream becoming significantly faster is multiplied by not only the multiple known efforts, but also the possibility of an 'eureka' moment being spread out amongst a greater population. Yeah, you can pick holes in this 'analysis' - the biggest hole is that you know the technology and I am not an expert, so your firm belief that nobody could replicate your work in useful time and that there are no other avenues of optimisation of superior performance could very probably be true. However I have a friend to ask, and a gut feeling....


BTW this is more a train of thought from your response. I am not doubting your work and respect your achievements (I see no reason for you to be lying about this, so whilst the discussion is 'academic' for now, I very much doubt you'll disappoint).

What got me fired up was the comment about 'small numbers of people involved in FPGA mining' with 'very large FPGA farms' - which increases the factor in my analysis regarding the centralisation of mining control...

And I guess this is off-topic as a result. Let me know if you want it deleted. Just my thoughts on the matter, if things are as you say - it has quite a large consequence for the viability of small miners, regardless of how up-to-date their technology actually is...

Does any of your inside data suggest that the high volumes of LX150s being sold are spread out in a distribution roughly approaching retail... or are they all going to one customer? Apologies, impossible question to answer since anyone seriously considering controlling the money supply wouldn't be quite so unsubtle. It's late, I'm tired and I've just driven back to England from Switzerland so I'm a bit frazzled...

If a unit fails, and needs re-programming ... If the only way to prevent 'theft' of the bitstream would be to lock the FPGA so it can't be used for other purposes

I'm not enthusiastic about the bitstream encryption route, but I do want to point out that this is just flat-out false.  ADDING a decryption key to a Spartan in no way prevents it from being used for other purposes -- you don't have to use it (in fact, most bitstreams don't!).  The encryption key is stored by writing to eFuses; Xilinx has multi-million dollar customers relying on those fuses.  They are no more likely to fail than the rest of the device (in which case you're screwed anyways).
Many thanks for clarifying this for me. I'm not a hardware guy, I'm having to learn very fast indeed...

legendary
Activity: 938
Merit: 1000
What's a GPU?
This looks awesome. What school do you teach at? It seems like it'd be a good place to learn CS Tongue
donator
Activity: 980
Merit: 1004
felonious vagrancy, personified
Just a ping here since I haven't updated the thread in a while; now that the semester is winding down (I teach CS) I should have enough time to take this across the finish line.

I've come to a much better understanding of how the Spartan6 routing fabric works.  Originally I just focused on packing all the logic into the device and keeping things that communicate nearby.  That's not quite enough: in order to make sure Xilinx's router doesn't do stupid things you have to make sure that things that communicate are aligned either vertically or horizontally.  All the retiming and reorganization needed for this took a long time, but it's paying off: the latest iteration has a design clock rate of 180mhz and meets timing for all of the "ordinary" stages -- i.e. all of them except the funny ones at the very beginning, the very end, and the corner turn (stages 30-31).  I also factored in a lot of changes I'd been putting off that reduce the register count by about 10% (mostly to save routing resources).

Lastly, my "non-homebrew" Spartan6 board just arrived so I will be able to post useful power numbers soon.
member
Activity: 70
Merit: 10
The value of your improvements, if they are for real, might be diminished as time goes by. It seems if this is legit you should have a kickstartr going real soon.
Also may be that a project has been announced, due in May (see icarus thread) that will be as fast/efficient/inexpensive as your target.

So you toil away, achieve your goal, then find out you were beaten to market. Ouch.
mrb
legendary
Activity: 1512
Merit: 1027
I certainly feel I was properly compensated.

Today, I would either do it the same way, or do it using the Kickstarter model like suggested on this thread. For those not familiar with it: http://en.wikipedia.org/wiki/Kickstarter
In 2 words: funds are pledged by potential customers. If a funding target is not reached by a certain date, money is returned to those who funded. Else the funds go to the seller who can finish producing and releasing the product.
full member
Activity: 281
Merit: 100
Whatever happened to that guy that figured out how to implement BF_INT (or something) on the GPU miners before the rest of the coders did shortly after?

I looked for a few min on to forum and did not find it as if people have completly forgotten.

I am the guy.

What happened to me? Well for a while I was selling my BFI_INT-enabled GPU miner. My strategy to prevent it from being leaked and pirated (I think the only strategy that can work) was to price it quite high (400 BTC which was about $200 at the time), so that the buyers would feel they had a valuable, exclusive product that they would not want to leak. AFAIK it worked and was never leaked. At the time I was 10-15% faster than the other open source miners.

Then the open source ones progressively caught up, so I discounted the price down to 300, then 250 BTC. And eventually I stopped selling it altogether.

I am glad you saw this. I know I would have used  your miner if I could have (I was and still am pretty small time).  I don’t know if other people do but every piece of open source software like miners and desktop gadgets or free pools and bounties I have used for this hobby I have contributed with BTC. Not windfall amounts but respectable at the time. Would you have done it the same way again? Maybe what you did was the best way to go about it. Do you feel you were properly compensated for the time you put into it? (perhaps after the rise in btc value)

I don’t really care what is decided with this new FPGA approach.  I know eventually people are going to find every way possible to squeeze every bit of performance out of these chips as possible.  I only mentioned the BF_INT thing because the situation sounded familiar.

One thing is certain, this community has people that are very passionate about the related technologies and you can’t really compete long term with people who work for free.
mrb
legendary
Activity: 1512
Merit: 1027
Whatever happened to that guy that figured out how to implement BF_INT (or something) on the GPU miners before the rest of the coders did shortly after?

I looked for a few min on to forum and did not find it as if people have completly forgotten.

I am the guy.

What happened to me? Well for a while I was selling my BFI_INT-enabled GPU miner. My strategy to prevent it from being leaked and pirated (I think the only strategy that can work) was to price it quite high (400 BTC which was about $200 at the time), so that the buyers would feel they had a valuable, exclusive product that they would not want to leak. AFAIK it worked and was never leaked. At the time I was 10-15% faster than the other open source miners.

Then the open source ones progressively caught up, so I discounted the price down to 300, then 250 BTC. And eventually I stopped selling it altogether.
full member
Activity: 281
Merit: 100
Whatever happened to that guy that figured out how to implement BF_INT (or something) on the GPU miners before the rest of the coders did shortly after?

I looked for a few min on to forum and did not find it as if people have completly forgotten.
Pages:
Jump to: