DIY FPGA Mining rig for any algorithm with fast ROI - page 72.

LTCMAXMYR

hero member

Activity: 609

Merit: 500

DMD,XZC

It's all nonsense. There is no evidence to prove the hashrate and power consumption.

nsummy

full member

Activity: 1179

Merit: 131

Quote from: grendel25 on May 10, 2018, 08:49:20 PM

TLDR: Sounds Neat. Do it if you can afford it. Be aware of cost and risk factor of future support.

I have been mining off-and-on since 2014 and I remember quite well the FPGA interest back in 2014. I'm sure there have been other FPGA efforts before 2014 but my point is to simply share my own opinion.

First, I'll just say this is probably a great opportunity for people who can afford this. Not everyone will be on board to spend $5K to get a system running. That's a rough estimate.. but $4K for the FPGA and then whatever else for ancillary equipment like mobo/ram/psu/etc.

So people saying that this will cause the demise of GPU mining are being short sighted.

Asides from being expensive, it is technically daunting. We would be relying on a programmer for future firmware updates and from what I've seen the support is just not to the same scale as the existing support for other mining options.

So I'll just summarize by saying this sounds like an awesome opportunity for diversification of a mining portfolio if you have the money for that. I imagine youtubers like VoskCoin would be jumping all over this.

TLDR: Sounds Neat. Do it if you can afford it. Be aware of cost and risk factor of future support.

This is probably one of the best and grounded comments I've read on this forum in a long time. The fact is that 95% of miners out there have a very rudimentary understanding of computers, algorithms, and programming. The barrier for entry is minuscule; Buy a few GPUs and there are numerous programs available that are designed to mine on them. I really can't envision a scenario where FPGAs will become fully mainstream. As evidenced by all of the posts on here, they are difficult to buy, let alone program. The real profits are always going to go to the people who spend the time and effort to find an edge in the mining game that goes beyond Nvidia, AMD, and Bitmain.

lunobird

full member

Activity: 846

Merit: 115

Everybody wake the f*'ck up. Gpu mining is dead. Your competing with the big asic boys and if your competing with 14 year olds with gamer gpu that get free electricity from parents than your gpu farm is doomed to fail. Expect half of cost to be electric waste.

The decentralized dream is bullshit. Only serious players can profit and the rest will get roasted. It's a zero sum game. FPGA and asics is the only game Left to compete at industrial or small business scale. Every f*'ck tard gamer will mine and not report gains to irs to cover their GPU cost and wack off to their $1 daily gpu profits 2 year roi

GPUHoarder

member

Activity: 154

Merit: 37

Quote from: senseless on May 10, 2018, 09:19:42 PM

Quote from: GPUHoarder on May 10, 2018, 09:13:04 PM

Quote from: senseless on May 10, 2018, 09:07:22 PM

Quote from: s1gs3gv on May 10, 2018, 08:41:33 PM

Are there many individual components of the Xnn series of algos which won't fit on an arria or even a cyclone ?

Yes. Some of the components hardly fit on a 9P. The cubehash example i gave a few posts ago.

Are you saying the algorithm for an individual cubehash pipeline hardly fits on a 9P???... while I haven’t studied cubehash specifically, it’s claimed to take 200 cycle on a basic CPU, and I can implement a lot of basic CPU cores on a 9P...

A pipeline, sure, lots of pipelines, if you want to unroll it fully and obtain real (1Gh/s+) performance it would take the entire 9P and it's not clear that a fully unrolled version of it would fit at all.

Ahh ok - I completely understand what you’re saying now. I misread as the individual algorithm took the chip.

The way i’d attack Lyra2Rev2 in hardware is a literal pipeline of chips, sized according to paralyzed throughput. It looks like the whole chain is 256 bit hashes, so 400gbps interconnect could handle your 1Gh+. For chip to hip interconnect on the same board 3 quads of 32 Gbps should be sufficient. The Blake/keccak skein is probably all on the same chip or a much smaller chip.

Your 9P is probably $3000, you could buy 2-3x the luts for pipelines on smaller chips for that...

All that said it looks like the 1GH is worth about $1000/mo right now, so still quite a long payout if you’re using $12k in hardware.

senseless

hero member

Activity: 1118

Merit: 541

Quote from: GPUHoarder on May 10, 2018, 09:13:04 PM

Quote from: senseless on May 10, 2018, 09:07:22 PM

Quote from: s1gs3gv on May 10, 2018, 08:41:33 PM

Are there many individual components of the Xnn series of algos which won't fit on an arria or even a cyclone ?

Yes. Some of the components hardly fit on a 9P. The cubehash example i gave a few posts ago.

Are you saying the algorithm for an individual cubehash pipeline hardly fits on a 9P???... while I haven’t studied cubehash specifically, it’s claimed to take 200 cycle on a basic CPU, and I can implement a lot of basic CPU cores on a 9P...

A pipeline, sure, lots of pipelines, if you want to unroll it fully and obtain real (1Gh/s+) performance it would take the entire 9P and it's not clear that a fully unrolled version of it would fit at all.

s1gs3gv

legendary

Activity: 1316

Merit: 1014

ex uno plures

Quote from: GPUHoarder on May 10, 2018, 09:01:43 PM

That’s exactly what I’ve been working on, for reasonable definitions of small and fast.

I think its a promising avenue for research. Its the one I would choose too. Sinking big bucks into an investment in UltraScale+ FPGA boards and being dependent on one or two VHDL coders who know the subject matter seems ~~risky~~ adventurous.

Revisiting sha256 ASIC development history, I can think of at least two companies (CoinTerra, Spondoolies) who failed because they tried to design large die area high hash rate chips and were late to market and at least one company (Bitmain) who succeeded by designing smaller and simpler chips, using lots of them in a miner and being first to market.

GPUHoarder

member

Activity: 154

Merit: 37

Quote from: senseless on May 10, 2018, 09:07:22 PM

Quote from: s1gs3gv on May 10, 2018, 08:41:33 PM

Are there many individual components of the Xnn series of algos which won't fit on an arria or even a cyclone ?

Yes. Some of the components hardly fit on a 9P. The cubehash example i gave a few posts ago.

Are you saying the algorithm for an individual cubehash pipeline hardly fits on a 9P???... while I haven’t studied cubehash specifically, it’s claimed to take 200 cycle on a basic CPU, and I can implement a lot of basic CPU cores on a 9P...

senseless

hero member

Activity: 1118

Merit: 541

Quote from: s1gs3gv on May 10, 2018, 08:41:33 PM

Are there many individual components of the Xnn series of algos which won't fit on an arria or even a cyclone ?

Yes. Some of the components hardly fit on a 9P. The cubehash example i gave a few posts ago.

GPUHoarder

member

Activity: 154

Merit: 37

Quote from: s1gs3gv on May 10, 2018, 08:41:33 PM

Quote from: GPUHoarder on May 10, 2018, 08:20:47 PM

You can’t fit a Stratix 10 on a nVME stick... I’ve tried. Kintex or Arrria is about as big as you can get. Damn 22x80 form factor.

I know.

I question the need for ultra large and expensive FPGAs instead of clusters of smaller FPGAs with high speed interconnects.
Perhaps a custom PCI-E format board with 4-6 last generation FPGA devices with some fast memory and a cross point switch. Are there many individual components of the Xnn series of algos which won't fit on an arria or even a cyclone ?

That’s exactly what I’ve been working on, for reasonable definitions of small and fast.

I have two active projects in the first spin batch phase. One is nVME with basically the biggest thing you can fit on there, and it augments GPUs more than works standalone.

The second is 4 chips on one PCIe card, with a switch, but the most reasonable chip that can be used in that configuration is still not what you would call cheap. Frankly even the nVME chip is as much as some graphics cards to get 4x 3.0 PCIe lanes.

The one advantage is the 4-chip board uses modules, so you could buy one with 1 module populated in the 3 figure range. When it is ready, which is likely August at this point for mass production.

grendel25

legendary

Activity: 2296

Merit: 1031

TLDR: Sounds Neat. Do it if you can afford it. Be aware of cost and risk factor of future support.

I have been mining off-and-on since 2014 and I remember quite well the FPGA interest back in 2014. I'm sure there have been other FPGA efforts before 2014 but my point is to simply share my own opinion.

First, I'll just say this is probably a great opportunity for people who can afford this. Not everyone will be on board to spend $5K to get a system running. That's a rough estimate.. but $4K for the FPGA and then whatever else for ancillary equipment like mobo/ram/psu/etc.

So people saying that this will cause the demise of GPU mining are being short sighted.

Asides from being expensive, it is technically daunting. We would be relying on a programmer for future firmware updates and from what I've seen the support is just not to the same scale as the existing support for other mining options.

So I'll just summarize by saying this sounds like an awesome opportunity for diversification of a mining portfolio if you have the money for that. I imagine youtubers like VoskCoin would be jumping all over this.

TLDR: Sounds Neat. Do it if you can afford it. Be aware of cost and risk factor of future support.

s1gs3gv

legendary

Activity: 1316

Merit: 1014

ex uno plures

Quote from: GPUHoarder on May 10, 2018, 08:20:47 PM

You can’t fit a Stratix 10 on a nVME stick... I’ve tried. Kintex or Arrria is about as big as you can get. Damn 22x80 form factor.

I know.

I question the need for ultra large and expensive FPGAs instead of clusters of smaller FPGAs with high speed interconnects.
Perhaps a custom PCI-E format board with 4-6 last generation FPGA devices with some fast memory and a cross point switch. Are there many individual components of the Xnn series of algos which won't fit on an arria or even a cyclone ?

senseless

hero member

Activity: 1118

Merit: 541

Quote from: Way2Paradise on May 10, 2018, 08:09:44 PM

i discovered bitcore yesterday. it seems to be a pure gpu coin. therefore my question.

it is possible to mine bitcore (btx) coins with this fpga miner? algo is Timetravel10.

Yes, it's basically lyra2rev2 with only a single round of cubehash and no memory. I'd guess maybe 900mh/s-1.2gh/s.

Edit: No, sorry, it's nist5 + bmw, luffa and cube. And a randomized order to the hashes. Ya, maybe 900Mh/s with some intelligent buffering. It also depends on how long the chain is, and I'm not quite sure I understand that.

HardwareCollector

member

Activity: 144

Merit: 10

@senseless

@2112

@GPUHoarder

Thank you for your explanations and I understood what was said, much obliged.

No need to explain what was said below, I have the patience to wait and see how throughput can be doubled or quadrupled when the cards are daisy chained as stated below. I must not be very good at math as I thought Grin

Quote from: whitefire990 on May 02, 2018, 04:55:30 PM

- It is possible by using data from initial algorithms to project the hashrate within +/-10% for future algorithms, and in that light the expected rates (per card) are about 300MH/s for X17 & X16R, 25MH/s for Neoscrypt, 600MH/s for Lyra2v2; 150MH/s for Xevan (Bittware card only for Xevan!). For Equihash it is much harder to calculate the projected hash rate. I don't think Ethash would be profitable enough to be worth it. Those numbers are just projections though, and their profits are in the same range as the initial algorithms being released, with X16R and Xevan looking the best at around $75/day
- X17 and X16R require two FPGA cards daisy chained together with 2 x 100G ethernet cables, one FPGA does half the function, the other FPGA does the other half
- Xevan requires FOUR FPGA cards daisy chained together with 6 x 100G ethernet cables; this is only possible with the Bittware card

Quote from: whitefire990 on May 02, 2018, 08:29:05 PM

Clarifying the projected hash rates
X17: 2 cards daisy chained get 600MH/s total
X16R: 2 cards daisy chained get 600MH/s total
Xevan: 4 Bittware cards daisy chained get 600MH/s total

GPUHoarder

member

Activity: 154

Merit: 37

Quote from: s1gs3gv on May 10, 2018, 07:54:18 PM

Quote from: senseless on May 10, 2018, 11:40:53 AM

In the next couple of years we'll be buying Stratix 10 PCI-E boards at walmart for $600 a pop.

And gamers will be complaining that miners have bought up all the NvmE Stratix 10 FPGA sticks

You can’t fit a Stratix 10 on a nVME stick... I’ve tried. Kintex or Arrria is about as big as you can get. Damn 22x80 form factor.

Way2Paradise

jr. member

Activity: 322

Merit: 1

i discovered bitcore yesterday. it seems to be a pure gpu coin. therefore my question.

it is possible to mine bitcore (btx) coins with this fpga miner? algo is Timetravel10.

s1gs3gv

legendary

Activity: 1316

Merit: 1014

ex uno plures

Quote from: senseless on May 10, 2018, 11:40:53 AM

In the next couple of years we'll be buying Stratix 10 PCI-E boards at walmart for $600 a pop.

And gamers will be complaining that miners have bought up all the NvmE Stratix 10 FPGA sticks

senseless

hero member

Activity: 1118

Merit: 541

Quote from: BittWareFPGATech on May 10, 2018, 10:17:59 AM

Quote from: greerso on May 10, 2018, 09:14:07 AM

I'm wanting to try fpga mining on an AWS EC2 instance. It seems anyone that has done/is doing that is keeping the 'how to' close to their chest. BFGMiner seems to be the way to go, but where does one get the bitstream (or in the case of AWS the AFI containing the bitstream)?

After seeing posts saying someone fried an AWS F1 board with 300A, and that now there is a 150W limit but it is only a warning, I did a little searching and found that it appears AWS F1 limits your core power (Vccint) to 85W, which would be 100A at 0.85V. They say they may/will shut you down (gate your clocks) if you exceed this, see https://github.com/aws/aws-fpga/blob/master/hdk/docs/afi_power.md

That's not what I said at all. Power limitations were not introduced until aws shell v1.3.5 IIRC (maybe as early as 1.3.0? -- I don't remember off hand), sometime around sept/oct 2017. And yes, when I compile firmwares for my 80A 0.85V vccint VCU118, I can ignore the power warning if I wish, continue to compile a firmware, load the bitstream and fry my $7500 board -- If i wanted.

Quote from: senseless on May 06, 2018, 12:29:07 PM

Really? What happens if you try to draw 300 amps on a board that only has a 160A vccint supply? Did you know vivado only tosses an ignorable warning? That you can still compile and complete the firmware? I know people who have fried their own fpga boards by drawing more current than the board has a supply. I have destroyed amazon boards by drawing too much current (unintentionally). This is just one of many ways you can physically destroy a FPGA with a bad firmware / design problem.

The only fud about what I said is that it could possibly happen.

GPUHoarder

member

Activity: 154

Merit: 37

Quote from: HardwareCollector on May 10, 2018, 06:21:05 PM

Quote from: 2112 on May 10, 2018, 06:09:08 PM

Quote from: GPUHoarder on May 10, 2018, 05:52:31 PM

Quote from: HardwareCollector on May 10, 2018, 05:28:52 PM

As it relates to Ravencoin mining with FPGAs, OP will need to store over 300 million bitstreams to account for every possible combination. Better get back to the drawing board because this design will never work.

Partial reconfiguration - you don’t need every combination, just every building block.

Yeah, for X16r coins thats 16^2=256, for X16s coins thats 16!/14!=240. Certainly doable.

Hmmm, two accelerator cards will be daisy chained with pipelining and their performance will magically double. I will believe it when I see it like all other claims made by the OP.

This isn’t hard. I do it all the time for a few algorithms. Here’s a contrived example - fill the scratchpad for CN7 on one FPGA dedicated to that, spitting out 2MB scratch pads all day long, and taking them back in and compressing / finalizing them. Total bandwidth for (example) 22kH is 343 Gbps. That’s achievable on lots of current hardware.

This makes it a lot easier build two sets of pipelines on two FPGAs for two related but very different set of operations. Doing all the things on two separate FPGAs couldn’t achieve the same performance.

This was a back of the envelope example, but the Xxx algorithms that just chain more on to the process definitely lean them selves to this kind of operation. This (and memory bandiwdth, and easier cooling) is why my accelerator cards have 4x75W
Ultrascale + FPGAs and not one big Virtex. Interconnect on those is 256 Gbps.

Edit: Let me try to phrase this in a few words. Don’t waste the extremely high bandwidth interconnect and resources inside the FPGA for something you can use the slow external interfaces to accomplish.

hogito

newbie

Activity: 31

Merit: 0

Please PM me if you're looking for someone to guinnea pig/collaborate to help progress your goal. I have funds I'm willing to use for POC.

2112

legendary

Activity: 2128

Merit: 1074

Quote from: HardwareCollector on May 10, 2018, 06:21:05 PM

Hmmm, two accelerator cards will be daisy chained with pipelining and their performance will magically double. I will believe it when I see it like all other claims made by the OP.

Why it wouldn't pipeline efficiently? There are 8 transceivers available over QSFP28 that work at the raw speed of 32.75 Gbps for a total of 262 Gbps from one board to the other in one direction. No back-channel is required, also we don't care about protocols and error detection & correction over our link, it is for lottery purposes only.

If that is not enough there are 16 of the same transceivers connected to the PCIe edge connector. This may be a little more tricky, to run them at full blast without obeying PCIe protocols we would need to either do some simple trace cuts on the backplane or find a way to busy-out and/or disable the PCIe bridge chip.

I seriously don't see the inter-board bandwidth as an important limitation. I haven't done the above with the Ultrascale+ technology, but I've successfully found ways to abuse the older connectivity standards (for an application not related to cryptocoins, but also tolerating occasional noise.)

Topic: DIY FPGA Mining rig for any algorithm with fast ROI - page 72. (Read 99502 times)