Pages:
Author

Topic: DIY FPGA Mining rig for any algorithm with fast ROI - page 74. (Read 99472 times)

legendary
Activity: 1316
Merit: 1014
ex uno plures
I think you are not factoring in the cost of FPGAs.

That's a good point, but the FPGA market looks totally dis-functional to me. I suspect that pricing practices will change as the impact of the Intel/Altera acquisition works thru the system.
newbie
Activity: 10
Merit: 6
I'm wanting to try fpga mining on an AWS EC2 instance.  It seems anyone that has done/is doing that is keeping the 'how to' close to their chest.  BFGMiner seems to be the way to go, but where does one get the bitstream (or in the case of AWS the AFI containing the bitstream)?

After seeing posts saying someone fried an AWS F1 board with 300A, and that now there is a 150W limit but it is only a warning, I did a little searching and found that it appears AWS F1 limits your core power (Vccint) to 85W, which would be 100A at 0.85V.  They say they may/will shut you down (gate your clocks) if you exceed this, see https://github.com/aws/aws-fpga/blob/master/hdk/docs/afi_power.md

Also, someone suggested that since the power supplies are current limited, you could just raise the voltage to lower the current, unfortunately this is not how these chips work.  They basically take the same amount of current to do a given job, regardless of voltage.  So the lower the voltage, the lower the power.  This is why chip makers keep trying to run at lower and lower voltages, since it saves power.  

There is another power consideration in all these latest generation chips as well, which is static power (as opposed to dynamic), which is due to leakage in the transistors.  While dynamic power is linear with voltage, static power (leakage) is non-linear, going up as the square of voltage.  It also increases dramatically with temperature.  Static power in general has nothing to do with how much work is being done, hence the name "static".

Note that raising voltage may make you be able to run faster, but faster means yet more power, since dynamic power is highly dependent on clock rate.  You can run certain VU9Ps at 0.72V instead of 0.85V and this will certainly lower your power BUT it also kills your performance by 20-30%.  You can also run certain VU9Ps at 0.9V and get higher performance, with the associated higher power.  Of course the chip vendors charge a premium for these special devices over the standard ones.

So, determining the optimal combination of FPGA size, core voltage, and clock rate is not such a simple task.  As mentioned, we are working with the OP on this.
member
Activity: 531
Merit: 29

Edit: And the irony of this -> Only ASICs will survive.



No. We'll see adaptive algos introduced which favor FPGA's. While FPGA programming may be a specialized skill and it takes time to code up a new algo, the cost and time to market of a FPGA solution will always be a lot less than designing and deploying new ASIC solutions and their energy efficiency (operational cost) will always be better than gpus.

I think there is no reason why fpgas can't be game changers.

I think you are not factoring in the cost of FPGAs.

I looked at x16r / RVN numbers. I estimate about 25$ / FPGA with 10000 on the network. The 5+ months to recover cost. And ten 10000 is conservative, considering x16R has probably 4 times the GPUs on it currently compared to Phi.

Make 20000, and its down to $12/card.

Anyway its in motion, this is not going to end well for GPUs or FPGAs.
legendary
Activity: 1316
Merit: 1014
ex uno plures

Edit: And the irony of this -> Only ASICs will survive.



No. We'll see adaptive algos introduced which favor FPGA's. While FPGA programming may be a specialized skill and it takes time to code up a new algo, the cost and time to market of a FPGA solution will always be a lot less than designing and deploying new ASIC solutions and their energy efficiency (operational cost) will always be better than gpus.

Personally, I think there is no reason why fpgas can't be game changers.
member
Activity: 144
Merit: 10
I'm wanting to try fpga mining on an AWS EC2 instance.  It seems anyone that has done/is doing that is keeping the 'how to' close to their chest.  BFGMiner seems to be the way to go, but where does one get the bitstream (or in the case of AWS the AFI containing the bitstream)?

You will have to design it yourself and here's a good place to begin:
https://github.com/aws/aws-fpga
jr. member
Activity: 33
Merit: 1
I'm wanting to try fpga mining on an AWS EC2 instance.  It seems anyone that has done/is doing that is keeping the 'how to' close to their chest.  BFGMiner seems to be the way to go, but where does one get the bitstream (or in the case of AWS the AFI containing the bitstream)?
legendary
Activity: 1453
Merit: 1011
Bitcoin Talks Bullshit Walks

Keccak (Smartcash, Maxcoin): 136GH/s (17GH/s per card x eight) ($160/day at Apr-30 prices)
Tribus (Denarius, Virtus): 16.8GH/s (2.1GH/s per card x eight) ($304/day at Apr-30 prices)
Phi1612 (Luxcoin, Folm): 5.2GH/s (650MH/s per card x eight) ($456/day at Apr-30 prices)
Skunhash (Various coins): 10.4GH/s (1.3GH/s per card x eight) ($261/day at Apr-30 prices)

Those yield around US$20-$57 per card per day ($160-$456 per day for the rig).


This is a mutually-assured-destruction arms race.

I calculated for Phi, if there are 5000 such cards online, the per day return will be $12 per card and not $57. (The GPUs will make pennies, meanwhile.). That will be an year to get back cost at $4000.

And 5000 is conservative number I think, for one Algo.

Edit: And the irony of this -> Only ASICs will survive.



Drops the mic!

Thanks for spelling it out for these folks

BR
newbie
Activity: 1
Merit: 1
Roll Eyes ok now i see. sorry for my noob question.

Actually it's not a silly question and you're not incorrect. You most likely will require a different bitstream for each board even though the Xilinx chip is the same. The reason for this is that the IO pins on the device are also configurable (i.e. you can change, within certain limits, which pins are connected to which external signals) and it is likely that the pins on the device have been connected up differently on each board. This may be a case of a simple recompile with no changes to the code but if the pinout is radially different you may have to reoptimise some of the timing to account for the differences in propagation delays between the pins and the logic.
member
Activity: 531
Merit: 29

Keccak (Smartcash, Maxcoin): 136GH/s (17GH/s per card x eight) ($160/day at Apr-30 prices)
Tribus (Denarius, Virtus): 16.8GH/s (2.1GH/s per card x eight) ($304/day at Apr-30 prices)
Phi1612 (Luxcoin, Folm): 5.2GH/s (650MH/s per card x eight) ($456/day at Apr-30 prices)
Skunhash (Various coins): 10.4GH/s (1.3GH/s per card x eight) ($261/day at Apr-30 prices)

Those yield around US$20-$57 per card per day ($160-$456 per day for the rig).


This is a mutually-assured-destruction arms race.

I calculated for Phi, if there are 5000 such cards online, the per day return will be $12 per card and not $57. (The GPUs will make pennies, meanwhile.). That will be an year to get back cost at $4000.

And 5000 is conservative number I think, for one Algo.

Edit: And the irony of this -> Only ASICs will survive.

member
Activity: 531
Merit: 29
Is this last chance for Xilinx to cash-out some high-priced FPGA volume before Intel makes available to public XEON-FPGA single chip? Second half of this year as announced.

Don't worry there is no last chance in an arms race, except when everything blows up. They will just come back with a  bigger/better bomb.
sr. member
Activity: 736
Merit: 262
Me, Myself & I
Is this last chance for Xilinx to cash-out some high-priced FPGA volume before Intel makes available to public XEON-FPGA single chip? Second half of this year as announced.
sr. member
Activity: 512
Merit: 260
 Roll Eyes ok now i see. sorry for my noob question.
newbie
Activity: 30
Merit: 0
Im a little confused with the OP's suggestions. In the first post he clearly recommends the Xilinx VCU1525 but later on, throughout the thread, he shows his preferences for the Bittware XUPP3R-VU9P.

Surely the bitstream will be chip-specific so if that is the case then he or someone will have to "compile" a bitstream for each device. And as he seems to be working more closely with Bittware that will take preference.

The Xilinx DK-U1-VCU1525-A-G is a dev board and locally the price is very high. If I need to buy two to run one also but the OP preference is Bittware then that Is a sizeable investment to risk. I really think that this will change the face of Altcoins but we will need more bitstream developers.

As said before we need the Clyamores/Wolfs of FPGA's and Open Hardware design.



they use the same xilinx chip.
sr. member
Activity: 512
Merit: 260
Im a little confused with the OP's suggestions. In the first post he clearly recommends the Xilinx VCU1525 but later on, throughout the thread, he shows his preferences for the Bittware XUPP3R-VU9P.

Surely the bitstream will be chip-specific so if that is the case then he or someone will have to "compile" a bitstream for each device. And as he seems to be working more closely with Bittware that will take preference.

The Xilinx DK-U1-VCU1525-A-G is a dev board and locally the price is very high. If I need to buy two to run one also but the OP preference is Bittware then that Is a sizeable investment to risk. I really think that this will change the face of Altcoins but we will need more bitstream developers.

As said before we need the Clyamores/Wolfs of FPGA's and Open Hardware design.

legendary
Activity: 3248
Merit: 1070
just think about it, it's not that crazy as may sound like at first glance

one of this is just 30 gpu 1070, at the cost of 10 o f them(you can find a 1070 for 330 or lower now so it's up to 12 of them), so it's 3:1 at best, but the new nvidia gpu will be 100% faster than a single 1070

this mean that this vs the 1170 will be only 1.5:1, 50% faster or lower...nothing special

https://www.techradar.com/news/nvidia-geforce-gtx-1180-leak-shows-that-its-faster-than-a-titan-xp

That Nvidia will release a GPU that is 100% faster than 1070 for 330 USD (which is you price in the comparison) is quite unrealistic Tongue
This generation we had quite a boost, yet the 1060 is only about 12% faster than a 970 as an example.
And dont forget the difference in power consumption! I would love a single FPGA instead of a 1080 ti rig, but have to see that it works first. Lead time in Norway is 12 weeks though Sad

will be 500 at launch( so 15 1170 vs this board will be $4000 vs $7500)as usual and 700 for the 1180 which is arguably as a fast as this fpga board for the same cost and same consumption with some tweaking

a 1170 will consume 100 watt, tdp of a 1180 which is a strongest card is the same as 1070ti which consume only 100, so a 1170 will consume less probably at lower tdp

You are dreaming if you think the new 1180 will be as fast as this card.
The 1180 will be an incremental improvement over the outgoing 1080, and even if it was faster than a Titan it will still be far slower.

Depending on the algorithm, we're talking about an FPGA that looks like it can be equivalent to 10+ Titan's!


not as a fast but not so far away FOR THE SAME COST, people need to read better, i was not talking 1vs1

unless you think that a 1180 will cost $4000...
full member
Activity: 219
Merit: 100

45% -- Would have been lower but I lost a LOT in Jan 2018 when shitsmartcash got hacked

Oh yes, that was me. I was controlling the market as much as I could. Most FPGA i ever had at a single time was around 800.


Well, I am jealous and I knew something was up when the prices kept climbing.  Wink

Don't be, I wasted all my profits flying all over the world trying to raise $5M to build out a huge FPGA mine. All the investors thought it was too good to be true and did not believe that I was mining on AWS. They kept saying "why isn't anyone else doing this", "I love the idea! come see me!", "I want to introduce you to my friend! They're ready to invest! You just need to go see them!".  Roll Eyes -- Hey, at least I got to see india, china, hong kong, switzerland, germany, france Cheesy. Definitely never going to fly anywhere to meet an investor again though.





The famous "take a plane and let's have a beer" scam ! Smiley what a pity !!!
member
Activity: 154
Merit: 37
Sorry, I've been meaning to make a post related to this.  We have been working with the OP to determine the optimal configuration and board for this application, as we have a few to choose from.

If you could include a table showing how much (LP)DRAM and SRAM (or low-latency alternative) can be hooked up to each FPGA, as well as the resulting cost, that would be very helpful. Thanks in advance!

The best thing to hook up to the FPGA would be a hybrid memory cube +$500. This would get you the same level of memory performance as HBM memory. The other nice thing, the HMC has a silicon memory controller on it along with some basic logic functions that can speed up certain applications (xor, and, or).



Can the hybrid memory cube be used with the VCU1525?
.

HMC is usually soldered on to the board just like HBM.  HMC can provide staggering amount of bandwidth although is suffers in latency.  Uses SERDES communication.
There are HMC+Altera FPGA paired boards by PicoComputing iirc.

What is hoped to be accomplished with HBM/HBC? What algorithms ? These typically offer in the order of 500GB/s. That means for ETH they support a max of 62MH, and for CN7 they alone support “only” 16KH, but they add huge cost premiums to FPGAs, and offer no advantage over GPUs.

The FPGA advantage is in calculation bound code, or algorithms whose work happens in memory spaces < 40MB or so, where the onboard 11+ TB/s of ultraram/BRAM can be used.

External QDR is going to top out at 80-160 GB/s too because of pin count. Sure it is low latency but that is only part of the equation.

The absolute most bandwidth you can get off chip is if you used 128 32 GBit/s transceivers in a very expensive chip, and that will give you... 512 GByte/second again.

There’s little to be gained with off chip memory of the very expensive variety, in most cases.
full member
Activity: 729
Merit: 114
Sorry, I've been meaning to make a post related to this.  We have been working with the OP to determine the optimal configuration and board for this application, as we have a few to choose from.

If you could include a table showing how much (LP)DRAM and SRAM (or low-latency alternative) can be hooked up to each FPGA, as well as the resulting cost, that would be very helpful. Thanks in advance!

The best thing to hook up to the FPGA would be a hybrid memory cube +$500. This would get you the same level of memory performance as HBM memory. The other nice thing, the HMC has a silicon memory controller on it along with some basic logic functions that can speed up certain applications (xor, and, or).



Can the hybrid memory cube be used with the VCU1525?

HMC is usually soldered on to the board just like HBM.  HMC can provide staggering amount of bandwidth although it suffers in latency.  Uses SERDES communication.
There are HMC+Altera FPGA paired boards by PicoComputing iirc.
newbie
Activity: 14
Merit: 0
Sorry, I've been meaning to make a post related to this.  We have been working with the OP to determine the optimal configuration and board for this application, as we have a few to choose from.

If you could include a table showing how much (LP)DRAM and SRAM (or low-latency alternative) can be hooked up to each FPGA, as well as the resulting cost, that would be very helpful. Thanks in advance!

The best thing to hook up to the FPGA would be a hybrid memory cube +$500. This would get you the same level of memory performance as HBM memory. The other nice thing, the HMC has a silicon memory controller on it along with some basic logic functions that can speed up certain applications (xor, and, or).



Can the hybrid memory cube be used with the VCU1525?

No, a new board would have to be designed.

What is the best 64GB DDR4 Ram to get for the VCU1525 then?
hero member
Activity: 1118
Merit: 541
Sorry, I've been meaning to make a post related to this.  We have been working with the OP to determine the optimal configuration and board for this application, as we have a few to choose from.

If you could include a table showing how much (LP)DRAM and SRAM (or low-latency alternative) can be hooked up to each FPGA, as well as the resulting cost, that would be very helpful. Thanks in advance!

The best thing to hook up to the FPGA would be a hybrid memory cube +$500. This would get you the same level of memory performance as HBM memory. The other nice thing, the HMC has a silicon memory controller on it along with some basic logic functions that can speed up certain applications (xor, and, or).



Can the hybrid memory cube be used with the VCU1525?

No, a new board would have to be designed.
Pages:
Jump to: