Pages:
Author

Topic: DIY FPGA Mining rig for any algorithm with fast ROI - page 46. (Read 99472 times)

newbie
Activity: 32
Merit: 0

Power in a circuit scales linearly with clock speed and exponentially with voltage.

P = capacitance * voltage^2 * frequency

The reason why thermals spiral out of control at high clock speeds is because you also have to keep cranking up the voltage to keep things stable as the clocks go up.

I didn't think this is different for FPGA. Is it?

If that holds true, dropping the voltage on the FPGA to 0.75V from default 0.85V will allow for about 125% of baseline unmodified performance by allowing 25% OC within the same thermal envelope, assuming it is stable. So maybe about 11GH/s per card.
jr. member
Activity: 94
Merit: 1


but that's relatively low frequences there

a real test would be to set frequency @ 500MHz by whitefire and see an overall power consumption, then
hero member
Activity: 1118
Merit: 541
full member
Activity: 197
Merit: 100
@whitefire990
Could immersive cooling  help  to avoid at least some of the modifications, like soldering a $4K FPGA board?
People have achieved great results with transformer oil cooling, using ASICs and GPUs.

Based on current experiments, immersion cooling an unmodified VCU1525 will get you to about 80-85% of peak performance on power-hungry algorithms without other modifications. 

Thank you for your work!
I think the average person can do «drop core voltage (no mechanical mod needed)».
if we have voltmod and immersion cooling it will be more than 80-85% ?
jr. member
Activity: 94
Merit: 1

It depends on the design, whitefire's designs are far more optimized than mine for those specific algo's. It should scale linearly with frequency. So if it's using 306A at 708Mhz, it should use 153A (130W) at 354mhz and that would provide 8.5Gh/s. It is a good idea to provide A/C for these cards. The colder the card the faster it can run.

@whitefire990 Could you reduce core clock to a more reasonable number and provide some feedback? I'm not sure everyone is going to want to run their card at those junction temps. Keep in mind, even with cooling modifications, At those thermal envelopes, someone running their card in an environment 5F warmer than you, even with the same cooling, may have problems.



Of course it does NOT scale linearly with frequency... For example:




I would rather suspect more less 150W @ 500MHz

That's not a FPGA.



...but the rule I'm talking about - is GENERAL, for devices using transistors Smiley
member
Activity: 154
Merit: 37
- does anyone determine profitability including power for GPUs without any mods? The % you achieve without mods is about the same here.

I am quite certain that fewer than 0.001% of GPU miners take a soldering iron to their GPUs to improve mining performance.

The biggest gain is a software undervolt...

And none of those mods are necessary to see better than GPU performance per $ spent, at significantly less power. Some people just like to take it to extreme tuning as well.
newbie
Activity: 32
Merit: 0
- does anyone determine profitability including power for GPUs without any mods? The % you achieve without mods is about the same here.

I am quite certain that fewer than 0.001% of GPU miners take a soldering iron to their GPUs to improve mining performance.
hero member
Activity: 1118
Merit: 541

It depends on the design, whitefire's designs are far more optimized than mine for those specific algo's. It should scale linearly with frequency. So if it's using 306A at 708Mhz, it should use 153A (130W) at 354mhz and that would provide 8.5Gh/s. It is a good idea to provide A/C for these cards. The colder the card the faster it can run.

@whitefire990 Could you reduce core clock to a more reasonable number and provide some feedback? I'm not sure everyone is going to want to run their card at those junction temps. Keep in mind, even with cooling modifications, At those thermal envelopes, someone running their card in an environment 5F warmer than you, even with the same cooling, may have problems.



Of course it does NOT scale linearly with frequency... For example:




I would rather suspect more less 150W @ 500MHz

That's not a FPGA.
member
Activity: 154
Merit: 37
GPU_Hoarder has his CN7 (19.7-22KH/s = 10-12x Vega 64) algorithm running on an unmodified VCU1525.

Bitstreams+miner or it didn't happen. :-)

The only thing that has been demonstrated thus far is that buying a VCU1525 for mining Keccak is a not a sane proposition vs. getting a 1080Ti for mining ethash.
I am still very hopeful that other algorithms such as CN7 will make this into a much more worthwhile endeavour.

Most importantly, however, I think the profitability measurement should be based on what can be achieved with an unmodified FPGA at 30C ambient temperature, rather than based on optimistic measurements with modifications that involve replacing PCB components.

- does anyone determine profitability including power for GPUs without any mods? The % you achieve without mods is about the same here.

First - Absolutely no one was suggesting you buy these to mine Keccak. Those who understand and take the time to read will profit, those who don’t won’t. Simple as that. Whitefire is providing a great service to the community in exposing all this publically - it has been working in private for a long time. He only stands to lose money sharing, yet he is sharing. I applaud that alone.

Second - You do realize no one has any incentive to provide you this bitstream and they have every incentive to keep it private, right? Also wasn’t developed for the VCU1525, just the same chip, so some modification is always needed.

At a certain point in life “bragging rights” have no value.

The people who do this understand the value of the Keccak test. It proves numerous things on this card - hardware limitations, power and thermals, achievable fabric speed at logic level, and performance of components. Take a look at the Keccak speeds on little baby precious classes of mining FPGAs. 100MH. Sometimes less.

FPGAs are not GPUs. Repeat that a few dozen times. The complexity of a best in class implementation on a large, complicated FPGA like this is very large. So are the gains when it is done right. So is the difference between the trivial implementation and the efficient implementation.

I can say that  all this functionality will be available in the cards come delivery in August. With proper protections. If the community has any say I’m sure open bitstreams will exist before then, we will see if they achieve the marks.

The open source implementation of Keccak are quite poor and see no where near this performance, and FPGAs operate < 1Ghz, where the graph slope is much narrower. You’re missing that this is more than 25x performance of a GPU in Keccak for 10x cost of that GPU. Keccak may not be profitable now on either (hint - it was very very recently), but the principle applies across the algorithm space.
jr. member
Activity: 94
Merit: 1

It depends on the design, whitefire's designs are far more optimized than mine for those specific algo's. It should scale linearly with frequency. So if it's using 306A at 708Mhz, it should use 153A (130W) at 354mhz and that would provide 8.5Gh/s. It is a good idea to provide A/C for these cards. The colder the card the faster it can run.

@whitefire990 Could you reduce core clock to a more reasonable number and provide some feedback? I'm not sure everyone is going to want to run their card at those junction temps. Keep in mind, even with cooling modifications, At those thermal envelopes, someone running their card in an environment 5F warmer than you, even with the same cooling, may have problems.



Of course it does NOT scale linearly with frequency... For example:




I would rather suspect more less 150W @ 500MHz
newbie
Activity: 47
Merit: 0
whitefire had stated that this algo was meant to test the thermal limits and insure the functions. if this one works for you, he said all the others will work.......

IMHO, He start this algo only for this reasons:
1) This algo already was released for FPGA and placed as VHDL/Verilog sources at github. OP does not wrote his own hashing-core design, He done only adaptation to specific FPGA-board;
2) This is a very simple and smallest hashing algorithm that can get maximum performance on FPGA. Any other algos discussed in this threads will use a more FPGA's logic cells and therefore it
will cause a less performance (profitability).
legendary
Activity: 1049
Merit: 1001
I have been following this thread for quite some time and I am glad to see things progressing, I would be interested in Tribus and Timetravel10 and running them at lower speeds to ensure a long life of the FPGA. I would also be interested in Lyra2z even though Zcoin is moving away from it, I still see that other coins may continue to use it. What are your thoughts on the Lyra2z330 algo, is this something that might work with an FPGA as well?
member
Activity: 144
Merit: 10
Summary: Go CN7 or go home.
Why would bother developing a bitstream for $1/day/card fee when you could do it for $2.5/day/card fee? Keccak seems like a waste of everyone's time.

You've got a point there, why start with the least profitable algorithm? I do not understand the rational behind this.
newbie
Activity: 32
Merit: 0
No disagreement on that, given that the power costs eat about 27% of mining revenue.
ROI time, however, is the only metric even remotely worth considering. Improving power efficiency gets you from 73% net profit (based on 1080Ti mining ETH) to about 93% net profit (based on this supposed 5x efficiency improvement you mention).
The question them becomes, how long does that 20% profitability improvement take to cover the 500% increase in hardware cost. And that figure is starting to look like it is measured in years rather than months.

The assumption is made based on total cost of ownership, say you spend exactly the same amount of money on hardware, could be FPGA boards or GPUs. You should get at least a 5x reduction in power and space for generating the same amount of profit when compared to efficient GPU mining.

Let me put it another way:
Revenue with keccak: $10/day.
Mining fee: 10%=$1/day
Electricity cost: $1/day.
Net profit: $8/day
Cost of the card: $5000.
$5000/8 = 625 days = 20.5 months

I don't see that keccak is worth bothering with, except as a proof of concept learning tool.

20KH/s of CN7 = $25/day
Mining fee + electricity = $3.5/day
Net: $21.5/day
ROI: 8 months

That is at least in a reasonable ballpark.

Summary: Go CN7 or go home.
Why would bother developing a bitstream for $1/day/card fee when you could do it for $2.5/day/card fee? Keccak seems like a waste of everyone's time.
member
Activity: 144
Merit: 10
No disagreement on that, given that the power costs eat about 27% of mining revenue.
ROI time, however, is the only metric even remotely worth considering. Improving power efficiency gets you from 73% net profit (based on 1080Ti mining ETH) to about 93% net profit (based on this supposed 5x efficiency improvement you mention).
The question them becomes, how long does that 20% profitability improvement take to cover the 500% increase in hardware cost. And that figure is starting to look like it is measured in years rather than months.

The assumption is made based on total cost of ownership, say you spend exactly the same amount of money on hardware, could be FPGA boards or GPUs. You should get at least a 5x reduction in power and space for generating the same amount of profit when compared to efficient GPU mining.
legendary
Activity: 1453
Merit: 1011
Bitcoin Talks Bullshit Walks
So, in brief, 53 pages of hope, sweat and tribal dancing is ended like ROI=never.

ESP when this is what the community will have to perform to get these specs.   This is a a bad dream I think.

From the link

Advanced Modifications

Soaking the heat sink in isopropyl alcohol overnight, then removing it
Removing the 6 power inductors on the board and replacing them with bigger versions with the same footprint
Desoldering the fan connector on the PCB and adding an 8-pin PCIe power connector
Adding a small heat sink to the back of the board on the LTC3884 chip
Adding a heat sink to the 6 inductors
Mounting an air blowing fan to cool the LTC3884 and the inductors
Cutting a custom copper plate from stock metal, and using SAC305 solder paste to solder it to a Thermaltake water cooling block using a toaster oven
Modifying (drilling/grinding) the stock Thermaltake water block mounting hardware to fit the VCU1525 mounting holes
Applying either Arctic Silver 5, or Conductonaut Liquid metal (gallium) to the VU9P die and mounting the water block
Building a 360mm radiator mounted with 3 x 120mm Antminer class fans
Completing the water cooling loop with a Noctua manual fan controller and water pump/hoses


So I don’t think I would do this to a 300$ card let alone 4k$. Just to let you know.  I could perform these task. But do I want to hack a board to that degree to make it work.  How about designing it right from the get go.

BR
newbie
Activity: 2
Merit: 0
Hi all,

Sorry I haven't had time to read this thread, I have been working like nuts trying to deliver as promised.

The Keccak-24core/708MHz (17GH/s) bitstream burns 306A and will run on an unmodified VCU1525 for approximately 10 seconds.  To run it continuously requires extensive modifications as described here:
http://zetheron.com/index.php/hardware-modifications/

The Keccak bitstream is now posted live on the website:
http://zetheron.com/index.php/downloads/

Also, there is an interesting table where I calculated how many FPGA's each algorithm can support:
http://zetheron.com/index.php/fpga-performance-profit/

So based on the current market, FPGA-friendly algorithms can support 25,000 FPGA's.


But how can you continuously support 306A? The modifications do not give a clue.
copper member
Activity: 166
Merit: 84
@whitefire990
Could immersive cooling  help  to avoid at least some of the modifications, like soldering a $4K FPGA board?
People have achieved great results with transformer oil cooling, using ASICs and GPUs.

Based on current experiments, immersion cooling an unmodified VCU1525 will get you to about 80-85% of peak performance on power-hungry algorithms without other modifications. 
newbie
Activity: 58
Merit: 0
@whitefire990
Could immersive cooling  help  to avoid at least some of the modifications, like soldering a $4 K FPGA boards?
 
People have achieved great results with transformer oil cooling (when using ASICs and GPUs). The fans were often removed to achieve better results.
newbie
Activity: 32
Merit: 0
GPU_Hoarder has his CN7 (19.7-22KH/s = 10-12x Vega 64) algorithm running on an unmodified VCU1525.

Bitstreams+miner or it didn't happen. :-)

The only thing that has been demonstrated thus far is that buying a VCU1525 for mining Keccak is a not a sane proposition vs. getting a 1080Ti for mining ethash.
I am still very hopeful that other algorithms such as CN7 will make this into a much more worthwhile endeavour.

Most importantly, however, I think the profitability measurement should be based on what can be achieved with an unmodified FPGA at 30C ambient temperature, rather than based on optimistic measurements with modifications that involve replacing PCB components.
Pages:
Jump to: