Pages:
Author

Topic: DIY FPGA Mining rig for any algorithm with fast ROI - page 69. (Read 99472 times)

sr. member
Activity: 512
Merit: 260
Can you experts please stop blabbering and start coding! Crypto community neads you to provide viable and usable Bitstreams that is early adoptors can use. Who ever release the first solution will most likely make back your development time 6 love...

Here’s an on-topic question. What bitstreams / algorithms does the community actually want?


If I could cherry pick it would be CN7, Equihash, Neoscrypt, Lyra2RE2, X17 and depending on the cost implications Ethash.

jr. member
Activity: 33
Merit: 1
Lol, I don't code bitstreams myself Cheesy -- I've got a basic understanding best case. It's also hard to stop from chattering when i'm so close to having half a purple coin.

If you dont code bitstreams yourself, how did you start fpga mining on AWS?  Hire help?

Slightly pedantic, but bitstreams, from RTL via VHDL or Verilog are not strictly speaking  “programmed” or “coded”.

Designed and implemented are probably better terms. Code / programs are usually statements of instructions to be executed sequentially by a processor (after being compiled or interpreted).

RTL describes the design of actual hardware interconnections / logic between registers, all of which happens essentially at the same time unless you setup clocks and registers to serialize iti. After being written (ok we can call it coded if you really want) and described in Verilog or VHDL it is synthesized, or turned into actual logic gates for the hardware in question - using the LUTs and flip-flops and such available on FPGAs or standard cells for ASICs. Once the synthesizer decides how the logic will be constructed, place and route figured out where to put it all on the chip and how to wire up the connections. The bitstream is actually just a list of every essentially possible connection point on an FPGA and whether or not it is connected. It isn’t executed or run like a program. Programming the bitstream onto the chip would be more like handing an electrician a wiring blueprint and having them connect all the appliances, lights, outlets, etc. with wires according to it.


The more I learn, the more I realize how far I am from succeeding in fpga mining on AWS.  I'll keep learning so that in a few weeks if the o.p follows through or the expert I have solicited to help becomes available, I'll at least know enough to have informed conversations.

Thank you for continuing to post this useful information, I feel like a few more posts and you fpga guys may let me in on the secret handshake.
newbie
Activity: 64
Merit: 0
Can you experts please stop blabbering and start coding! Crypto community neads you to provide viable and usable Bitstreams that is early adoptors can use. Who ever release the first solution will most likely make back your development time 6 love...

Here’s an on-topic question. What bitstreams / algorithms does the community actually want?


Equihash-144-5 assuming the Z9 is rendered useless by it.
Lyra2REv2
NeoScrypt
CryptoNightV7
member
Activity: 154
Merit: 37
Lol, I don't code bitstreams myself Cheesy -- I've got a basic understanding best case. It's also hard to stop from chattering when i'm so close to having half a purple coin.

If you dont code bitstreams yourself, how did you start fpga mining on AWS?  Hire help?

Slightly pedantic, but bitstreams, from RTL via VHDL or Verilog are not strictly speaking  “programmed” or “coded”.

Designed and implemented are probably better terms. Code / programs are usually statements of instructions to be executed sequentially by a processor (after being compiled or interpreted).

RTL describes the design of actual hardware interconnections / logic between registers, all of which happens essentially at the same time unless you setup clocks and registers to serialize iti. After being written (ok we can call it coded if you really want) and described in Verilog or VHDL it is synthesized, or turned into actual logic gates for the hardware in question - using the LUTs and flip-flops and such available on FPGAs or standard cells for ASICs. Once the synthesizer decides how the logic will be constructed, place and route figured out where to put it all on the chip and how to wire up the connections. The bitstream is actually just a list of every essentially possible connection point on an FPGA and whether or not it is connected. It isn’t executed or run like a program. Programming the bitstream onto the chip would be more like handing an electrician a wiring blueprint and having them connect all the appliances, lights, outlets, etc. with wires according to it.




jr. member
Activity: 33
Merit: 1
Lol, I don't code bitstreams myself Cheesy -- I've got a basic understanding best case. It's also hard to stop from chattering when i'm so close to having half a purple coin.

If you dont code bitstreams yourself, how did you start fpga mining on AWS?  Hire help?
jr. member
Activity: 33
Merit: 1
Can you experts please stop blabbering and start coding! Crypto community neads you to provide viable and usable Bitstreams that is early adoptors can use. Who ever release the first solution will most likely make back your development time 6 love...

Here’s an on-topic question. What bitstreams / algorithms does the community actually want?


From my, limited, experience in the (voskcoin) gpu mining community, the community wants the most profitable algorithm for the available hardware.  for AMD eth and Nvidia Equihash.  From the original post, that looks to be PHI1612 for these fpga's.

Personally, I'm just looking for something, anything, that will get me mining on an AWS EC2 F1 instance.  Profitable or not, I'm interested to see it working.  In 6 weeks I may find out from the dentist that I purchased on mturk Tongue

Just imagine what @GPUHoarder, @senseless, @2112 and others could do if they were to put their heads together...
newbie
Activity: 16
Merit: 0
Has anyone used a Nallatech 520 with Stratix 10?
member
Activity: 154
Merit: 37
I agree with Ethash being a bonus / using extra memory - but your HMC costs more
than a GPU that gives you the same bandwidth.

I've never been able to find any reliable data on the memory bandwidth / efficiency utilization on gpu mining. If only 50% efficiency on memory bandwidth utilization is being achieved with gpu GDDR/HBM vs 90% on HMC or HBM w/ FPGA. That would mean the fpga could achieve hashrates +40% over gpu if the biggest limiting factor was memory bandwidth (even more if it's not). Is it even possible to measure the utilization efficiency of the memory bandwidth with gpu miner code? That's the one piece of information i've been missing in my hashrate estimations for algos using these memory types.

Sure, we can say a gpu is more cost effective, but it's not clear that it actually is.

Can you experts please stop blabbering and start coding! Crypto community neads you to provide viable and usable Bitstreams that is early adoptors can use. Who ever release the first solution will most likely make back your development time 6 love...

Lol, I don't code bitstreams myself Cheesy -- I've got a basic understanding best case.


Luckily since EThash is perfectly randomly distributed reads from a large set, cache effects are almost entirely negligible. That means you can almost perfectly estimate memory speed from hashrate.

Each hash takes 8192 bytes of memory access (64 * 128). So 30 MH = 228 GB/s. The 570 is advertised at 224 GB/s IO @ 1750 clock, and 30 MH is usually a 15% overclock so theoretical 256 GB/s. 89% efficinency. The rocm guys will confirm numbers up to 90% for Ethash. I don’t know how it translates to other algorithms.

hero member
Activity: 1118
Merit: 541
I agree with Ethash being a bonus / using extra memory - but your HMC costs more
than a GPU that gives you the same bandwidth.

I've never been able to find any reliable data on the memory bandwidth / efficiency utilization on gpu mining. If only 50% efficiency on memory bandwidth utilization is being achieved with gpu GDDR/HBM vs 90% on HMC or HBM w/ FPGA. That would mean the fpga could achieve hashrates +40% over gpu if the biggest limiting factor was memory bandwidth (even more if it's not). Is it even possible to measure the utilization efficiency of the memory bandwidth with gpu miner code? That's the one piece of information i've been missing in my hashrate estimations for algos using these memory types.

Sure, we can say a gpu is more cost effective, but it's not clear that it actually is.

Can you experts please stop blabbering and start coding! Crypto community neads you to provide viable and usable Bitstreams that is early adoptors can use. Who ever release the first solution will most likely make back your development time 6 love...

Lol, I don't code bitstreams myself Cheesy -- I've got a basic understanding best case. It's also hard to stop from chattering when i'm so close to having half a purple coin.
member
Activity: 144
Merit: 10
OP really needs to be involved in his own thread and give reassurance to his supporters. Grin
member
Activity: 154
Merit: 37
What’s wrong with 900W to 1kw per hour exactly? Other than being pedantic I think that saying consumption is 0.9-1 KWH is understood.
Your teacher should have explained to you what is the difference between kW/h and kWh.

But on a marketing site dedicated for miners, that to quote earlier post in this thread:
The fact is that 95% of miners out there have a very rudimentary understanding of computers, algorithms, and programming.
I would add that they also have rudimentary understanding of literacy and numeracy.

This is what makes reading mining forums such a great fun. Are people really that stupid or are they just pretending? How to they are going to bamboozle people with bullshit calculations involving non-existing units of measure like kelvin-watt-henry?

On this occasion I'd like to post a good advice that reeses had given about a year six years ago:
I'd recommend reading "The Big Con" for some of the history, and watching Confidence and The Sting as examples of the "classic" con games.
I read that book, and although it was written between the world wars, it is very pertaining to Bitcoin all cryptocurrecies. Here's a short excerpt:

  • Locating and investigating a well-to-do victim. (Putting the mark up.)
  • Gaining the victim’s confidence. (Playing the con for him.)
  • Steering him to meet the insideman. (Roping the mark.)
  • Permitting the insideman to show him how he can make a large amount of money dishonestly. (Telling him the tale.)
  • Allowing the victim to make a substantial profit. (Giving him the convincer.)
  • Determining exactly how much he will invest. (Giving him the breakdown.)
  • Sending him home for his amount of money. (Putting him on the send.)
  • Playing him against a big store and fleecing him. (Taking off the touch.)
  • Getting him out of the way as quietly as possible. (Blowing him off.)
  • Forestalling action by the law. (Putting in the fix.)


I think you precisely missed my statement about being pedantic. I knew what you meant and was saying that everyone reading the marketing speak knew what they meant. They used the wrong unit but everyone still understood, and nothing of value was lost other than it clearly irritates you.

Agree that everyone here that doesn’t do this type of work themselves should proceed cautiously and be responsible for their own destiny. I’m amazed when I see people immediately pop up and want to order 100 of something with zero understanding or evidence. Fear of missing out or hoping to be the first in and get rich before the jig is up?
legendary
Activity: 2128
Merit: 1073
What’s wrong with 900W to 1kw per hour exactly? Other than being pedantic I think that saying consumption is 0.9-1 KWH is understood.
Your teacher should have explained to you what is the difference between kW/h and kWh.

But on a marketing site dedicated for miners, that to quote earlier post in this thread:
The fact is that 95% of miners out there have a very rudimentary understanding of computers, algorithms, and programming.
I would add that they also have rudimentary understanding of literacy and numeracy.

This is what makes reading mining forums such a great fun. Are people really that stupid or are they just pretending? How to they are going to bamboozle people with bullshit calculations involving non-existing units of measure like kelvin-watt-henry?

On this occasion I'd like to post a good advice that reeses had given about a year six years ago:
I'd recommend reading "The Big Con" for some of the history, and watching Confidence and The Sting as examples of the "classic" con games.
I read that book, and although it was written between the world wars, it is very pertaining to Bitcoin all cryptocurrecies. Here's a short excerpt:

  • Locating and investigating a well-to-do victim. (Putting the mark up.)
  • Gaining the victim’s confidence. (Playing the con for him.)
  • Steering him to meet the insideman. (Roping the mark.)
  • Permitting the insideman to show him how he can make a large amount of money dishonestly. (Telling him the tale.)
  • Allowing the victim to make a substantial profit. (Giving him the convincer.)
  • Determining exactly how much he will invest. (Giving him the breakdown.)
  • Sending him home for his amount of money. (Putting him on the send.)
  • Playing him against a big store and fleecing him. (Taking off the touch.)
  • Getting him out of the way as quietly as possible. (Blowing him off.)
  • Forestalling action by the law. (Putting in the fix.)
member
Activity: 154
Merit: 37
Can you experts please stop blabbering and start coding! Crypto community neads you to provide viable and usable Bitstreams that is early adoptors can use. Who ever release the first solution will most likely make back your development time 6 love...

Here’s an on-topic question. What bitstreams / algorithms does the community actually want?
sr. member
Activity: 512
Merit: 260
Can you experts please stop blabbering and start coding! Crypto community neads you to provide viable and usable Bitstreams that is early adoptors can use. Who ever release the first solution will most likely make back your development time 6 love...
member
Activity: 154
Merit: 37
Could someone familiar with any of those technologies make another post with relevant calculations?


Edit:



Btw, did some reading on algebraic logic minimization last night along with a couple other techniques. This is already done, automatically, during synth (but can be turned off). Seeing the process, yes, it's something that could be added to simplify logic circuits. HOWEVER, Vivado already does it! Starting to question OP and if this bittware account is even really bittware. I might have to put my foot in my mouth in 18 days, but the more I look at it, the more I'm thinking it's not possible. Elaborate scam?


I’m not an expert in that part, but I believe the synthesizer does basic reduction and propagates constants/eliminates unused NETs. Actual SAT3 style reduction is a very hard problem (I think NP) and I am fairly certain the default options are not generating a provable minimal circuit. I believe that would take exhausting analysis of the 640+ bit input state to output truth table.

You can very likely achieve “better” for some definition of better with a longer running algorithm, much like the place and route problem.

Additionally, the synthesizer can’t help you at all if you don’t have all your constraints constrained. If your design allows arbitrary input to the Keccak engine, or uses the same module for each round, etc. the synthesizer can’t make those optimization’s for you.

Re memory bandwidth: those specs are published for those memory types? No need to guess. I glossed over the details of overhead and latency vs I/O bandwidth, but for the most part you can hide those.

Edit:

I agree with Ethash being a bonus / using extra memory - but your HMC costs more
than a GPU that gives you the same bandwidth.
hero member
Activity: 1118
Merit: 541
The PASCAL A1 ASIC miner is dead on arrival anyway, or is it?

Based on OPs claims, a single VCU1525 with his bitstreams is 20x the throughput of a GTX 1080 Ti while consuming at most 160 watts of power. But based on the PASCAL’s claims, their ASIC miner is 16x the throughput of a GTX 1080 (equivalent to 12x GTX 1080 Ti) while consuming at most 1,000 watts of power. Why would they pursue this?

As an aside, I am still yet to implement an algorithm on AWS F1 (single FPGA) instance that is 2x better in throughput than using a TITAN V. I can see a 2x-4x throughput FPGA advantage with an on-premise card, which is consistent with the findings below.

https://www.xilinx.com/support/documentation/white_papers/wp492-compute-intensive-sys.pdf
https://science.energy.gov/~/media/ascr/ascac/pdf/meetings/201612/Finkel_FPGA_ascac.pdf


I'm not sure the pascal A1 is an asic. If they claim it's an asic, it's almost certainly a scam.


member
Activity: 144
Merit: 10
The PASCAL A1 ASIC miner is dead on arrival anyway, or is it?

Based on OPs claims, a single VCU1525 with his bitstreams is 20x the throughput of a GTX 1080 Ti while consuming at most 160 watts of power. But based on the PASCAL’s claims, their ASIC miner is 16x the throughput of a GTX 1080 (equivalent to 12x GTX 1080 Ti) while consuming at most 1,000 watts of power. Why would they pursue this?

As an aside, I am still yet to implement an algorithm on AWS F1 (single FPGA) instance that is 2x better in throughput than using a TITAN V. I can see a 2x-4x throughput FPGA advantage with an on-premise card, which is consistent with the findings below.

https://www.xilinx.com/support/documentation/white_papers/wp492-compute-intensive-sys.pdf
https://science.energy.gov/~/media/ascr/ascac/pdf/meetings/201612/Finkel_FPGA_ascac.pdf
hero member
Activity: 1118
Merit: 541
The FPGA advocates here have not been talking about those algorithms

Which algos?

For EtHash my plan was 37P + HMC. Logic usage would be minimal, most of the chip would be doing something else, CN7, etc. I never saw those large memory algos as bread winners, just icing on the cake.

It's kind of moot with asics for equihash and ethash. Unless there's some other large mem algo I'm not aware of.

I also think your HBM/GDDR6 bandwidth is off. I doubt memory bandwidth is going to triple in the next gen, double, probably -- maybe 512GB/s max for HBM3.   

member
Activity: 154
Merit: 37

Do you really think that they have in that box a single chip dissipating 1kW with 2 USB, 1 Ethernet and 1 HDMI port?
 

To be fair, I've viewed that website in the past and I did not see their ASIC chip specifications. At least to me, their specifications were for the unit as a whole.


What’s wrong with 900W to 1kw per hour exactly? Other than being pedantic I think that saying consumption is 0.9-1 KWH is understood.

HBM/HBC doesn’t really change the bandwidth game to that level. It is possible to build, as I said, in either configuration. 1TB/s HBM may be coming up in the next generation of chips/interposers, and one could achieve those numbers with 4-8 physical ASICs + HBM, each running 125W. Latency is annoying but manageable. I also didn’t see a price point listed other than 2.5-3x cheaper, but without an exact price it makes all of the “can the speeds be achieved” less relevant. Of course they can for an arbitrary price.

Next gen GDDR6/HBM based GPUs will almost certainly achieve 75-85MH (that’s in the 600-800GB/s range, Titan V for example). Their GPU comparison performance numbers are based on 32MH/GPU - so say a previous $200 MSRP 570. But they probably used $500/hosted GPU range prices to make their numbers look better. 16*500/3 = $2500-3000 range. As far as power, expect 7 next gen GPUs to use about the same 1000W to make those numbers. All that I am nearly certain of is this isn’t one massive chip, it is many like every other ASIC for mining.

Back on topic, the memory bandwidth war is going to continue to be apple’s to apple’s across the board - ASIC/GPU/FPGA are all playing the same game for Ethash, etc. It simply comes down to how low of margins the manufacturer is willing to go and how much effort the miner is willing to put in vs. a turn key solution. At least unless someone wants to unleash more than 128 VU9P based systems fully interconnected and achieve 32 GH. Oh wait - that would cost more than the equivalent GPUs.

The FPGA advocates here have not been talking about those algorithms, but the ones that are not inherently ASIC resistant but have chosen to employ regular change to ward of ASICs. FPGAs will almost certainly be able to perform a set of calculations much faster than GPUs.

Pages:
Jump to: