Author

Topic: Advanced Hardware, SHA256 and Scrypt questions (Read 4063 times)

phk
newbie
Activity: 28
Merit: 0

That or ive misunderstood the purpose off thread-concurrency

Probably this.
sr. member
Activity: 322
Merit: 250
Supersonic
Isint GPU memory external RAM? Or is that a whole different ballgame cause of high clock frequency?

GPU have L1/L2 cache hierarchy just like a CPU.  SCRYPT only performs well if it is small enough to fit into on-chip memory.  If you have to go off-chip, it's going to be a turd.  But yes, most graphic cards have seemingly endless amounts of external memory.

The SCRYPT configuration used in Litecoin needs 128KB, which fits easily into onchip cache in a CPU/GPU, and into onchip block memory on many (not all) FPGA.



I dont think so... I use cgminer on my 7950 with :-
Code:
--thread-concurrency 25768

Meaning its using 25768 * 128 KB of memory (roughly 3.14 GB) . On increasing the concurrency, cgminer fails on startup saying im trying to allocate more memory than available.

That or ive misunderstood the purpose off thread-concurrency
phk
newbie
Activity: 28
Merit: 0
Isint GPU memory external RAM? Or is that a whole different ballgame cause of high clock frequency?

GPU have L1/L2 cache hierarchy just like a CPU.  SCRYPT only performs well if it is small enough to fit into on-chip memory.  If you have to go off-chip, it's going to be a turd.  But yes, most graphic cards have seemingly endless amounts of external memory.

The SCRYPT configuration used in Litecoin needs 128KB, which fits easily into onchip cache in a CPU/GPU, and into onchip block memory on many (not all) FPGA.

sr. member
Activity: 378
Merit: 250
Yes,
GPU memory is external DDR memory chips connected to the GPU ASIC.
sr. member
Activity: 322
Merit: 250
Supersonic
Connecting high speed DDR ram to an FPGA is not a big deal.

I was under the impression that the kind of memory latency necessary (extremely low) is only possible with the most expensive types of DDR chips.

Let's put it this way: if you actually need external memory to run SCRYPT as used in Litecoin (you don't), then yes it is going to be a total turd performance wise.
It doesn't matter what chip you put out there.



Isint GPU memory external RAM? Or is that a whole different ballgame cause of high clock frequency?
phk
newbie
Activity: 28
Merit: 0
Connecting high speed DDR ram to an FPGA is not a big deal.

I was under the impression that the kind of memory latency necessary (extremely low) is only possible with the most expensive types of DDR chips.

Let's put it this way: if you actually need external memory to run SCRYPT as used in Litecoin (you don't), then yes it is going to be a total turd performance wise.
It doesn't matter what chip you put out there.

sr. member
Activity: 322
Merit: 250
Connecting high speed DDR ram to an FPGA is not a big deal.
I see no reason a DSP based FPGA with DDR ram attached could not run SCRYPT.
The FPGA by itself would not be effective.




I was under the impression that the kind of memory latency necessary (extremely low) is only possible with the most expensive types of DDR chips.


I'm not sure if the economics of the scrypt mining business has changed, but I do believe the math was done on this a year ago and it was a total waste of investment.
sr. member
Activity: 378
Merit: 250
Connecting high speed DDR ram to an FPGA is not a big deal.
I see no reason a DSP based FPGA with DDR ram attached could not run SCRYPT.
The FPGA by itself would not be effective.
erk
hero member
Activity: 826
Merit: 500
hero member
Activity: 1162
Merit: 500
Frizz23-
I was honestly trying to ask a sincere question, just trying to understand what the real technical reasons are. You can see from my comment history I was away during the whole FPGA revolution. I know that's not your fault, it was my own.

http://lmgtfy.com/?q=FPGA
phk
newbie
Activity: 28
Merit: 0
FPGA can be programmed to do (almost) anything you want them to do.  You can reprogram / repurpose them to do whatever you want them to do, if you know how to do such things.

An ASIC only does whatever the guy who designed it told it to do before the factory built the chip.

There is nothing stopping somebody from releasing updated bitstreams for already purchased FPGA hardware that would allow them to do a different algorithm.  But nobody has done that yet.
There is not any technical reason preventing it from happening.

Whether or not they would have adequate performance is a different topic.

legendary
Activity: 954
Merit: 1000
Frizz23-
I was honestly trying to ask a sincere question, just trying to understand what the real technical reasons are. You can see from my comment history I was away during the whole FPGA revolution. I know that's not your fault, it was my own.

I was trying to ask a question to learn, trying to apply the bits and pieces of current posts (scrypt needs fast access to memory... FPGAs can't mine scrypt because they need memory), and "Here's a quarter, go get a clue" doesn't really answer the question or help me learn. If you wish to point me articles, discussions... that would be appreciated. If you wish to tell me yourself, even better. I'm honestly trying to learn, and not trying to troll.

I'm really trying to creatively think, in an abstract way, if there would be any way to extend the income lifespan as it were of an FPGA, whether through some creative addition of an addon board through some fashion, or even a service where boards are submitted for retrofitting their FPGA processor chips onto a new board with memory cache chips. (With this, you can see why I was asking if access to DRAM, or to onboard cache, was the limitation.)

I really am not confident that the market will ramp up the fiat amount of a bitcoin, as fast as the difficulty increase created by the addition of ASICs (The Avalon ASICs already in the field and due to ship... the Avalon DIY chips coming to the field by mid-late summer... and, maybe, the BFL ASICs... in addition to whatever newcomer comes around that is reliable). I think that the combination of increased difficulty against a market that does not adjust as agile as that increase, will ramp down the profitability of a given FPGA or farm of FPGA's. I know this particular side of the forum does not have a favorable view toward any minable cryptocurrency other than Bitcoin, and is skeptical of the lifespan of any given currency. However, if you do look at exchanges other than MtGox, it remains that there is a viable daily volume in many of the other currencies, at least one of which is double-tradable to both bitcoin and fiat. It is my thinking that the profitability of an FPGA will not vanish... but it will drop very much down across all SHA-256 currencies, possibly even to a negative profitability until the market corrects for the additional weight of ASIC mining.

I am trying to creatively think of a way that an FPGA might be able to mine multiple types of computations, if its even possible (thus my question asking HOW they worked and WHAT the limitation actually was), to try to extend the profitability lifetime. If its not possible, just let me know what the real issue was, in a real, person to person way so I can learn. Like I asked... Is it a L2-style cache issue, or is it a DRAM addressibility issue, or is it just a rigid limitation in the design of the chip themselves.

Not "Here's a quarter, go buy a clue." I'm trying to get a constructive comment.
hero member
Activity: 1162
Merit: 500
legendary
Activity: 954
Merit: 1000
We have four major types of mining hardware (which, with the exception of ASICs, have multiple generations of hardware providing different results.)

-CPU
-GPU
-FPGA
-ASIC

I know FPGA devices were written for the express purpose of mining SHA256, namely designed just for Bitcoin. I also know they were repurposed for solving on other SHA256 coins as well (namely, all other coins except Litecoin and Novacoin.... we're talking about items like Namecoin, PPcoin, Terracoin, Bytecoin, Feathercoin, etc.) ASIC is just an expansion of this.

The easy answer to tell new users is "FPGA and ASIC can't mine anything else. End of Line." The slightly advanced answer is "FPGA and ASIC can't mine Scrypt, or anything else that's non-SHA256 that will ever be made, because it doesn't have high memory to access."

So, I'm slightly curious. What are the actual higher level reasons? With an FPGA that exists today, what are the actual results if you were able to port or point a client (such as CGminer) at it, type in --scrypt, and watch it try to hash away?

Is the structure in the chip LITERALLY unable to complete the calculation, or is it just because the chip is optomized to handle only SHA256 and all other calculations are going to be less efficient?

If the chip is equally able to solve an SHA256 as anything else, and its just a matter of pointing a calculation at it and needing memory... Are we talking about needing rapid access to a high amount on-die L1/L2 cache? If this is the case, could it be the possibility of perhaps co-locating a L2 cache chip onboard (as was done for processors up till just prior to Pentium 2, where prior to that L2 was a separate cache chip, but with P2 was moved on-die and an L3 was introduced as the separate cache chip)? Or are we talking about the need to (relatively) rapidly access an amount of DRAM?

Where I'm trying to think is this... there's an existing deployment of FPGA hardware in the field, and a small amount still being sold. With the rising deployment of ASICs by EOY 2013, this hardware will be about as profitable as GPUs are in Spring 2013, unless the market price of SHA coins rises in correlation to the difficulty. What are the technical reasons which would hold back a current FPGA from switching to another type of blockchain (Scrypt, or whatever rises to become the next widely supported blockchain format)?

I realize that if we're talking about something like adding an onboard cache, that is going to be fundamentally unsupportable because we're talking about the need to etch memory traces and have the cache chip be physically nearby for rapid access. If we're talking, however, about the need for rapid memory from something like a DRAM, perhaps there might be some other way around this than just junking FPGA's. (And then the question would then extend to ASIC as well.)

If we're just talking that the chips were purpose created in their core logic to only be capable of solving SHA256 and never capable of solving any other calculation, then this whole question is answered there, and we'll just change the focus (as has been suggested in other forums) to specifically creating chips for solving those other calculations.
Jump to: