Author

Topic: Virtual GPU can it be done? (Read 4588 times)

newbie
Activity: 25
Merit: 0
June 26, 2011, 04:07:24 PM
#11
[... Long text ending with: ]Would CPU hashing still be more efficient in a massively parallel cloud of clients, than some kind of virtual appliance, emulating the essential features of a GPU? Interesting possibilities methinks. Anyhow thanks for the info. You can learn so much, just by asking stupid questions to clever people. Grin

What kind of shit did you just smoked man? Cheesy

You may want to stop thinking about emulating gpus and start thinking about... a futuristic society formed by talking trees and then use your energy to write a book about it.
full member
Activity: 185
Merit: 121
June 26, 2011, 02:42:02 PM
#10
There's no point in trying to emulate the features of a GPU when GPUs are ridiculously cheap thanks to the massive quantities in which they are produced. While FPGAs could theoretically outperform GPUs, they won't do so usefully because they're more expensive due to lower quantities and they just can't provide the massive parallelism that GPUs can.

Now, if you had $10,000,000 to blow on a few thousand fully-custom ASICs, that would be another story.


OK. Thanks for your help joe, I'm going to follow a new thread on FPGAs I just discovered, just to learn some more.
full member
Activity: 185
Merit: 121
June 26, 2011, 02:33:06 PM
#9
you can run a weather simulator cluster in a netbook, but it will take ages to move a single raindrop

But if I spill a glass of beer on it, then it would move a couple of my mates to shed teardrops. What makes raindrops so hard to move? Tongue
legendary
Activity: 1596
Merit: 1012
Democracy is vulnerable to a 51% attack.
June 26, 2011, 02:08:44 PM
#8
There's no point in trying to emulate the features of a GPU when GPUs are ridiculously cheap thanks to the massive quantities in which they are produced. While FPGAs could theoretically outperform GPUs, they won't do so usefully because they're more expensive due to lower quantities and they just can't provide the massive parallelism that GPUs can.

Now, if you had $10,000,000 to blow on a few thousand fully-custom ASICs, that would be another story.
full member
Activity: 185
Merit: 121
June 26, 2011, 02:04:58 PM
#7
GPU are massively parallel and are extremely good at solving computation that don't require moving large amounts of information or making large numbers of decisions. CPUs are somewhat parallel and are extremely good at solving computations that require moving large amount of information and making large numbers of decisions. Hashing, as it happens, doesn't require moving large amounts of information and doesn't require making large number of decisions. So GPUs win.

A GPU is like a CPU with 1,000 very limited cores.


Ahhh. That explains quite a bit. The loss then, would be incurred by the de-serialization algorithms. Like an ADC, it must introduce bottle necks, that require a processing speed, proportional to the volume of data being hashed. Likewise, emulating a CPU on GPU, would force the opposite issue, like a DAC, I imagine it would shift every bit it had to move into registers to address a virtual UART, implemented for a serial interface. I'm just guessing here so don't shoot me now  Smiley. I would assume these visualization apps, have virtual UARTS for parallel to serial interfacing and vice versa. Now (if I'm right so far), I have to wonder how differently FPGA's work and if they are more amenable to improvisation. I'm trying to see how a mobile device (or older computer) could have the best environment, to churn out hashes better than one with no particular hash processing optimizing code (virtual or otherwise), yet still be efficient at processing regular OS/application code, while addressing the system bus, dynamic memory allocation, handling interrupts, and so on.

If we talk in terms of say a virtual UART, dedicated to this optimization, are we still still throwing good machine cycles after bad? Keep in mind of course, that the code in service of this task, has the system memory at it's disposal (I think). Where as my basic 3D on board video, has a piddling 128 meg of DRAM, My whole system has a couple of Gig. With good tight, clean, code that dynamically  pulls the required memory into service, don't I have more room for flexible parallel assignment like a processor that grows extra process threads or something? Even an expensive video card with a Gig of on board memory is limited to it's capacity. If you implement the GPU in memory (unless I'm mistaken) then each instance of it's parallel processing would not require a new parallel image of it in memory, but only an implementation of the internal clock and I/O addressing the virtual UART as quick as it's little legs will carry it. OK. So my CPU is only 3.7 gig, but my GPU is far worse at 1 point something. Surely I can squeeze a bit more processing power from a 3-4):1  ratio of processing speed, especially if I have almost 10 times the RAM to depose of in idle time.

You see what I'm saying? Even if the capacity of CPU is nothing like GPU in terms of hardware performance, I still may have much more CPU power and RAM for modeling (even perhaps somewhat more SMP) than could attained by CPU processing alone, if only we isolate the bottle neck and virtualise only the components/functionality that give the GPU it's advantage. A dynamic VGPU, might chew up available system resources and spit out a SMP implementation of scalable/adaptable clocks, I/Os and VUARTs that even granting that my piddly video card may be better in hardware design, my virtual video GPU, may be somewhat better given the much better internal CPU clock speed and the disposable memory available to it's registers on demand. Efficient is something the CPU may not be (for hashing), but in terms of being resource enriched, and having dynamically delegable functionality and for all round versatility, it seems like a resource undervalued. The challenge may be virtulising no more of the GPU architecture, than needed, to re-purpose a reasonably fast CPU to outperform a relatively slow GPU (or no GPU). I can afford the overheads, if putting a graphics card into my machine is not an option and the VGPU is basically a targeted optimization of GPU architecture that could interface with the CPU and perform better than the CPU trying to crunch hashes all on it's own.

My understanding of how a GPU performs hash calculations or draws polygons, by comparison to ordinary CPU processing, call upon an analogy (perhaps a less than satisfactory one) with the difference between bitmap (grid plotting and pixel rendering) vs. vector graphics. The latter calculating the absolute coordinates of points on a plane, whereas, the later uses minimal differences between frames of reference and optimal mathematical descriptions of  lines, planes and the relationship between points (geometry), whereas bitmaps were only intended to describe pixels and their attributes one by one, as they were scanned to the display. OK, So a CPU can calculate vectors and do the same tricks, but a GPU (in my naive understanding), was designed to describe the contents of a given chunk video memory, by the geometrical relationships and differences between frames and render the result in the same way as the dumb electronics would accept (Ie, scan lines and fixed values for each pixel), but to calculate that, it only needs know the relatively small differences between each frame and send the information that would be relevant. Given the typical similarity between each frame, the difference could be subtracted and only the changes refreshed on each pass of the scan line.

As I understand, this preservation approach to describing only what has changed, is particularly favorable to description by vector relationships. reducing data to all so many lines, angles, points and the relationships between them (geometry). You can. as I understand it, describe a point, plane or polygon far more efficiently in terms of vectors. As typical video images contain lots of objects moving (or perspective of them shifting) incrementally, which while their perspective/position has changed their general consistency is preserved. The total share of pixels in each frame then, that must be adjusted, is usually a much smaller fraction (compared to redrawing the whole screen) and so their description in a multidimensional array, is simplified by what minimal pixels are to be re-rendered and the incremental degree of change, rather than of absolute whole values. In vectors terms this may be trivial, except for the colour encoding, shading and ray-tracing dimensions. As I understand it, modern cards have a fair bit of separate dedicated circuitry and chips to handle support for this separate dimension of contextual complexity. They really are marvelous and clever devices on close inspection. Not that CPUs and computers in general aren't, but this clever complementarity, so well devised to model and render, complex 3D fields of vision, beyond the resolution required by human eyesight, it's just a marvel to behold.

Comprehending the complexity of what is being drawn on the screen, gives some sense of magnitude, of the information processing that modern computing deals with, and the clever ways being used to compress and minimize the preprocessing of redundant information . Yet that is but a fraction of the processing power, of a modest desktop box. It's beguiling to think of the amount of information being processed by the whole machine and all those lines of code, with all the conditional tests being tried, ports being read, I/O accessed, buffers read and written to, while registers are being peeked, poked and mov'ed. People may look at their screen and not only take it all for granted, yet only assume that what they see up there, IS the program and IS about all that is happening.

Anyhow. I digress. Now I have to wonder about the future potential of field programmable gate arrays and how differently they work. The potential to have mobile devices with dedicated processors, optimized to be adaptable for any processes presently demanding greater resources (whether they be CPU like GPU like or otherwise), is an interesting line of inquiry; costly as it may be for R&D. I wonder if the FPGA could allow a device to be just as efficient at playing CPU as GPU depending on present needs and dynamically adjusting it's provisions to suit each process accordingly. I hear that some FPGA's are down to below $300 now. Also I wonder how the advent of the massively parallel processing of hyper-visors like KVM, might affect the potential of a VGPU implementation. The potential I'm investigating BTW, is not so much for individual/personal use, as for aggregating and for SMP over networks. I can almost imagine bitcoin being farmed (like existing mining pools), but with many, many CPUs, on participating mobile devices, all sharing the hashing work, of a cloud based VGPU server, deployed over a VPN and using a hyper-visor like KVM for load balancing. The question of viability, then has to account for the performance leverage, afforded to the whole system, given memory constraints may be handled by the data center serving the VGPU, while parallel processing of hash power is handled by the hyper-visor. Would CPU hashing still be more efficient in a massively parallel cloud of clients, than some kind of virtual appliance, emulating the essential features of a GPU? Interesting possibilities methinks. Anyhow thanks for the info. You can learn so much, just by asking stupid questions to clever people. Grin
legendary
Activity: 1596
Merit: 1012
Democracy is vulnerable to a 51% attack.
June 26, 2011, 05:16:41 AM
#6
GPU are massively parallel and are extremely good at solving computation that don't require moving large amounts of information or making large numbers of decisions. CPUs are somewhat parallel and are extremely good at solving computations that require moving large amount of information and making large numbers of decisions. Hashing, as it happens, doesn't require moving large amounts of information and doesn't require making large number of decisions. So GPUs win.

A GPU is like a CPU with 1,000 very limited cores.
full member
Activity: 185
Merit: 121
June 26, 2011, 05:05:42 AM
#5
Perhaps somebody has already thought of this. Perhaps it's unfeasible. But I'thought I'd ask anyhow.

Since GPU's can mine far more efficiently than a CPU, is it possible to create code for a virtual GPU (I'm thinking of hardware that won't support a graphics card)?
Yes, that's quite possible.

Quote
Would the overhead defeat the purpose?
Of course. If a CPU sucks, a CPU having to do the extra work to pretend to be something it isn't will suck much more. Now it won't be able to do the hashes the most efficient way for a CPU but will have to do them the most efficient way for a GPU, all while also having to pretend to be a GPU. YUCK!

Quote
Could you squeeze much more hashing power out of a CPU and make the memory dynamicly allocatable to the GPU on demand (ie in native CPU idle time). Just a thought.  Undecided
How can you squeeze *more* power out when you're wasting most of that power just trying to pretend to be a GPU? You'll have to hash with what little is left.

OK. So that's an unequivocal NO then. I was not sure that 'power' of a GPU for hashing or rendering polygons etc, was an equivalent measure of 'power' for CPU processing, nor if once the overhead to do the emulating had been paid, if the subsequent cost in power would rise in proportion. I was thinking it may be a fixed overhead to emulate the hardware, so if you had a relatively powerful CPU or no GPU at all, you'd loose a fixed amount of CPU processing ability, yet gain from re-purposing you CPU while it would otherwise be idle. I would have assumed that other emulated hardware, such as tap and tun devices, had a fixed overhead unrelated (or at least not proportional) to the bandwidth of data they handle.

I understand basic thermodynamics (better than most esp. in relation to information theory), so I realize you can't get something for nothing. Also my (perhaps naive) understanding is, that GPU's may be better designed for hashing, but that the CPU on a typical box, is a much more powerful processor overall than the GPU, which while designed for graphics, is able to do so much more in it's specialty than a CPU, more because of the way it handles video data, which has well defined and limited characteristics.

So if you don't have a GPU at all (and cant install one), there's nothing to be gained by emulating one. I'm interested to learn where the efficiency difference lies. Clearly it's not simply because a GPU mainly has to address video memory, or such mundane architectural differences with it's I/O design. It's obviously more to do with it's intrinsic method of processing algorithms. It's an interesting thought in my view, as I understand that field programmable arrays, can do yet again more efficient processing for a given task, with literally much less power in terms of kilojoules. So again we would conclude that emulation is futile to trump the hardware alternative, I would have suspected as much, but if there is no alternative (ie. no GPU to speak of, or if you want platform independent, standardized, virtual deployment, over a VPN running on a hyper-visor like KVM). That's the crux of the issue. I'm not simply wondering if a CPU emulated GPU, is better than a a hardware GPU, but rather, if it's worth implementing at the expense of the direct CPU hashing power, specifically where hardware GPU power would be negligible to no-existent.

So, I'll take it the answer is still an unequivocal NO, unless perhaps, I didn't make my original query clear enough and perhaps you have some clarification to qualify the result. Either way, thanks for the advice.

PS: I promise I won't try to save the world with my 'over unity' bitcoin generating virtual GPU, even though we know there are suckers out there who would by it. Grin

legendary
Activity: 1596
Merit: 1012
Democracy is vulnerable to a 51% attack.
June 25, 2011, 11:55:24 PM
#4
Perhaps somebody has already thought of this. Perhaps it's unfeasible. But I'thought I'd ask anyhow.

Since GPU's can mine far more efficiently than a CPU, is it possible to create code for a virtual GPU (I'm thinking of hardware that won't support a graphics card)?
Yes, that's quite possible.

Quote
Would the overhead defeat the purpose?
Of course. If a CPU sucks, a CPU having to do the extra work to pretend to be something it isn't will suck much more. Now it won't be able to do the hashes the most efficient way for a CPU but will have to do them the most efficient way for a GPU, all while also having to pretend to be a GPU. YUCK!

Quote
Could you squeeze much more hashing power out of a CPU and make the memory dynamicly allocatable to the GPU on demand (ie in native CPU idle time). Just a thought.  Undecided
How can you squeeze *more* power out when you're wasting most of that power just trying to pretend to be a GPU? You'll have to hash with what little is left.
hero member
Activity: 616
Merit: 500
Firstbits.com/1fg4i :)
June 25, 2011, 12:14:51 PM
#3
you can run a weather simulator cluster in a netbook, but it will take ages to move a single raindrop
full member
Activity: 126
Merit: 100
June 25, 2011, 11:58:40 AM
#2
I'm Not really sure in what direction you want to get with your virtual
 GPU, but maybe have a look on the fpga miner threads here on the forum
full member
Activity: 185
Merit: 121
June 25, 2011, 11:25:05 AM
#1
Perhaps somebody has already thought of this. Perhaps it's unfeasible. But I'thought I'd ask anyhow.

Since GPU's can mine far more efficiently than a CPU, is it possible to create code for a virtual GPU (I'm thinking of hardware that won't support a graphics card)? Would the overhead defeat the purpose? Could you squeeze much more hashing power out of a CPU and make the memory dynamicly allocatable to the GPU on demand (ie in native CPU idle time). Just a thought.  Undecided
Jump to: