Pages:
Author

Topic: Nvidia Kepler : killer or average mining card ? (Read 3969 times)

rjk
sr. member
Activity: 448
Merit: 250
1ngldh
February 15, 2012, 08:47:38 PM
#21
http://fastra2.ua.ac.be/?page_id=214

I have heard people report they couldn't get 7 or 8 GPU to be detected in BAMT or other Linux distro.  I have never had an issue at 8 GPU but I turn everything off in the BIOS which may free up space in this legacy 64KB memory map to allow at least 8 GPU to be detected.

Praise you! That is a brilliant find indeed, DAT. Yet another potential point of failure I never considered.
Then again, always having gone berserk in the BIOS I never encountered any issues with missing GPUs.
The diligence does pay off, I guess...

The truth is, the old BIOS implementations can barely be called up to snuff today.
UEFI should deal the coup de grace and take over as a ruling standard with great benefit to the whole ecosystem.
No more messing with code devised in the 80s...

Kepler will without doubt suck at integer ops - it's non-trivial to change the architecture enough to turn a well known Achilles' heel into a strong point.
Bitcoin is by no means significant enough to warrant the labor, not by orders of magnitude.

It's quite funny as the same performance characteristics that make AMD cards better suited for mining are what makes those cards inferior to nVidia's for many GPGPU applications.
According to Trenton and other manufacturers of incredible computing systems, their design service includes BIOS rewrites - presumably because a custom BIOS must be written for each application, or at least for some of the high density GPGPU systems they produce.
full member
Activity: 210
Merit: 100
http://fastra2.ua.ac.be/?page_id=214

I have heard people report they couldn't get 7 or 8 GPU to be detected in BAMT or other Linux distro.  I have never had an issue at 8 GPU but I turn everything off in the BIOS which may free up space in this legacy 64KB memory map to allow at least 8 GPU to be detected.

Praise you! That is a brilliant find indeed, DAT. Yet another potential point of failure I never considered.
Then again, always having gone berserk in the BIOS I never encountered any issues with missing GPUs.
The diligence does pay off, I guess...

The truth is, the old BIOS implementations can barely be called up to snuff today.
UEFI should deal the coup de grace and take over as a ruling standard with great benefit to the whole ecosystem.
No more messing with code devised in the 80s...

Kepler will without doubt suck at integer ops - it's non-trivial to change the architecture enough to turn a well known Achilles' heel into a strong point.
Bitcoin is by no means significant enough to warrant the labor, not by orders of magnitude.

It's quite funny as the same performance characteristics that make AMD cards better suited for mining are what makes those cards inferior to nVidia's for many GPGPU applications.
donator
Activity: 1218
Merit: 1079
Gerald Davis
So when looking at >8 GPU issues/solutions I found this

Quote
The remaining problem was unexpected: each GPU requires a block of 4KB of I/O port space, for which only 64KB is reserved in total. Together with low-level system devices and devices like network and USB controllers also taking up I/O space this was a very tight fit. We needed to re-map inefficiently allocated system devices and disable as many devices as possible entirely, such as the RAID controller and the second network controller. From later experiments we suspect it might actually only be necessary to allocate this 4KB block of I/O ports for the primary VGA controller, but we haven’t verified that.

http://fastra2.ua.ac.be/?page_id=214

I have heard people report they couldn't get 7 or 8 GPU to be detected in BAMT or other Linux distro.  I have never had an issue at 8 GPU but I turn everything off in the BIOS which may free up space in this legacy 64KB memory map to allow at least 8 GPU to be detected.

Anyways kinda off topic but if Nvidia card does turn out to be decent and someone tries >8 GPU that might help.

hero member
Activity: 518
Merit: 500
OT


What do you call x86?  Smiley Performance for native code was fine, emulating x86 was poor.

If you made an ARM processor which has x86 emulation support performance in x86 mode would suck also.

Yes, but I was talking about its native performance. It sucks. Always has sucked from the very first implementation. Despite an estimated $10+ billion in RnD, despite its enormous diesize, gigantic cache's, monstrous IO capabilities, despite heroic efforts by the intels compiler teams, on many benchmarks (that arent HDD benches) it often gets beaten by lowly Celerons, literally.  Lets be kind, and not mention the cpus it actually competes with like Beckton/westmere /.. based Xeons let alone Power 5/6/7.

Its a dog, and the only reason it still sells is because its the only chip to run HPUX, Nonstop and Tandem (for now). Anyone not bound to those OSs, runs on a different architecture. Its not for no reason Microsoft, Red Hat and others pulled the plug on Itanium, if you run windows, you are far better off with x86, and if you are on linux, might as well go for Power or x86.

Performance however, isnt even my main point. Nor the nightmare it apparently is to develop optimized code for Itanium; there is this little issue that the ISA is so complex and protected by patents it would guarantee a single supplier market.  Which was basically the whole point of intel developing it.

Like I said, thanks, good riddance.

Quote
Somewhat ironically the only implementation of locking down hardware to a particular OS though the use of UEFI signatures is Windows 8 ARM platform. Smiley

I dont see the irony. Its not like it has anything to do with ARM as an ISA. It has everything to do with MS.

/OT
donator
Activity: 1218
Merit: 1079
Gerald Davis
Itanium? Uefi? Dear God, what good riddance.
One a totally proprietary platform that performs like shit

What do you call x86?  Smiley Performance for native code was fine, emulating x86 was poor.

If you made an ARM processor which has x86 emulation support performance in x86 mode would suck also.

Quote
and the other a DRM infected curse forcing us in to closed source firmware drivers.

UEFI doesn't required closed source firmware or DRM.  Most systems will be using UEFI in time anyways it will just be full of backward compatible junk due to x86 long and tortured history. 

Quote
If you want to get rid of x86 look no further than ARM (possibly MIPS); and for BIOS replacement we have openboot/coreboot

Somewhat ironically the only implementation of locking down hardware to a particular OS though the use of UEFI signatures is Windows 8 ARM platform. Smiley
hero member
Activity: 518
Merit: 500
Itanium? Uefi? Dear God, what good riddance.
One a totally proprietary platform that performs like shit, and the other a DRM infected curse forcing us in to closed source firmware drivers. Thanks but no thanks.

If you want to get rid of x86 look no further than ARM (possibly MIPS); and for BIOS replacement we have openboot/coreboot
donator
Activity: 1218
Merit: 1079
Gerald Davis
That would surprise me very much.
But even if true, just run multiple xservers.

Well it gets more complicated that that.  BIOS have a lot of assumptions so without BIOS support the hardware isn't exposed to the driver or the OS.  Part of the problem is that MB BIOS needs to handle everything from last gen components running on Windows NT without AHCI support to modern hardware, to edge cases like 8 GPUs that aren't actually GPUs.

Linux Kernel also makes some performance assumptions which aren't necessarily valid for this kind of "niche" high performance computing. In the link above it looks like the first guy was "working on getting 9+ GPUs and in the second link the company had to write custom Linux kernel and custom BIOS to get the system to even boot. 

Sadly x86 has a lot of backwards compatibility "junk" and x64 being just an extension doesn't make a clean break.  Sometimes I wish Itanium had won the 64 bit wars.  A native 64-bit platform using only UEFI without any need for legacy support would have been a "clean slate" to build on.

Oh well.  Maybe someday (although a new architecture seems unlikely now).
donator
Activity: 1218
Merit: 1079
Gerald Davis
Gotcha.  Yeah I really hoped that when AMD bought ATI, that driver support would improve.  Given that now AMD is making big inroads into performance per watt you would think that they might want to win a couple supercomputer bids and thus step up their OpenCL drivers but it has been BLAH for two years now.

NVidia has IMHO always had better drivers and they also have an advantage of working with GPGPU since 2007.  Still most academic & scientific work doesn't involve integers and it appears internally the company has decided (at least in the past) to drive floating point (especially double precision used in scientific work) performance at the expense of integer performance.
hero member
Activity: 518
Merit: 500
Total BS.

Nvidia has no such hardcoded limit in their drivers like ATI engineers that live in a virtual BOX.

Thanks for the links.  Last discuss I saw on this someone linked to NVidia forum where an NVidia rep indicates the drivers don't support > 8.  Looks like they do.

Still in your first link the guy said "when I get this to post".  The second link the company had to write custom bios.  Smiley

I imagine most miners will be stuck w/ 8 GPUs for the time being.

Quote
I'm not even a Nvidia fanboy. I just go with whatever is the best offering but I must say this : I am sick and tired of ATI's BS with SDK 2.6 and CPU bugs and export DISPLAY=:0 and all that ...

NVidia has always had better drivers but they also have always had beyond horrible integer performance.  Sorry to disapoint but unless you see something from NVidia indicating otherwise their int efficiency is likely going to blow (just like prior 3 generations).

Still not sure how you have CPU bug and not sure why you are using SDK 2.6 unless you have a 7000 series card.

I don't have the CPU bug myself ( using 5870s with 2.1 ) but I was just giving an example of ATI's crap driver problems people are facing.
hero member
Activity: 518
Merit: 500
NVidia needs xserver and also has an 8 GPU limit.

That would surprise me very much.
But even if true, just run multiple xservers.
donator
Activity: 1218
Merit: 1079
Gerald Davis
Total BS.

Nvidia has no such hardcoded limit in their drivers like ATI engineers that live in a virtual BOX.

Thanks for the links.  Last discuss I saw on this someone linked to NVidia forum where an NVidia rep indicates the drivers don't support > 8.  Looks like they do.

Still in your first link the guy said "when I get this to post".  The second link the company had to write custom bios.  Smiley

I imagine most miners will be stuck w/ 8 GPUs for the time being.

Quote
I'm not even a Nvidia fanboy. I just go with whatever is the best offering but I must say this : I am sick and tired of ATI's BS with SDK 2.6 and CPU bugs and export DISPLAY=:0 and all that ...

NVidia has always had better drivers but they also have always had beyond horrible integer performance.  Sorry to disapoint but unless you see something from NVidia indicating otherwise their int efficiency is likely going to blow (just like prior 3 generations).

Still not sure how you have CPU bug and not sure why you are using SDK 2.6 unless you have a 7000 series card.
hero member
Activity: 518
Merit: 500
Screw ATI with their stupid 8 GPU limit and need for xserver to be running Angry

NVidia needs xserver and also has an 8 GPU limit.


Total BS.

Nvidia has no such hardcoded limit in their drivers like ATI engineers that live in a virtual BOX.

http://www.overclock.net/t/486609/gpu-milking-machine -> guy running 6 * dual GPU cards = 12 GPUs

http://fastra2.ua.ac.be/ -> total 13 GPUs ( 1 single card and 6 dual cards ) using tweaked BIOS and kernel

Somebody needs to let us know if the xserver needs to run but I doubt it. Also no annoying 100% CPU bug presumably as well. Nvidia drivers are MUCH better than the crap ATI gives us IMHO.

I'm not even a Nvidia fanboy. I just go with whatever is the best offering but I must say this : I am sick and tired of ATI's BS with SDK 2.6 and CPU bugs and export DISPLAY=:0 and all that ...
donator
Activity: 1218
Merit: 1079
Gerald Davis
Screw ATI with their stupid 8 GPU limit and need for xserver to be running Angry

NVidia needs xserver and also has an 8 GPU limit.
hero member
Activity: 518
Merit: 500
My oppinion
if 512 @830 cores could pupm with 150Mhash (os somph like that)
than ~2300 cores @ 925 should give us 4.5x150=675 ....
cant wait to see that Cheesy

This would be amazing if true.

Screw ATI with their stupid 8 GPU limit and need for xserver to be running Angry

Does anyone know if Nvidia needs the xserver active during mining on Linux ?

Probably not as they are much saner than ATI ( living in a box ).
donator
Activity: 1218
Merit: 1079
Gerald Davis
My oppinion
if 512 @830 cores could pupm with 150Mhash (os somph like that)
than ~2300 cores @ 925 should give us 4.5x150=675 ....
cant wait to see that Cheesy

You can't compare SP to SP as it is a new architecture.

GTX 580 gets 1.5 TFLOPs the new card is estimated to be ~3.0 TFLOPs.  Now granted that is floating point computational power but unless NVidia changed relative integer performance it should scale roughly the same.  So is integer performance is similar we are looking at 300 MH/s, maybe 350 MH/s if it has higher overclocking ability than the 580.


Currently NVidia integer performance is significantly inferior than AMD and that holds back Bitcoin performance:

Take a stock 5870.
Floating Point: 2720 GLFOPs
Integer: ~390 MH/s.  

Units aren't important just relative performance.  AMD cards get roughly 1 MH/s per 6 GFLOPS.  Now before someone flames I am not saying GFLOPs are used in hashing.  It is just a metric to look at how balanced AMD cards are (integer performance vs floating point performance).


Now lets look at a stock GTX 580
Floating Point: 1500 GFLOPs
Integer: ~140 MH/s.

Once again units don't matter.  NVidia gets roughly 1 MH/s per 10 GFLOPS.  Clock for clock, transistor for transistor AMD designs deliver about 60% higher int ops than NVidia's do.  You can look at it by transistor count vs performance also, the same bias is present.  This doesn't just apply to Bitcoin it appears in any integer heavy computation (like brute forcing passwords).

IF (and like I said nothing indicates that) it had roughly the same relative performance (integer vs floating point) that AMD 5000 series cards have we would be looking at 450-500 MH/s.   Maybe someday NVidia will care about integer performance.
donator
Activity: 1218
Merit: 1079
Gerald Davis
It'll stink, Just like all other NV cards. I'd go as far out as it wont get 300mh/s

I am trying to be as objective as possible here. Fanboy comments do not count Grin

I don't think it is a fanboy comment.  Nvidia has always made integer performance a low priority and nothing indicates this has changed.
sr. member
Activity: 274
Merit: 250
My oppinion
if 512 @830 cores could pupm with 150Mhash (os somph like that)
than ~2300 cores @ 925 should give us 4.5x150=675 ....
cant wait to see that Cheesy
hero member
Activity: 518
Merit: 500
It'll stink, Just like all other NV cards. I'd go as far out as it wont get 300mh/s

I am trying to be as objective as possible here. Fanboy comments do not count Grin
donator
Activity: 1419
Merit: 1015
It is my understanding that the Kepler shader is actually inferior to the Fermi shader from a Bitcoin standpoint, right? So having more of a crap unit doesn't really help, or maybe I'm reading this wrong. That's a lot of stream processing for sure.
sr. member
Activity: 322
Merit: 250
It'll stink, Just like all other NV cards. I'd go as far out as it wont get 300mh/s
Pages:
Jump to: