Pages:
Author

Topic: [XPM] Working on a GPU miner for Primecoin, new thread :) - page 14. (Read 166583 times)

newbie
Activity: 23
Merit: 0
Wha..?!  No way!  NVIDIA has a huge advantage over AMD in many aspects.  Just look at how well their software works compared w/AMD's.  You still need an X server running to do computation with AMD GPUs and that totally blows.

NVIDIA made a poor (IMO) strategic decision by abandoning OCL but you still have to give them the credit for creating it!  I think they were afraid to abandon their early adopter CUDA customers and decided they didn't have the throughput to support both.

I think eventually they'll reverse their position on OCL.  But to a lot of folks doing GPGPU they don't care about OCL and they're using CUDA and loving it.  So it's not fair to say "NVIDIA is poor at doing anything GPGPU" IMO.

OpenCL Trademarks belong to Apple Corp. I dont think Nvidia made OpenCL.

They might be good at GPGPU, but only on the GPU's that specialize in it. ie. Their tesla series. The consumer GPU's they make aren't as good.. but they are also the vast majority.

Idk.

All I know is that the GPGPU software I've seen out there runs tons faster on ATI cards than it does on NVIDIA cards.

Yeah, Apple owns the trademarks because they're the ones who brought everyone to the table.  Apple loved CUDA but isn't dumb enough to sole-source any of their parts.  So they told NVIDIA and ATI that they should all play nice and standardize CUDA.  OpenCL was the result.  It's only barely different from OpenCL.  The biggest differences are primarily in making CUDA fit a programming model similar to the shaders already used in OpenGL.  NVIDIA wanted to win a contract with Apple and they had a huge headstart on the competition.  AMD's Brook and CAL/IL was mostly a flop, so they would happily jump onboard with a Khronos standard.

If you look just at hashing (and now prime number computation), you're missing a much bigger part of the GPGPU marketplace.  Most of the GPGPU customers (in terms of units purchased) are running floating point computations of enormous matrices and using the interpolation hardware.  They're used in scientific applications, Oil&Gas, Medical stuff, etc.  In those applications, NVIDIA does very well, often better than AMD.
sr. member
Activity: 406
Merit: 250
Any word on getting this to work on NVIDIA cards? From what I understand it's because the nvidia cards don't support opencl 1.2 (yet?). Any potential workarounds on windows or linux?

NVIDIA is poor at doing anything GPGPU.

Wha..?!  No way!  NVIDIA has a huge advantage over AMD in many aspects.  Just look at how well their software works compared w/AMD's.  You still need an X server running to do computation with AMD GPUs and that totally blows.

NVIDIA made a poor (IMO) strategic decision by abandoning OCL but you still have to give them the credit for creating it!  I think they were afraid to abandon their early adopter CUDA customers and decided they didn't have the throughput to support both.

I think eventually they'll reverse their position on OCL.  But to a lot of folks doing GPGPU they don't care about OCL and they're using CUDA and loving it.  So it's not fair to say "NVIDIA is poor at doing anything GPGPU" IMO.

OpenCL Trademarks belong to Apple Corp. I dont think Nvidia made OpenCL.

They might be good at GPGPU, but only on the GPU's that specialize in it. ie. Their tesla series. The consumer GPU's they make aren't as good.. but they are also the vast majority.

Idk.

All I know is that the GPGPU software I've seen out there runs tons faster on ATI cards than it does on NVIDIA cards.
newbie
Activity: 23
Merit: 0
Any word on getting this to work on NVIDIA cards? From what I understand it's because the nvidia cards don't support opencl 1.2 (yet?). Any potential workarounds on windows or linux?

NVIDIA is poor at doing anything GPGPU.

Wha..?!  No way!  NVIDIA has a huge advantage over AMD in many aspects.  Just look at how well their software works compared w/AMD's.  You still need an X server running to do computation with AMD GPUs and that totally blows.

NVIDIA made a poor (IMO) strategic decision by abandoning OCL but you still have to give them the credit for creating it!  I think they were afraid to abandon their early adopter CUDA customers and decided they didn't have the throughput to support both.

I think eventually they'll reverse their position on OCL.  But to a lot of folks doing GPGPU they don't care about OCL and they're using CUDA and loving it.  So it's not fair to say "NVIDIA is poor at doing anything GPGPU" IMO.
sr. member
Activity: 406
Merit: 250
Any word on getting this to work on NVIDIA cards? From what I understand it's because the nvidia cards don't support opencl 1.2 (yet?). Any potential workarounds on windows or linux?

NVIDIA is poor at doing anything GPGPU. Forget about it ever running there. Even if it does run, performance will be unbearable.
newbie
Activity: 23
Merit: 0
Any word on getting this to work on NVIDIA cards? From what I understand it's because the nvidia cards don't support opencl 1.2 (yet?). Any potential workarounds on windows or linux?

I doubt it's a priority.  With NVIDIA's poor integer performance it seems like it's not even worth it.  Maybe it would be if the GPU miner could hit ~20x performance over typical CPUs, but as it stands it's not really there yet.

But yeah there's definitely a way to eliminate the dependency on OCL 1.2.  You just need to find someone motivated enough to do it.
full member
Activity: 213
Merit: 100
Any word on getting this to work on NVIDIA cards? From what I understand it's because the nvidia cards don't support opencl 1.2 (yet?). Any potential workarounds on windows or linux?
legendary
Activity: 1694
Merit: 1054
Point. Click. Blockchain
I've made the primality tester 4-5x faster, but I haven't managed to fix the "fractional assert" problem. Should I just release the faster version (many people won't be able to use it) or try to fix the bug first (will take time)?

bug fix first or hear people bitching... (yip-yip hooray...)


-tb-
member
Activity: 100
Merit: 10
I've made the primality tester 4-5x faster, but I haven't managed to fix the "fractional assert" problem. Should I just release the faster version (many people won't be able to use it) or try to fix the bug first (will take time)?
I would of course prefer it if you fixed the bug I am having first, but then again I am biased Tongue
//DeaDTerra

+1

bug fix 
donator
Activity: 1064
Merit: 1000
I've made the primality tester 4-5x faster, but I haven't managed to fix the "fractional assert" problem. Should I just release the faster version (many people won't be able to use it) or try to fix the bug first (will take time)?
I would of course prefer it if you fixed the bug I am having first, but then again I am biased Tongue
//DeaDTerra
hero member
Activity: 812
Merit: 1000
Yeah, if it works on linux release the kraken :p
hero member
Activity: 574
Merit: 500
I've made the primality tester 4-5x faster, but I haven't managed to fix the "fractional assert" problem. Should I just release the faster version (many people won't be able to use it) or try to fix the bug first (will take time)?

Personally I think releasing that version right now sounds awesome. Smiley

Fractional assert problem...can it be bypassed by OS or other work arounds ??

If so then release it...
legendary
Activity: 1713
Merit: 1029
I've made the primality tester 4-5x faster, but I haven't managed to fix the "fractional assert" problem. Should I just release the faster version (many people won't be able to use it) or try to fix the bug first (will take time)?

Personally I think releasing that version right now sounds awesome. Smiley
hero member
Activity: 532
Merit: 500
It seems like memory corruption, the most annoying kind of bug. Might take some time to fix, I haven't been able to reproduce it on my computer.
This is a serious problem that may not have a fix.  I have a GPU that I still have running on SHA256 because it starts throwing massive errors if I try to use the same intensity on scrypt coins.

I don't think this is a GPU error, but rather an error in the CPU code somewhere.
The reason I brought it up is that several people had memory problems on relatively new cards and one of the posters explained about the tolerances of GDDR5 memory used in GPUs as compared to the DDR3 used in computers.  http://www.mersenneforum.org/showthread.php?t=17598 is the initial discussion.

In answer to the question of "Why is GPU memory weaker than RAM?" was:

Quote
That is normal for video cards. I already commented this few times, the video cards industry is the only one (from all the electronic related branches) which accepts memories with "some percent" of unstable or bad bits. That is to be able to make cheap cards for people as you an me, and it is accepted because your eyes will make no difference (and even with specialized instruments is difficult to see) between a pixel on the screen which is Green-126 and one which is Green-127. Your monitor even has only 18 physical "wires" (6 for each of Red, Green, and Blue) through which the color is transmitted, so it CAN'T show more then 2^18=262144 colors, and your eyes may see more then 4000 only if you are professional photographer or painter (or woman, hehe, my wife always tells me that I only see 16 colors, and indeed, for me "peach" is a fruit, not a color, and "rose" is a flower). So, displaying 24-bit color or even 32-bit color is just because old processors use a full byte for each r/g/b, or because new processors can operate faster on 32-bit registers. But from 8 bits of red, only 6 most significant go to the monitor. When you use your monitor in 16 bits mode, there are 5 lines of red and blue, and 6 of green (as the human eye is more sensitive to green), the rest of the "bits" of the color is "wasted" anyhow (physically, the lines are not connected to the LCD controllers, I am not talking about serialization, LVDS, and other stuff used by your card to send signals to your monitor, I am only talking what's happening at the LCD controller level).

That is why a Tesla C/M2090 is 4-6 time the price of gtx580, there is no difference between them except the ECC memory and intensive production testing for endurance.

--------------------------------------------------------------------------------

One possible remedy postulated is downclocking the memory.
sr. member
Activity: 406
Merit: 250
http://howmuchmoneyhasmtrltmadefromtheprimecoingpuminer.housecat.name/

I totally promise that it is bias-free this time. Tongue

(I was bored.)

I'd be offended if I was offered only $27/hour for a professional job.



Yes, it is quite underpaid considering the job that is being done. He seems to be satisfied, though, as he isn't asking for any more donations?

Of course, realize that it goes assuming he works on it full time.
hero member
Activity: 560
Merit: 500
But apart from the above, it's good to see the community working with mtrlt to debug and improve the program - and no sign of the pitchfork wavers.
hero member
Activity: 560
Merit: 500
http://howmuchmoneyhasmtrltmadefromtheprimecoingpuminer.housecat.name/

I totally promise that it is bias-free this time. Tongue

(I was bored.)

I'd be offended if I was offered only $27/hour for a professional job.

sr. member
Activity: 406
Merit: 250
http://howmuchmoneyhasmtrltmadefromtheprimecoingpuminer.housecat.name/

I totally promise that it is bias-free this time. Tongue

(I was bored.)
full member
Activity: 210
Merit: 100
I not use any kind of messenger beware of scammers
Ignoring the total, the fermat test on the GPU gives 1000x fewer 2-chains,  and 100x fewer 4-chains.  This is with 2 x 7950, a AMD FX(tm)-4130 Quad-Core Processor, and ubunutu 13.

Am I missing something somewhere?
Likely not it may be my settings or some other factor (mingw compiled vs native linux etc). I have a very weak CPU in there (Celeron G530 I think) that gets maxed out when the CPU test is going (against 2 GPUs).
cpu_mining_threads  2,  worksize 64,  aggression 23,  sievepercentage 10, sievesize 25165824

CPU: 350.824k 2-chains     33.2151k 3-chains     3.13091k 4-chains     318.715 5-chains     18.7479 6-chains
GPU: 502.131k 2-chains     45.7112k 3-chains     4.8202k 4-chains     392.074 5-chains     46.1263 6-chains

I have a faster quad-core laying around, I will give that a try when I get a moment.
sr. member
Activity: 771
Merit: 258
Trident Protocol | Simple «buy-hold-earn» system!
I've made the primality tester 4-5x faster, but I haven't managed to fix the "fractional assert" problem. Should I just release the faster version (many people won't be able to use it) or try to fix the bug first (will take time)?
If this is of any help, looking at the console as all the data flies by, my setup (i7 chip, win8, 64bit) reels out all the data, starting with 0 fermats/sec and 786 gandalfs/s, then as time passes, the gandalfs/s progressively drops and if the app doesnt crash earlier each time the console reports the gandalfs/s it shows a smaller and smaller number, reducing to 60 gandalfs/s then it usually gives up and passes onto windows error reporting. It always only shows 0 fermats/s - ? what is this meant to be?

I had this same issue with the gandalfs dropping and I managed to keep it at a regular average number by playing around with the "worksize" parameter in the primecoin.conf , got the best result with 64

I can confirm worksize of 64 was optimal for me too.
This does seem a improvement - thanks - this time runs from 1 - 5 mins.
sr. member
Activity: 771
Merit: 258
Trident Protocol | Simple «buy-hold-earn» system!
What GPU?
7970 same prob with catalyst 13.4 and 13.8
Pages:
Jump to: