[XPM] Working on a GPU miner for Primecoin, new thread :) - page 14.

wyldfire

newbie

Activity: 23

Merit: 0

Quote from: ReCat on September 06, 2013, 04:28:34 PM

Quote from: wyldfire on September 06, 2013, 02:51:43 PM

Wha..?! No way! NVIDIA has a huge advantage over AMD in many aspects. Just look at how well their software works compared w/AMD's. You still need an X server running to do computation with AMD GPUs and that totally blows.

NVIDIA made a poor (IMO) strategic decision by abandoning OCL but you still have to give them the credit for creating it! I think they were afraid to abandon their early adopter CUDA customers and decided they didn't have the throughput to support both.

I think eventually they'll reverse their position on OCL. But to a lot of folks doing GPGPU they don't care about OCL and they're using CUDA and loving it. So it's not fair to say "NVIDIA is poor at doing anything GPGPU" IMO.

OpenCL Trademarks belong to Apple Corp. I dont think Nvidia made OpenCL.

They might be good at GPGPU, but only on the GPU's that specialize in it. ie. Their tesla series. The consumer GPU's they make aren't as good.. but they are also the vast majority.

Idk.

All I know is that the GPGPU software I've seen out there runs tons faster on ATI cards than it does on NVIDIA cards.

Yeah, Apple owns the trademarks because they're the ones who brought everyone to the table. Apple loved CUDA but isn't dumb enough to sole-source any of their parts. So they told NVIDIA and ATI that they should all play nice and standardize CUDA. OpenCL was the result. It's only barely different from OpenCL. The biggest differences are primarily in making CUDA fit a programming model similar to the shaders already used in OpenGL. NVIDIA wanted to win a contract with Apple and they had a huge headstart on the competition. AMD's Brook and CAL/IL was mostly a flop, so they would happily jump onboard with a Khronos standard.

If you look just at hashing (and now prime number computation), you're missing a much bigger part of the GPGPU marketplace. Most of the GPGPU customers (in terms of units purchased) are running floating point computations of enormous matrices and using the interpolation hardware. They're used in scientific applications, Oil&Gas, Medical stuff, etc. In those applications, NVIDIA does very well, often better than AMD.

ReCat

sr. member

Activity: 406

Merit: 250

Quote from: wyldfire on September 06, 2013, 02:51:43 PM

Quote from: ReCat on September 06, 2013, 02:44:25 PM

Quote from: refer_2_me on September 06, 2013, 01:04:26 PM

Any word on getting this to work on NVIDIA cards? From what I understand it's because the nvidia cards don't support opencl 1.2 (yet?). Any potential workarounds on windows or linux?

NVIDIA is poor at doing anything GPGPU.

Wha..?! No way! NVIDIA has a huge advantage over AMD in many aspects. Just look at how well their software works compared w/AMD's. You still need an X server running to do computation with AMD GPUs and that totally blows.

NVIDIA made a poor (IMO) strategic decision by abandoning OCL but you still have to give them the credit for creating it! I think they were afraid to abandon their early adopter CUDA customers and decided they didn't have the throughput to support both.

I think eventually they'll reverse their position on OCL. But to a lot of folks doing GPGPU they don't care about OCL and they're using CUDA and loving it. So it's not fair to say "NVIDIA is poor at doing anything GPGPU" IMO.

OpenCL Trademarks belong to Apple Corp. I dont think Nvidia made OpenCL.

They might be good at GPGPU, but only on the GPU's that specialize in it. ie. Their tesla series. The consumer GPU's they make aren't as good.. but they are also the vast majority.

Idk.

All I know is that the GPGPU software I've seen out there runs tons faster on ATI cards than it does on NVIDIA cards.

wyldfire

newbie

Activity: 23

Merit: 0

Quote from: ReCat on September 06, 2013, 02:44:25 PM

Quote from: refer_2_me on September 06, 2013, 01:04:26 PM

Any word on getting this to work on NVIDIA cards? From what I understand it's because the nvidia cards don't support opencl 1.2 (yet?). Any potential workarounds on windows or linux?

NVIDIA is poor at doing anything GPGPU.

Wha..?! No way! NVIDIA has a huge advantage over AMD in many aspects. Just look at how well their software works compared w/AMD's. You still need an X server running to do computation with AMD GPUs and that totally blows.

NVIDIA made a poor (IMO) strategic decision by abandoning OCL but you still have to give them the credit for creating it! I think they were afraid to abandon their early adopter CUDA customers and decided they didn't have the throughput to support both.

I think eventually they'll reverse their position on OCL. But to a lot of folks doing GPGPU they don't care about OCL and they're using CUDA and loving it. So it's not fair to say "NVIDIA is poor at doing anything GPGPU" IMO.

ReCat

sr. member

Activity: 406

Merit: 250

Quote from: refer_2_me on September 06, 2013, 01:04:26 PM

Any word on getting this to work on NVIDIA cards? From what I understand it's because the nvidia cards don't support opencl 1.2 (yet?). Any potential workarounds on windows or linux?

NVIDIA is poor at doing anything GPGPU. Forget about it ever running there. Even if it does run, performance will be unbearable.

wyldfire

newbie

Activity: 23

Merit: 0

Quote from: refer_2_me on September 06, 2013, 01:04:26 PM

Any word on getting this to work on NVIDIA cards? From what I understand it's because the nvidia cards don't support opencl 1.2 (yet?). Any potential workarounds on windows or linux?

I doubt it's a priority. With NVIDIA's poor integer performance it seems like it's not even worth it. Maybe it would be if the GPU miner could hit ~20x performance over typical CPUs, but as it stands it's not really there yet.

But yeah there's definitely a way to eliminate the dependency on OCL 1.2. You just need to find someone motivated enough to do it.

refer_2_me

full member

Activity: 213

Merit: 100

Any word on getting this to work on NVIDIA cards? From what I understand it's because the nvidia cards don't support opencl 1.2 (yet?). Any potential workarounds on windows or linux?

techbytes

legendary

Activity: 1694

Merit: 1054

Point. Click. Blockchain

Quote from: mtrlt on September 05, 2013, 11:43:12 AM

I've made the primality tester 4-5x faster, but I haven't managed to fix the "fractional assert" problem. Should I just release the faster version (many people won't be able to use it) or try to fix the bug first (will take time)?

bug fix first or hear people bitching... (yip-yip hooray...)

-tb-

jammertr

member

Activity: 100

Merit: 10

Quote from: DeaDTerra on September 06, 2013, 05:31:03 AM

Quote from: mtrlt on September 05, 2013, 11:43:12 AM

I've made the primality tester 4-5x faster, but I haven't managed to fix the "fractional assert" problem. Should I just release the faster version (many people won't be able to use it) or try to fix the bug first (will take time)?

I would of course prefer it if you fixed the bug I am having first, but then again I am biased Tongue

//DeaDTerra

+1

bug fix

DeaDTerra

donator

Activity: 1064

Merit: 1000

Quote from: mtrlt on September 05, 2013, 11:43:12 AM

I've made the primality tester 4-5x faster, but I haven't managed to fix the "fractional assert" problem. Should I just release the faster version (many people won't be able to use it) or try to fix the bug first (will take time)?

I would of course prefer it if you fixed the bug I am having first, but then again I am biased Tongue

//DeaDTerra

ivanlabrie

hero member

Activity: 812

Merit: 1000

Yeah, if it works on linux release the kraken :p

YipYip

hero member

Activity: 574

Merit: 500

Quote from: Vorksholk on September 05, 2013, 11:01:14 PM

Quote from: mtrlt on September 05, 2013, 11:43:12 AM

I've made the primality tester 4-5x faster, but I haven't managed to fix the "fractional assert" problem. Should I just release the faster version (many people won't be able to use it) or try to fix the bug first (will take time)?

Personally I think releasing that version right now sounds awesome.

Fractional assert problem...can it be bypassed by OS or other work arounds ??

If so then release it...

Vorksholk

legendary

Activity: 1713

Merit: 1029

Quote from: mtrlt on September 05, 2013, 11:43:12 AM

I've made the primality tester 4-5x faster, but I haven't managed to fix the "fractional assert" problem. Should I just release the faster version (many people won't be able to use it) or try to fix the bug first (will take time)?

Personally I think releasing that version right now sounds awesome.

bcp19

hero member

Activity: 532

Merit: 500

Quote from: mtrlt on September 05, 2013, 06:58:46 AM

Quote from: bcp19 on September 05, 2013, 06:28:56 AM

Quote from: mtrlt on September 05, 2013, 05:12:56 AM

It seems like memory corruption, the most annoying kind of bug. Might take some time to fix, I haven't been able to reproduce it on my computer.

This is a serious problem that may not have a fix. I have a GPU that I still have running on SHA256 because it starts throwing massive errors if I try to use the same intensity on scrypt coins.

I don't think this is a GPU error, but rather an error in the CPU code somewhere.

The reason I brought it up is that several people had memory problems on relatively new cards and one of the posters explained about the tolerances of GDDR5 memory used in GPUs as compared to the DDR3 used in computers. http://www.mersenneforum.org/showthread.php?t=17598 is the initial discussion.

In answer to the question of "Why is GPU memory weaker than RAM?" was:

Quote

That is normal for video cards. I already commented this few times, the video cards industry is the only one (from all the electronic related branches) which accepts memories with "some percent" of unstable or bad bits. That is to be able to make cheap cards for people as you an me, and it is accepted because your eyes will make no difference (and even with specialized instruments is difficult to see) between a pixel on the screen which is Green-126 and one which is Green-127. Your monitor even has only 18 physical "wires" (6 for each of Red, Green, and Blue) through which the color is transmitted, so it CAN'T show more then 2^18=262144 colors, and your eyes may see more then 4000 only if you are professional photographer or painter (or woman, hehe, my wife always tells me that I only see 16 colors, and indeed, for me "peach" is a fruit, not a color, and "rose" is a flower). So, displaying 24-bit color or even 32-bit color is just because old processors use a full byte for each r/g/b, or because new processors can operate faster on 32-bit registers. But from 8 bits of red, only 6 most significant go to the monitor. When you use your monitor in 16 bits mode, there are 5 lines of red and blue, and 6 of green (as the human eye is more sensitive to green), the rest of the "bits" of the color is "wasted" anyhow (physically, the lines are not connected to the LCD controllers, I am not talking about serialization, LVDS, and other stuff used by your card to send signals to your monitor, I am only talking what's happening at the LCD controller level).

That is why a Tesla C/M2090 is 4-6 time the price of gtx580, there is no difference between them except the ECC memory and intensive production testing for endurance.

--------------------------------------------------------------------------------

One possible remedy postulated is downclocking the memory.

ReCat

sr. member

Activity: 406

Merit: 250

Quote from: Stinky_Pete on September 05, 2013, 03:26:34 PM

Quote from: ReCat on September 05, 2013, 03:17:45 PM

http://howmuchmoneyhasmtrltmadefromtheprimecoingpuminer.housecat.name/

I totally promise that it is bias-free this time. Tongue

(I was bored.)

I'd be offended if I was offered only $27/hour for a professional job.

Yes, it is quite underpaid considering the job that is being done. He seems to be satisfied, though, as he isn't asking for any more donations?

Of course, realize that it goes assuming he works on it full time.

Stinky_Pete

hero member

Activity: 560

Merit: 500

But apart from the above, it's good to see the community working with mtrlt to debug and improve the program - and no sign of the pitchfork wavers.

Stinky_Pete

hero member

Activity: 560

Merit: 500

Quote from: ReCat on September 05, 2013, 03:17:45 PM

http://howmuchmoneyhasmtrltmadefromtheprimecoingpuminer.housecat.name/

I totally promise that it is bias-free this time. Tongue

(I was bored.)

I'd be offended if I was offered only $27/hour for a professional job.

ReCat

sr. member

Activity: 406

Merit: 250

http://howmuchmoneyhasmtrltmadefromtheprimecoingpuminer.housecat.name/

I totally promise that it is bias-free this time. Tongue

(I was bored.)

Entz

full member

Activity: 210

Merit: 100

I not use any kind of messenger beware of scammers

Quote from: paulthetafy on September 05, 2013, 03:03:17 AM

Ignoring the total, the fermat test on the GPU gives 1000x fewer 2-chains, and 100x fewer 4-chains. This is with 2 x 7950, a AMD FX(tm)-4130 Quad-Core Processor, and ubunutu 13.

Am I missing something somewhere?

Likely not it may be my settings or some other factor (mingw compiled vs native linux etc). I have a very weak CPU in there (Celeron G530 I think) that gets maxed out when the CPU test is going (against 2 GPUs).
cpu_mining_threads 2, worksize 64, aggression 23, sievepercentage 10, sievesize 25165824

CPU: 350.824k 2-chains 33.2151k 3-chains 3.13091k 4-chains 318.715 5-chains 18.7479 6-chains
GPU: 502.131k 2-chains 45.7112k 3-chains 4.8202k 4-chains 392.074 5-chains 46.1263 6-chains

I have a faster quad-core laying around, I will give that a try when I get a moment.

pgbit

sr. member

Activity: 771

Merit: 258

Trident Protocol | Simple «buy-hold-earn» system!

Quote from: wyldfire on September 05, 2013, 02:01:47 PM

Quote from: y0m0 on September 05, 2013, 01:51:01 PM

Quote from: pgbit on September 05, 2013, 12:09:37 PM

Quote from: mtrlt on September 05, 2013, 11:43:12 AM

I've made the primality tester 4-5x faster, but I haven't managed to fix the "fractional assert" problem. Should I just release the faster version (many people won't be able to use it) or try to fix the bug first (will take time)?

If this is of any help, looking at the console as all the data flies by, my setup (i7 chip, win8, 64bit) reels out all the data, starting with 0 fermats/sec and 786 gandalfs/s, then as time passes, the gandalfs/s progressively drops and if the app doesnt crash earlier each time the console reports the gandalfs/s it shows a smaller and smaller number, reducing to 60 gandalfs/s then it usually gives up and passes onto windows error reporting. It always only shows 0 fermats/s - ? what is this meant to be?

I had this same issue with the gandalfs dropping and I managed to keep it at a regular average number by playing around with the "worksize" parameter in the primecoin.conf , got the best result with 64

I can confirm worksize of 64 was optimal for me too.

This does seem a improvement - thanks - this time runs from 1 - 5 mins.

pgbit

sr. member

Activity: 771

Merit: 258

Trident Protocol | Simple «buy-hold-earn» system!

Quote from: mtrlt on September 05, 2013, 12:20:38 PM

What GPU?

7970 same prob with catalyst 13.4 and 13.8

Topic: [XPM] Working on a GPU miner for Primecoin, new thread :) - page 14. (Read 166603 times)