Pages:
Author

Topic: [XPM] Working on a GPU miner for Primecoin, new thread :) - page 13. (Read 166554 times)

member
Activity: 104
Merit: 10
The issue isn't with jansson, it's with libblkmaker. It seems you have the normal (i.e. non-primecoin) version. You have to download libblkmaker from https://dl.dropboxusercontent.com/u/55025350/bitcoin-libblkmaker.zip. It is the libblkmaker prime branch, with a couple of primecoin-specific things added. Those auxdata parameters are important, without them the miner won't work. Smiley
newbie
Activity: 23
Merit: 0
I didn't have jansson installed when I built reaperprime, so I downloaded the newest one (v2.4).  But when I used that one, reaperprime ran into a compilation error.  Maybe the API changed?

Here's the patch.  It compiles and runs.  Fingers crossed that it still works -- I have NO idea Smiley

Code:
--- App.cpp.orig 2013-09-08 09:53:17.022105436 -0500
+++ App.cpp 2013-09-08 09:53:17.126105439 -0500
@@ -98,7 +98,7 @@
  blktemplate_t* tmpl = app.templates[w.templateid];
  uint NONCE = EndianSwap(*(uint*)&w.data[76]);
 
- json_t* readyblock = blkmk_submit_jansson(tmpl, &w.data[0], w.dataid, NONCE, &w.auxdata[0], w.auxdata.size());
+ json_t* readyblock = blkmk_submit_jansson(tmpl, &w.data[0], w.dataid, NONCE);
  char *s = json_dumps(readyblock, JSON_INDENT(2));
  str = s;
  free(s);
hero member
Activity: 812
Merit: 1000
Interesting that it can work on Nvidia...seems like something for Titan/780/580 owners to do.

Nice Smiley
hero member
Activity: 517
Merit: 501
If you have a newer Nvidia card (with a "compute capability version" < 2.0 according to http://en.wikipedia.org/wiki/CUDA#Supported_GPUs ), try to set worksize 512 and see what this gives you.

Do you mean less than or more than?  The way you wrote it it reads "less than 2.0", but that doesn't really make any sense in the context.

Edit: also the code you pasted seems to cut off on the right hand side on a couple of the lines.

Err, you are right of course. I meant >= 2.0. I corrected the original post, also regarding the cut-off lines.
member
Activity: 104
Merit: 10
BTW, Maybe it's time to put the code on GitHub... mtrlt?
Yeah, the week is almost up.
sr. member
Activity: 363
Merit: 250
If you have a newer Nvidia card (with a "compute capability version" < 2.0 according to http://en.wikipedia.org/wiki/CUDA#Supported_GPUs ), try to set worksize 512 and see what this gives you.

Do you mean less than or more than?  The way you wrote it it reads "less than 2.0", but that doesn't really make any sense in the context.

Edit: also the code you pasted seems to cut off on the right hand side on a couple of the lines.
hero member
Activity: 517
Merit: 501
BTW, Maybe it's time to put the code on GitHub... mtrlt?
hero member
Activity: 517
Merit: 501
Here's the patch to make it work with OpenCL 1.1 (and therefore Nvidia cards).

Replace function OpenCL::WriteBufferPattern in file AppOpenCL.cpp with the following code:

Code:
void OpenCL::WriteBufferPattern(uint device_num, string buffername, size_t data_length, void* pattern, size_t pattern_length)
{
_clState& GPUstate = GPUstates[device_num];
if (GPUstate.buffers[buffername] == NULL)
cout << "Buffer " << buffername << " not found on GPU #" << device_num << endl;
#ifdef CL_VERSION_1_2
cl_int status = clEnqueueFillBuffer(GPUstate.commandQueue, GPUstate.buffers[buffername], pattern, pattern_length, 0, data_length, 0, NULL, NULL);
#else
  uint8_t buffer[data_length];
  for(uint16_t i=0; i<(data_length / pattern_length);i++)
    memcpy((&buffer[i*pattern_length]), pattern, pattern_length);
cl_int status = clEnqueueWriteBuffer(GPUstate.commandQueue, GPUstate.buffers[buffername], CL_TRUE, 0, data_length, buffer, 0, NULL, NULL);
#endif
if (globalconfs.coin.config.GetValue("opencldebug"))
cout << "Write buffer pattern " << buffername << ", " << pattern_length << " bytes. Status: " << status << endl;
}

This runs for me, but I am getting
Code:
0 fermats/s, 0 gandalfs/s.
0 TOTAL
most likely because my card it too old and I had to set worksize 64 in primecoin.conf.

If you have a newer Nvidia card (with a "compute capability version" >= 2.0 according to http://en.wikipedia.org/wiki/CUDA#Supported_GPUs ), try to set worksize 512 and see what this gives you.
legendary
Activity: 1764
Merit: 1000
Many people around have had trouble getting reaper to work correctly (especially the primecoin-based fork) and are getting errors. As well, others want to log their reaper output so they can analyze it, or want to filter out messages that spam the console ("GPU stuff"). To that effect, I made a quick-and-dirty little program which runs in java and works on the Windows version of reaper to allow you to filter reaper output, and capture logs of the reaper output. Additionally, it also allows you to quickly combine your reaper.conf and primecoin.conf files into your output log.

http://www.theopeneffect.com/reaperreader.jar

Published under creative commons.

Thanks for sharing this program Vorsholk
legendary
Activity: 1764
Merit: 1000
Hope to see an update on the appcrash issue Wink
legendary
Activity: 1713
Merit: 1029
Many people around have had trouble getting reaper to work correctly (especially the primecoin-based fork) and are getting errors. As well, others want to log their reaper output so they can analyze it, or want to filter out messages that spam the console ("GPU stuff"). To that effect, I made a quick-and-dirty little program which runs in java and works on the Windows version of reaper to allow you to filter reaper output, and capture logs of the reaper output. Additionally, it also allows you to quickly combine your reaper.conf and primecoin.conf files into your output log.

http://www.theopeneffect.com/reaperreader.jar

Published under creative commons.
hero member
Activity: 812
Merit: 1000
Tested the beta2 in my Windows 7 SP1 64 bit (pro) pc and it still behaves the same way, crashes after a minute of running, and changing settings only helps prolonging the agony, so to speak. Sad
sr. member
Activity: 406
Merit: 250
Any word on getting this to work on NVIDIA cards? From what I understand it's because the nvidia cards don't support opencl 1.2 (yet?). Any potential workarounds on windows or linux?

NVIDIA is poor at doing anything GPGPU.

Wha..?!  No way!  NVIDIA has a huge advantage over AMD in many aspects.  Just look at how well their software works compared w/AMD's.  You still need an X server running to do computation with AMD GPUs and that totally blows.

NVIDIA made a poor (IMO) strategic decision by abandoning OCL but you still have to give them the credit for creating it!  I think they were afraid to abandon their early adopter CUDA customers and decided they didn't have the throughput to support both.

I think eventually they'll reverse their position on OCL.  But to a lot of folks doing GPGPU they don't care about OCL and they're using CUDA and loving it.  So it's not fair to say "NVIDIA is poor at doing anything GPGPU" IMO.

OpenCL Trademarks belong to Apple Corp. I dont think Nvidia made OpenCL.

They might be good at GPGPU, but only on the GPU's that specialize in it. ie. Their tesla series. The consumer GPU's they make aren't as good.. but they are also the vast majority.

Idk.

All I know is that the GPGPU software I've seen out there runs tons faster on ATI cards than it does on NVIDIA cards.
CUDA would work very well for this type of computing.  Over on Mersenne.org, they have had CUDA based programs to run the Lucas-Lehmer tests for quite some time now while the OPENCL crowd have barely gotten one functioning and at nowhere near the speed of CUDA.

In trial factoring work, a GTX590 using CUDA puts out 681.6GHz Days of work per day compared to a 7990 using OPENCL putting out 748.7.   On sha-256 the 590 is ~190 to the 7990's 1200+.   Porting the OPENCL to CUDA will not be an easy task, but I'd bet the result would surprise you.

It's not very much of a surprise. I realize how different architectures can specialize in different types of tasks and have a significant advantages with them. I was just going by the little information I have seen about it, which I guess did not fully explain the situation.
hero member
Activity: 532
Merit: 500
Any word on getting this to work on NVIDIA cards? From what I understand it's because the nvidia cards don't support opencl 1.2 (yet?). Any potential workarounds on windows or linux?

NVIDIA is poor at doing anything GPGPU.

Wha..?!  No way!  NVIDIA has a huge advantage over AMD in many aspects.  Just look at how well their software works compared w/AMD's.  You still need an X server running to do computation with AMD GPUs and that totally blows.

NVIDIA made a poor (IMO) strategic decision by abandoning OCL but you still have to give them the credit for creating it!  I think they were afraid to abandon their early adopter CUDA customers and decided they didn't have the throughput to support both.

I think eventually they'll reverse their position on OCL.  But to a lot of folks doing GPGPU they don't care about OCL and they're using CUDA and loving it.  So it's not fair to say "NVIDIA is poor at doing anything GPGPU" IMO.

OpenCL Trademarks belong to Apple Corp. I dont think Nvidia made OpenCL.

They might be good at GPGPU, but only on the GPU's that specialize in it. ie. Their tesla series. The consumer GPU's they make aren't as good.. but they are also the vast majority.

Idk.

All I know is that the GPGPU software I've seen out there runs tons faster on ATI cards than it does on NVIDIA cards.
CUDA would work very well for this type of computing.  Over on Mersenne.org, they have had CUDA based programs to run the Lucas-Lehmer tests for quite some time now while the OPENCL crowd have barely gotten one functioning and at nowhere near the speed of CUDA.

In trial factoring work, a GTX590 using CUDA puts out 681.6GHz Days of work per day compared to a 7990 using OPENCL putting out 748.7.   On sha-256 the 590 is ~190 to the 7990's 1200+.   Porting the OPENCL to CUDA will not be an easy task, but I'd bet the result would surprise you.
hero member
Activity: 517
Merit: 501
I then ran into another weird problem when compiling the kernels

For the record, here's the error:

Code:
Write buffer vPrimes, 6302644 bytes. Status: 0
Compiling kernel... this could take up to 2 minutes.
ptxas error   : Entry function 'CalculateMultipliers' uses too much shared data (0x5078 bytes + 0x10 bytes system, 0x4000 max)

What GPU? It seems it only has 16 kilobytes of local memory, whereas I've programmed the miner with the assumption of 32 kilobytes, which is what ~all AMD GPUs have.


It's a NVIDIA Corporation GT215 [GeForce GT 240]. It's a few years old, so might not be the best choice. Just happens the only one I can easily test on.

It seems tha Nvidia cards with a "compute capability version" < 2.0 have only 16KB of local memory, all above 512KB. See http://en.wikipedia.org/wiki/CUDA#Supported_GPUs for a list which GPU has which compute capability version.
member
Activity: 104
Merit: 10
I then ran into another weird problem when compiling the kernels

For the record, here's the error:

Code:
Write buffer vPrimes, 6302644 bytes. Status: 0
Compiling kernel... this could take up to 2 minutes.
ptxas error   : Entry function 'CalculateMultipliers' uses too much shared data (0x5078 bytes + 0x10 bytes system, 0x4000 max)

What GPU? It seems it only has 16 kilobytes of local memory, whereas I've programmed the miner with the assumption of 32 kilobytes, which is what ~all AMD GPUs have.
hero member
Activity: 517
Merit: 501
I then ran into another weird problem when compiling the kernels

For the record, here's the error:

Code:
Write buffer vPrimes, 6302644 bytes. Status: 0
Compiling kernel... this could take up to 2 minutes.
ptxas error   : Entry function 'CalculateMultipliers' uses too much shared data (0x5078 bytes + 0x10 bytes system, 0x4000 max)
hero member
Activity: 517
Merit: 501
Any word on getting this to work on NVIDIA cards? From what I understand it's because the nvidia cards don't support opencl 1.2 (yet?). Any potential workarounds on windows or linux?

I hacked up a solution for the clEnqueueFillBuffer problem (which seems that it's the only function Sunny used from OpenCL 1.2, the rest is 1.1 and thus well supported by Nvidia). I then ran into another weird problem when compiling the kernels, at which point I decided it's too much work because a) I don't know anything about OpenCL and b) I don't even want to mine. Cheesy
sr. member
Activity: 406
Merit: 250
Wha..?!  No way!  NVIDIA has a huge advantage over AMD in many aspects.  Just look at how well their software works compared w/AMD's.  You still need an X server running to do computation with AMD GPUs and that totally blows.

NVIDIA made a poor (IMO) strategic decision by abandoning OCL but you still have to give them the credit for creating it!  I think they were afraid to abandon their early adopter CUDA customers and decided they didn't have the throughput to support both.

I think eventually they'll reverse their position on OCL.  But to a lot of folks doing GPGPU they don't care about OCL and they're using CUDA and loving it.  So it's not fair to say "NVIDIA is poor at doing anything GPGPU" IMO.

OpenCL Trademarks belong to Apple Corp. I dont think Nvidia made OpenCL.

They might be good at GPGPU, but only on the GPU's that specialize in it. ie. Their tesla series. The consumer GPU's they make aren't as good.. but they are also the vast majority.

Idk.

All I know is that the GPGPU software I've seen out there runs tons faster on ATI cards than it does on NVIDIA cards.

Yeah, Apple owns the trademarks because they're the ones who brought everyone to the table.  Apple loved CUDA but isn't dumb enough to sole-source any of their parts.  So they told NVIDIA and ATI that they should all play nice and standardize CUDA.  OpenCL was the result.  It's only barely different from OpenCL.  The biggest differences are primarily in making CUDA fit a programming model similar to the shaders already used in OpenGL.  NVIDIA wanted to win a contract with Apple and they had a huge headstart on the competition.  AMD's Brook and CAL/IL was mostly a flop, so they would happily jump onboard with a Khronos standard.

If you look just at hashing (and now prime number computation), you're missing a much bigger part of the GPGPU marketplace.  Most of the GPGPU customers (in terms of units purchased) are running floating point computations of enormous matrices and using the interpolation hardware.  They're used in scientific applications, Oil&Gas, Medical stuff, etc.  In those applications, NVIDIA does very well, often better than AMD.

Any sources?

There's very few applications of GPGPU out there, and the few I have seen seem to indicate that ATI performs better, but I'm not sure. Especially at floating point math. So I heard.
hero member
Activity: 812
Merit: 1000
Thanks, didn't know how that started Cheesy
Pages:
Jump to: