Author

Topic: OFFICIAL CGMINER mining software thread for linux/win/osx/mips/arm/r-pi 4.11.0 - page 477. (Read 5805546 times)

legendary
Activity: 1484
Merit: 1005
okay, i'll just write a python script to restart it again every two hours i guess.

Code:
import os, subprocess, time

while True:
      print("Starting reaper...")
      p = subprocess.Popen("C:\\Users\\my-pc\\Desktop\\reaper\\reaper.exe")
      time.sleep(7200)
      print("Terminating reaper...")
      p.terminate()
      time.sleep(10)
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
at least make it a multiple of the shaders count, not 8000

okay;

cgminer (thread_concurrency=7168, v=1, w=64, lookup_gap=2, I=13): 230 shares accepted

I spent one hell of a long time tweaking the 7xxx cards (~6 hours) with reaper and found the following:
- there is an optimal thread concurrency equal to approximately 64 * bits_bus_width
- above or below slightly this thread concurrency produces approximately the same results so long as the number is a multiple of 64; 8000 seems optimal for my 7770 while 24000 seems optimal for my 7950s
- using too small of a thread concurrency results in hardware errors with high intensities, so low intensities of ~13 must be used instead, the lower the thread concurrency, the lower the intensity allowable before hardware errors occur
- worksize, vectors, and sharethreads have little impact on performance

i'm really, really leaning towards larger buffer sizes being required for the 7xxx series in order to hash effectively.

I'm going back to mining with reaper now.  I wouldn't bitch about this but reaper seems to suddenly kill the buffer of one of my cards after 12 or so hours and has to be restarted (the memory usage just disappears and the hash rate goes down to 10kh/s), which is a pain in my ass.
Yes I think you are better off with reaper because it just ignores the errors. Sometimes that works, and as you have seen, eventually it fails. I can't afford to have cgminer do that kind of random thing though. Sorry I can't help you any further with scrypt on cgminer.
legendary
Activity: 1484
Merit: 1005
at least make it a multiple of the shaders count, not 8000

okay;

cgminer (thread_concurrency=7168, v=1, w=64, lookup_gap=2, I=13): 230 shares accepted

I spent one hell of a long time tweaking the 7xxx cards (~6 hours) with reaper and found the following:
- there is an optimal thread concurrency equal to approximately 64 * bits_bus_width
- above or below slightly this thread concurrency produces approximately the same results so long as the number is a multiple of 64; 8000 seems optimal for my 7770 while 24000 seems optimal for my 7950s
- using too small of a thread concurrency results in hardware errors with high intensities, so low intensities of ~13 must be used instead, the lower the thread concurrency, the lower the intensity allowable before hardware errors occur
- worksize, vectors, and sharethreads have little impact on performance

i'm really, really leaning towards larger buffer sizes being required for the 7xxx series in order to hash effectively.

I'm going back to mining with reaper now.  I wouldn't bitch about this but reaper seems to suddenly kill the buffer of one of my cards after 12 or so hours and has to be restarted (the memory usage just disappears and the hash rate goes down to 10kh/s), which is a pain in my ass.
legendary
Activity: 1484
Merit: 1005
Reaper (thread_concurrency=24000, v=1, w=64, lookup_gap=2, I=20): 449 shares accepted
cgminer (thread_concurrency=8000, v=1, w=64, lookup_gap=2, I=13): 232 shares accepted
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
Yes it's all the same... as I said the only thing different is it does not check the return codes from the opencl calls. Once again I have to ask you, are you actually getting more shares returned by raper. Please don't assume that because the hashrate displays higher that that is evidence.

Okay.  I'll run it for exactly 5 minutes with both programs and report back the number of shares I get.  I'm going to have to run cgminer with suboptimal settings (thread_concurrency = 8000, intensity = 13) because otherwise I'll get all hardware errors.
at least make it a multiple of the shaders count, not 8000
hero member
Activity: 988
Merit: 1000
Yes it's all the same... as I said the only thing different is it does not check the return codes from the opencl calls. Once again I have to ask you, are you actually getting more shares returned by raper. Please don't assume that because the hashrate displays higher that that is evidence.

Okay.  I'll run it for exactly 5 minutes with both programs and report back the number of shares I get.  I'm going to have to run cgminer with suboptimal settings (thread_concurrency = 8000, intensity = 13) because otherwise I'll get all hardware errors.

Use the reported # shares at the pool
legendary
Activity: 1484
Merit: 1005
Yes it's all the same... as I said the only thing different is it does not check the return codes from the opencl calls. Once again I have to ask you, are you actually getting more shares returned by raper. Please don't assume that because the hashrate displays higher that that is evidence.

Okay.  I'll run it for exactly 5 minutes with both programs and report back the number of shares I get.  I'm going to have to run cgminer with suboptimal settings (thread_concurrency = 8000, intensity = 13) because otherwise I'll get all hardware errors.
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
That has nothing to do with what I said. It's not like I'm making the number up, it reads it back from the device. It does NOT mean the amount of memory the device has. You have read the docs correctly and that is the value reported back for that max alloc size by your device.

Okay, I had a look at reaper's source code to see if something different is being done then.  The initialization is almost the same, however reaper first declares padbuffer8 using this command:
Code:
cl_mem padbuffer8
Other than that it's almost verbatim.  Does the usage of a memory object allow you to override the limitations imposed by the buffer size?
Also, why does my 7950 have the same buffer size restrictions as my 7770?

There are also a number of calls to clSetKernelArg in reaper that I'm not sure what they're doing (or if they're already included elsewhere in cgminer).

Code:
./ocl.h:        cl_mem padbuffer8;

Yes it's all the same... as I said the only thing different is it does not check the return codes from the opencl calls. Once again I have to ask you, are you actually getting more shares returned by raper. Please don't assume that because the hashrate displays higher that that is evidence.
legendary
Activity: 1484
Merit: 1005
That has nothing to do with what I said. It's not like I'm making the number up, it reads it back from the device. It does NOT mean the amount of memory the device has. You have read the docs correctly and that is the value reported back for that max alloc size by your device.

Okay, I had a look at reaper's source code to see if something different is being done then.  The initialization is almost the same, however reaper first declares padbuffer8 using this command:
Code:
cl_mem padbuffer8
Other than that it's almost verbatim.  Does the usage of a memory object allow you to override the limitations imposed by the buffer size?
Also, why does my 7950 have the same buffer size restrictions as my 7770?

There are also a number of calls to clSetKernelArg in reaper that I'm not sure what they're doing (or if they're already included elsewhere in cgminer).
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
if you could, it'd be extremely helpful to add a flag to the program to simply ignore the memory size checking and create the pad on the GPU anyway (for some reason, cgminer is only reporting that my video cards have 512mb of memory when they have 3gb.  I don't know why this is)

It will set it to whatever you choose regardless if it can instead of what it detects as the maximum.

The reported size is just the reported max size allocatable by opencl, it is NOT the gpu ramsize. I already said that it will try to set it to what you set it to. It fails, and we are back to my original response - I have no idea why it fails, but that is the response to the command asking for that much memory.

Is that something you need to get and store with clGetDeviceInfo?  Are you sure that that's not just the max default allocatable size for OpenCL?

From http://www.khronos.org/registry/cl/specs/opencl-1.x-latest.pdf#page=52 :
Quote
CL_INVALID_BUFFER_SIZE returned if size is 0 or is greater than
CL_DEVICE_MAX_MEM_ALLOC_SIZE value specified in table 4.3 for all devices in
context.

CL_DEVICE_MAX_MEM_ALLOC_SIZE specifications
Quote
CL_DEVICE_MAX_MEM_ALLOC_SIZE (cl_ulong) Max size of memory object allocation in bytes.  The minimum value is max (1/4th of CL_DEVICE_GLOBAL_MEM_SIZE, 128*1024*1024)
ulong is a huge integer, it should be able to be set higher than 512MB
That has nothing to do with what I said. It's not like I'm making the number up, it reads it back from the device. It does NOT mean the amount of memory the device has. You have read the docs correctly and that is the value reported back for that max alloc size by your device.
legendary
Activity: 1484
Merit: 1005
if you could, it'd be extremely helpful to add a flag to the program to simply ignore the memory size checking and create the pad on the GPU anyway (for some reason, cgminer is only reporting that my video cards have 512mb of memory when they have 3gb.  I don't know why this is)

It will set it to whatever you choose regardless if it can instead of what it detects as the maximum.

The reported size is just the reported max size allocatable by opencl, it is NOT the gpu ramsize. I already said that it will try to set it to what you set it to. It fails, and we are back to my original response - I have no idea why it fails, but that is the response to the command asking for that much memory.

Is that something you need to get and store with clGetDeviceInfo?  Are you sure that that's not just the max default allocatable size for OpenCL?

From http://www.khronos.org/registry/cl/specs/opencl-1.x-latest.pdf#page=52 :
Quote
CL_INVALID_BUFFER_SIZE returned if size is 0 or is greater than
CL_DEVICE_MAX_MEM_ALLOC_SIZE value specified in table 4.3 for all devices in
context.

CL_DEVICE_MAX_MEM_ALLOC_SIZE specifications
Quote
CL_DEVICE_MAX_MEM_ALLOC_SIZE (cl_ulong) Max size of memory object allocation in bytes.  The minimum value is max (1/4th of CL_DEVICE_GLOBAL_MEM_SIZE, 128*1024*1024)
ulong is a huge integer, it should be able to be set higher than 512MB
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
It keeps hashing for up to 2 minutes before it  can tell the pool is down. How long did you leave it for?

Over 30 minutes? There's something broken in it, really. At this stage I cannot recommend people to update, because when I'll restart the pool (update or whatever), cgminer will freeze forever Sad.
I hate sockets. They never  do what you expect. Ok well it was always going to be a rough introduction.
legendary
Activity: 1386
Merit: 1097
It keeps hashing for up to 2 minutes before it  can tell the pool is down. How long did you leave it for?

Over 30 minutes? There's something broken in it, really. At this stage I cannot recommend people to update, because when I'll restart the pool (update or whatever), cgminer will freeze forever Sad.
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
I can confirm the bug, cgminer doesn't detect connection failure.
It keeps hashing for up to 2 minutes before it  can tell the pool is down. How long did you leave it for?
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
can't compile with opencl support under windows!
Result after
Code:
CFLAGS="-O2 -msse2" ./configure
Code:
Configuration Options Summary:

  curses.TUI...........: FOUND: pdcurses
  OpenCL...............: NOT FOUND. GPU mining support DISABLED
configure: error: No mining configured in
folder ADL_SDK contains:
adl_defines.h
adl_sdk.h
adl_structures.h
Last time this happened it was a packaging error on my part. I'll reupload shortly. I've reuploaded it.
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
if you could, it'd be extremely helpful to add a flag to the program to simply ignore the memory size checking and create the pad on the GPU anyway (for some reason, cgminer is only reporting that my video cards have 512mb of memory when they have 3gb.  I don't know why this is)

It will set it to whatever you choose regardless if it can instead of what it detects as the maximum.

The reported size is just the reported max size allocatable by opencl, it is NOT the gpu ramsize. I already said that it will try to set it to what you set it to. It fails, and we are back to my original response - I have no idea why it fails, but that is the response to the command asking for that much memory.
member
Activity: 125
Merit: 10
that is for ADL support, now go back and re-read the windows-build.txt file
But it all worked while compiling cgminer 2.7.5
Have APPSDK 2.5
Do I need the latest APPSDK 2.7 ?
legendary
Activity: 1484
Merit: 1005
high HW values are normal? also utility is very low!
i haven't found yet correct ltc settings for 79xx cards. changing to btc  Wink
yeah, i realized it after posting. still broken.
hero member
Activity: 607
Merit: 500
high HW values are normal? also utility is very low!
i haven't found yet correct ltc settings for 79xx cards. changing to btc  Wink
legendary
Activity: 1484
Merit: 1005
You should be able to easily replicate this just by setting --thread-concurrency 12288 (which works fine on reaper).

I'm pretty sure the problem has to do with these lines,
Code:
clState->padbufsize = bufsize;
clState->padbuffer8 = clCreateBuffer(clState->context, CL_MEM_READ_WRITE, bufsize, NULL, &status);

For whatever reason your program is calculating 0 for the bufsize.  You should be able to step through this with a debugger and figure it out pretty easily I would presume.
I'm unable to reproduce this anywhere. Can you give me your whole command line minus any account details?

--scrypt -I 20 -g 1 -v 1 -w 256 --shaders 1792 --thread-concurrency 12288 or 24000

Off the top of my head

It's been a noted bug in the windows version since the tittiez beta builds
Now that is just bizarre. I tried it even on a windows machine and it didn't give me zero...

EDIT: Nm, can now reproduce.
Okay I've done quite a bit of investigation around this "0" displayed issue. Ironically, that is a display bug in windows. The buffer size is actually being worked out to something like 1.5 billion, and if it's put on a separate line you can see that bufsize is  not zero (I'll do it in the next version). However this does not fix your original complaint that you can't set very high thread concurrency counts like you could on raper [sic]. But you've reminded me of what happened when I investigated this originally.

There are a number of problems with the way raper uses the padbuffer there. Firstly it is reused between threads which means that if you set multiple threads per device they fight over and can trash the data in the buffer. That's not a huge problem with raper because its threading is pretty primitive, unlike cgminer which is heavily multithreaded. However, the main problem is that there is NO error checking on setting values to run the opencl commands. If it  returns invalid values, raper just does it again, and assumes the hashes have been done. So it intermittently works, and intermittently just counts up a number of hashes that never happened. So what happens is you get a displayed hash rate that is really high that does not translate into a proportional rise in number of shares generated.

Summary: I implore you to compare the best share generation rate of raper to cgminer rather than the displayed hashrate.

I only go by pool hashrate.  Reaper pulls 550kh/s per card versus 460kh/s per card with cgminer.  That's a 20% improvement.  Reaper gets ~3% stales which is more or less the same as compared with cgminer.

edit: tried these settings
Code:
--scrypt --thread-concurrency 8000 --shaders 1792 -I 20 -g 1 -v 1 -w 64


Okay.  The problem is the same as the one with reaper, in that when thread_concurrency is too low 7xxx cards will only yield hardware errors at intensity = 20.  to fix this, you need to use more memory.  if you could, it'd be extremely helpful to add a flag to the program to simply ignore the memory size checking and create the pad on the GPU anyway (for some reason, cgminer is only reporting that my video cards have 512mb of memory when they have 3gb.  I don't know why this is)



basically, i'm pretty sure the problem is that cgminer is calculating the memory size available on the card incorrectly.
Jump to: