Pages:
Author

Topic: cgminer - CPU/GPU miner in C for linux/windows - page 8. (Read 81916 times)

-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
Updated tree:

I put some effort into minimising the risk of rejects and cl errors and to not miss shares close to each other. I did this by creating an array for the buffer variables passed to and from the GPU to make it extremely unlikely for a race to occur over the same slot in the array. Then I scan over the entire array when it is flagged as a match being found, but it's scanned in a separate thread to not delay further work being passed to the GPU. This change should allow you to use the higher values for intensity without it increasing the reject or error rate.

In the interim I discovered a nice bug whereby there was a chance the struct with the thread id had its memory freed before an attempt was made to detach the thread with pthread_detach which would lead to a segfault.
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
i tried with openssl-dev, nss-dev and also with curl from http://curl.haxx.se/download.html.
Everything works when i build minerd dynamic, but static dont wanna be so polite Smiley

You can't build a truly static build of something that sends or receives network packets, sorry.
member
Activity: 63
Merit: 10
i tried with openssl-dev, nss-dev and also with curl from http://curl.haxx.se/download.html.
Everything works when i build minerd dynamic, but static dont wanna be so polite Smiley

I have the same problem with static linking. Even tried to compile curl from curl.haxx.se manually. But still no luck.
member
Activity: 63
Merit: 10
Recent git version works fine on nvidia. Thanks! Good job!

One more thing. I do not exactly understand how bitcoin generation works but I have noticed the following thing.

oclHashCat-lite (http://www.hashcat.net/oclhashcat-lite/) gives me about ~290M/hash sha256 password bruteforce speed but gpumine only ~120M/hash. Are there any differences with sha256 password bruteforce and bitcoin mining process?
Maybe is it possible to improve the gpumine using kernels from hashcat?

here is the way I start hashcat:

Code:
./cudaHashcat-lite64.bin -m 1400 762d689acf34b57c52be4fad090626d4f44d3cfd83bbd2cceb4526bd95c54551
...
Hash.Type....: SHA256
Speed........:  292.6M/s

Of course I've tried to play with gpumine parameters like threads, intensity, worksize and vectors and was unable to increase the mining speed significantly.
newbie
Activity: 22
Merit: 0
i tried with openssl-dev, nss-dev and also with curl from http://curl.haxx.se/download.html.
Everything works when i build minerd dynamic, but static dont wanna be so polite Smiley
member
Activity: 80
Merit: 10
Any ideas how to compile that static?

Code:
CFLAGS="-O3 -static -Wall -msse2 -I/usr/include/nvidia-current/" ./configure
Gives me next error:
Code:
checking for curl-config... /usr/bin/curl-config
checking for the version of libcurl... 7.21.3
checking for libcurl >= version 7.10.1... yes
checking whether libcurl is usable... no
configure: error: Missing required libcurl >= 7.10.1

Code:
# apt-cache policy libcurl4-gnutls-dev
libcurl4-gnutls-dev:
  Installed: 7.21.3-1ubuntu1.2

try sudo apt-get-install libcurl4-openssl-dev

had this issue yesterday and it seemed to work with that package installed, rather than the other libcurl4 types.
newbie
Activity: 42
Merit: 0
The last version with poclbm kernel and intensity=3 gives me 25.1 Mh/sec - already very close to the baseline 29.5 Mh/s of other miners! Higher intensity doesn't increase the hash rate much but slows down the desktop alot.
newbie
Activity: 22
Merit: 0
Any ideas how to compile that static?

Code:
CFLAGS="-O3 -static -Wall -msse2 -I/usr/include/nvidia-current/" ./configure
Gives me next error:
Code:
checking for curl-config... /usr/bin/curl-config
checking for the version of libcurl... 7.21.3
checking for libcurl >= version 7.10.1... yes
checking whether libcurl is usable... no
configure: error: Missing required libcurl >= 7.10.1

Code:
# apt-cache policy libcurl4-gnutls-dev
libcurl4-gnutls-dev:
  Installed: 7.21.3-1ubuntu1.2
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
clGetDeviceInfo() tends to return BS sometimes Smiley Anyway, I don't think using the maximum allowed worksize is optimal as you are resource-constrained anyway. Bad thing is that it is hard to determine the optimum without experimenting with the workgroup size (starting from 32 on nvidia all the way to 512 in multiples of 32).

No of course not. I use max work size / vectors. That works surprisingly well as a default starting setting when none are chosen Smiley

So is anyone actually finding this client useful? It's getting quite mature now but apart from Burp's feedback I don't really get a sense that anyone's finding it useful. I find a huge improvement in throughput from it at intensity levels that don't affect my desktop.
sr. member
Activity: 256
Merit: 250
clGetDeviceInfo() tends to return BS sometimes Smiley Anyway, I don't think using the maximum allowed worksize is optimal as you are resource-constrained anyway. Bad thing is that it is hard to determine the optimum without experimenting with the workgroup size (starting from 32 on nvidia all the way to 512 in multiples of 32).
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
Updated tree

I've modified the log to only show the summary and not the testing information unless in debug mode. There are now counters stored to say which gpu or cpu found the share, and hw errors are stored as well. The added information can be used to determine whether to turn down intensity or to overclock less.

The output looks like this now:

[2011-06-29 10:46:19] GPU: 0 Accepted: 100 Rejected: 4 HW errors: 0
[2011-06-29 10:46:24] [230.23 | 218.86 Mhash/s] [105 Accepted] [4 Rejected] [0 HW errors]
[2011-06-29 10:46:29] [227.39 | 218.88 Mhash/s] [105 Accepted] [4 Rejected] [0 HW errors]
[2011-06-29 10:46:34] [218.19 | 218.88 Mhash/s] [105 Accepted] [4 Rejected] [0 HW errors]
[2011-06-29 10:46:40] [239.39 | 218.94 Mhash/s] [105 Accepted] [4 Rejected] [0 HW errors]
[2011-06-29 10:46:45] [230.92 | 218.97 Mhash/s] [105 Accepted] [4 Rejected] [0 HW errors]
[2011-06-29 10:46:45] GPU: 0 Accepted: 101 Rejected: 4 HW errors: 0

Also I've updated the code to not allow automatically setting work sizes greater than 512 as a simple way of preventing the nvidia bug mentioned earlier.

EDIT: I've also made the 1st rate reported (the log interval one) a decaying average so it doesn't jump around as much.
member
Activity: 63
Merit: 10
Thanks and thanks. I wondered why they returned 1024. Looks like more phayl from nvidia with opencl Sad

Well, I understand that nvidia is not the best hardware for mining but anyway could I somehow help in resolving this bug? Maybe some additional information or debug data are needed?
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
Thanks and thanks. I wondered why they returned 1024. Looks like more phayl from nvidia with opencl Sad
member
Activity: 63
Merit: 10
...
To see what I'm doing with linux kernel, check out http//ck-hack.blogspot.com

Please add : after //, broken link Smiley
sr. member
Activity: 256
Merit: 250
Well then welcome to the wonderful world of experimental programming and cursing AMD and Nvidia Smiley

It's a lot of fun though Smiley GPGPU stuff is among the most interesting things I've got into for sure.
member
Activity: 63
Merit: 10
There is a problem with worksize autodetection on nvidia gtx 570. By default it is autodetected as 1024 but that leads to segfault:

Code:
./minerd --userpass xxx:xxx --url http://xxxxxx:8332/
[2011-06-29 02:51:01] Init GPU thread 0
[2011-06-29 02:51:01] List of devices:
[2011-06-29 02:51:01]   0       GeForce GTX 570
[2011-06-29 02:51:01] Selected 0: GeForce GTX 570
[2011-06-29 02:51:18] Initialising kernel poclbm.cl without BFI_INT patching, 1 vectors and worksize 1024
[2011-06-29 02:51:18] initCl() finished. Found GeForce GTX 570
[2011-06-29 02:51:18] Init GPU thread 1
[2011-06-29 02:51:18] List of devices:
[2011-06-29 02:51:18]   0       GeForce GTX 570
[2011-06-29 02:51:18] Selected 0: GeForce GTX 570
[2011-06-29 02:51:18] Long-polling activated for http://xxxxx:8332/LP
Segmentation fault

If I set worksize 512 or less it works fine:

Code:
./minerd --userpass xxx:xxx --url http://xxxx:8332/ --worksize 512
[2011-06-29 02:53:56] Init GPU thread 0
[2011-06-29 02:53:56] List of devices:
[2011-06-29 02:53:56]   0       GeForce GTX 570
[2011-06-29 02:53:56] Selected 0: GeForce GTX 570
[2011-06-29 02:54:12] Initialising kernel poclbm.cl without BFI_INT patching, 1 vectors and worksize 512
[2011-06-29 02:54:12] initCl() finished. Found GeForce GTX 570
[2011-06-29 02:54:12] Init GPU thread 1
[2011-06-29 02:54:12] List of devices:
[2011-06-29 02:54:12]   0       GeForce GTX 570
[2011-06-29 02:54:12] Selected 0: GeForce GTX 570
[2011-06-29 02:54:12] Long-polling activated for http://xxxxxx:8332/LP
[2011-06-29 02:54:13] [3.27 | 3.27 Mhash/s] [0 Accepted] [0 Rejected]
[2011-06-29 02:54:18] [70.47 | 18.51 Mhash/s] [0 Accepted] [0 Rejected]
[2011-06-29 02:54:23] [89.52 | 31.64 Mhash/s] [0 Accepted] [0 Rejected]
[2011-06-29 02:54:28] [69.74 | 37.58 Mhash/s] [0 Accepted] [0 Rejected]
[2011-06-29 02:54:33] [69.03 | 41.83 Mhash/s] [0 Accepted] [0 Rejected]
...
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
I thought you quit kernel hacking. I've compiled some of your kernels a while ago on my desktop Smiley Had no idea you are into bitcoin stuff and OpenCL. Nice Smiley
Actually I'm very new to opencl and bitcoin. Just started a week ago, and had to learn all about opencl. I've put in over a hundred hours on this code already  to get up to speed Tongue

To see what I'm doing with linux kernel, check out http://ck-hack.blogspot.com
sr. member
Activity: 256
Merit: 250
I thought you quit kernel hacking. I've compiled some of your kernels a while ago on my desktop Smiley Had no idea you are into bitcoin stuff and OpenCL. Nice Smiley
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
Sorry for the rude OT question but was that you that maintained the -ck tree? Smiley
Not rude at all. Yes it is me and I still do Smiley
member
Activity: 98
Merit: 10
OK, it looks very good for me with 1 thread per gpu, intensity 10, and worksize 256. I get 619MH/s in total, means ~609MH/s per card. Rejection rate is at a normal level. For me it seems to be beneficial to increase intensity and worksize in favor of 2 gpu threads (which leads to more rejections for me).

EDIT: Rejection rate for now is higher than with poclbm (equal settings), minerd so far: 10/220 ~ 4.5%, poclbm: 41/2752 ~1.5%
EDIT2: Better results with intensity 8, gives me "just" 617MH/s in total but no rejections for 100 accepted shares so far. Probably the perfect settings for me.
Pages:
Jump to: