cgminer - CPU/GPU miner in C for linux/windows - page 8.

-ck

legendary

Activity: 4088

Merit: 1631

Ruu \o/

Updated tree:

I put some effort into minimising the risk of rejects and cl errors and to not miss shares close to each other. I did this by creating an array for the buffer variables passed to and from the GPU to make it extremely unlikely for a race to occur over the same slot in the array. Then I scan over the entire array when it is flagged as a match being found, but it's scanned in a separate thread to not delay further work being passed to the GPU. This change should allow you to use the higher values for intensity without it increasing the reject or error rate.

In the interim I discovered a nice bug whereby there was a chance the struct with the thread id had its memory freed before an attempt was made to detach the thread with pthread_detach which would lead to a segfault.

-ck

legendary

Activity: 4088

Merit: 1631

Ruu \o/

Quote from: Naven on June 29, 2011, 05:44:39 AM

i tried with openssl-dev, nss-dev and also with curl from http://curl.haxx.se/download.html.
Everything works when i build minerd dynamic, but static dont wanna be so polite

You can't build a truly static build of something that sends or receives network packets, sorry.

bumbox

member

Activity: 63

Merit: 10

Quote from: Naven on June 29, 2011, 05:44:39 AM

i tried with openssl-dev, nss-dev and also with curl from http://curl.haxx.se/download.html.
Everything works when i build minerd dynamic, but static dont wanna be so polite

I have the same problem with static linking. Even tried to compile curl from curl.haxx.se manually. But still no luck.

bumbox

member

Activity: 63

Merit: 10

Recent git version works fine on nvidia. Thanks! Good job!

One more thing. I do not exactly understand how bitcoin generation works but I have noticed the following thing.

oclHashCat-lite (http://www.hashcat.net/oclhashcat-lite/) gives me about ~290M/hash sha256 password bruteforce speed but gpumine only ~120M/hash. Are there any differences with sha256 password bruteforce and bitcoin mining process?
Maybe is it possible to improve the gpumine using kernels from hashcat?

here is the way I start hashcat:

Code:

./cudaHashcat-lite64.bin -m 1400 762d689acf34b57c52be4fad090626d4f44d3cfd83bbd2cceb4526bd95c54551
...
Hash.Type....: SHA256
Speed........: 292.6M/s

Of course I've tried to play with gpumine parameters like threads, intensity, worksize and vectors and was unable to increase the mining speed significantly.

Naven

newbie

Activity: 22

Merit: 0

i tried with openssl-dev, nss-dev and also with curl from http://curl.haxx.se/download.html.
Everything works when i build minerd dynamic, but static dont wanna be so polite

theowalpott

member

Activity: 80

Merit: 10

Quote from: Naven on June 29, 2011, 04:12:13 AM

Any ideas how to compile that static?

Code:

CFLAGS="-O3 -static -Wall -msse2 -I/usr/include/nvidia-current/" ./configure

Gives me next error:

Code:

checking for curl-config... /usr/bin/curl-config
checking for the version of libcurl... 7.21.3
checking for libcurl >= version 7.10.1... yes
checking whether libcurl is usable... no
configure: error: Missing required libcurl >= 7.10.1

Code:

# apt-cache policy libcurl4-gnutls-dev
libcurl4-gnutls-dev:
Installed: 7.21.3-1ubuntu1.2

try sudo apt-get-install libcurl4-openssl-dev

had this issue yesterday and it seemed to work with that package installed, rather than the other libcurl4 types.

figvam

newbie

Activity: 42

Merit: 0

The last version with poclbm kernel and intensity=3 gives me 25.1 Mh/sec - already very close to the baseline 29.5 Mh/s of other miners! Higher intensity doesn't increase the hash rate much but slows down the desktop alot.

Naven

newbie

Activity: 22

Merit: 0

Any ideas how to compile that static?

Code:

CFLAGS="-O3 -static -Wall -msse2 -I/usr/include/nvidia-current/" ./configure

Gives me next error:

Code:

checking for curl-config... /usr/bin/curl-config
checking for the version of libcurl... 7.21.3
checking for libcurl >= version 7.10.1... yes
checking whether libcurl is usable... no
configure: error: Missing required libcurl >= 7.10.1

Code:

# apt-cache policy libcurl4-gnutls-dev
libcurl4-gnutls-dev:
Installed: 7.21.3-1ubuntu1.2

-ck

legendary

Activity: 4088

Merit: 1631

Ruu \o/

Quote from: gat3way on June 29, 2011, 02:19:50 AM

clGetDeviceInfo() tends to return BS sometimes

Anyway, I don't think using the maximum allowed worksize is optimal as you are resource-constrained anyway. Bad thing is that it is hard to determine the optimum without experimenting with the workgroup size (starting from 32 on nvidia all the way to 512 in multiples of 32).

No of course not. I use max work size / vectors. That works surprisingly well as a default starting setting when none are chosen

So is anyone actually finding this client useful? It's getting quite mature now but apart from Burp's feedback I don't really get a sense that anyone's finding it useful. I find a huge improvement in throughput from it at intensity levels that don't affect my desktop.

gat3way

sr. member

Activity: 256

Merit: 250

clGetDeviceInfo() tends to return BS sometimes

Anyway, I don't think using the maximum allowed worksize is optimal as you are resource-constrained anyway. Bad thing is that it is hard to determine the optimum without experimenting with the workgroup size (starting from 32 on nvidia all the way to 512 in multiples of 32).

-ck

legendary

Activity: 4088

Merit: 1631

Ruu \o/

Updated tree

I've modified the log to only show the summary and not the testing information unless in debug mode. There are now counters stored to say which gpu or cpu found the share, and hw errors are stored as well. The added information can be used to determine whether to turn down intensity or to overclock less.

The output looks like this now:

[2011-06-29 10:46:19] GPU: 0 Accepted: 100 Rejected: 4 HW errors: 0
[2011-06-29 10:46:24] [230.23 | 218.86 Mhash/s] [105 Accepted] [4 Rejected] [0 HW errors]
[2011-06-29 10:46:29] [227.39 | 218.88 Mhash/s] [105 Accepted] [4 Rejected] [0 HW errors]
[2011-06-29 10:46:34] [218.19 | 218.88 Mhash/s] [105 Accepted] [4 Rejected] [0 HW errors]
[2011-06-29 10:46:40] [239.39 | 218.94 Mhash/s] [105 Accepted] [4 Rejected] [0 HW errors]
[2011-06-29 10:46:45] [230.92 | 218.97 Mhash/s] [105 Accepted] [4 Rejected] [0 HW errors]
[2011-06-29 10:46:45] GPU: 0 Accepted: 101 Rejected: 4 HW errors: 0

Also I've updated the code to not allow automatically setting work sizes greater than 512 as a simple way of preventing the nvidia bug mentioned earlier.

EDIT: I've also made the 1st rate reported (the log interval one) a decaying average so it doesn't jump around as much.

bumbox

member

Activity: 63

Merit: 10

Quote from: -ck on June 28, 2011, 06:04:07 PM

Thanks and thanks. I wondered why they returned 1024. Looks like more phayl from nvidia with opencl Sad

Well, I understand that nvidia is not the best hardware for mining but anyway could I somehow help in resolving this bug? Maybe some additional information or debug data are needed?

-ck

legendary

Activity: 4088

Merit: 1631

Ruu \o/

Thanks and thanks. I wondered why they returned 1024. Looks like more phayl from nvidia with opencl Sad

bumbox

member

Activity: 63

Merit: 10

Quote from: -ck on June 28, 2011, 05:43:30 PM

...
To see what I'm doing with linux kernel, check out http//ck-hack.blogspot.com

Please add : after //, broken link

gat3way

sr. member

Activity: 256

Merit: 250

Well then welcome to the wonderful world of experimental programming and cursing AMD and Nvidia

It's a lot of fun though

GPGPU stuff is among the most interesting things I've got into for sure.

bumbox

member

Activity: 63

Merit: 10

There is a problem with worksize autodetection on nvidia gtx 570. By default it is autodetected as 1024 but that leads to segfault:

Code:

./minerd --userpass xxx:xxx --url http://xxxxxx:8332/
[2011-06-29 02:51:01] Init GPU thread 0
[2011-06-29 02:51:01] List of devices:
[2011-06-29 02:51:01] 0 GeForce GTX 570
[2011-06-29 02:51:01] Selected 0: GeForce GTX 570
[2011-06-29 02:51:18] Initialising kernel poclbm.cl without BFI_INT patching, 1 vectors and worksize 1024
[2011-06-29 02:51:18] initCl() finished. Found GeForce GTX 570
[2011-06-29 02:51:18] Init GPU thread 1
[2011-06-29 02:51:18] List of devices:
[2011-06-29 02:51:18] 0 GeForce GTX 570
[2011-06-29 02:51:18] Selected 0: GeForce GTX 570
[2011-06-29 02:51:18] Long-polling activated for http://xxxxx:8332/LP
Segmentation fault

If I set worksize 512 or less it works fine:

Code:

./minerd --userpass xxx:xxx --url http://xxxx:8332/ --worksize 512
[2011-06-29 02:53:56] Init GPU thread 0
[2011-06-29 02:53:56] List of devices:
[2011-06-29 02:53:56] 0 GeForce GTX 570
[2011-06-29 02:53:56] Selected 0: GeForce GTX 570
[2011-06-29 02:54:12] Initialising kernel poclbm.cl without BFI_INT patching, 1 vectors and worksize 512
[2011-06-29 02:54:12] initCl() finished. Found GeForce GTX 570
[2011-06-29 02:54:12] Init GPU thread 1
[2011-06-29 02:54:12] List of devices:
[2011-06-29 02:54:12] 0 GeForce GTX 570
[2011-06-29 02:54:12] Selected 0: GeForce GTX 570
[2011-06-29 02:54:12] Long-polling activated for http://xxxxxx:8332/LP
[2011-06-29 02:54:13] [3.27 | 3.27 Mhash/s] [0 Accepted] [0 Rejected]
[2011-06-29 02:54:18] [70.47 | 18.51 Mhash/s] [0 Accepted] [0 Rejected]
[2011-06-29 02:54:23] [89.52 | 31.64 Mhash/s] [0 Accepted] [0 Rejected]
[2011-06-29 02:54:28] [69.74 | 37.58 Mhash/s] [0 Accepted] [0 Rejected]
[2011-06-29 02:54:33] [69.03 | 41.83 Mhash/s] [0 Accepted] [0 Rejected]
...

-ck

legendary

Activity: 4088

Merit: 1631

Ruu \o/

Quote from: gat3way on June 28, 2011, 04:46:58 PM

I thought you quit kernel hacking. I've compiled some of your kernels a while ago on my desktop

Had no idea you are into bitcoin stuff and OpenCL. Nice

Actually I'm very new to opencl and bitcoin. Just started a week ago, and had to learn all about opencl. I've put in over a hundred hours on this code already to get up to speed Tongue

To see what I'm doing with linux kernel, check out http://ck-hack.blogspot.com

gat3way

sr. member

Activity: 256

Merit: 250

I thought you quit kernel hacking. I've compiled some of your kernels a while ago on my desktop

Had no idea you are into bitcoin stuff and OpenCL. Nice

-ck

legendary

Activity: 4088

Merit: 1631

Ruu \o/

Quote from: gat3way on June 28, 2011, 06:57:59 AM

Sorry for the rude OT question but was that you that maintained the -ck tree?

Not rude at all. Yes it is me and I still do

burp

member

Activity: 98

Merit: 10

OK, it looks very good for me with 1 thread per gpu, intensity 10, and worksize 256. I get 619MH/s in total, means ~609MH/s per card. Rejection rate is at a normal level. For me it seems to be beneficial to increase intensity and worksize in favor of 2 gpu threads (which leads to more rejections for me).

EDIT: Rejection rate for now is higher than with poclbm (equal settings), minerd so far: 10/220 ~ 4.5%, poclbm: 41/2752 ~1.5%
EDIT2: Better results with intensity 8, gives me "just" 617MH/s in total but no rejections for 100 accepted shares so far. Probably the perfect settings for me.

Topic: cgminer - CPU/GPU miner in C for linux/windows - page 8. (Read 81916 times)