Pages:
Author

Topic: cgminer - CPU/GPU miner in C for linux/windows - page 11. (Read 81916 times)

-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
Per-GPU optimisations and bugfixes.

It's still segfaulting for me deep in the ati opencl code during the setup of the second card in a three card system.


Without doubt I'm still not setting up multiple cards properly yet. Thanks for testing.
staff
Activity: 4326
Merit: 8951
Per-GPU optimisations and bugfixes.

It's still segfaulting for me deep in the ati opencl code during the setup of the second card in a three card system.

-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
Further to that thought, I've committed a change to the tree which should prevent 32 bit overflows.
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
Thanks for the quick updates.

With the latest version the hashrate is still not optimal on my GPU, and it still drops to zero periodically when CPU threads are disabled:
Quote
...
[2011-06-24 09:06:39] [17.43 | 16.47 Mhash/s] [0 Accepted] [0 Rejected]
[2011-06-24 09:06:44] [17.41 | 16.49 Mhash/s] [0 Accepted] [0 Rejected]
[2011-06-24 09:06:49] [17.43 | 0.02 Mhash/s] [0 Accepted] [0 Rejected]
[2011-06-24 09:06:54] [17.42 | 0.35 Mhash/s] [0 Accepted] [0 Rejected]
[2011-06-24 09:06:59] [17.22 | 0.67 Mhash/s] [0 Accepted] [0 Rejected]
...


Ah. Question: When it drops to zero, do you ever see it still find blocks despite it reading zero? It may just be a 32 bit overflow because I just remembered you're on 32 bits.
newbie
Activity: 42
Merit: 0
Thanks for the quick updates.

With the latest version the hashrate is still not optimal on my GPU, and it still drops to zero periodically when CPU threads are disabled:
Quote
...
[2011-06-24 09:06:39] [17.43 | 16.47 Mhash/s] [0 Accepted] [0 Rejected]
[2011-06-24 09:06:44] [17.41 | 16.49 Mhash/s] [0 Accepted] [0 Rejected]
[2011-06-24 09:06:49] [17.43 | 0.02 Mhash/s] [0 Accepted] [0 Rejected]
[2011-06-24 09:06:54] [17.42 | 0.35 Mhash/s] [0 Accepted] [0 Rejected]
[2011-06-24 09:06:59] [17.22 | 0.67 Mhash/s] [0 Accepted] [0 Rejected]
...
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
Updated tree:
Ensure the GPU doesn't keep working on blocks longer than opt_scantime.

This makes for much less false blocks on slower GPUs.

I've also limited the max --intensity variable to 10, as higher values returned garbage from the GPU.
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
Per-GPU optimisations and bugfixes.

I've updated the code now to test every card for its ideal settings and it runs a kernel suitable for each card. This allows you to have different GPUs now and have them all work to their best.

As for the multiple threads question about Diablo's miner, the worker thread that hands out work to the GPU in minerd works asynchronously with very low overhead so it can keep the GPU busy just with the one thread. Ultimately the overhead of this approach and the lack of switching workloads on the GPU should be better.
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
Hmm yeah I still haven't figured out what the bug is there, but I'm still working on it.

Just for the record, this miner is now faster than any other miner for my hardware (1 x 6770) when left idle. I used to get 197 with phoenix+phatk and when cpu is done on top I got another 12 with CPU (total of 209), but with minerd and default settings I'm getting 216.
newbie
Activity: 42
Merit: 0
It's a bit faster now on Radeon 4650 (17Mh/sec instead of 16). But all time average still drops to zero every getwork, which I guess is a bug. Unless all time average means getwork time average.
staff
Activity: 4326
Merit: 8951
diff --git a/ocl.c b/ocl.c
index 4173026..a8240eb 100644
--- a/ocl.c
+++ b/ocl.c
@@ -425,7 +425,7 @@ _clState *initCl(int gpu, char *name, size_t nameSize)
                        return NULL;
                }
 
-               clState->program = clCreateProgramWithBinary(clState->context, numDevices, &devices[gpu], binary_sizes, (const unsigned char **)binaries, &status, NULL);
+               clState->program = clCreateProgramWithBinary(clState->context, 1, &devices[gpu], binary_sizes, (const unsigned char **)binaries, &status, NULL);


Meh. Makes it not crash, but it's not sufficient for it to work right. I should look at this after sleeping. Wink

-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
I've committed some changes to the kernel based on phatk's kernel's use of arrays instead of individual variables and this has afforded another speedup at least on my machine.
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
I fixed a bug which would make it segfault occasionally when BFI_INT patching.

I've updated the output log to display both a log-interval average and an all-time average.

I've added an option to configure the log interval in seconds with --log.

I think I also fixed the bug where it would stop doing work after a few minutes.

The output now looks like this:
[2011-06-23 22:26:17] [166.81 | 176.26 Mhash/s] [63 Accepted] [1 Rejected]

First entry is rolling average, 2nd is all time average.

Now I'll look some more into the internals and performance.
staff
Activity: 4326
Merit: 8951
Testing on a silly nvidia machine:

Selected 0: GeForce GTX 275
[2011-06-23 06:00:58] Preferred vector width reported 1
[2011-06-23 06:00:58] Max work group size reported 512
[2011-06-23 06:00:58] cl_amd_media_ops not found, will not BFI_INT patch
initCl() finished. Found GeForce GTX 275
[...]
[2011-06-23 06:04:18] GPU 0 found something?
[2011-06-23 06:04:18] No best_g found! Error in OpenCL code?

Looks like it never accepts results from the GPU. CPU mines fine.


on a system with 3 5850s:

(gdb) run -t 6 -a 4way --url http://pool.bitcoin.dashjr.org:8337/ --userpass 15xWuDHSyKzpvp6FacGKXijBeaaaYhKWSi:x --retry-pause 1 -r -1 --intensity 16
Starting program: /root/gm/cpuminer/minerd -t 6 -a 4way --url http://pool.bitcoin.dashjr.org:8337/ --userpass 15xWuDHSyKzpvp6FacGKXijBeaaaYhKWSi:x --retry-pause 1 -r -1 --intensity 16
[Thread debugging using libthread_db enabled]
[New Thread 0x7ffff4141700 (LWP 17510)]
[New Thread 0x7ffff3940700 (LWP 17511)]
Init GPU 0
List of devices:
        0       Cypress
        1       Cypress
        2       Cypress
Selected 0: Cypress
[2011-06-23 06:17:35] Preferred vector width reported 4
[2011-06-23 06:17:35] Max work group size reported 256
[2011-06-23 06:17:35] Preferred vector width reported 4
[2011-06-23 06:17:35] Max work group size reported 256
[2011-06-23 06:17:35] Preferred vector width reported 4
[2011-06-23 06:17:35] Max work group size reported 256
[2011-06-23 06:17:35] Patched source to suit 4 vectors
[2011-06-23 06:17:35] cl_amd_media_ops found, patched source with BFI_INT
[New Thread 0x7ffff313f700 (LWP 17512)]
initCl() finished. Found Cypress
[New Thread 0x7ffff307e700 (LWP 17513)]
[New Thread 0x7ffff287d700 (LWP 17514)]
[Thread 0x7ffff287d700 (LWP 17514) exited]
Init GPU 1
List of devices:
        0       Cypress
        1       Cypress
        2       Cypress
Selected 1: Cypress
[2011-06-23 06:17:37] Preferred vector width reported 4
[2011-06-23 06:17:37] Max work group size reported 256
[2011-06-23 06:17:37] Preferred vector width reported 4
[2011-06-23 06:17:37] Max work group size reported 256
[2011-06-23 06:17:37] Preferred vector width reported 4
[2011-06-23 06:17:37] Max work group size reported 256
[2011-06-23 06:17:37] Patched source to suit 4 vectors
[2011-06-23 06:17:37] cl_amd_media_ops found, patched source with BFI_INT

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff6a2b4b0 in ?? ()
   from /root/AMD-APP-SDK-v2.4-lnx64/lib/x86_64/libamdocl64.so
(gdb) bt
#0  0x00007ffff6a2b4b0 in ?? ()
   from /root/AMD-APP-SDK-v2.4-lnx64/lib/x86_64/libamdocl64.so
#1  0x00007ffff6a7a31b in ?? ()
   from /root/AMD-APP-SDK-v2.4-lnx64/lib/x86_64/libamdocl64.so
#2  0x00007ffff6a225a0 in clCreateProgramWithBinary ()
   from /root/AMD-APP-SDK-v2.4-lnx64/lib/x86_64/libamdocl64.so
#3  0x000000000040793b in initCl ()


Stupid binary libraries.  :-/

Update: Still crashing on the system with 5850s with 181070d129259d088219a0dcd0ef41d9a45439d3
newbie
Activity: 42
Merit: 0
Yes, that was it. With one CPU thread, the rate stays constant at ~17MH/s even after getworks fetching.
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
I run minerd --protocol-dump, and the rate drops to zero immediately after the next getwork.

It could be the threads=0 option. Try allowing the CPUs to run, or at least one thread.
newbie
Activity: 42
Merit: 0
I run minerd --protocol-dump, and the rate drops to zero immediately after the next getwork.
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
Looks like other miners show the hash rate as an average over the last few seconds. I think showing the average over the whole run time may be confusing. Also, it's strange that the average reported rate would be zero at some point if it's calculated over the whole run time.

It does indeed sound like a bug. Perhaps having a rolling average and a total average would be helpful too.
newbie
Activity: 42
Merit: 0
Looks like other miners show the hash rate as an average over the last few seconds. I think showing the average over the whole run time may be confusing. Also, it's strange that the average reported rate would be zero at some point if it's calculated over the whole run time.
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
The hash value returned is the total number of hashes done over the total time. It tends to rise to the real value over a few minutes. If it drops off, usually it's because the server has not responded and it's waiting for more work, but it's not without reason there are other bugs there. The vector count of other apps set to "2 way vectors" in phoenix is the same as an optimal vector count of 2. minerd detects the optimal according to what the card reports and does up to 4 way vectors (the most supported by any card currently).
newbie
Activity: 42
Merit: 0
I let it run a bit longer, and there was a strange drop in the hash rate:
Quote
Init GPU 0
List of devices:
        0       ATI RV730
Selected 0: ATI RV730
[2011-06-23 12:12:01] Preferred vector width reported 4
[2011-06-23 12:12:01] Max work group size reported 128
[2011-06-23 12:12:01] Patched source to suit 4 vectors
[2011-06-23 12:12:01] cl_amd_media_ops not found, will not BFI_INT patch
initCl() finished. Found ATI RV730
[2011-06-23 12:12:32] Long-polling activated for http://mineco.in:3000/LP
[2011-06-23 12:12:32] [0.03 Mhash/sec] [0 Accepted] [0 Rejected]
[2011-06-23 12:12:33] 1 gpu miner threads started
[2011-06-23 12:12:33] 0 cpu miner threads started, using SHA256 'c' algorithm.
[2011-06-23 12:12:41] [0.12 Mhash/sec] [0 Accepted] [0 Rejected]
[2011-06-23 12:12:46] [6.59 Mhash/sec] [0 Accepted] [0 Rejected]
[2011-06-23 12:12:52] [9.54 Mhash/sec] [0 Accepted] [0 Rejected]
[2011-06-23 12:12:57] [11.19 Mhash/sec] [0 Accepted] [0 Rejected]
[2011-06-23 12:13:02] [12.29 Mhash/sec] [0 Accepted] [0 Rejected]
[2011-06-23 12:13:07] [13.06 Mhash/sec] [0 Accepted] [0 Rejected]
[2011-06-23 12:13:12] [13.63 Mhash/sec] [0 Accepted] [0 Rejected]
[2011-06-23 12:13:17] [14.07 Mhash/sec] [0 Accepted] [0 Rejected]
[2011-06-23 12:13:22] [14.41 Mhash/sec] [0 Accepted] [0 Rejected]
[2011-06-23 12:13:27] [14.68 Mhash/sec] [0 Accepted] [0 Rejected]
[2011-06-23 12:13:32] [14.89 Mhash/sec] [0 Accepted] [0 Rejected]
[2011-06-23 12:13:37] [15.08 Mhash/sec] [0 Accepted] [0 Rejected]
[2011-06-23 12:13:42] [15.24 Mhash/sec] [0 Accepted] [0 Rejected]
....
[2011-06-23 12:16:13] [16.70 Mhash/sec] [0 Accepted] [0 Rejected]
[2011-06-23 12:16:18] [16.71 Mhash/sec] [0 Accepted] [0 Rejected]
[2011-06-23 12:16:22] GPU 0 found something?
[2011-06-23 12:16:22] PROOF OF WORK RESULT: false (booooo)
[2011-06-23 12:16:23] [16.73 Mhash/sec] [0 Accepted] [1 Rejected]
[2011-06-23 12:16:28] [16.74 Mhash/sec] [0 Accepted] [1 Rejected]
[2011-06-23 12:16:33] [16.76 Mhash/sec] [0 Accepted] [1 Rejected]
[2011-06-23 12:16:38] [16.77 Mhash/sec] [0 Accepted] [1 Rejected]
[2011-06-23 12:16:43] [16.78 Mhash/sec] [0 Accepted] [1 Rejected]
[2011-06-23 12:16:48] [16.80 Mhash/sec] [0 Accepted] [1 Rejected]
[2011-06-23 12:16:53] [0.31 Mhash/sec] [0 Accepted] [1 Rejected]
[2011-06-23 12:16:58] [0.64 Mhash/sec] [0 Accepted] [1 Rejected]
[2011-06-23 12:17:04] [0.95 Mhash/sec] [0 Accepted] [1 Rejected]
[2011-06-23 12:17:09] [1.26 Mhash/sec] [0 Accepted] [1 Rejected]
[2011-06-23 12:17:14] [1.55 Mhash/sec] [0 Accepted] [1 Rejected]
...

I stopped the miner and re-run it, with the same results - the hash rate drops to zero after about 4-5 minutes of run time.
Pages:
Jump to: