Pages:
Author

Topic: cgminer - CPU/GPU miner in C for linux/windows - page 12. (Read 81916 times)

newbie
Activity: 42
Merit: 0
It says "Preferred vector width reported 4" for ATI RV730 (Radeon 4650). Is the vector width thing even the same as "2-way vectors" in Phoenix?

Anyway, with the latest version the rate is somewhat better, it started at ~10Mh/sec and slowly raised up to ~17Mh/sec during the next few minutes. Is the hash rate calculated over the whole program run time?
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
Okay I've implemented some rudimentary testing of what the GPU reports as its maximum work size and preferred vector width and dynamic patching to make the most of those values. On my brief testing this provides the optimal throughput on the 2 cards I've tried it on (ati 6770 and nvidia GT 330). There is no scope for it coping with multiple different GPUs on the same machine at the moment. Please pull the latest tree and give it a try!
staff
Activity: 4326
Merit: 8951
If you grab the latest version, you'll see it reports the preferred vector width. I'm planning on getting all the preferred details back from the cards to automatically set the best options.

Diablo reported that running multiple opencl threads improves utilization substantially, his miner runs three per gpu.  Though this may be due to the excessive usage of blocking IO in the fast path (wtf‽).

I'm looking forward to the phatk kernel.  Getting a miner with a control plane which isn't crap is a dream that I haven't had the free time to realize.

Thanks for working on this!
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
If you grab the latest version, you'll see it reports the preferred vector width. I'm planning on getting all the preferred details back from the cards to automatically set the best options.
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
Thanks, that's most helpful, especially since this currently uses 4 way vectors. That could explain why it's unhappy.
newbie
Activity: 42
Merit: 0
I can only say that turning off VECTORS ("Enables 2-way vectors. This may improve hashrate if enabled, but it can be slower on some hardware") for phoenix improves hashrate from about 24 to 29 Mh/sec. Looks like 4xxx cards need more workarounds in the code.
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
Thanks for that. I'm not sure why it's not working with 2.4. Do you know what settings (if any) help on that card you have with poclbm? I'm currently at work and won't be able to do anything major right now, but I'll look at implementing more tuneables and detection soon.
newbie
Activity: 42
Merit: 0
APP SDK 2.4 still doesn't work:
Quote
$ ldd minerd|fgrep OpenCL
        libOpenCL.so.1 => /opt/AMD-APP-SDK-v2.4-lnx32/lib/x86/libOpenCL.so.1 (0x00b98000)
$ ./minerd --threads 0 --intensity 3 --url http://mineco.in:3000
Error: Getting Platforms. (clGetPlatformsIDs)
newbie
Activity: 42
Merit: 0
With the latest change and intensity=3, the CPU threads parameter works, the desktop feels adequate (about the same as when running other miners), but the hashrate is too low: ~12Mh/sec
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
Try again please with the new git tree. You should be able to disable the CPU threads. The default optimisations are for 5x and 6x as I have no experience with the other cards. I've yet to implement other options to set on the command line to tweak performance

EDIT: Yes it does recompile the kernel each time it starts (for now).
newbie
Activity: 42
Merit: 0
It works with AMD APP SDK 2.1.

But there are a few issues:
 - CPU threads can't be disabled - the appropriate option doesn't work
 - the desktop is dog slow (probably as a result of above)
 - there's a delay of about 20 seconds before "initCl() finished" - does it recompile the CL scripts every time?
 - hashrate is about twice as low as poclbm: 13.5Mh/sec for minerd vs 22.5 for poclbm. Running on Radeon 4650:

Quote
Init GPU 0
List of devices:
        0       ATI RV730
Selected 0: ATI RV730
[2011-06-23 08:54:21] cl_amd_media_ops not found, will not BFI_INT patch
initCl() finished. Found ATI RV730
[2011-06-23 08:54:52] Long-polling activated for http://mineco.in:3000/LP
[2011-06-23 08:54:52] [0.27 Mhash/sec] [0 Accepted] [0 Rejected]
[2011-06-23 08:54:53] 1 gpu miner threads started
[2011-06-23 08:54:53] Binding thread 1 to cpu 1
[2011-06-23 08:54:54] Binding thread 2 to cpu 0
[2011-06-23 08:54:55] 2 cpu miner threads started, using SHA256 'c' algorithm.
[2011-06-23 08:55:02] [1.15 Mhash/sec] [0 Accepted] [0 Rejected]
[2011-06-23 08:55:08] [6.50 Mhash/sec] [0 Accepted] [0 Rejected]
[2011-06-23 08:55:13] [8.61 Mhash/sec] [0 Accepted] [0 Rejected]
[2011-06-23 08:55:19] [9.74 Mhash/sec] [0 Accepted] [0 Rejected]
[2011-06-23 08:55:25] [11.56 Mhash/sec] [0 Accepted] [0 Rejected]
[2011-06-23 08:55:30] [11.87 Mhash/sec] [0 Accepted] [0 Rejected]
[2011-06-23 08:55:35] [12.04 Mhash/sec] [0 Accepted] [0 Rejected]
[2011-06-23 08:55:41] [12.21 Mhash/sec] [0 Accepted] [0 Rejected]
[2011-06-23 08:55:46] [12.33 Mhash/sec] [0 Accepted] [0 Rejected]
[2011-06-23 08:55:52] [12.44 Mhash/sec] [0 Accepted] [0 Rejected]
[2011-06-23 08:55:57] [12.53 Mhash/sec] [0 Accepted] [0 Rejected]
[2011-06-23 08:56:03] [12.60 Mhash/sec] [0 Accepted] [0 Rejected]
[2011-06-23 08:56:08] [12.67 Mhash/sec] [0 Accepted] [0 Rejected]
[2011-06-23 08:56:13] [13.14 Mhash/sec] [0 Accepted] [0 Rejected]
[2011-06-23 08:56:19] [13.15 Mhash/sec] [0 Accepted] [0 Rejected]
[2011-06-23 08:56:21] GPU 0 found something?
[2011-06-23 08:56:21] PROOF OF WORK RESULT: true (yay!!!)
[2011-06-23 08:56:24] [13.57 Mhash/sec] [1 Accepted] [0 Rejected]
[2011-06-23 08:56:29] [13.55 Mhash/sec] [1 Accepted] [0 Rejected]
[2011-06-23 08:56:29] GPU 0 found something?
[2011-06-23 08:56:29] PROOF OF WORK RESULT: true (yay!!!)
[2011-06-23 08:56:34] [13.54 Mhash/sec] [2 Accepted] [0 Rejected]
[2011-06-23 08:56:39] [13.52 Mhash/sec] [2 Accepted] [0 Rejected]
[2011-06-23 08:56:44] [13.52 Mhash/sec] [2 Accepted] [0 Rejected]
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
Thanks for testing and reporting. I've just uploaded another commit to the git tree which should fix that. Please pull the latest changes.

I've now also added a command line tuneable to allow you to choose how hard to work the GPU:

--intensity
(-I) Intensity of scanning (0 - 16, default 5)

The higher you set it to, the greater the GPU throughput, but the more lag you'll get. Very large numbers can cause massive stalls without a huge improvement to throughput. Try them cautiously!
newbie
Activity: 42
Merit: 0
It errors out under Fedora 14 i386 and AMD APP 2.4:

Quote
$ ./minerd -t 0 --url http://mineco.in:3000
Error: Getting Platforms. (clGetPlatformsIDs)
[2011-06-23 08:45:27] 0 gpu miner threads started
[2011-06-23 08:45:27] Binding thread -1 to cpu -1
[2011-06-23 08:45:27] Long-polling activated for http://mineco.in:3000/LP
Segmentation Fault (core dumped)

Here's configure line:

CFLAGS="-O2 -Wall -msse2 -I/opt/AMD-APP-SDK-v2.4-lnx32/include -g" LDFLAGS="-L/opt/AMD-APP-SDK-v2.4-lnx32/lib/x86/ -g"  ./configure

Here's the backtrace:

Quote
#0  0x080541ea in scanhash_c (thr_id=-128,
    midstate=0xb65ff0c0 "\276\063mR\262~-\020\323\352\320\366\223\017\250\327\216\233\002]\016H\300n\345\336[JIC\336D\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377",
    data=0xb65ff040 "͢\213]N\002\304#\032\023!\205", hash1=0xb65ff080 "",
    hash=0xb65ff100 "",
    target=0xb65ff0e0 "\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377", max_nonce=16777215,
    hashes_done=0xb65ff27c) at sha256_generic.c:252
#1  0x0804a32d in miner_thread (userdata=0x9a590c4) at cpu-miner.c:662
#2  0x00567e99 in start_thread () from /lib/libpthread.so.0
#3  0x0047ad2e in clone () from /lib/libc.so.6
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
I've now added testing for cl_amd_media_ops to detect which cards support BFI_INT and only patch those. This means it should work on all cards now, including nvidia and old ati that don't support it.
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
Thanks for starting a new topic JG. I need this project to have more exposure to attract attention to its GPU mining code so that it can be tested and I can keep improving on it. The background behind minerd is that it's built on top of Jeff Garzik's excellent cpuminer application which is programmed in (efficient, fast, lower overhead for low level stuff) c and I've added gpu mining code to it. The idea is to make an all encompassing cpu + gpu miner that performs very well, is cross platform and is being actively developed.

EDITED NOW CALLED CGMINER:
Latest source release:
http://ck.kolivas.org/apps/cgminer-1.2.3.tar.bz2

Git tree:
https://github.com/ckolivas/cgminer

Latest git source tarball:
https://github.com/ckolivas/cgminer/tarball/cgminer

Windows binary:
http://ck.kolivas.org/apps/cgminer-1.2.3-win32.zip

Summary of GUI mining features so far: Custom modified phatk and poclbm kernels, BFI_INT patching, new VECTOR code, long poll support, multi-card support while doing +/- CPU mining as well. The GPU mining code now detects the optimal parameters and capabilities for every card and sets each card separately. It has extensive failover logic to keep work going at all times and GPUs should virtually never go idle.

Initially the oclminer code was ported across to get the gpu mining working, but a lot of work has been done in gpu mining since then. I've drastically modified the way work is passed to the GPU to make it as asynchronous as possible and keep the GPU busy without making the GUI come to a standstill. Since then I've ported across the poclbm and phatk kernels, added BFI_INT patching and vectors. I've custom modified the vector code to make the most of modern pipelines and added code to detect optimal settings for each card and set them separately. Performance on my 6770 and 4x6970 is now better than any other GPU mining software.

The output looks like this:

cgminer version 1.2.0
--------------------------------------------------------------------------------
Totals: [(5s):166.9  (avg):194.3 Mh/s] [Q:43  A:14  R:0  HW:2  E:33%  U:2.53/m]
--------------------------------------------------------------------------------
GPU 0: [183.5 Mh/s] [Q:15  A:14  R:0  HW:2  E:93%  U:2.57/m]
CPU 0: [0.0 Mh/s] [Q:1  A:0  R:0  HW:0  E:0%  U:0.00/m]
CPU 1: [3.2 Mh/s] [Q:1  A:0  R:0  HW:0  E:0%  U:0.00/m]
CPU 2: [3.2 Mh/s] [Q:1  A:0  R:0  HW:0  E:0%  U:0.00/m]
CPU 3: [3.2 Mh/s] [Q:5  A:0  R:0  HW:0  E:0%  U:0.00/m]
--------------------------------------------------------------------------------
[2011-07-11 13:35:41] Share accepted from GPU 0
[2011-07-11 13:36:00] Share accepted from GPU 0
[2011-07-11 13:36:37] Share accepted from GPU 0
[2011-07-11 13:36:57] Share accepted from GPU 0
[2011-07-11 13:37:06] Server not providing work fast enough, generating work locally
[2011-07-11 13:37:07] Resumed retrieving work from server
[2011-07-11 13:37:23] LONGPOLL detected new block, flushing work queue
[2011-07-11 13:37:41] Share accepted from GPU 0
[2011-07-11 13:37:43] LONGPOLL detected new block, flushing work queue
[2011-07-11 13:38:10] Share accepted from GPU 0
[2011-07-11 13:39:08] Share accepted from GPU 0
[2011-07-11 13:39:25] Share accepted from GPU 0

Options:
--cpu-threads|-t Number of miner CPU threads (default: 0)
--debug|-D          Enable debug output
--gpu-threads|-g Number of threads per GPU (0 - 10) (default: 2)
--intensity|-I Intensity of GPU scanning (0 - 14) (default: 4)
--log|-l      Interval in seconds between log output (default: 5)
--no-longpoll       Disable X-Long-Polling support
--pass|-p     Password for bitcoin JSON-RPC server
--protocol-dump|-P  Verbose dump of protocol-level activities
--queue|-Q    Number of extra work items to queue (1 - 10) (default: 1)
--quiet|-q          Disable per-thread hashmeter output
--retries|-r  Number of times to retry before giving up, if JSON-RPC call fails (-1 means never) (default: -1)
--retry-pause|-R Number of seconds to pause, between retries (default: 5)
--scan-time|-s Upper bound on time spent scanning current work, in seconds (default: 60)
--syslog            Use system log for output messages (default: standard error)
--url|-o      URL for bitcoin JSON-RPC server (default: "http://127.0.0.1:8332/")
--user|-u     Username for bitcoin JSON-RPC server
--vectors|-v  Override detected optimal vector width (1, 2 or 4)
--verbose           Log verbose output to stderr as well as status output
--worksize|-w Override detected optimal worksize (default: 0)
--userpass|-O Username:Password pair for bitcoin JSON-RPC server
Options for command line only:
--config|-c   Load a JSON-format configuration file
See example-cfg.json for an example configuration.
--help|-h           Print this message
--ndevs|-n          Display number of detected GPUs and exit

---
Please test it and report back!

I recommend limiting intensity to no more than 8 as higher levels cause serious stalls and very little improvement (with possibly more stale shares). Some nvidia cards crash immediately on startup and that's because they report bogus values for their worksize. Try setting worksize to 128 or 256 manually if that happens.
legendary
Activity: 1596
Merit: 1100
The software formerly known as 'cpuminer' is being updated to include OpenCL GPU mining capability, thanks to Con Kolivas.  Until this software is fully "baked" and ready, it is being developed on a git branch at https://github.com/ckolivas/cgminer

Once stable, we intend to merge Con's work and rename 'cpuminer' to something more appropriate.

Update:  See this cgminer thread for official cgminer support and development.



Pages:
Jump to: