hashkill - testing bitcoin miner plugin - page 7.

xanatos

newbie

Activity: 42

Merit: 0

Quote from: gat3way on June 10, 2011, 04:28:04 PM

@xanatos, nice

Anyway, why did you remove that one?

Quote

__attribute__((reqd_work_group_size(64, 1, 1)))

I understand that workgroup size is configurable in poclbm. However, in most cases, 64 should be the best one. Also, hardcoding the required workgroup size helps the OpenCL compiler to do better the register allocation stuff as it "knows" the workgroup size in compile time and do not make worst-case assumptions. You are losing performance due to this.

Another thing is (don't know if that's possible with pyopencl) - don't use clenqueuereadbuffer() (or whatever it's equivalent is). Use clenqueuemapbuffer() instead. It's noticably faster. Hm really started wondering about modifying some python miner to incorporate that kernel there, looks like a quick way to make it portable to windows. Besides, there are obvious problems with the non-ocl part which are due to code inmaturity.

Tried the first... No measurable difference (technically I first tried it with 64, but it was much slower, then changed it to 256, and there was no difference with the version without the attribute). Tried the second... Mmmmh... perhaps there was a difference, but it was very small. Tried even changing the size of the output buffer. You use a one-unit array, poclbm uses a 256 sized array, but no difference. The memory bandwitdth of the graphical adaptor is probably great enough that 1kb of data every second or so isn't truly measurable.

Tomorrow I'll modify the poclbm kernel to work with your frontend. It shouldn't be much difficult.

Clipse

hero member

Activity: 504

Merit: 502

Not sure why you have such poor eff cause Im on 100-104% the whole time , confused tho how I can get more than 100% eff.

Dusty

hero member

Activity: 731

Merit: 503

Libertas a calumnia

After some weeks of using hashkill because I find very neat to use just one program to mine on all gpu's I get tired to have very low efficiency (80-90%) so I tried phoenix.

Thanks to server side statistics (I'm using bitcoins.lc) the results are astounding: while phoenis is somewhat slower the total accepted share are WAAAAAY higher.
I gained more than 30% efficiency overall.

So I suppose there is something very very broken on hashkill but I'm unable to understand what.
I still have to understand why number of shares processed is different from submitted: most of times that number it's (much) higher but sometimes it can also be less (!).

Someone care to explain me how this is possible?

Thanks!
Dusty

gat3way

sr. member

Activity: 256

Merit: 250

And how do you know your pool does not do the same?

Capitan

member

Activity: 112

Merit: 10

I would not run this without the source being released. How do oyu know he's not stealing 1% / 1/2%, 1/4%, etc, of your compute cycles? Or some other nefarious thing?

gat3way

sr. member

Activity: 256

Merit: 250

@xanatos, nice

Anyway, why did you remove that one?

Quote

__attribute__((reqd_work_group_size(64, 1, 1)))

I understand that workgroup size is configurable in poclbm. However, in most cases, 64 should be the best one. Also, hardcoding the required workgroup size helps the OpenCL compiler to do better the register allocation stuff as it "knows" the workgroup size in compile time and do not make worst-case assumptions. You are losing performance due to this.

Another thing is (don't know if that's possible with pyopencl) - don't use clenqueuereadbuffer() (or whatever it's equivalent is). Use clenqueuemapbuffer() instead. It's noticably faster. Hm really started wondering about modifying some python miner to incorporate that kernel there, looks like a quick way to make it portable to windows. Besides, there are obvious problems with the non-ocl part which are due to code inmaturity.

gat3way

sr. member

Activity: 256

Merit: 250

http://hashkill.sourceforge.net/

I am _not_ providing any support regarding building it anyway,sorry.

lizthegrey

newbie

Activity: 56

Merit: 0

Quote from: gat3way on June 10, 2011, 04:56:26 AM

It's GPL so it can be embedded into GPL software.

I couldn't find source for the non-opencl portions. Would you mind providing a download link for the source corresponding to the 0.2.4 build? Thanks!

snoopy

newbie

Activity: 7

Merit: 0

Quote from: dudel42 on June 10, 2011, 04:00:49 AM

Quote from: twmz on June 09, 2011, 06:23:49 PM

Here is a data point for you. Not sure if it is useful.

I have been using poclbm (up-to-date from git). I decided to try hashkill for 24 hours to compare. Here is what I found.

* The hash rate shown on screen was in fact higher with hashkill vs poclbm. Close to 8% higher.
* % stale was not significantly better or worse (my pool supposes LP). It was good with both (about 0.5%).
* However, the submitted share rate was significantly lower. I understand that rate at which shares are found is driven somewhat by luck, but I feel like 24 hours of share data is long enough that a comparison is fair. I collected data from two different machines running ubuntu:

Over two 24 hour periods:

Machine 1, 6850. poclbm: 3974 shares, 195MH/s displayed. hashkill: 3630 shares 201MH/s displayed.
Machine 2, 2x6970. poclbm: 13256 shares, 670 MH/s displayed. hashkill: 12270 shares, 710 MH/s displayed.

I can't explain this data. But the reality is that I earned less on my PPS pool with hashkill.

I've noticed the exact same behavior, just haven't had time to conduct long time tests yet. hashkill had more hashes/s displayed, but just watching the pool statistics online it always had less hashrate registered by the pool than phoenix.

that's why I switched back to phoenix for the time being, even though I really like hashkill (and expecially the fact that one process supports multiple gpus).

FWIW, I cannot reproduce what you two are seeing: when I compare the projected coins created per day given the hashrate hashkill displays with what I'm actually getting from my pool, it is accurate to .1 btc. So on my rig at least, it seems to be hashing at exactly the rate it's displaying. The same holds true for my poclbm instances as well when I compare those... (Just my .02 btc)

xanatos

newbie

Activity: 42

Merit: 0

Here there is the "poclbm" version. You use it with poclbm by replacing the BitcoinMiner.cl with the file I'm providing. Remember that the base file I used (amd_bitcoin.cl) is licensed under GPL and so is the work I'm redistributing (I don't know if it's GPL v2, v2 or later or v3, ask the original author).
This version supports the BIT_INT "trick". It doesn't work correctly without the -v flag (because it does 3 hashes at a time, but poeclbm supports only 1 or 2 hashes at a time, and I think the frontend doesn't know when it should increment the root because it has already tested all the nonce values).

If you TRULY want to test the 3 hashes at a time mode, modify BitcoinMiner.py at line 96 from:

   (self.defines, self.rateDivisor, self.hashspace) = if_else(vectors, ('-DVECTORS', 500, 0x7FFFFFFF), ('', 1000, 0xFFFFFFFF))

to

   (self.defines, self.rateDivisor, self.hashspace) = if_else(vectors, ('-DVECTORS', 500, 0x7FFFFFFF), ('', 333, 0x55555555))

and call poclbm without the -v flag.

To call poclbm use something like:

   ./poclbm -d 0 -v -w 256 -f1 -u username --pass password -o address -p port

(skip the ./ if you are under Windows)

On my 5870 this kernel is a 2/3% slower than the standard poclbm kernel, but it could be that the small mods I have done to make it compatible to poclbm have "unbalanced" it.

The link with the file: http://www.mediafire.com/file/c3id5ruw4o7wadi/BitcoinMiner.cl

Oh... If anyone wants to tip me... I would be very happy :-) My first tip as a programmer! :-) (ok... I DO it for a work, but a tip is a tip :-) )

Tips appreciated: 1AwtyweUV9GBUhEHPtowAmgcj5uoUq4y1c

sagefool1975

newbie

Activity: 6

Merit: 0

Quote from: gat3way on June 10, 2011, 01:04:54 AM

You need the ICD stuff installed. Go to /opt/AMD-APP-SDK-v2.4-lnx64, tar -zxvf icd-registration.tgz; cp -rp OpenCL/* /etc/OpenCL

SDK2.4 has different library names. You need to reinstall the ICD files for that. DO NOT DELETE ANYTHING FROM /etc/OpenCL !!! Otherwise you would lose backward compatibility with older SDKs.

Ahh ok, new registration. Easy enough.

Also you probably want:

cp -rp etc/OpenCL/* /etc/OpenCL/

(you where msising the leading etc) If you plan to include those directions in some kind of 'updating from < 2.3 section' of docs. Thanks.

gat3way

sr. member

Activity: 256

Merit: 250

It's GPL so it can be embedded into GPL software.

You SHOULD NOT set VLIW4 on 5xxx. The reason not using VLIW4 is slower is most likely because you don't take into account that 3 hash operations are calculated instead of 2 (you can consider that as if it was using uint3s instead of uint2s). I am not using uint3 because the OpenCL compiler is buggy and generates wrong ISA code, so that I interlace one uint2 and one uint hash operation.

Another thing is: do not pass OLD_ATI to the kernel unless you have a 4xxx GPU. Otherwise you'd have no BITALIGN and BFI_INT.

xanatos

newbie

Activity: 42

Merit: 0

I'll add that I have converted the Hashkill "amd kernel" to be usable by the pocblm "front end". I'm using a single 5970. When not setting the VLIW4 #define the speed was very low, when setting VLIW4 the speed was equal (a little lower) than the pocblm standard kernel. So your observations confirm mine (but note that I had to make some changes to the hashkill kernel, like how the parameters are passed to the kernel, so my modified kernel isn't totally equal to the "original" hashkill kernel). I cannot release the "modified" kernel because the license of the kernel isn't "clear". But I CAN probably release a diff with the instructions on how to apply it. If anyone is interested just ask.

dudel42

member

Activity: 111

Merit: 10

Quote from: twmz on June 09, 2011, 06:23:49 PM

Here is a data point for you. Not sure if it is useful.

I have been using poclbm (up-to-date from git). I decided to try hashkill for 24 hours to compare. Here is what I found.

* The hash rate shown on screen was in fact higher with hashkill vs poclbm. Close to 8% higher.
* % stale was not significantly better or worse (my pool supposes LP). It was good with both (about 0.5%).
* However, the submitted share rate was significantly lower. I understand that rate at which shares are found is driven somewhat by luck, but I feel like 24 hours of share data is long enough that a comparison is fair. I collected data from two different machines running ubuntu:

Over two 24 hour periods:

Machine 1, 6850. poclbm: 3974 shares, 195MH/s displayed. hashkill: 3630 shares 201MH/s displayed.
Machine 2, 2x6970. poclbm: 13256 shares, 670 MH/s displayed. hashkill: 12270 shares, 710 MH/s displayed.

I can't explain this data. But the reality is that I earned less on my PPS pool with hashkill.

I've noticed the exact same behavior, just haven't had time to conduct long time tests yet. hashkill had more hashes/s displayed, but just watching the pool statistics online it always had less hashrate registered by the pool than phoenix.

that's why I switched back to phoenix for the time being, even though I really like hashkill (and expecially the fact that one process supports multiple gpus).

gat3way

sr. member

Activity: 256

Merit: 250

Quote from: sagefool1975 on June 09, 2011, 08:40:15 PM

LD_LIBRARY_PATH=/opt/AMD-APP-SDK-v2.4-lnx64/lib/x86_64/

[hashkill] Version 0.2.5
[hashkill] Plugin 'bitcoin' loaded successfully
[error] (ocl-threads.c:97) clGetPlatformIDs returned error (no OpenCL installed?)
[hashkill] Threads queue size: 8 plaintexts/thread

ls -l /opt/AMD-APP-SDK-v2.4-lnx64/lib/x86_64/
total 19940
-rwxr-xr-x 1 root root 19089776 2011-03-28 22:06 libamdocl64.so
-rwxr-xr-x 1 root root 512998 2011-03-28 22:06 libGLEW.so
-rwxr-xr-x 1 root root 753434 2011-03-28 22:06 libglut.so
lrwxrwxrwx 1 root root 14 2011-06-09 18:32 libOpenCL.so -> libOpenCL.so.1
-rwxr-xr-x 1 root root 20864 2011-03-28 22:06 libOpenCL.so.1

You need the ICD stuff installed. Go to /opt/AMD-APP-SDK-v2.4-lnx64, tar -zxvf icd-registration.tgz; cp -rp OpenCL/* /etc/OpenCL

SDK2.4 has different library names. You need to reinstall the ICD files for that. DO NOT DELETE ANYTHING FROM /etc/OpenCL !!! Otherwise you would lose backward compatibility with older SDKs.

P.S I am also mining on mining.bitcoin.cz as well. This morning when I got up, I noticed hashkill is stuck on "authentication failure". I've lost a couple of hours not mining anything. That's not good. I will investigate that. Anyway, multi-workers (in failover or load-balance mode) will be supported soon.

Quote

One thing I noted on machine 2 is that twice, hashkill got into a state where the displayed hash rate was 355 MH/s and I could see from aticonfig output that one of the GPUs was idle.

The speed indicator displays speed averaged in last 5 seconds. Anyway, that's not good. This means one or more threads are not "fed" fast enough with getworks and are waiting for more work to come and so they are idle. This could be a normal situation with slower links after a long polling notification as the queues are flushed and we are waiting for new work. But this is not normal otherwise.

mamad

newbie

Activity: 17

Merit: 0

I am giving this a try but for some reason it keeps complaining about authentication error as soon as it finds new shares:

[hashkill] Version 0.2.5
[hashkill] Plugin 'bitcoin' loaded successfully
[hashkill] Found GPU device: Advanced Micro Devices, Inc. - Cayman
[hashkill] Found GPU device: Advanced Micro Devices, Inc. - Cayman
[hashkill] Found GPU device: Advanced Micro Devices, Inc. - Cayman
[hashkill] GPU0: AMD Radeon HD 6900 Series [busy:13%] [temp:41C]
[hashkill] GPU1: AMD Radeon HD 6900 Series [busy:0%] [temp:36C]
[hashkill] GPU2: AMD Radeon HD 6900 Series [busy:0%] [temp:44C]
[hashkill] Temperature threshold set to 90 degrees C
[hashkill] This plugin supports GPU acceleration.
[hashkill] Initialized hash indexes
[hashkill] Initialized thread mutexes
[hashkill] Spawned worker threads
[hashkill] Successfully connected and authorized at mining.bitcoin.cz:8332
[hashkill] Compiling OpenCL kernel source (amd_bitcoin.cl)
[hashkill] Binary size: 203728
[hashkill] Doing BFI_INT magic...

Mining statistics...
Speed: 976 MHash/sec [proc: 10] [subm: 1] [stale: 3] [eff: 10%]
[error] (ocl_bitcoin.c:318) Failure connecting and authenticating to server mining.bitcoin.cz at port 8332!
[error] (ocl_bitcoin.c:326) Retrying in 20s...
Speed: 0 MHash/sec [proc: 12] [subm: 1] [stale: 6] [eff: 8%]

sagefool1975

newbie

Activity: 6

Merit: 0

LD_LIBRARY_PATH=/opt/AMD-APP-SDK-v2.4-lnx64/lib/x86_64/

[hashkill] Version 0.2.5
[hashkill] Plugin 'bitcoin' loaded successfully
[error] (ocl-threads.c:97) clGetPlatformIDs returned error (no OpenCL installed?)
[hashkill] Threads queue size: 8 plaintexts/thread

ls -l /opt/AMD-APP-SDK-v2.4-lnx64/lib/x86_64/
total 19940
-rwxr-xr-x 1 root root 19089776 2011-03-28 22:06 libamdocl64.so
-rwxr-xr-x 1 root root 512998 2011-03-28 22:06 libGLEW.so
-rwxr-xr-x 1 root root 753434 2011-03-28 22:06 libglut.so
lrwxrwxrwx 1 root root 14 2011-06-09 18:32 libOpenCL.so -> libOpenCL.so.1
-rwxr-xr-x 1 root root 20864 2011-03-28 22:06 libOpenCL.so.1

twmz

hero member

Activity: 737

Merit: 500

Here is a data point for you. Not sure if it is useful.

I have been using poclbm (up-to-date from git). I decided to try hashkill for 24 hours to compare. Here is what I found.

* The hash rate shown on screen was in fact higher with hashkill vs poclbm. Close to 8% higher.
* % stale was not significantly better or worse (my pool supposes LP). It was good with both (about 0.5%).
* However, the submitted share rate was significantly lower. I understand that rate at which shares are found is driven somewhat by luck, but I feel like 24 hours of share data is long enough that a comparison is fair. I collected data from two different machines running ubuntu:

Over two 24 hour periods:

Machine 1, 6850. poclbm: 3974 shares, 195MH/s displayed. hashkill: 3630 shares 201MH/s displayed.
Machine 2, 2x6970. poclbm: 13256 shares, 670 MH/s displayed. hashkill: 12270 shares, 710 MH/s displayed.

I can't explain this data. But the reality is that I earned less on my PPS pool with hashkill.

One thing I noted on machine 2 is that twice, hashkill got into a state where the displayed hash rate was 355 MH/s and I could see from aticonfig output that one of the GPUs was idle. I could see both times from the output of long poll messages that this had only been happening a few minutes both times. I'm sure this resulted in some drop of rate of submitted shares, but I don't believe it can explain everything because of the fact that it had only been happening for a few minutes both times and the fact that Machine 1 also had a lower share rate and had only a single GPU.

Anyway, not sure what to make of this, but FYI.

gat3way

sr. member

Activity: 256

Merit: 250

Don't think so, but anyway I should check this possibility. Definitely though 2.1 would crash.

sagefool1975

newbie

Activity: 6

Merit: 0

Gotcha, I'll try 2.4 later today.

I did have 2.2 downloaded and tried it and got a segfault as well.

Any chance it is because one of my cards doesn't support opencl? (Old ATI 2600)

Topic: hashkill - testing bitcoin miner plugin - page 7. (Read 90981 times)