Pages:
Author

Topic: python OpenCL bitcoin miner - page 18. (Read 1239035 times)

member
Activity: 73
Merit: 10
April 22, 2011, 08:37:31 PM
I just installed the 270.61 nvidia drivers, and it seems to make the drivers crash each time I exit poclbm by clicking the x, if I ctrl-c they seem to exit fine most of the time. Nothing serious yet, but I keep getting these in my system event log:

"Display driver nvlddmkm stopped responding and has successfully recovered."



have you done any overclocking?

No  Undecided
qed
full member
Activity: 196
Merit: 100
April 22, 2011, 07:37:33 PM

The if (H == 0) modification fixed my crossfire issue. Now all the GPUs are used at 97%-99%, before i had 1 GPU at 99% one at around 94% and one at around 85%.

Wow, really? How could that be I wonder?

Did you do anything else besides pasting in the above section of code?

I just made all the 3 changes posted, that's it. I'm using 3 hd 6950 on windows 7 64 bit.
full member
Activity: 126
Merit: 100
April 22, 2011, 06:37:26 PM
It MAY have lowered the amplitude of the problem, but not by much, it still requires -f 60 for it to be somewhat stable
full member
Activity: 126
Merit: 100
April 22, 2011, 06:31:24 PM
hmmm imana try and see if it fixed my too, brb
legendary
Activity: 3920
Merit: 2349
Eadem mutata resurgo
April 22, 2011, 06:01:42 PM

The if (H == 0) modification fixed my crossfire issue. Now all the GPUs are used at 97%-99%, before i had 1 GPU at 99% one at around 94% and one at around 85%.

Wow, really? How could that be I wonder?

Did you do anything else besides pasting in the above section of code?
qed
full member
Activity: 196
Merit: 100
April 22, 2011, 05:53:49 PM
The if (H == 0) modification fixed my crossfire issue. Now all the GPUs are used at 97%-99%, before i had 1 GPU at 99% one at around 94% and one at around 85%.
full member
Activity: 126
Merit: 100
April 22, 2011, 01:54:24 PM
I go from 233 to 238Mh/s with that code, but fan speed has no effect in any way on both codes
full member
Activity: 196
Merit: 100
April 22, 2011, 01:07:58 PM
Code:
#ifdef VECTORS
if (H.x == 0)
{
output[OUTPUT_SIZE] = output[nonce.x & OUTPUT_MASK] = nonce.x;
}
else if (H.y == 0)
{
output[OUTPUT_SIZE] = output[nonce.y & OUTPUT_MASK] = nonce.y;
}
#else
if (H == 0)
{
output[OUTPUT_SIZE] = output[nonce & OUTPUT_MASK] = nonce;
}
#endif

I can confirm that this code does give a faster hash rate.

Windows 7 64bit
XFX ATI HD 5850
SDK 2.1
11.3 Catalyst

With the original code I get ~249Mhash/s

With the new code suggested above I get ~251Mhash/s

Update: I noticed that this new code is temperature dependent with my setup. I turned my fan up from 50% to 60% and got 255Mhash/s.

At 50% fan speed I get 74*C
At 60% fan speed I get 66*C

If I use the old code turning the fan up has no effect. I'm not sure why this is.
sr. member
Activity: 308
Merit: 251
April 22, 2011, 09:18:03 AM
No. The kernel computes correctly only the 4 first bytes. It's confusing, because there is a code in BitcoinMiner.cl BelowOrEquals() which checks 8 bytes - this produces better assembler for some reason, at least in my setup. It can be replaced with 'if (H == 0)' (but it was slower). That's exactly why the targets are hard coded to difficulty of 1 (00000000 FFFF0000).

If you mean that this:
Code:
#ifdef VECTORS
if (belowOrEquals(H.x, targetH, G.x, targetG))
{
output[OUTPUT_SIZE] = output[nonce.x & OUTPUT_MASK] = nonce.x;
}
else if (belowOrEquals(H.y, targetH, G.y, targetG))
{
output[OUTPUT_SIZE] = output[nonce.y & OUTPUT_MASK] = nonce.y;
}
#else
if (belowOrEquals(H, targetH, G, targetG))
{
output[OUTPUT_SIZE] = output[nonce & OUTPUT_MASK] = nonce;
}
#endif
}
is faster than this:
Code:
#ifdef VECTORS
if (H.x == 0)
{
output[OUTPUT_SIZE] = output[nonce.x & OUTPUT_MASK] = nonce.x;
}
else if (H.y == 0)
{
output[OUTPUT_SIZE] = output[nonce.y & OUTPUT_MASK] = nonce.y;
}
#else
if (H == 0)
{
output[OUTPUT_SIZE] = output[nonce & OUTPUT_MASK] = nonce;
}
#endif
then I got very different results in my tests on Windows 7 x64 & HD5850.

With belowEquals I get ~270Mhps
With the H == 0 version I get ~275Mhps

That's a 1.8% speed increase.

Quote
Actually you are right, but in a different way - because of this I should use hard coded kernel target of 00000000 FFFFFFFF in order to not lose 1 thousandth of a percent of all valid difficulty=1 candidates. I'll do this with the next release.

I'd just stop the target from being sent to the GPU and do that H==0 thing. I see no logical reason that it would be slower. There might be illogical reasons, though :-)

Quote
Why you check for 'lower' with a 'greater than' operator? Where did the '0x80000000' came from?

I guess I didn't explain that part that well. I chose 0x80000000 to make it easy to determine if bytes 4-7 were actually calculated correctly. I edited BitcoinMiner.cl to only accept hashes below 0x80000000. In the Python code, "> 0x80000000" (should've been >= though) is checking if that part of the hash is over 0x80000000 and warns the user if it is, i.e. if the GPU calculated the hash wrong. But none of that matters since it wasn't even supposed to calculate bytes 4-7 right.


mOmchill,

Is this H==0 mod. going to go into official version of poclbm miner on github  ?

Cheers.

I just tried this and I'm too getting around 1.8% increase in speed.

debian sid x86_64
ATI APP v2.4
11.3 catalyst
pyopencl git
poclbm git
XFX ATI Radeon HD5830 xxx

with the git version I'm getting around 240000
adding the above I'm getting 245000
newbie
Activity: 55
Merit: 0
April 22, 2011, 07:16:54 AM
I just installed the 270.61 nvidia drivers, and it seems to make the drivers crash each time I exit poclbm by clicking the x, if I ctrl-c they seem to exit fine most of the time. Nothing serious yet, but I keep getting these in my system event log:

"Display driver nvlddmkm stopped responding and has successfully recovered."



have you done any overclocking?
member
Activity: 73
Merit: 10
April 21, 2011, 10:17:29 PM
I just installed the 270.61 nvidia drivers, and it seems to make the drivers crash each time I exit poclbm by clicking the x, if I ctrl-c they seem to exit fine most of the time. Nothing serious yet, but I keep getting these in my system event log:

"Display driver nvlddmkm stopped responding and has successfully recovered."

newbie
Activity: 6
Merit: 0
April 21, 2011, 12:15:47 PM
does anyone have the best settings for a 4850 or are the stock settings the best?
legendary
Activity: 3920
Merit: 2349
Eadem mutata resurgo
April 21, 2011, 04:28:53 AM
No. The kernel computes correctly only the 4 first bytes. It's confusing, because there is a code in BitcoinMiner.cl BelowOrEquals() which checks 8 bytes - this produces better assembler for some reason, at least in my setup. It can be replaced with 'if (H == 0)' (but it was slower). That's exactly why the targets are hard coded to difficulty of 1 (00000000 FFFF0000).

If you mean that this:
Code:
#ifdef VECTORS
if (belowOrEquals(H.x, targetH, G.x, targetG))
{
output[OUTPUT_SIZE] = output[nonce.x & OUTPUT_MASK] = nonce.x;
}
else if (belowOrEquals(H.y, targetH, G.y, targetG))
{
output[OUTPUT_SIZE] = output[nonce.y & OUTPUT_MASK] = nonce.y;
}
#else
if (belowOrEquals(H, targetH, G, targetG))
{
output[OUTPUT_SIZE] = output[nonce & OUTPUT_MASK] = nonce;
}
#endif
}
is faster than this:
Code:
#ifdef VECTORS
if (H.x == 0)
{
output[OUTPUT_SIZE] = output[nonce.x & OUTPUT_MASK] = nonce.x;
}
else if (H.y == 0)
{
output[OUTPUT_SIZE] = output[nonce.y & OUTPUT_MASK] = nonce.y;
}
#else
if (H == 0)
{
output[OUTPUT_SIZE] = output[nonce & OUTPUT_MASK] = nonce;
}
#endif
then I got very different results in my tests on Windows 7 x64 & HD5850.

With belowEquals I get ~270Mhps
With the H == 0 version I get ~275Mhps

That's a 1.8% speed increase.

Quote
Actually you are right, but in a different way - because of this I should use hard coded kernel target of 00000000 FFFFFFFF in order to not lose 1 thousandth of a percent of all valid difficulty=1 candidates. I'll do this with the next release.

I'd just stop the target from being sent to the GPU and do that H==0 thing. I see no logical reason that it would be slower. There might be illogical reasons, though :-)

Quote
Why you check for 'lower' with a 'greater than' operator? Where did the '0x80000000' came from?

I guess I didn't explain that part that well. I chose 0x80000000 to make it easy to determine if bytes 4-7 were actually calculated correctly. I edited BitcoinMiner.cl to only accept hashes below 0x80000000. In the Python code, "> 0x80000000" (should've been >= though) is checking if that part of the hash is over 0x80000000 and warns the user if it is, i.e. if the GPU calculated the hash wrong. But none of that matters since it wasn't even supposed to calculate bytes 4-7 right.


mOmchill,

Is this H==0 mod. going to go into official version of poclbm miner on github  ?

Cheers.
newbie
Activity: 1
Merit: 0
April 13, 2011, 07:44:02 PM
Hello,

A pool I'm trying to connect to requires a higher TIMEOUT than the 5 set in BitcoinMiner.py. Would it be possible to make that into a parameter so I can avoid having to recompile it? I attempted it anyway but I just don't have the time to install and configure all the required prereqs.

Thanks,

Nameroc
member
Activity: 104
Merit: 10
April 13, 2011, 02:16:36 PM
No. The kernel computes correctly only the 4 first bytes. It's confusing, because there is a code in BitcoinMiner.cl BelowOrEquals() which checks 8 bytes - this produces better assembler for some reason, at least in my setup. It can be replaced with 'if (H == 0)' (but it was slower). That's exactly why the targets are hard coded to difficulty of 1 (00000000 FFFF0000).

If you mean that this:
Code:
#ifdef VECTORS
if (belowOrEquals(H.x, targetH, G.x, targetG))
{
output[OUTPUT_SIZE] = output[nonce.x & OUTPUT_MASK] = nonce.x;
}
else if (belowOrEquals(H.y, targetH, G.y, targetG))
{
output[OUTPUT_SIZE] = output[nonce.y & OUTPUT_MASK] = nonce.y;
}
#else
if (belowOrEquals(H, targetH, G, targetG))
{
output[OUTPUT_SIZE] = output[nonce & OUTPUT_MASK] = nonce;
}
#endif
}
is faster than this:
Code:
#ifdef VECTORS
if (H.x == 0)
{
output[OUTPUT_SIZE] = output[nonce.x & OUTPUT_MASK] = nonce.x;
}
else if (H.y == 0)
{
output[OUTPUT_SIZE] = output[nonce.y & OUTPUT_MASK] = nonce.y;
}
#else
if (H == 0)
{
output[OUTPUT_SIZE] = output[nonce & OUTPUT_MASK] = nonce;
}
#endif
then I got very different results in my tests on Windows 7 x64 & HD5850.

With belowEquals I get ~270Mhps
With the H == 0 version I get ~275Mhps

That's a 1.8% speed increase.

Quote
Actually you are right, but in a different way - because of this I should use hard coded kernel target of 00000000 FFFFFFFF in order to not lose 1 thousandth of a percent of all valid difficulty=1 candidates. I'll do this with the next release.

I'd just stop the target from being sent to the GPU and do that H==0 thing. I see no logical reason that it would be slower. There might be illogical reasons, though :-)

Quote
Why you check for 'lower' with a 'greater than' operator? Where did the '0x80000000' came from?

I guess I didn't explain that part that well. I chose 0x80000000 to make it easy to determine if bytes 4-7 were actually calculated correctly. I edited BitcoinMiner.cl to only accept hashes below 0x80000000. In the Python code, "> 0x80000000" (should've been >= though) is checking if that part of the hash is over 0x80000000 and warns the user if it is, i.e. if the GPU calculated the hash wrong. But none of that matters since it wasn't even supposed to calculate bytes 4-7 right.
full member
Activity: 124
Merit: 100
April 13, 2011, 12:15:16 PM
Can someone give me advice how to stable this miner on 5970, MSI afterburned to 875MHz core | 695MHz memory? MHashes/s are way too low - approx. 530, they must be around 640-650. Current extra flags are -v -w128.
I'm using 11.4 the early preview version under Windows 7 x64 because every other driver gives me a bsod

sr. member
Activity: 350
Merit: 250
April 12, 2011, 12:44:37 PM
If you run your miner on dedicated rig, than -f 0 flag is handy. That wll really lag you display, but will run miner on full power.
full member
Activity: 171
Merit: 127
April 12, 2011, 11:49:48 AM
I think I've found a bug in poclbm. If I'm right, it goes like this:

The OpenCL part is supposed to calculate the 8 first bytes of the hash.

No. The kernel computes correctly only the 4 first bytes. It's confusing, because there is a code in BitcoinMiner.cl BelowOrEquals() which checks 8 bytes - this produces better assembler for some reason, at least in my setup. It can be replaced with 'if (H == 0)' (but it was slower). That's exactly why the targets are hard coded to difficulty of 1 (00000000 FFFF0000).

Quote
The 4 first bytes are calculated fine, but the next ones are not, so they end up being essentially random numbers. In specific situations, it could lead to the miner ignoring a valid block.

Actually you are right, but in a different way - because of this I should use hard coded kernel target of 00000000 FFFFFFFF in order to not lose 1 thousandth of a percent of all valid difficulty=1 candidates. I'll do this with the next release.

Quote
            
Code:
elif h[6] > 0x80000000:
...

I saw the target given to OpenCL is hardcoded. I changed 0xFFFF0000 to 0x80000000. That should cause the OpenCL part to return nonces that result in hashes having the first 8 bytes lower than 0x0000000080000000, right?

Why you check for 'lower' with a 'greater than' operator? Where did the '0x80000000' came from?

Anyway, thank you for your comments, they are always welcome.
newbie
Activity: 8
Merit: 0
April 12, 2011, 01:54:34 AM
jgarzik: I tested this against vanilla client to be sure blocks are actually accepted. On ATI 4350 it makes ~5800 khash/s.
Did you have to do anything special for it to work with a 4350?  I can't get mine to work after installing the latest AMD drivers the sdk says there is no compliant opendsk video card.
full member
Activity: 171
Merit: 127
April 12, 2011, 12:17:35 AM

I'm using the following flags : -v -w128

Looking from the hardware comparison table I should be able to pull out some more, possibly up to 350. Do you think those kinds of results are only possible on a dedicated box running Linux? I'm not sure whether to downgrade the SDK to 2.1/2.2 to test it though. Also heard people were having problems with the GUI miner on 2.1, thoughts?

You can try with lower -f, i.e. under 20. Or you can overclock your GPU (core, memory doesn't affect performance) a little. I don't think downgrading to 2.2 will bring much more.
Pages:
Jump to: