Author

Topic: BFGMiner/CGMiner, Catalyst 13.4 and 58xx: Fix! (Read 6540 times)

member
Activity: 106
Merit: 10
September 30, 2013, 12:48:34 PM
#13
It sadly did not work in my case (Radeon Mobility 5870, 13.10 beta and Win8x64). Using bfgminer now where it works Smiley
newbie
Activity: 35
Merit: 0
Thank you so much!

This worked like a charm for me.  I upgraded my rig from BAMT to something more modern and ran into this problem.
I'm now successfully running the following configuration with hash results as near as I can remember from before.

  • AMD/ATI Radeon 5770
  • Linux Kernel 3.8.0-29-generic x64 (Ubuntu)
  • Catalyst 13.101 (experimental)
  • AMD-APP-SDK-V2.8


-E
legendary
Activity: 2576
Merit: 1186
The old BFI_INT patching doesn't work with SDKs newer than the one included in Catalyst 13.1.

I'll have the next release of BFGMiner (3.1.2) disable the patching on newer SDKs, but note the bitselect implementation does not perform as well (about 6 Mh/s lost).

These drivers seem to also have some ADL issues (at least for me), so I'm adding a workaround for that too.
newbie
Activity: 1
Merit: 0
For reference, the same error occurred using guiminer and a Radeon 6950 after updating to 13.4. The fix in the first post was successful in correcting the error.
full member
Activity: 193
Merit: 100
Thanks for the references of my post Smiley
newbie
Activity: 2
Merit: 0
Hi

i own a little ATI Mobility Radeon 5870 with the 13.4 driver and got the HW errors too since i updated my driver.
I saw your post and tried it but with no success, still getting HW errors.
When i modified my file with this code:
Code:
#ifdef BITALIGN
#pragma OPENCL EXTENSION cl_amd_media_ops : enable
#define rot(x, y) amd_bitalign(x, x, (uint)(32 - y))

// This part is not from the stock poclbm kernel. It's part of an optimization
// added in the Phoenix Miner.

// Some AMD devices have Vals[0] BFI_INT opcode, which behaves exactly like the
// SHA-256 Ch function, but provides it in exactly one instruction. If
// detected, use it for Ch. Otherwise, construct Ch out of simpler logical
// primitives.

//We have an SDK which automatically optimizes to BFI INT, so lets do this

#define Ch(x, y, z) bitselect(z, y, x)
#define Ma(x, y, z) bitselect((u)x, (u)y, (u)z ^ (u)x)
#define rotr(x, y) amd_bitalign((u)x, (u)x, (u)y)

#else // BITALIGN
#define Ch(x, y, z) (z ^ (x & (y ^ z)))
#define Ma(x, y, z) ((x & z) | (y & (x | z)))
#define rot(x, y) rotate((u)x, (u)y)
#define rotr(x, y) rotate((u)x, (u)(32-y))
#endif

bfgminer crashes at startup.

PS: The tweak with poclbm from your link works fine, but i don't like to use GUIMiner

newbie
Activity: 28
Merit: 0
Good first post!
newbie
Activity: 14
Merit: 0
Thanks!

Hopefully when I'm out of the newbie corner I can move it to a more visible area Smiley

yea your not gonna get very good answers on this part of the forum
newbie
Activity: 10
Merit: 0
Thanks!

Hopefully when I'm out of the newbie corner I can move it to a more visible area Smiley
member
Activity: 72
Merit: 10
I do not own a 5000 series card nor have I ever encountered this problem but I'd just like to say nice work putting this together. I think it's worthy of a bump, at the least. Smiley
newbie
Activity: 10
Merit: 0
Don't have that card so I can't test if it helps, but you could provide a patch to make it even easier to implement.

I'd be glad to, however it's not worth implementing in it's current form; I removed some of the checks that might need to be there for other card series!

Code snippet of the diff:
Code:
--- phatk121016.cl      2013-06-07 09:38:40.000000000 +0100
+++ phatk121016-modified.cl     2013-06-07 10:41:05.000000000 +0100
@@ -57,27 +57,12 @@
 // SHA-256 Ch function, but provides it in exactly one instruction. If
 // detected, use it for Ch. Otherwise, construct Ch out of simpler logical
 // primitives.
-
- #ifdef BFI_INT
-       // Well, slight problem... It turns out BFI_INT isn't actually exposed to
-       // OpenCL (or CAL IL for that matter) in any way. However, there is
-       // a similar instruction, BYTE_ALIGN_INT, which is exposed to OpenCL via
-       // amd_bytealign, takes the same inputs, and provides the same output.
-       // We can use that as a placeholder for BFI_INT and have the application
-       // patch it after compilation.

-       // This is the BFI_INT function
-       #define Ch(x, y, z) amd_bytealign(x,y,z)
-       // Ma can also be implemented in terms of BFI_INT...
-       #define Ma(z, x, y) amd_bytealign(z^x,y,x)
- #else // BFI_INT
-       // Later SDKs optimise this to BFI INT without patching and GCN
-       // actually fails if manually patched with BFI_INT
-
-       #define Ch(x, y, z) bitselect((u)z, (u)y, (u)x)
+       //We have an SDK which automatically optimizes to BFI INT, so lets do this
+       #define Ch(x, y, z) bitselect(z, y, x)
        #define Ma(x, y, z) bitselect((u)x, (u)y, (u)z ^ (u)x)
        #define rotr(x, y) amd_bitalign((u)x, (u)x, (u)y)
- #endif
+
 #else // BITALIGN
        #define Ch(x, y, z) (z ^ (x & (y ^ z)))
        #define Ma(x, y, z) ((x & z) | (y & (x | z)))


...and a pastebin for those that prefer it:

http://pastebin.com/tZ7DeV2c
newbie
Activity: 6
Merit: 0
Don't have that card so I can't test if it helps, but you could provide a patch to make it even easier to implement.
newbie
Activity: 10
Merit: 0
Hey guys, figured I'd make my first post useful.

So if you have a 58xx series card (could be more) and upgraded your catalyst drivers to latest, you may have noticed that you are getting nothing but this:

Quote
GPU0: invalid nonce - HW error

Of course, this is no good! You won't be sending any blocks that are deemed invalid, and contributing nothing to whatever pool you might be on!

The reason for this seems to be that amd_bytealign is either working differently or no longer usable in the 13.4 drivers (and probably a few before it). that phatk kernel that cgminer and bfgminer uses was upgraded to harness this optimization, however with the newer drivers "bitselect" is automatically optimized to BFI INT without us having to do it!

So, how do you fix it? simple!

1. To start, navigate to your miner's folder and look for a file that begins with "phatk" and ends with ".cl". this is the OpenCL code that makes up the kernel, but as it's a script it can be opened in any text document viewer (I highly recommend Notepad++!)

2. Inside this file may be a bit daunting, but don't worry, there's only a few small changes needed! You'll notice around line 61 (depending on your version of miner) you'll see the following (or very similar):

Code:
#ifdef BITALIGN
#pragma OPENCL EXTENSION cl_amd_media_ops : enable
#define rot(x, y) amd_bitalign(x, x, (uint)(32 - y))

// This part is not from the stock poclbm kernel. It's part of an optimization
// added in the Phoenix Miner.

// Some AMD devices have Vals[0] BFI_INT opcode, which behaves exactly like the
// SHA-256 Ch function, but provides it in exactly one instruction. If
// detected, use it for Ch. Otherwise, construct Ch out of simpler logical
// primitives.
 #ifdef BFI_INT
// Well, slight problem... It turns out BFI_INT isn't actually exposed to
// OpenCL (or CAL IL for that matter) in any way. However, there is
// a similar instruction, BYTE_ALIGN_INT, which is exposed to OpenCL via
// amd_bytealign, takes the same inputs, and provides the same output.
// We can use that as a placeholder for BFI_INT and have the application
// patch it after compilation.

// This is the BFI_INT function
#define Ch(x, y, z) amd_bytealign(x,y,z)
// Ma can also be implemented in terms of BFI_INT...
#define Ma(z, x, y) amd_bytealign(z^x,y,x)
 #else // BFI_INT
// Later SDKs optimise this to BFI INT without patching and GCN
// actually fails if manually patched with BFI_INT

#define Ch(x, y, z) bitselect((u)z, (u)y, (u)x)
#define Ma(x, y, z) bitselect((u)x, (u)y, (u)z ^ (u)x)
#define rotr(x, y) amd_bitalign((u)x, (u)x, (u)y)
 #endif
#else // BITALIGN
#define Ch(x, y, z) (z ^ (x & (y ^ z)))
#define Ma(x, y, z) ((x & z) | (y & (x | z)))
#define rot(x, y) rotate((u)x, (u)y)
#define rotr(x, y) rotate((u)x, (u)(32-y))
#endif

Some of this block is the problematic code for the 13.4 Catalysts, and as such we need to change it!

3. Delete the above block of code (aprrox line 49 to 86 inclusive), and replace it with the following:
Code:
#ifdef BITALIGN
#pragma OPENCL EXTENSION cl_amd_media_ops : enable
#define rot(x, y) amd_bitalign(x, x, (uint)(32 - y))

// This part is not from the stock poclbm kernel. It's part of an optimization
// added in the Phoenix Miner.

// Some AMD devices have Vals[0] BFI_INT opcode, which behaves exactly like the
// SHA-256 Ch function, but provides it in exactly one instruction. If
// detected, use it for Ch. Otherwise, construct Ch out of simpler logical
// primitives.

//We have an SDK which automatically optimizes to BFI INT, so lets do this

#define Ch(x, y, z) bitselect(z, y, x)
#define Ma(x, y, z) bitselect((u)x, (u)y, (u)z ^ (u)x)
#define rotr(x, y) amd_bitalign((u)x, (u)x, (u)y)

#else // BITALIGN
#define Ch(x, y, z) (z ^ (x & (y ^ z)))
#define Ma(x, y, z) ((x & z) | (y & (x | z)))
#define rot(x, y) rotate((u)x, (u)y)
#define rotr(x, y) rotate((u)x, (u)(32-y))
#endif

This avoids any of the logic that should cause you problems.

4. if your miner is running, shut it down. if you have any leftover files starting in "phatk" and ending in ".bin", it's probably best to delete those.

5. Start up your miner. it should now start accepting blocks!


Hope this has helped. Note that this is modified from the following post regarding a similar problem on poclbm: https://bitcointalksearch.org/topic/amd-134-and-phatk-kernel-on-poclbm-possible-workarround-221041.

Happy mining!
Jump to: