OFFICIAL CGMINER mining software thread for linux/win/osx/mips/arm/r-pi 4.11.0 - page 662.

Diapolo

hero member

Activity: 772

Merit: 500

Quote from: -ck on February 10, 2012, 01:45:46 AM

Quote from: Diapolo on February 10, 2012, 01:27:55 AM

Quote from: -ck on February 10, 2012, 12:59:47 AM

Okay, so as I wrote, if Phatk works, then the base-nonces passed to the kernel should be correct for diakgcn. I will check the phatk.cl to be sure. I saw you added a BITALIGN path to diakgcn, that's not using bitalign() or any other OpenCL function, but simply does it's thing directly. What is that for, i'm not sure if that's needed for a GCN kernel anyway

.
Another idea, are you applying a BFI_INT patch on Tahiti (it must not use amd_bytealign())? This is not needed and produces wrong values ... I want that damn thing working Cheesy

, I stared at it quite a few hours too ^^.

BITALIGN is to enable amd media ops for platforms that have it.

BFI INT patching does NOT work on Tahiti. It makes a corrupt kernel. SDK2.6 automatically uses the BFI INT instruction anyway so there is no need for this crappy patching.

That's what I said, to be sure that BFI_INT patching is DISABLED for Tahiti, I thought it coule be active, so that would have been a problem

. BITALIGN flag is not needed for DiaKGCN, because amd_bitalign() is use nowhere ... cl_amd_media_ops is only needed for doing BFI_INT patching on non GCN hardware (where amd_bytealign() is "patched" into bfi_int instruction). The amd_bitalign() was used with former SDKs to speedup the rotations, these are now optimized via the OpenCL compiler into bitalign anyway.

Dia

-ck

legendary

Activity: 4088

Merit: 1631

Ruu \o/

Quote from: Diapolo on February 10, 2012, 01:27:55 AM

Quote from: -ck on February 10, 2012, 12:59:47 AM

Edit: Perhaps we could try my old approach of writing to output in the kernel, because I know that worked for me?

That's the code I used, but uses your NFLAG. It would need to scan the output buffer on host side everytime after a kernel execution, which could lead to higher CPU usage (and needs changes in host code), but saves the IF-clause and another write into output (which saves the kernel quite some instructions, even on GCN).

Code:

u result = (V[7] == 0x136032ed) * nonce;
output[NFLAG & result] = result;

This code would be more like your current code, but uses the approach of comparison and mul to save 0 or a positive nonce in result (and is slower than your current code). But for sure that can't be the problem we are looking for ...

Code:

u result = (V[7] == 0x136032ed) * nonce;
if (result)
	output[FOUND] = output[NFLAG & result] = result;

Dia

Writing to output on every iteration isn't going to fix the problem, and I can't see how this would help to be honest. Note that your last code will end up setting output[FOUND] to 0 and would undo anything you wrote to it with other threads Wink

-ck

legendary

Activity: 4088

Merit: 1631

Ruu \o/

Quote from: Diapolo on February 10, 2012, 01:27:55 AM

Quote from: -ck on February 10, 2012, 12:59:47 AM

Okay, so as I wrote, if Phatk works, then the base-nonces passed to the kernel should be correct for diakgcn. I will check the phatk.cl to be sure. I saw you added a BITALIGN path to diakgcn, that's not using bitalign() or any other OpenCL function, but simply does it's thing directly. What is that for, i'm not sure if that's needed for a GCN kernel anyway

.
Another idea, are you applying a BFI_INT patch on Tahiti (it must not use amd_bytealign())? This is not needed and produces wrong values ... I want that damn thing working Cheesy

, I stared at it quite a few hours too ^^.

BITALIGN is to enable amd media ops for platforms that have it.

BFI INT patching does NOT work on Tahiti. It makes a corrupt kernel. SDK2.6 automatically uses the BFI INT instruction anyway so there is no need for this crappy patching.

Diapolo

hero member

Activity: 772

Merit: 500

Quote from: -ck on February 10, 2012, 12:59:47 AM

Quote from: Diapolo on February 10, 2012, 12:54:14 AM

Well ... as I said, I have no IDE setup, so currently I can't compile a version for myself. If you don't have the time to fiddle around with my commits, then I really need help in setting up an IDE in Windows. Have you got this in a readme, wiki or can you give me a brief explanation in how to do this? I worked with MS VC++ Express as a hobby some time ago ...

You said local copy, is it a copy of the last version of my fork? As you've observed I am new to this kind of working, but I hope you see my progress Cheesy

.

Dia

Compiling this on windows is nothing short of a DISASTER so forget it.

Anyway I fixed up a few things on my Diapolo branch on github. Pull the changes to bring your local tree into sync. Alas I'm still only getting HW errors, so there's clearly something wrong. The return code for giving me a nonce I use works fine, provided I'm testing for the right thing before sending the nonce back. I've stared at it for half a day and can't find what's wrong. I even tried diablo's kernel and encountered exactly the same problem. For some reason I keep thinking it's something to do with confusion about the initial offset of the nonce and what is passed to the kernel.

Okay, so as I wrote, if Phatk works, then the base-nonces passed to the kernel should be correct for diakgcn. I will check the phatk.cl to be sure. I saw you added a BITALIGN path to diakgcn, that's not using bitalign() or any other OpenCL function, but simply does it's thing directly. What is that for, i'm not sure if that's needed for a GCN kernel anyway

.
Another idea, are you applying a BFI_INT patch on Tahiti (it must not use amd_bytealign())? This is not needed and produces wrong values ... I want that damn thing working Cheesy

, I stared at it quite a few hours too ^^.

Edit: Perhaps we could try my old approach of writing to output in the kernel, because I know that worked for me?

That's the code I used, but uses your NFLAG. It would need to scan the output buffer on host side everytime after a kernel execution, which could lead to higher CPU usage (and needs changes in host code), but saves the IF-clause and another write into output (which saves the kernel quite some instructions, even on GCN).

Code:

u result = (V[7] == 0x136032ed) * nonce;
output[NFLAG & result] = result;

This code would be more like your current code, but uses the approach of comparison and mul to save 0 or a positive nonce in result (and is slower than your current code). But for sure that can't be the problem we are looking for ...

Code:

u result = (V[7] == 0x136032ed) * nonce;
if (result)
	output[FOUND] = output[NFLAG & result] = result;

Dia

-ck

legendary

Activity: 4088

Merit: 1631

Ruu \o/

Quote from: Diapolo on February 10, 2012, 12:54:14 AM

Well ... as I said, I have no IDE setup, so currently I can't compile a version for myself. If you don't have the time to fiddle around with my commits, then I really need help in setting up an IDE in Windows. Have you got this in a readme, wiki or can you give me a brief explanation in how to do this? I worked with MS VC++ Express as a hobby some time ago ...

You said local copy, is it a copy of the last version of my fork? As you've observed I am new to this kind of working, but I hope you see my progress Cheesy

.

Dia

Compiling this on windows is nothing short of a DISASTER so forget it.

Anyway I fixed up a few things on my Diapolo branch on github. Pull the changes to bring your local tree into sync. Alas I'm still only getting HW errors, so there's clearly something wrong. The return code for giving me a nonce I use works fine, provided I'm testing for the right thing before sending the nonce back. I've stared at it for half a day and can't find what's wrong. I even tried diablo's kernel and encountered exactly the same problem. For some reason I keep thinking it's something to do with confusion about the initial offset of the nonce and what is passed to the kernel.

Diapolo

hero member

Activity: 772

Merit: 500

Quote from: -ck on February 09, 2012, 04:34:41 PM

Quote from: Diapolo on February 09, 2012, 02:25:36 PM

Hey Con,

I looked again through every kernel argument and compared line by line with my Python code. I found 2 small differences and 2 brackets, that are not needed (see last commit https://github.com/Diapolo/cgminer/commit/68e36c657318fbe1e7714be470cf954a1d512333), but I guess they don't fix the persisting problem with false-positive nonces (perhaps you can give it a try - I have no compiler or IDE setup to test it by myself). The argument order is exactly as DiaKGCN awaits it, so that can't be the problem either.

It could be a problem of your changes to the output code in the kernel, a problem with the base-nonces, who are passed to the kernel or something with the output-buffer in the CGMINER host code ... :-/. Where resides the output-buffer processing? As I said my kernel used ulong * natively, which I changed to uint * in one commit of my fork, I guess I need to look at it.

Edit: OMFG, I introduced a bug with one of my former commits, which changed the type of the output buffer from uint * to int * ... fixed that one! It's time for another try Con Cheesy

.

Dia

Diapolo... I appreciate the effort you're putting in, and I realise you're new to this collaborative coding and source control management, but probably a good idea to see your code actually compiles before you ask someone to test it. Usually people compile and test their own code before asking someone else to test it for them.

Anyway... I fixed the !(find) in my local copy and it still produces hardware errors.

edit: It doesn't matter what vectors or worksize I try this with.

Well ... as I said, I have no IDE setup, so currently I can't compile a version for myself. If you don't have the time to fiddle around with my commits, then I really need help in setting up an IDE in Windows. Have you got this in a readme, wiki or can you give me a brief explanation in how to do this? I worked with MS VC++ Express as a hobby some time ago ...

You said local copy, is it a copy of the last version of my fork? As you've observed I am new to this kind of working, but I hope you see my progress Cheesy

.

Dia

kano

legendary

Activity: 4634

Merit: 1851

Linux since 1997 RedHat 4

Quote from: SAC on February 09, 2012, 06:09:04 PM

Quote from: gnar1ta$ on February 09, 2012, 06:04:23 PM

Didn't' forget about this Tongue

I ran 2.1.2 with no errors for 18 hours then started 2.2.3 yesterday with the same flags and got this:

Weird how las initialised is slightly after start time, but it wasn't disabled for a few hours. You might be on to something here.

Both 2.2.2 and 2.2.3 will not even start on the machine I compile on they fail to initialize the GPUs never tried on the others I have.

Just an aside ... don't try to use 2.2.2

rcocchiararo

newbie

Activity: 78

Merit: 0

something to add to my issue:

1) autologin is enabled on that rig

2) if a display is attached, i can remote controll it with vnc (ubuntu/debian integrated)
3) if no display is attached, that wont work

gnar1ta$

donator

Activity: 798

Merit: 500

Anyone know how to solve rcocchiararo's "no protocol specified" issue? I've reformatted a couple rigs cuz I couldn't figure that one out. Sometimes host + works, but usually not.

-ck

legendary

Activity: 4088

Merit: 1631

Ruu \o/

Quote from: Endeavour79 on February 09, 2012, 06:14:13 PM

Quote from: -ck on February 09, 2012, 06:12:11 PM

Quote from: Endeavour79 on February 09, 2012, 06:06:51 PM

Thanks for the reply ckolivas and again thanks for the good work..
If I remember right, with version 1.5.1 or a bit later you've upgraded the kernels and again with version 2.2.1or3. I don't blame you for the performance decrease ckolivas! I just want to find out what other users do for best performance, what the best config is.
Btw.. do you still support SDK 2.1, in your FAQ you mention 2.4/2.5 only.

The kernel updates recently were purely bugfixes for platforms they wouldn't work Tongue

SDK 2.1 should work fine for 5x cards with poclbm. I don't think they work with phatk.

Thanks. Will try 11.6 with SDK 2.1 then. Post an update soon..
Btw.. 1.5.x was best performance ever.. Tongue

I seriously cannot see how that can happen...

Endeavour79

full member

Activity: 174

Merit: 100

Quote from: -ck on February 09, 2012, 06:12:11 PM

Quote from: Endeavour79 on February 09, 2012, 06:06:51 PM

Thanks for the reply ckolivas and again thanks for the good work..
If I remember right, with version 1.5.1 or a bit later you've upgraded the kernels and again with version 2.2.1or3. I don't blame you for the performance decrease ckolivas! I just want to find out what other users do for best performance, what the best config is.
Btw.. do you still support SDK 2.1, in your FAQ you mention 2.4/2.5 only.

The kernel updates recently were purely bugfixes for platforms they wouldn't work Tongue

SDK 2.1 should work fine for 5x cards with poclbm. I don't think they work with phatk.

Thanks. Will try 11.6 with SDK 2.1 then. Post an update soon..
Btw.. 1.5.x was best performance ever.. Tongue

-ck

legendary

Activity: 4088

Merit: 1631

Ruu \o/

Quote from: Endeavour79 on February 09, 2012, 06:06:51 PM

Thanks for the reply ckolivas and again thanks for the good work..
If I remember right, with version 1.5.1 or a bit later you've upgraded the kernels and again with version 2.2.1or3. I don't blame you for the performance decrease ckolivas! I just want to find out what other users do for best performance, what the best config is.
Btw.. do you still support SDK 2.1, in your FAQ you mention 2.4/2.5 only.

The kernel updates recently were purely bugfixes for platforms they wouldn't work Tongue

SDK 2.1 should work fine for 5x cards with poclbm. I don't think they work with phatk.

SAC

sr. member

Activity: 322

Merit: 250

Quote from: gnar1ta$ on February 09, 2012, 06:04:23 PM

Didn't' forget about this Tongue

I ran 2.1.2 with no errors for 18 hours then started 2.2.3 yesterday with the same flags and got this:

Weird how las initialised is slightly after start time, but it wasn't disabled for a few hours. You might be on to something here.

Both 2.2.2 and 2.2.3 will not even start on the machine I compile on they fail to initialize the GPUs never tried on the others I have.

Endeavour79

full member

Activity: 174

Merit: 100

Quote from: -ck on February 09, 2012, 05:56:21 PM

Quote from: Endeavour79 on February 09, 2012, 05:54:11 PM

Over time, with new CGMINER versions, never kernels and for sure updated drivers/app sdk the performance is lower and lower.

Currently with 12.1 driver and APP SDK 2.5 I only get around 280Mhash with poclbm and worksize 128 (I8) (tried many settings and different kernels and this works out best).

See this is the thing. You're saying it's the newer kernels and the updated drivers and sdk.... but there have been no updated kernels. They are essentially unchanged for 7 months now. So look at the other things you've blamed instead.

Thanks for the reply ckolivas and again thanks for the good work..
If I remember right, with version 1.5.1 or a bit later you've upgraded the kernels and again with version 2.2.1or3. I don't blame you for the performance decrease ckolivas! I just want to find out what other users do for best performance, what the best config is.
Btw.. do you still support SDK 2.1, in your FAQ you mention 2.4/2.5 only.

gnar1ta$

donator

Activity: 798

Merit: 500

Quote from: -ck on February 08, 2012, 09:09:28 AM

Quote from: gnar1ta$ on February 08, 2012, 09:07:49 AM

Quote from: ?? on ??

Hello ck!

Apparently the problem with "OFF" decreased but not disappeared completely:

Same here. Ran 18 hrs and got 3 of 4 cards showing OFF with 2.2.3. Started 2.1.2 with the same flags last night, I'll check it in 8 hours. Are you using --auto-fan? Not using auto-fan seemed to solve it in 2.2.1, until my power crashed anyway.

Can you check in the menu when the GPUs were "last initialised" ?

Didn't' forget about this Tongue

I ran 2.1.2 with no errors for 18 hours then started 2.2.3 yesterday with the same flags and got this:

Code:

cgminer version 2.2.3 - Started: [2012-02-08 22:52:08]

GPU 2:  74.0C 1801RPM | OFF  / 36.9Mh/s | A: 545 R:  2 HW:0 U: 0.48/m I: 8

GPU 2: 0.0 / 37.1 Mh/s | A:545  R:2  HW:0  U:0.48/m  I:8
74.0 C  F: 31% (1805 RPM)  E: 157 MHz  M: 200 Mhz  V: 0.950V  A: 0% P: 0%
Last initialised: [2012-02-08 22:52:12]
Intensity: 8
Thread 4: 0.0 Mh/s Disabled ALIVE
Thread 5: 0.0 Mh/s Disabled ALIVE

Log entry:

Code:

[2012-02-09 00:48:34] Device 2 idle for more than 60 seconds, GPU 2 declared SICK!
[2012-02-09 00:48:34] Attempting to restart GPU
[2012-02-09 00:48:34] Thread 4 still exists, killing it off
[2012-02-09 00:48:34] Thread 5 still exists, killing it off
[2012-02-09 00:48:35] Thread 4 restarted
[2012-02-09 00:48:35] Thread 5 restarted
[2012-02-09 00:48:36] Thread 4 being disabled
[2012-02-09 00:48:36] Thread 5 being disabled

Weird how las initialised is slightly after start time, but it wasn't disabled for a few hours. You might be on to something here.

rcocchiararo

newbie

Activity: 78

Merit: 0

./configure said OpenCL and ADL sdks where available. (i have fought long enough with compiling cgminer xD)

Code:

adrian@mine01:~$ export DISPLAY=:0
adrian@mine01:~$ cgminer
[2012-02-09 19:57:58] Started cgminer 2.2.3
No protocol specified
[2012-02-09 19:57:58] Error: Getting Device IDs (num)
[2012-02-09 19:57:58] clDevicesNum returned error, no GPUs usable
All devices disabled, cannot mine!
adrian@mine01:~$

Also

Code:

adrian@mine01:~/cgminer-2.2.3$ export DISPLAY=:0.0
adrian@mine01:~/cgminer-2.2.3$ cgminer
[2012-02-09 20:01:01] Started cgminer 2.2.3
No protocol specified
[2012-02-09 20:01:02] Error: Getting Device IDs (num)
[2012-02-09 20:01:02] clDevicesNum returned error, no GPUs usable
All devices disabled, cannot mine!
adrian@mine01:~/cgminer-2.2.3$ aticonfig --pplib-cmd "set fanspeed 0 65"
No protocol specified
aticonfig: This program must be run as root when no X server is active
adrian@mine01:~/cgminer-2.2.3$ sudo aticonfig --pplib-cmd "set fanspeed 0 65"
[sudo] password for adrian:
No protocol specified
No protocol specified
ati_pplib_cmd: Unable to open display `:0.1'.
aticonfig: parsing the command-line failed.

i have a 5850 and a 5830 on this machine.

-ck

legendary

Activity: 4088

Merit: 1631

Ruu \o/

Quote from: Endeavour79 on February 09, 2012, 05:54:11 PM

Over time, with new CGMINER versions, never kernels and for sure updated drivers/app sdk the performance is lower and lower.

Currently with 12.1 driver and APP SDK 2.5 I only get around 280Mhash with poclbm and worksize 128 (I8) (tried many settings and different kernels and this works out best).

See this is the thing. You're saying it's the newer kernels and the updated drivers and sdk.... but there have been no updated kernels. They are essentially unchanged for 7 months now. So look at the other things you've blamed instead.

-ck

legendary

Activity: 4088

Merit: 1631

Ruu \o/

Quote from: rcocchiararo on February 09, 2012, 04:03:03 PM

but i cant start mining, except with cpu only (on 2.1.2, 2.2.3 has no cpu support i think), it tells me that there is no valid gpu available.

What did ./configure show when you built it?
Also, when running remotely:

Code:

export DISPLAY=:0

then start it.

Endeavour79

full member

Activity: 174

Merit: 100

Hi @ll,

first I want to say CGMINER is a fantastic app and I really appreciate all the work and time in it. I already donated Grin

One thing I notices since version 1.5 (when I started to use it)..

I have a 4xHD5830 Windows x64 rig with clocks at 960@300 and had in the beginning constant hashrates around 315MHash or more.
Over time, with new CGMINER versions, never kernels and for sure updated drivers/app sdk the performance is lower and lower.

Currently with 12.1 driver and APP SDK 2.5 I only get around 280Mhash with poclbm and worksize 128 (I8) (tried many settings and different kernels and this works out best).
I am wondering what may be the issue here? What settings do you currently use for best performance and what driver-app sdk combination.

Any recommendations would be nice. (Remember Windows OS please.)

Thanks!! Cool

rcocchiararo

newbie

Activity: 78

Merit: 0

Quote from: jjiimm_64 on February 09, 2012, 04:11:25 PM

Quote from: rcocchiararo on February 09, 2012, 04:03:03 PM

i had an ubuntu box that was ok running phoenix

but i like cgminer more, so i wanted to compile it (it has ubuntu x86)

I tried, but failed, because ncurses version was not correct.

I then upgraded from 10.04 to 10.11 (all through ssh)

After reinstalling the drivers (ssh -X), i was able to compile cgminer.

but i cant start mining, except with cpu only (on 2.1.2, 2.2.3 has no cpu support i think), it tells me that there is no valid gpu available.

I tried automatically starting cgminer with screen and upstart (like i was told arroung page 178 to 180).

It gives the same error.

i then tried with the same trick i used on debian, and nothing happens.

Im not sure if this PC with ubuntu has any trouble with not having a display connected.

did you accept the 'sdk license?'

when, where and how am i supposed to do that ? don't remember doing it on my debian machine.

This ubuntu pc is at my parents house (i moved last week xD), and has no display attached right now.

Important to mention tho, is that right now, i can't use phoenix either.

And ati commands for OC and such, fail as the "normal user" telling me that i MUST start X if i want to run them without being "su", and if i run them with "sudo", they fail too.

I guess that my only choice is to go back, plug a display, and see what happens Tongue

Topic: OFFICIAL CGMINER mining software thread for linux/win/osx/mips/arm/r-pi 4.11.0 - page 662. (Read 5806103 times)