Pages:
Author

Topic: ATTN Litecoin GPU Miners - Scrypt support for cgminer - page 12. (Read 175855 times)

newbie
Activity: 40
Merit: 0
thanks for the hints ... I will try them ... after having some sleep and a day at work :-)
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
system RAM is 2GB ... no other hungry processes running

I know that Linux drivers from AMD are not very good - with more than 1 card it can be a lottery to get the X server up without freezing the system ...

is there any chance another driver version has a higher memory buffer limit ?
Not sure. I'm on 12.6 which gives me a decent buffer size but then I'm also running 7970s and it does appear to be affecting 5/6x and not 7x on linux. Also sdk2.6+ is mandatory with this. You could try making it ignore the reported limits and forcing higher values manually as well by --thread-concurrency and giving it higher multiples of 1440. Try with -g 1 to begin with so that you take that out of the equation at least. There are still far too many variables to know what's best with this FPOS.
newbie
Activity: 40
Merit: 0
system RAM is 2GB ... no other hungry processes running

I know that Linux drivers from AMD are not very good - with more than 1 card it can be a lottery to get the X server up without freezing the system ...

is there any chance another driver version has a higher memory buffer limit ?
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
my configuration:
Linux 3.1.10-1.9 x86_64 GNU/Linux
AMD SDK V2.7
fglrx driver 12.6
cgminer commit bff58c3bed937bd027e46907acd1eab7327e838b compiled from git


cgminer -D -T reports

 [2012-07-25 02:11:38] Preferred vector width reported 4
 [2012-07-25 02:11:38] Max work group size reported 256
 [2012-07-25 02:11:38] Max mem alloc size is 134217728
[2012-07-25 02:11:38] Selecting scrypt kernel
 [2012-07-25 02:11:38] GPU 0: selecting lookup gap of 2
[2012-07-25 02:11:38] GPU 0: selecting thread concurrency of 1440
[2012-07-25 02:11:38] Loaded binary image scrypt120724Cypressglg2tc1440w256l8.bin
 [2012-07-25 02:11:38] Initialising kernel scrypt120724.cl with bitalign, 1 vectors and worksize 256   
 [2012-07-25 02:11:38] Creating scrypt buffer sized 134217728
 [2012-07-25 02:11:38] initCl() finished. Found Cypress


any ideas what is wrong here ?
Unfortunately this is nowhere near as logical as btc mining. Yes for some reason on linux the amd drivers aren't allowing larger buffer sizes. How much system ram do you have? You'll see it's setting concurrency to 1440 whereas something like 7200 is best, but that will ONLY work if the driver allows you to allocate ram. As for the reported hashrate, you are right in that once you go over the optimal values, it either starts doing work that returns less shares or starts creating invalid shares.
newbie
Activity: 40
Merit: 0
After a very good experience mining BTC with cgminer on p2pool for a few months I decided to try the scrypt version on one of my machines.

my experience with a 5850 is worse than what others report:

$ cgminer -o http://my_local_p2pool:9327 -u x -p y --scrypt --shaders 1440 -I 10

is doing only about 83 Kh/s

GPU 0: 83.5 / 83.6 Mh/s | A:129  R:3  HW:0  U:27.80/m  I:10
69.5 C  F: 34% (1819 RPM)  E: 900 MHz  M: 1000 Mhz  V: 1.118V  A: 99% P: 0%
Last initialised: [2012-07-25 02:24:44]
Intensity: 10
Thread 0: 41.6 Mh/s Enabled ALIVE
Thread 1: 41.9 Mh/s Enabled ALIVE


any higher intensity does not really work. p2pool logs many messages like this:

2012-07-25 02:32:18.484973 Worker x submitted share with hash > target:
2012-07-25 02:32:18.485073     Hash:   5a417c48970ce4d87e9b4c082f855a656a90a0e8e0e50a3c62c0dc6273e03b97
2012-07-25 02:32:18.485106     Target: 6b8bd775c948180000000000000000000000000000000000000000000000


at intensity 13 the hash rate estimate of p2pool stays way below 100, but cgminer reports about 250Kh/s and no rejected shares !

GPU 0: 242.5 / 93.2 Mh/s | A:270  R:3  HW:0  U:34.04/m  I:13
70.5 C  F: 60% (3633 RPM)  E: 900 MHz  M: 1000 Mhz  V: 1.118V  A: 99% P: 0%
Last initialised: [2012-07-25 02:24:44]
Intensity: 13
Thread 0: 123.9 Mh/s Enabled ALIVE
Thread 1: 126.5 Mh/s Enabled ALIVE


my configuration:
Linux 3.1.10-1.9 x86_64 GNU/Linux
AMD SDK V2.7
fglrx driver 12.6
cgminer commit bff58c3bed937bd027e46907acd1eab7327e838b compiled from git


cgminer -D -T reports

 [2012-07-25 02:11:38] Preferred vector width reported 4
 [2012-07-25 02:11:38] Max work group size reported 256
 [2012-07-25 02:11:38] Max mem alloc size is 134217728
 [2012-07-25 02:11:38] Selecting scrypt kernel
 [2012-07-25 02:11:38] GPU 0: selecting lookup gap of 2
 [2012-07-25 02:11:38] GPU 0: selecting thread concurrency of 1440
 [2012-07-25 02:11:38] Loaded binary image scrypt120724Cypressglg2tc1440w256l8.bin
 [2012-07-25 02:11:38] Initialising kernel scrypt120724.cl with bitalign, 1 vectors and worksize 256   
 [2012-07-25 02:11:38] Creating scrypt buffer sized 134217728
 [2012-07-25 02:11:38] initCl() finished. Found Cypress


any ideas what is wrong here ?
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
This is horrible for me. 2 5850's.

Code:
[P]ool management [G]PU management [S]ettings [D]isplay options [Q]uit
 GPU 0:  43.0C 3581RPM |  13.4/ 15.1Kh/s | A:1 R:0 HW:0 U:1.73/m I:13
 GPU 1:  46.0C 3410RPM |  50.4/ 53.6Kh/s | A:1 R:0 HW:0 U:1.73/m I:13

cgminer --scrypt -o http://site:port -u username -p password --shaders 1440 --intensity 13

vs

809 khash/s


cgminer --scrypt -o http://site:port -u username -p password --worksize 256 --lookup-gap 2 --thread-concurrency 7200 -g 1 --intensity 19
What debugging do you get running it with -D -T and then stopping it before it starts mining about size when you start it with --shaders 1440? Look for messages about buffer, lookup gap and thread concurrency.

I don't know what debugging is. I'll just stick to

ckolivas-cgminer-3a0d60c
&
cgminer --scrypt -o http:// -u  -p  --worksize 256 --lookup-gap 2 --thread-concurrency 7200 -g 1 --intensity 19

Seems to work the best.
Guys come on... how exactly do you expect me to move forward without your help? I already said start it with "-D -T" added to your command line.
hero member
Activity: 770
Merit: 502
This is horrible for me. 2 5850's.

Code:
[P]ool management [G]PU management [S]ettings [D]isplay options [Q]uit
 GPU 0:  43.0C 3581RPM |  13.4/ 15.1Kh/s | A:1 R:0 HW:0 U:1.73/m I:13
 GPU 1:  46.0C 3410RPM |  50.4/ 53.6Kh/s | A:1 R:0 HW:0 U:1.73/m I:13

cgminer --scrypt -o http://site:port -u username -p password --shaders 1440 --intensity 13

vs

809 khash/s


cgminer --scrypt -o http://site:port -u username -p password --worksize 256 --lookup-gap 2 --thread-concurrency 7200 -g 1 --intensity 19
What debugging do you get running it with -D -T and then stopping it before it starts mining about size when you start it with --shaders 1440? Look for messages about buffer, lookup gap and thread concurrency.

I don't know what debugging is. I'll just stick to

ckolivas-cgminer-3a0d60c
&
cgminer --scrypt -o http:// -u  -p  --worksize 256 --lookup-gap 2 --thread-concurrency 7200 -g 1 --intensity 19

Seems to work the best.
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
This is horrible for me. 2 5850's.

Code:
[P]ool management [G]PU management [S]ettings [D]isplay options [Q]uit
 GPU 0:  43.0C 3581RPM |  13.4/ 15.1Kh/s | A:1 R:0 HW:0 U:1.73/m I:13
 GPU 1:  46.0C 3410RPM |  50.4/ 53.6Kh/s | A:1 R:0 HW:0 U:1.73/m I:13

cgminer --scrypt -o http://site:port -u username -p password --shaders 1440 --intensity 13

vs

809 khash/s


cgminer --scrypt -o http://site:port -u username -p password --worksize 256 --lookup-gap 2 --thread-concurrency 7200 -g 1 --intensity 19
What debugging do you get running it with -D -T and then stopping it before it starts mining about size when you start it with --shaders 1440? Look for messages about buffer, lookup gap and thread concurrency.
hero member
Activity: 770
Merit: 502
Added the --shaders option now
- Make the thread concurrency and lookup gap options hidden on the command line and autotune parameters with a newly parsed --shaders option.

So you should only need to try --shaders and -I now. Note that any intensity above 13 is a gamble and highly dependent on hardware/software combination as to whether it's better, so -I 13 is a good start.

Here's the table again:
Code:
GPU  Processing Elements
7750 512
7770 640
7850 1024
7870 1280
7950 1792
7970 2048

6850 960
6870 1120
6950 1408
6970 1536
6990 (6970x2)

6570 480
6670 480
6790 800

6450 160

5670 400
5750 720
5770 800
5830 1120
5850 1440
5870 1600
5970 (5870x2)

And for those who need me to spell it out, let's say you have a 5830, try this first:
--shaders 1120 -I 13


This is horrible for me. 2 5850's.

Code:
[P]ool management [G]PU management [S]ettings [D]isplay options [Q]uit
 GPU 0:  43.0C 3581RPM |  13.4/ 15.1Kh/s | A:1 R:0 HW:0 U:1.73/m I:13
 GPU 1:  46.0C 3410RPM |  50.4/ 53.6Kh/s | A:1 R:0 HW:0 U:1.73/m I:13

cgminer --scrypt -o http://site:port -u username -p password --shaders 1440 --intensity 13

vs

809 khash/s


cgminer --scrypt -o http://site:port -u username -p password --worksize 256 --lookup-gap 2 --thread-concurrency 7200 -g 1 --intensity 19

-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
Under Ubuntu 12.04 with 12.6 drivers and sdk  with the latest build of cgminer the highest you can set the thread concurrency is 2048 I'm trying to figure why I guess that is the limit in cgminer and not reaper if the kernels are similar?

four 5850s

"intensity" : "11,11,11,11",
"vectors" : "1,1,1,1",
"worksize" : "128,128,128,128",
"kernel" : "scrypt,scrypt,scrypt,scrypt",
"gpu-engine" : "0-750,0-700,0-700,0-700",
"gpu-memclock" : "1000,1000,1000,1000",
"gpu-vddc" : "1.000,0.950,0.950,0.950",
"gpu-threads" : "2",
"scrypt" : true,

this works perfectly at -I 11  ~675 kh
at -I 12 start errors miss-target

I get the same exact results under reaper

worksize 128
aggression 11
threads_per_gpu 2
sharethreads 18
lookup_gap 2
gpu_thread_concurrency 2048

aggression 11 works great no errors  ~ 675 kh
aggression 12 throws errors


Ok the best results for reaper for me seems to be

worksize 128
aggression 12
threads_per_gpu 2
sharethreads 18
lookup_gap 2
gpu_thread_concurrency 5760

this works great no errors ~1030kh pool reports roughly same

I don't have any 5850s running under windows to seem if the results are the same. The same settings that work on both give nearly identical khs in both programs, but the difference is the thread concurrency maximum of 2048 in cgminer vs. 5760 in reaper? this even make sense lol?

I have a couple 5870s I'm need to test this out on but I'm thinking 4 x shader count is the optimum?

I just noticed that reaper builds the buffer at 360mb @ 5760tc   4 x 360 = 1440

cgminer's buffer build, thread concurrency, gap lookup and intensity/aggression work identically to raper so you will get identical buffer results, base performance etc with the same settings. The only thing I do differently is I check for errors when submitting the requests to opencl. You CAN override the upper limit for thread concurrency by just putting it in manually. However what likely happens is the error from the kernel running happens randomly. cgminer does not ignore the errors, whereas raper doesn't even check for them, it just keeps sending the kernel over and over ignoring whether it's working or not. Why linux has lower reported memory limits on this occasion to windows I don't know but I'm guessing it's just the driver differences. I could go in and make it ignore the results of failed kernel queueing, but that's so counter to good programming it's ridiculous. However it seems I have no choice...
sr. member
Activity: 322
Merit: 250
Litecoin is a scam - there is no such thing as a "CPU coin" and never can be. 

Could you go into more details on this one ?
It is impossible to design a proof-of-work that cannot be specialized (eg, ASICs). Designing them to be resistent to existing technologies increases the security risk such specialization presents. Simple proofs like SHA256 (in Bitcoin) have proven to work well, since it scales up in tiers (CPU -> GPU -> FPGA -> ASIC).

Some of you Bitcoin purists really are scared of LTC aren't you?  Start your own thread if you'd like to discuss this.  I'm sure some would be more than happy to talk.  Otherwise STFU.
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
This thread is not the forum for that discussion.
legendary
Activity: 2576
Merit: 1186
Litecoin is a scam - there is no such thing as a "CPU coin" and never can be. 

Could you go into more details on this one ?
It is impossible to design a proof-of-work that cannot be specialized (eg, ASICs). Designing them to be resistent to existing technologies increases the security risk such specialization presents. Simple proofs like SHA256 (in Bitcoin) have proven to work well, since it scales up in tiers (CPU -> GPU -> FPGA -> ASIC).
member
Activity: 112
Merit: 10
Litecoin is a scam - there is no such thing as a "CPU coin" and never can be. 

Could you go into more details on this one ?
sr. member
Activity: 322
Merit: 250
Okay, maybe I am being painfully slow today but...

1) Litecoin is intended to be a CPU coin, right ?

2) So, from this thread it appears that GPU mining it in a manner that is more economically effective than CPU mining is possible, right ?

But, if both 1 and 2 is true, doesn't that mean that litecoin is kinda...pointless?
Litecoin is a scam - there is no such thing as a "CPU coin" and never can be. I think Con is only doing this because he's being paid - in Bitcoins Wink

https://en.bitcoin.it/wiki/Litecoin#Criticism

Well considering you wrote that.  It's totally believable! lol.
legendary
Activity: 2576
Merit: 1186
Okay, maybe I am being painfully slow today but...

1) Litecoin is intended to be a CPU coin, right ?

2) So, from this thread it appears that GPU mining it in a manner that is more economically effective than CPU mining is possible, right ?

But, if both 1 and 2 is true, doesn't that mean that litecoin is kinda...pointless?
Litecoin is a scam - there is no such thing as a "CPU coin" and never can be. I think Con is only doing this because he's being paid - in Bitcoins Wink

https://en.bitcoin.it/wiki/Litecoin#Criticism
member
Activity: 112
Merit: 10
Okay, maybe I am being painfully slow today but...

1) Litecoin is intended to be a CPU coin, right ?

2) So, from this thread it appears that GPU mining it in a manner that is more economically effective than CPU mining is possible, right ?

But, if both 1 and 2 is true, doesn't that mean that litecoin is kinda...pointless?
hero member
Activity: 770
Merit: 502
Added the --shaders option now
- Make the thread concurrency and lookup gap options hidden on the command line and autotune parameters with a newly parsed --shaders option.

So you should only need to try --shaders and -I now. Note that any intensity above 13 is a gamble and highly dependent on hardware/software combination as to whether it's better, so -I 13 is a good start.

Here's the table again:
Code:
GPU  Processing Elements
7750 512
7770 640
7850 1024
7870 1280
7950 1792
7970 2048

6850 960
6870 1120
6950 1408
6970 1536
6990 (6970x2)

6570 480
6670 480
6790 800

6450 160

5670 400
5750 720
5770 800
5830 1120
5850 1440
5870 1600
5970 (5870x2)

And for those who need me to spell it out, let's say you have a 5830, try this first:
--shaders 1120 -I 13


This is horrible for me. 2 5850's.

Code:
[P]ool management [G]PU management [S]ettings [D]isplay options [Q]uit
 GPU 0:  43.0C 3581RPM |  13.4/ 15.1Kh/s | A:1 R:0 HW:0 U:1.73/m I:13
 GPU 1:  46.0C 3410RPM |  50.4/ 53.6Kh/s | A:1 R:0 HW:0 U:1.73/m I:13

cgminer --scrypt -o http://site:port -u username -p password --shaders 1440 --intensity 13

vs

809 khash/s


cgminer --scrypt -o http://site:port -u username -p password --worksize 256 --lookup-gap 2 --thread-concurrency 7200 -g 1 --intensity 19
sr. member
Activity: 277
Merit: 250
Tesla M2050    79.8          1550    448    DiabloMiner
Tesla M2050    94.5          1550       poclbm

https://en.bitcoin.it/wiki/Mining_hardware_comparison#Nvidia

¤¿¤
member
Activity: 98
Merit: 10
Does CGminer-Scrypt support 1400 x Tesla M2050, the card BX use to launch the attack against litecoin network ?
How many kH/s you can get with that NVIDIA card ?
Is it bigger than 90 kH/s ?
Pages:
Jump to: