Author

Topic: OFFICIAL CGMINER mining software thread for linux/win/osx/mips/arm/r-pi 4.11.0 - page 779. (Read 5805728 times)

newbie
Activity: 73
Merit: 0
Well... Now I see this, though cgminer 2.0 with your modified exe functioned just fine. I have windows 7 x64 2x6990 if it matters.
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
sr. member
Activity: 462
Merit: 250
I heart thebaron
Changing S will not significantly change much of anything at all I'm afraid. Your GPU will probably finish a work item in significantly less than 1 minute and then get more work. The 1 minute cutoff is used to decide the latest possible that a solution would be allowed to be submitted. So if you find a share in say 30 seconds, you have up to 30 more seconds to submit it (bad network conditions will limit that). cgminer only gets work as often as it needs to keep your GPU busy, and GPUs should never run out and hit the 1 minute cutoff unless you significantly increase the number of threads per GPU. The other thing is cgminer can use existing work to generate more work (rolltime it's called) to keep getting useful shares out of the same work for up to 1 minute. If you significantly decrease -s to less than how long it takes for your GPU to find a share, you are more likely to throw work away unnecessarily. All in all, don't bother changing it, but it would be ideal to get at least 1/m Utility (i.e. 1 accepted share per minute) per thread, which would be 2 per device.
So -s can't be used as an easy-share cherry picker ? low -s, perhaps -s 15, so that your GPU gets 15 seconds to solve otherwise it moves on. This combined with a large que should theoretically help with a higher output, no ? A sort of 'PPS cheat'....LOL, just pick off the easy ones and dump the rest....lol
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
Changing S will not significantly change much of anything at all I'm afraid. Your GPU will probably finish a work item in significantly less than 1 minute and then get more work. The 1 minute cutoff is used to decide the latest possible that a solution would be allowed to be submitted. So if you find a share in say 30 seconds, you have up to 30 more seconds to submit it (bad network conditions will limit that). cgminer only gets work as often as it needs to keep your GPU busy, and GPUs should never run out and hit the 1 minute cutoff unless you significantly increase the number of threads per GPU. The other thing is cgminer can use existing work to generate more work (rolltime it's called) to keep getting useful shares out of the same work for up to 1 minute. If you significantly decrease -s to less than how long it takes for your GPU to find a share, you are more likely to throw work away unnecessarily. All in all, don't bother changing it, but it would be ideal to get at least 1/m Utility (i.e. 1 accepted share per minute) per thread, which would be 2 per device.
so if I get about 1/m per thread utility, wouldn't increasing that value to say 90 seconds be beneficial? because when a share is taking slightly longer, I don't want to have to get new work often
Pools don't accept shares older than a certain age. Depending on the pool, it's somewhere between 60 and 120 seconds, so I have to default to the safe lower level.
hero member
Activity: 658
Merit: 500
Changing S will not significantly change much of anything at all I'm afraid. Your GPU will probably finish a work item in significantly less than 1 minute and then get more work. The 1 minute cutoff is used to decide the latest possible that a solution would be allowed to be submitted. So if you find a share in say 30 seconds, you have up to 30 more seconds to submit it (bad network conditions will limit that). cgminer only gets work as often as it needs to keep your GPU busy, and GPUs should never run out and hit the 1 minute cutoff unless you significantly increase the number of threads per GPU. The other thing is cgminer can use existing work to generate more work (rolltime it's called) to keep getting useful shares out of the same work for up to 1 minute. If you significantly decrease -s to less than how long it takes for your GPU to find a share, you are more likely to throw work away unnecessarily. All in all, don't bother changing it, but it would be ideal to get at least 1/m Utility (i.e. 1 accepted share per minute) per thread, which would be 2 per device.
so if I get about 1/m per thread utility, wouldn't increasing that value to say 90 seconds be beneficial? because when a share is taking slightly longer, I don't want to have to get new work often
sr. member
Activity: 467
Merit: 250
Well I honestly am completely baffled since the build has -ldl added... Unless the LDFLAGS are not being passed at all. God I hate autofoo tools. Anyone with a clue out there?

PM me, happy to give you temporary access to a box to take a peek if you want.

I realize I haven't said this enough, but I _LOVE_ cgminer.... Smiley


-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
Changing S will not significantly change much of anything at all I'm afraid. Your GPU will probably finish a work item in significantly less than 1 minute and then get more work. The 1 minute cutoff is used to decide the latest possible that a solution would be allowed to be submitted. So if you find a share in say 30 seconds, you have up to 30 more seconds to submit it (bad network conditions will limit that). cgminer only gets work as often as it needs to keep your GPU busy, and GPUs should never run out and hit the 1 minute cutoff unless you significantly increase the number of threads per GPU. The other thing is cgminer can use existing work to generate more work (rolltime it's called) to keep getting useful shares out of the same work for up to 1 minute. If you significantly decrease -s to less than how long it takes for your GPU to find a share, you are more likely to throw work away unnecessarily. All in all, don't bother changing it, but it would be ideal to get at least 1/m Utility (i.e. 1 accepted share per minute) per thread, which would be 2 per device.
sr. member
Activity: 462
Merit: 250
I heart thebaron
you're supposed to use two way vectors, and 256 worksize when you underclock to around 1/3 or sometimes a little higher or lower

so I use -w 256 -v 2 -I 8 --gpu-engine 755 --gpu-memclock 250 on my 5750
Thanks for the tips so far. They have all really helped. I will give this a shot as well. I completely forgot about toying with WORK size as everything has just been so impressive thus far.

Onto the next step in optimizing....

Could someone please help by explaining the '-s ' flag ?
Is this simply a work timeout ? Like with the default being '60', if a worker thread is unable to calculate/submit a hash within 60 seconds, it simply ditches the work and moves on to the next work unit ?, rather than being stuck for any longer and wasting time ?

As a PPS Miner, I would really like to get the highest share/hour count that I can....which also brings me to effective worker thread counts. CGMiner is a great monitoring tool to help with this in small incriments during optimizations that one might try.

On average, what would be a sufficient Mh/s per thread ? what would be counter productive ?
With Default being 2, my 5770's @ 220 Mh/s seem to work quite well. With this in mind, should I change to 3 threads for my 6870's (300 Mh/s) and perhaps 4 threads for my 6950's (just under 400 Mh/s) ?

Would using a combination of a lower work unit timeout (-s flag) and increased thread count (1 thread per 100 Mh/s) be an effective optimization strategy ?

Hopefully I understood the concept of the -s flag, otherwise, this post was mostly a waste and it's back to the drawing board....LOL
Allan
member
Activity: 90
Merit: 12
Same problem my friend.. unless I add ldl I get:

Quote
/usr/bin/ld: cgminer-adl.o: undefined reference to symbol 'dlopen@@GLIBC_2.1'
/usr/bin/ld: note: 'dlopen@@GLIBC_2.1' is defined in DSO /lib/libdl.so.2 so try adding it to the linker command line
/lib/libdl.so.2: could not read symbols: Invalid operation
collect2: ld returned 1 exit status
make[2]: *** [cgminer] Error 1
make[2]: Leaving directory `/RAM/new/cgminer'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/RAM/new/cgminer'
make: *** [all] Error 2


If you're pulling from git, make sure you re-run autogen.sh or you won't see his change for this.
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
Doh, -lpthread, not -pthread is also an issue it seems?

Anyway, try pulling the git tree again please.

Same problem my friend.. unless I add ldl I get:

Quote
/usr/bin/ld: cgminer-adl.o: undefined reference to symbol 'dlopen@@GLIBC_2.1'
/usr/bin/ld: note: 'dlopen@@GLIBC_2.1' is defined in DSO /lib/libdl.so.2 so try adding it to the linker command line
/lib/libdl.so.2: could not read symbols: Invalid operation
collect2: ld returned 1 exit status
make[2]: *** [cgminer] Error 1
make[2]: Leaving directory `/RAM/new/cgminer'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/RAM/new/cgminer'
make: *** [all] Error 2

Well I honestly am completely baffled since the build has -ldl added... Unless the LDFLAGS are not being passed at all. God I hate autofoo tools. Anyone with a clue out there?
sr. member
Activity: 467
Merit: 250
Doh, -lpthread, not -pthread is also an issue it seems?

Anyway, try pulling the git tree again please.

Same problem my friend.. unless I add ldl I get:

Quote
/usr/bin/ld: cgminer-adl.o: undefined reference to symbol 'dlopen@@GLIBC_2.1'
/usr/bin/ld: note: 'dlopen@@GLIBC_2.1' is defined in DSO /lib/libdl.so.2 so try adding it to the linker command line
/lib/libdl.so.2: could not read symbols: Invalid operation
collect2: ld returned 1 exit status
make[2]: *** [cgminer] Error 1
make[2]: Leaving directory `/RAM/new/cgminer'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/RAM/new/cgminer'
make: *** [all] Error 2
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
I don't understand how this is possible.

CGMiner is the FASTEST miner I have ever used and accomplishes equally impressive results on 3 different generations/sub-generations of cards that I mine with.

--snip--

To say I am happy, would be an understatement. I will definately support and donate to this project.
Thanks for the excellent feedback  Grin
hero member
Activity: 658
Merit: 500
you're supposed to use two way vectors, and 256 worksize when you underclock to around 1/3 or sometimes a little higher or lower

so I use -w 256 -v 2 -I 8 --gpu-engine 755 --gpu-memclock 250 on my 5750
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
New version 2.0.1 - links in top post.

Executive summary of new features:
Fanspeeds, gpu engine speeds now accept ranges, eg:
--auto-gpu --gpu-engine 750-950,945,700-930,960

Temperature targets are per-device, eg:
--temp-cutoff 95,105

Disable adl option
--no-adl

Should detect more cards that it can monitor/clock now, even if only with partial support, including 2nd cores in dual core cards.

Temps and fanspeeds in status line


Full changelog:
- Hopefully fix building on 32bit glibc with dlopen with -lpthread and -ldl
- ByteReverse is not used and the bswap opcode breaks big endian builds. Remove
it.
- Ignore whether the display is active or not since only display enabled devices
work this way, and we skip over repeat entries anwyay.
- Only reset values on exiting if we've ever modified them.
- Flag adl as active if any card is successfully activated.
- Add a thermal cutoff option as well and set it to 95 degrees by default.
- Change the fan speed by only 5% if it's over the target temperature but less
than the hysteresis value to minimise overshoot down in temperature.
- Add a --no-adl option to disable ADL monitoring and GPU settings.
- Only show longpoll received delayed message at verbose level.
- Allow temperatures greater than 100 degrees.
- We should be passing a float for the remainder of the vddc values.
- Implement accepting a range of engine speeds as well to allow a lower limit to
be specified on the command line.
- Allow per-device fan ranges to be set and use them in auto-fan mode.
- Display which GPU has overheated in warning message.
- Allow temperature targets to be set on a per-card basis on the command line.
- Display fan range in autofan status.
- Setting the hysteresis is unlikely to be useful on the fly and doesn't belong
in the per-gpu submenu.
- With many cards, the GPU summaries can be quite long so use a terse output
line when showing them all.
- Use a terser device status line to show fan RPM as well when available.
- Define max gpudevices in one macro.
- Allow adapterid 0 cards to enumerate as a device as they will be non-AMD
cards, and enable ADL on any AMD card.
- Do away with the increasingly confusing and irrelevant total queued and
efficiency measures per device.
- Only display values in the log if they're supported and standardise device log
line printing.
sr. member
Activity: 462
Merit: 250
I heart thebaron
try underclocking your mem clock to around 300 mhz or so, or about 1/3 of your core clock
I just finished underclocking the memory on all my DEDICATED RIGS that use 5770's, 6870's and 6950's......AND THE RESULT DOESN'T MAKE SENSE (based on my past testing anyways....).

Another small performance increase on all platforms, TEMP seems steady with no real change to mention.

This has NEVER worked for me before and has nearly always resulted in excessive stales, poor performance and instability, especially with the 6xxx series cards.

That also reminds me, forgot to add another thing to my list above:

What I have experienced:.....continued.
- Constant 6-9% (5-7% w/1:1 mem above, 6-9% w/300Mhz Mem) performance increase on 5770, 6870 & 6950 Cards using unoptimized/default settings.
- Dead-Nutz Clock, Temp and Fan control using the 3 cards above.
- the cleanest/easiest installation yet.
- UNDER 1.5% Stale shares. Most times, single digit stales on triple+ digit accepted.
- Definately Love.....LOL

I am not sure if this makes a difference, but all my Mining Rigs run Windows7 x64 with atleast 4GB of RAM (some have more, because I have it to spare). Also, all my rigs are Intel LGA1366 Based i7 desktop CPUs or LGA1366 Quad-Core Xeons (because I had them, no other reason). For my dedicated Miners, My excessively-SPEC'd CPU/Motherboard setups are in NO WAY a reflection of my desire to increase performance...they were just put to use because they were already here and I didn't have to purchase anything else, for the most part.


Can anyone suggest any other CMD LINE optimizations that I might use in my Start-up BAT Files to further increase performance ?

Cheers,
Allan
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
[...]
      Win32 does not use dlopen so link in -ldl only when not on win32 and display what ldflags are being passed on ./configure.
[...]

FreeBSD doesn't have a separate -ldl either, since dlopen() and dlclose() are part of the FreeBSD libc library. The same might be true for MacOSX and any other BSD OS.
Thanks, but there's no ADL support on any BSD OS so you won't be using dlopen anyway.
hero member
Activity: 658
Merit: 500
try underclocking your mem clock to around 300 mhz or so, or about 1/3 of your core clock
sr. member
Activity: 462
Merit: 250
I heart thebaron
I don't understand how this is possible.

CGMiner is the FASTEST miner I have ever used and accomplishes equally impressive results on 3 different generations/sub-generations of cards that I mine with.

There have been other miners that are better suited for 69xx or 68xx cards or even 57xx or 58xx, but nothing I have come across that performs well on all platforms.

To the lay person (as well as myself at first), I almost felt that by trying this package, that I would almost be going BACKWARDS. but that couldn't be further from the truth, as I replaced MSI Afterburner & Guiminer with simply CGMiner as a do-all program. Comparing dedicated identical machines side-by-side told me everything I needed to know.....

What I have experienced:
- Constant 5-7% performance increase on 5770, 6870 & 6950 Cards with NO OPTIMIZATION (all auto settings).
- Dead-Nutz Clock, Temp and Fan control using the 3 cards above.
- the cleanest/easiest installation yet.
- Love ? .....LOL

For hopeful longevity, I DO NOT INCREASE VOLTAGE on any of my Rigs. Here are the basic settings and contents of the BAT files I use:

5770 Workstations (Mild Static OC using ATI CPL, Dynamic -I)
------------------------------------------------------------
cgminer -o http://pool:port -u username -p password

4x 5770 Dedicated (Simple setup - No problems)
----------------------------------------------
cgminer -o http://pool:port -u username -p password -I 8 -Q 10 --auto-fan --auto-gpu --gpu-engine 950 --gpu-memclock 950 --temp-target 70

4x 6870 Dedicated (Even using -Q 10, still reports Server can not supply work fast enough)
------------------------------------------------------------------------------------------
cgminer -o http://pool:port -u username -p password -I 8 -Q 10 --auto-fan --auto-gpu --gpu-engine 935 --gpu-memclock 935 --temp-target 70

2x 6950 Dedicated (-I 9 too agressive, driver errors, -I 8 stable)
------------------------------------------------------------------
cgminer -o http://pool:port -u username -p password -I 8 -Q 10 --gpu-powertune 20 --auto-fan --auto-gpu --gpu-engine 900 --gpu-memclock 900 --temp-target 70


To say I am happy, would be an understatement. I will definately support and donate to this project.
Allan
member
Activity: 90
Merit: 12
Aha! Well that's why it thinks the device doesn't support ADL, because it doesn't support the full feature set. I'm pretty sure I can fix that. Is there only one fan setting for both GPUs then?

Right, there's just one fan that both GPUs share. Sorry, I didn't even think the fan was possibly causing this.
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
Aha! Well that's why it thinks the device doesn't support ADL, because it doesn't support the full feature set. I'm pretty sure I can fix that. Is there only one fan setting for both GPUs then?
Jump to: