Author

Topic: OFFICIAL CGMINER mining software thread for linux/win/osx/mips/arm/r-pi 4.11.0 - page 573. (Read 5805728 times)

legendary
Activity: 1540
Merit: 1001

It's a 480 GB SSD, I don't think Spinrite will help Smiley  

The box is a fairly recent reload, I'm not keen on doing it again.  I was hoping someone might have a magic bullet.

Actually, I read recently that spinrite level 1 is read only, and can work wonders on SSDs.  

I'm not even sure how that would make sense?

You can read from SSDs almost indefinitely without causing wear.  It's the writing that causes problems.

So by doing a level 1 read across the drive, you're forcing the wear logic to realize failing sectors are having problems, which it then swaps out to a spare one.

Steve's words:

Quote
And all of our listeners
just got a new tip for running SpinRite, if you have a drive which, like this - the problem
is that all of the other levels are writing something. Level 1 is a read-only pass. And
that's why it's safe to run on thumb drives, because it doesn't write anything, absolutely
nothing. It only reads.
But the beauty of that is that, as we were saying before, the act of reading shows the
drive it has a problem. And clearly this, whatever was going wacky with this and a couple
other drives that Mike found, writing gave the drive fits, but reading was okay. So
reading was sort of eased into it more gently and allowed the drive to fix the problems so
that then writing to them was writing to different areas because the bad spots had been
relocated to good areas on the drive. So that's a great tip. It'll definitely make it into our
notes for the future.

from: https://www.grc.com/sn/sn-343.pdf
legendary
Activity: 1260
Merit: 1000

It's a 480 GB SSD, I don't think Spinrite will help Smiley 

The box is a fairly recent reload, I'm not keen on doing it again.  I was hoping someone might have a magic bullet.

Actually, I read recently that spinrite level 1 is read only, and can work wonders on SSDs. 

I'm not even sure how that would make sense?
legendary
Activity: 1540
Merit: 1001

It's a 480 GB SSD, I don't think Spinrite will help Smiley 

The box is a fairly recent reload, I'm not keen on doing it again.  I was hoping someone might have a magic bullet.

Actually, I read recently that spinrite level 1 is read only, and can work wonders on SSDs. 
legendary
Activity: 1540
Merit: 1001
Question:  Is there a way to mine on multiple pools at the same time with CGMiner?  In other words, if I have 2 BFL miners, and want to point one to one pool, and another to another pool, how would I go about setting that up?
The --load-balance flag will basically do that for you.
How accurately would it load balance between the pools?

I heard -haven't tried- that it doesn't quite work as expected on balancing.  I use it for failover only.
-ck
legendary
Activity: 4088
Merit: 1631
Ruu \o/
Please take a good look through the readme, and read carefully the extensive documentation on the advanced option --gpu-map.
legendary
Activity: 1274
Merit: 1004
Check the debug log file.  Sometimes ADL does return -1 from fan speed/temp APIs.  I've had people reporting it with 6000 series cards (not cgminer, my akbash watchdog) BTW, -1, is "Most likely one or more of the Escape calls to the driver failed".

Not sure why, maybe it is "not always supported" (as their ADL docs says) ?!?. I raised this issue with AMD support. Waiting for their response.

Not sure how re-order would help, ADL APIs use adapter index, not opencl gpu #.



The lack of RPM for the one 5970 has to do with how it died. The GPU that failed was the one closest to the output, so this card isn't capable of outputting an image, or controlling/reporting on fan speed. The fan is just always pegged at 100.
legendary
Activity: 1274
Merit: 1004
I tried --gpu-reorder, and it did organize them in a more logical layout but it doesn't fix the problem.

Code:
 [P]ool management [G]PU management [S]ettings [D]isplay options [Q]uit
 GPU 0:  64.0C 3192RPM | 273.0/274.5Mh/s | A:15 R:0 HW:0 U: 4.82/m I: 8
 GPU 1:  59.0C 4408RPM | 331.0/330.2Mh/s | A:25 R:0 HW:0 U: 8.04/m I: 8
 GPU 2:  60.5C 4403RPM |  88.6/177.9Mh/s | A: 5 R:0 HW:0 U: 1.61/m I: 8
 GPU 3:  36.0C         | 331.3/330.2Mh/s | A:14 R:0 HW:0 U: 4.50/m I: 8

Here, I changed the core of the single GPU 5970 to 200MHz and the voltage to 0.950V. The hashrate dropped on GPU2, but the corresponding drop in temperature happened on GPU3.
hero member
Activity: 591
Merit: 500
No, I haven't. What does the flag do?
--gpu-reorder       Attempt to reorder GPU devices according to PCI Bus ID
Although in my experience, it was only necessary when I used Windows. So far it's been accurate without that flag on Ubuntu.
legendary
Activity: 1274
Merit: 1004
Has anyone else noticed that the order of temperatures and fan speed is incorrect sometimes? For instance, this is a rig with 3 cards; a 6870, a 5970 and a 5970 with one bad GPU.

...

It's not just this configuration of cards either, I've noticed this before with 4 GPUs in the system but the cards in different orders.
Have you tried using the --gpu-reorder flag?
No, I haven't. What does the flag do?
hero member
Activity: 591
Merit: 500
Has anyone else noticed that the order of temperatures and fan speed is incorrect sometimes? For instance, this is a rig with 3 cards; a 6870, a 5970 and a 5970 with one bad GPU.

...

It's not just this configuration of cards either, I've noticed this before with 4 GPUs in the system but the cards in different orders.
Have you tried using the --gpu-reorder flag?
legendary
Activity: 1274
Merit: 1004
Has anyone else noticed that the order of temperatures and fan speed is incorrect sometimes? For instance, this is a rig with 3 cards; a 6870, a 5970 and a 5970 with one bad GPU.

GPU 0 is the 6870, the one with the exceedingly low temperature and no fan speed is the one with the bad GPU.
The GPU order corresponds with what I see in clocktweak.
Reading data:
Adapter#:0 Temp:75 Load:99 Fan:67 Level:2   CoreL0:250 CoreL1:399 CoreL2:900 MemL0:198 MemL1:799 MemL2:800 mVoltL0:950 mVoltL1:999 mVoltL2:1150
Adapter#:3 Temp:62 Load:99 Fan:NA Level:2   CoreL0:250 CoreL1:500 CoreL2:750 MemL0:198 MemL1:199 MemL2:200 mVoltL0:950 mVoltL1:999 mVoltL2:1000
Adapter#:4 Temp:43 Load:99 Fan:NA Level:2   CoreL0:157 CoreL1:399 CoreL2:750 MemL0:198 MemL1:199 MemL2:200 mVoltL0:950 mVoltL1:999 mVoltL2:1000
Adapter#:5 Temp:60 Load:99 Fan:86 Level:2   CoreL0:157 CoreL1:400 CoreL2:750 MemL0:198 MemL1:199 MemL2:200 mVoltL0:950 mVoltL1:999 mVoltL2:1000

In this case, I changed the clock speed on the 5970 with only one GPU (Adapter #4) to 400MHz, and as you can see the hashrate of GPU1 when down instead of GPU2.

It's not just this configuration of cards either, I've noticed this before with 4 GPUs in the system but the cards in different orders.
hero member
Activity: 591
Merit: 500
How accurately would it load balance between the pools?
It sounds like it tries to balance the work equally, but I haven't tested it out myself to see how it works.
legendary
Activity: 1400
Merit: 1005
Question:  Is there a way to mine on multiple pools at the same time with CGMiner?  In other words, if I have 2 BFL miners, and want to point one to one pool, and another to another pool, how would I go about setting that up?
The --load-balance flag will basically do that for you.
How accurately would it load balance between the pools?
legendary
Activity: 1795
Merit: 1208
This is not OK.
OK, I have a few issues I would like to raise. Now I would look at these myself, but I just don't have the time to analyse/learn the code and do the work myself. I am still looking at it, it's just going to take me some time...

1. If a config file is specified in the cmd line, don't attempt loading from the default location, even if the file specified is invalid. Only attempt to read from the default location if no file is specified.

2. To save a file via the API, a blank parameter (Null text), indicates that the loaded file should be updated... It should save to the file which it uses, whether that's the file specified in the cmd line, or the default location. (Kano is looking into this, I think?).

3. For the BFL (and possibly the other FPGA), when the unit is disabled, either by the user or fault, the worker thread should not be terminated. Comms with the device should be maintained for (at least) 2 reasons:
a) To update the state of the device, Alive or dead etc: no comms attempts means we don't know.
b) To update the temps of the device: no comms mean we can't ask for the temperature.
Disabling the device should just stop new work being sent to it.

4. For the BFL (and possibly the other FPGA), if a device dies and a re-enable request is sent, the device should be re-initialised.
I found that when a BFL throttles, CGminer would report it having zero hashes and disable it. Hitting re-enable didn't do anything, but restarting CGminer would bring it back to life. This suggests the BFL just needs re-initialising.

Also,
I see that when work is submitted to the BFL, it waits 4500ms then start's polling at 10ms intervals for work. I think there should be a timeout at maybe 10s (BFL say that work should take 5.125s so 10s is ample) when the BFL is declared sick, and is re-initialised.
legendary
Activity: 1260
Merit: 1000
I actually turn swap off on all of my primary machines, for a number of reasons, primarily for security, though.  Having worked it data forensics for awhile, you'd be horrified to see what kind of cruft is left over in your swap file in Windows that can be easily recovered with the right tools.  Even if you delete your swap file regularly, chances are large portions of previous swap files are still floating around on your disk, not yet written over by the OS. 

First thing I do is turn it off... in this case, this machine has 24 GB of RAM, so if it's swapping, it's got other issues that are far more pressing than running out of memory for what I use it for.  I went back to a previous version of CGMiner that I know worked and it had the same problem, so it's definitely not any changes made to CGminer that are causing it. 

I will try X'ing out of CGMiner instead of Q'ing out to see what happens there, but I don't hold out a lot of hope for that.
rjk
sr. member
Activity: 448
Merit: 250
1ngldh
And totally off topic:
At work, our analysis server's storage array just got upgraded to 20 250GB Enterprise grade SSD.  Utterly mindblowing performance on complex simulations.  The coolest part is the array chassis is only 2U and has hotswap bays for 32 2.5" drives.  Smiley  It is only a matter of time before SSD replace everything (except maybe nearline storage).  The only con is cost and that is falling by about 50% every year (faster than Moore's law).    
Everyone needs to experience a huge array of enterprise SSDs at least once in their life. Smiley It is very impressive. And, I doubt you can make your throughput graphs dip when you shout at the array, like you can with regular drives. Grin

EDIT: http://www.youtube.com/watch?v=tDacjrSCeq4 Cheesy
donator
Activity: 1218
Merit: 1079
Gerald Davis
Is your swap file and temp directory set to the SSD?  If so I would put a regular spinning hard drive in the system just for that purpose.  It's supposedly bad to continuously re-write data to SSD's.

True to a point.  Still in any properly configured system the amount of writing done to swap and temp file shouldn't materially contribute to wear leveling.   All SSD have a finite amount of writes, but wear leveling, and overbuilding (480GB SSD likely has 500GB to 550GB of actual flash) mean they should work just fine for years.

Some back of napkin math:
480GB SSD (w/ 500GB actual flash) and 500K avg writes per cell.  Swap file wites per day 1TB (very unlikely).
Gross write throughput = 500 GB * 500K = 250 PB (1PB = 1000TB).

In 10 years swap file writes ~= 4 PB.  Roughly 2% of drive's rated write capacity.  It is unlikely someone swap file usage aproaches anywhere near 1 TB per day unless they randomly open and close applications on a near continual basis.  When drives were much smaller, had low write ratings, and system memory was more expensive (greater % of paging) it was more of an issue.

And totally off topic:
At work, our analysis server's storage array just got upgraded to 20 250GB Enterprise grade SSD.  Utterly mindblowing performance on complex simulations.  It provided more of a performance boost then increasing the RAM from 32GB to 64GB and cost less.  The coolest part is the chassis for the external array is only 2U and has 32 2.5" hotswap bays.  Smiley  It is only a matter of time before SSD replaces everything (except maybe nearline storage).  The only con is cost and that is falling by about 50% every year (faster than Moore's law).    
donator
Activity: 1218
Merit: 1079
Gerald Davis
If I type in --temp-cutoff 60

would this mean that if a gpu core reaches 60c then CGminer turns off?

Yes.  Technically cgminer never shutdown it simply turns the GPU threads to disabled and idles them.  Remember even idle the GPU are going to dump ~30W ea into the loop (less for 7000 series).   Also see the notes of hyseresis below (by default cutoff wouldn't occur until 63).

Quote

Right now my GPU core never goes above 52c, the VRM however reached 65c.

So unless pump fails I will never reach more then 52c if that makes sense.

edit---> I found this in my cgminer.cfg
"temp-cutoff" : "95",  This I put 65
"temp-overheat" : "85",  This I put 64
"temp-target" : "75",     This I put 64
^^^^^^^ There is no fans controlled by GPU, it's radiator with fans.

temp-target & overheat should be set but it won't have much effect.

when temp > temp-target + hysteresis then cgminer will either lower the clock OR raise the fan*
when temp > temp-overheat + hysteresis then cgminer will drop clock to lowest value AND set fan @ 100%.
when temp > temp-cutoff + hysteresis then cgminer will disable GPU thread (forced idle)

A couple of other notes:
You must have "auto-gpu" : true in your config file or cgminer will ignore all the above values.
If you set a gpu clock range (gpu-engine: 750-850) then cgminer will start @ highest clock and lower it as temp > temp-target.
If you set no gpu range (gpu-engine: 850) then cgminer will create a range from default to specified value (i.e. 725-850 for a 5970).

For watercooling the temp-target & temp-overheat values will not have much effect.  It will cause cgminer to lower clocks but watercooling is so good at keeping temps in line any jump in temp likely means lowering the clocks won't be sufficient to keep system from going over the cutoff value.

Keep in mind the effect of the hysteresis value (default is 3?).  It is how far the temp has to move above or below the specified value for an effect to take place.  So if your cutoff is 60 and hysteresis is 3 then your GPU won't idle until 63.  Make sure to consider that when setting your values.

 

hero member
Activity: 546
Merit: 500
Does it do the same when you just force-close it by hitting the "x" vs Q?  What's the benefit of using Q?

legendary
Activity: 3583
Merit: 1094
Think for yourself

It's a 480 GB SSD, I don't think Spinrite will help Smiley 

The box is a fairly recent reload, I'm not keen on doing it again.  I was hoping someone might have a magic bullet.

Ah, no I don't think it would be good to run any hard drive utility on an SSD.

Is your swap file and temp directory set to the SSD?  If so I would put a regular spinning hard drive in the system just for that purpose.  It's supposedly bad to continuously re-write data to SSD's.
Sam
Jump to: