Topic: Avalon ASIC users thread - page 154. (Read 438724 times)

Auto tries to keep HW errors below 1.5%. Are you sure you're not mining at a higher diff?

So the shown HW errors are a multiple of the the diff mining at?
having a higher percentage

cgminer restarted in the night, MHz is again at 341

I'm getting a high rate of rejects from the pool so cgminer is showing me nearly 79GHash/s but on the pool bitparking its only around 71GHasch/s like it was at 300 MHz?
watching this

The hardware errors need to be divided by the diff... I'm absolutely sure you're not at 10-20% errors.

fhh

legendary

Activity: 1206

Merit: 1000

Quote from: shmadz on June 25, 2013, 11:56:21 PM

Auto tries to keep HW errors below 1.5%. Are you sure you're not mining at a higher diff?

So the shown HW errors are a multiple of the the diff mining at?
having a higher percentage

cgminer restarted in the night, MHz is again at 341

I'm getting a high rate of rejects from the pool so cgminer is showing me nearly 79GHash/s but on the pool bitparking its only around 71GHasch/s like it was at 300 MHz?
watching this

-ck

legendary

Activity: 4088

Merit: 1631

Ruu \o/

Quote from: -ck on June 25, 2013, 11:39:41 PM

Very much dependent on the chips, so this can only be a wild guess, but... 80+ degrees?

thank you CKolivas! 80C will cook me and my tiny apartment. I think I need to figure a way to vent the heat directly outside without letting the rain and snow in,

Hah, well don't take my word for it, as I said, it's pure speculation.

shmadz

legendary

Activity: 1512

Merit: 1000

@theshmadz

Quote from: -ck on June 25, 2013, 11:39:41 PM

Quote from: shmadz on June 25, 2013, 11:34:56 PM

Quote from: -ck on June 25, 2013, 11:10:03 PM

Quote from: shmadz on June 25, 2013, 11:34:56 PM

I am seeing little or no improvement by cooling with a portable A/C.

Unit with A/C
1h 37m 58s 83896.29 temp3 43 freq(auto) 354

Unit without A/C
6h 54m 22s 83111.32 temp3 53 freq(auto) 353

I guessed this might be the case since the temperatures really aren't getting into the error range even with regular air cooling - especially since it's 3 degrees at my home overnight and the hashrate doesn't go up. I suspect the hashrate will only get higher with more voltage given to the chips.

just curious what might the "error range" be? it's getting rather hot in here and I think I might have to buy another AC unit... summer is right around the corner...

Very much dependent on the chips, so this can only be a wild guess, but... 80+ degrees?

thank you CKolivas! 80C will cook me and my tiny apartment. I think I need to figure a way to vent the heat directly outside without letting the rain and snow in,

-ck

legendary

Activity: 4088

Merit: 1631

Ruu \o/

Quote from: -ck on June 25, 2013, 11:10:03 PM

Quote from: -ck on June 25, 2013, 11:10:03 PM

I am seeing little or no improvement by cooling with a portable A/C.

Unit with A/C
1h 37m 58s 83896.29 temp3 43 freq(auto) 354

Unit without A/C
6h 54m 22s 83111.32 temp3 53 freq(auto) 353

I guessed this might be the case since the temperatures really aren't getting into the error range even with regular air cooling - especially since it's 3 degrees at my home overnight and the hashrate doesn't go up. I suspect the hashrate will only get higher with more voltage given to the chips.

just curious what might the "error range" be? it's getting rather hot in here and I think I might have to buy another AC unit... summer is right around the corner...

Very much dependent on the chips, so this can only be a wild guess, but... 80+ degrees?

shmadz

legendary

Activity: 1512

Merit: 1000

@theshmadz

I am seeing little or no improvement by cooling with a portable A/C.

Unit with A/C
1h 37m 58s 83896.29 temp3 43 freq(auto) 354

Unit without A/C
6h 54m 22s 83111.32 temp3 53 freq(auto) 353

I guessed this might be the case since the temperatures really aren't getting into the error range even with regular air cooling - especially since it's 3 degrees at my home overnight and the hashrate doesn't go up. I suspect the hashrate will only get higher with more voltage given to the chips.

just curious what might the "error range" be? it's getting rather hot in here and I think I might have to buy another AC unit... summer is right around the corner...

-ck

legendary

Activity: 4088

Merit: 1631

Ruu \o/

Quote from: johnyj on June 25, 2013, 09:00:26 PM

I am seeing little or no improvement by cooling with a portable A/C.

Unit with A/C
1h 37m 58s 83896.29 temp3 43 freq(auto) 354

Unit without A/C
6h 54m 22s 83111.32 temp3 53 freq(auto) 353

I guessed this might be the case since the temperatures really aren't getting into the error range even with regular air cooling - especially since it's 3 degrees at my home overnight and the hashrate doesn't go up. I suspect the hashrate will only get higher with more voltage given to the chips.

-ck

legendary

Activity: 4088

Merit: 1631

Ruu \o/

Thanks for the detailed info! I noticed that during the first 10 or so hours the overclocked avalon was stable, but then it becomes more and more unstable, even the outside temp dropped significantly during night, cgminer restarted repeatedly, I feel that instability might comes from FPGA. What could be the cause of that? Have you observed same accumulated instability over time?

P.S. also sent 1B to you, cgminer still rules Cool

And thank you Wink

I'm sure instability can manifest in any number of ways, and it's probably either resetting the device regularly due to the chips failing or idling frequently due to the PSU not keeping up or something along those lines.

-ck

legendary

Activity: 4088

Merit: 1631

Ruu \o/

Quote from: dogie on June 25, 2013, 08:56:35 PM

How is the auto balancing between maximising clock speed but minimising fan speed? What is the hierarchy?

Unlike the GPU code, they're totally independent as. Clock speed is determined solely by hardware errors whereas fanspeed is determined by temperature. HW errors tend to run hand in hand with temperature rise on this sort of hardware whereas GPUs are designed to be deterministic right up to failure so hw errors are meant to almost never happen.

johnyj

legendary

Activity: 1988

Merit: 1012

Beyond Imagination

Quote from: -ck on June 25, 2013, 06:58:04 PM

A few notes about the auto-clocking approach.

First and foremost, you can fry your hardware as you are running your avalon out of specification, especially if you try it on a batch 1 device with its lower power and quality PSU.

As is virtually always the case, manually fine tuning the final result will always be better than an automated process that guesses. With time I wish to get rid of the requirement to have fixed intervals and allow the user to specify any arbitrary value for the frequency, though the interface coping with it is a bit of an issue at the moment.

Ironically some people are finding the frequency a little too high and others a little too low. I suspect everyone is looking at a different endpoint for what is an ideal frequency in their eyes. The targets I've set are based on hardware error as a percentage, with hysteresis of +/- 0.25% - this is because a .5% increase in hardware errors works out to the amount the hashrate would rise with 2Mhz increments; i.e. if your hardware error count is going up at the same rate as the hashrate should rise, you are wasting energy. Ideally, a regression plot is what would be needed, getting the hashrate rise with each increment and the hw error percentage rise, and seeing when one grows faster than the other, but this is absurd stats to try to go looking for, especially when the values fluctuate wildly under normal circumstances only. By default with avalon-auto, you will get hardware errors of 1~1.5% . When looking at the hardware error count, make sure you are comparing it to the diff1 shares and not the accepted since you will almost certainly be mining at higher diff. Hardware errors are harmless in their own right but indicative of how hard you're pushing the chips for their available voltage and cooling. It sounds like these chips are capable of much more with more voltage but no one's done said mod yet.

The way to calculate hardware error percentage is:
HW * 100 / (diff1 + HW)

It's also worth mentioning that to simplify the calculation of different frequencies, the values passed to the avalon with this latest firmware on the "regular values", i.e. 300 and below, is slightly lower than the values that would have been passed to it, but it should make only a negligible difference to hashrate, lost in the noise of normal variance that happens with hashrate. The "timeout" value passed is also smaller now, which means you may hit the limit at lower speeds than you used to - but the old timeouts were too high, and even if you apparently had a higher hashrate, if you go back and check your stats you may find you were getting more rejects. This is because the higher timeouts were leading to duplicate shares being generated so it is only a disadvantage.

A sure fire sign that you're overdoing it is cgminer repeatedly being restarted by the avalon watchdog, or periods of hashrate dropping, or smoke coming out of your PSU.

Thanks for the detailed info! I noticed that during the first 10 or so hours the overclocked avalon was stable, but then it becomes more and more unstable, even the outside temp dropped significantly during night, cgminer restarted repeatedly, I feel that instability might comes from FPGA. What could be the cause of that? Have you observed same accumulated instability over time?

P.S. also sent 1B to you, cgminer still rules Cool

dogie

legendary

Activity: 1666

Merit: 1185

dogiecoin.com

How is the auto balancing between maximising clock speed but minimising fan speed? What is the hierarchy?

-ck

legendary

Activity: 4088

Merit: 1631

Ruu \o/

Quote from: btcsql on June 25, 2013, 07:41:29 PM

Quote from: -ck on June 25, 2013, 07:31:02 PM

Quote from: btcsql on June 25, 2013, 07:29:00 PM

Quote from: -ck on June 25, 2013, 07:31:02 PM

Hi ckolivas, thanks for everything. What I have noticed is STROMBOM's firmware + 325 mhz is giving about 1-2% HW errors. On the other hand the latest you put out with the --auto and temp targetting is throwing 15-20% HW errors at the same clock. Is there any way to combine the best of both worlds and get strombom's level of HW errors, but with the ability to control the temp? Would be greatly appreciated.

No idea why that would be the case. Auto tries to keep HW errors below 1.5%. Are you sure you're not mining at a higher diff?

Right? No idea here either. I literally have the exact settings saved and switched between the two firmwares. Auto was setting the clock to 327, but even without Auto and manually set to 325, the HW error was still 15-20%, compared to strombom's 2%. So weird!

Try restarting it a few times from the interface perhaps? I find it a bit less reliable to start up normally. But yeah, I don't know why that would be the case...

Tried restarting multiple times from the interface, still seeing 15-20% HW errors. Soo weird.

Hmm... Auto wont start changing clocks unless the actual nonces returned are within 10% of expected, so perhaps try enabling auto and start at lower clocks like 300.

btcsql

sr. member

Activity: 292

Merit: 250

Quote from: btcsql on June 25, 2013, 07:29:00 PM

Quote from: btcsql on June 25, 2013, 07:29:00 PM

Hi ckolivas, thanks for everything. What I have noticed is STROMBOM's firmware + 325 mhz is giving about 1-2% HW errors. On the other hand the latest you put out with the --auto and temp targetting is throwing 15-20% HW errors at the same clock. Is there any way to combine the best of both worlds and get strombom's level of HW errors, but with the ability to control the temp? Would be greatly appreciated.

No idea why that would be the case. Auto tries to keep HW errors below 1.5%. Are you sure you're not mining at a higher diff?

Right? No idea here either. I literally have the exact settings saved and switched between the two firmwares. Auto was setting the clock to 327, but even without Auto and manually set to 325, the HW error was still 15-20%, compared to strombom's 2%. So weird!

Try restarting it a few times from the interface perhaps? I find it a bit less reliable to start up normally. But yeah, I don't know why that would be the case...

Tried restarting multiple times from the interface, still seeing 15-20% HW errors. Soo weird.

-ck

legendary

Activity: 4088

Merit: 1631

Ruu \o/

Hi ckolivas, thanks for everything. What I have noticed is STROMBOM's firmware + 325 mhz is giving about 1-2% HW errors. On the other hand the latest you put out with the --auto and temp targetting is throwing 15-20% HW errors at the same clock. Is there any way to combine the best of both worlds and get strombom's level of HW errors, but with the ability to control the temp? Would be greatly appreciated.

No idea why that would be the case. Auto tries to keep HW errors below 1.5%. Are you sure you're not mining at a higher diff?

Right? No idea here either. I literally have the exact settings saved and switched between the two firmwares. Auto was setting the clock to 327, but even without Auto and manually set to 325, the HW error was still 15-20%, compared to strombom's 2%. So weird!

Try restarting it a few times from the interface perhaps? I find it a bit less reliable to start up normally. But yeah, I don't know why that would be the case...

btcsql

sr. member

Activity: 292

Merit: 250

Hi ckolivas, thanks for everything. What I have noticed is STROMBOM's firmware + 325 mhz is giving about 1-2% HW errors. On the other hand the latest you put out with the --auto and temp targetting is throwing 15-20% HW errors at the same clock. Is there any way to combine the best of both worlds and get strombom's level of HW errors, but with the ability to control the temp? Would be greatly appreciated.

No idea why that would be the case. Auto tries to keep HW errors below 1.5%. Are you sure you're not mining at a higher diff?

Right? No idea here either. I literally have the exact settings saved and switched between the two firmwares. Auto was setting the clock to 327, but even without Auto and manually set to 325, the HW error was still 15-20%, compared to strombom's 2%. So weird!

-ck

legendary

Activity: 4088

Merit: 1631

Ruu \o/