Pages:
Author

Topic: Hacking The KNC Firmware: Overclocking - page 55. (Read 144343 times)

legendary
Activity: 1260
Merit: 1008
December 31, 2013, 09:41:38 AM
On the November unit I have for testing, disabling work flushes in the cgminer source and running cgminer with -q -T gives me the following stats:

(5s):731.0G (avg):681.2Gh/s | A:88420915  R:1590264  HW:133439  WU:9414.2/m

Comes to ~0.15% HW errors.  Pool side 3-hour rate is 670Gh.  Disabling flushes causes an increase in rejected shares, but this seems to be far less of a loss than the false HW errors.

Basically, flushing is broken in all current firmwares it seems.  There is also a false HW error issue in the cgminer driver that was partially fixed in this commit to their cgminer fork.

I'm currently working on a fresh rewrite of the driver.

Just figured I'd throw this out there somewhere.

-wk

thanks for sharing wk!

Flushwork has been the weakest point of knc cgminer driver since the beginning. Just drop us a line when
your rewrite will be complete. 
legendary
Activity: 1428
Merit: 1000
https://www.bitworks.io
December 30, 2013, 11:54:19 PM
~650Gh-ish and would steadily fall.

Binary I made that works on the november unit I have for dev, untested on prior units (but should work): http://vpn.wizkid057.com/nas/cgminer-binary-wizdev20131222

-wk

Thanks for posting.. Gave it a shot on my October.. Spewing out 6-8 HW errors per second on the console.. The hash rate on the pool settled about 10% less than it was before so I reverted back..

Interesting experiment, I assumed the performance would be similar to yours but I guess there are more differences between October and November than I would have first guessed.
legendary
Activity: 1223
Merit: 1006
December 30, 2013, 09:38:45 PM
On the November unit I have for testing, disabling work flushes in the cgminer source and running cgminer with -q -T gives me the following stats:

(5s):731.0G (avg):681.2Gh/s | A:88420915  R:1590264  HW:133439  WU:9414.2/m

Comes to ~0.15% HW errors.  Pool side 3-hour rate is 670Gh.  Disabling flushes causes an increase in rejected shares, but this seems to be far less of a loss than the false HW errors.

Basically, flushing is broken in all current firmwares it seems.  There is also a false HW error issue in the cgminer driver that was partially fixed in this commit to their cgminer fork.

I'm currently working on a fresh rewrite of the driver.

Just figured I'd throw this out there somewhere.

-wk

What was this Jupiter reporting at the pool for a 3-hour before this change? Any chance you would be willing to share the build? Would like to shake it out some on one of my Octobers..

~650Gh-ish and would steadily fall.

Binary I made that works on the november unit I have for dev, untested on prior units (but should work): http://vpn.wizkid057.com/nas/cgminer-binary-wizdev20131222

-wk
legendary
Activity: 1428
Merit: 1000
https://www.bitworks.io
December 30, 2013, 08:33:30 PM
On the November unit I have for testing, disabling work flushes in the cgminer source and running cgminer with -q -T gives me the following stats:

(5s):731.0G (avg):681.2Gh/s | A:88420915  R:1590264  HW:133439  WU:9414.2/m

Comes to ~0.15% HW errors.  Pool side 3-hour rate is 670Gh.  Disabling flushes causes an increase in rejected shares, but this seems to be far less of a loss than the false HW errors.

Basically, flushing is broken in all current firmwares it seems.  There is also a false HW error issue in the cgminer driver that was partially fixed in this commit to their cgminer fork.

I'm currently working on a fresh rewrite of the driver.

Just figured I'd throw this out there somewhere.

-wk

What was this Jupiter reporting at the pool for a 3-hour before this change? Any chance you would be willing to share the build? Would like to shake it out some on one of my Octobers..
legendary
Activity: 1223
Merit: 1006
December 30, 2013, 06:35:55 PM
On the November unit I have for testing, disabling work flushes in the cgminer source and running cgminer with -q -T gives me the following stats:

(5s):731.0G (avg):681.2Gh/s | A:88420915  R:1590264  HW:133439  WU:9414.2/m

Comes to ~0.15% HW errors.  Pool side 3-hour rate is 670Gh.  Disabling flushes causes an increase in rejected shares, but this seems to be far less of a loss than the false HW errors.

Basically, flushing is broken in all current firmwares it seems.  There is also a false HW error issue in the cgminer driver that was partially fixed in this commit to their cgminer fork.

I'm currently working on a fresh rewrite of the driver.

Just figured I'd throw this out there somewhere.

-wk
legendary
Activity: 1400
Merit: 1000
I owe my soul to the Bitcoin code...
December 30, 2013, 05:54:55 PM
I have my jupiter now at '201' with the case off and an extra 120mm fan right in the middle pushing the cooler intake air toward the rear boards.  Seems to give steady 621GH/s (12+hrs) with temps from 40-58C across the boards.  With proper cooling I have kept the wattage per asic around 40W so as not to overload the vrms.  Not a bad result for a small text editor change and a fan.

Next up will be to see if I can lower the voltage needed to keep this performance and maybe go to '211'.  I had missed tinkering in the asic era,  no longer.  Grin

what's your HW errors rate?

I'm going to apply 201/211 to the rear-left asic just to see how it beahave. if everything is ok I'll set 201 also for the front boards.


Right now its at 1.9% hardware errors.  I may go up to '211' if I can keep the wattage close to 40W per vrm.
newbie
Activity: 31
Merit: 0
December 30, 2013, 03:38:25 PM
Just to share my settings/performance:
- October unit (Asic 1 = 4 VRM, other 3 are 8 VRM but only 4 active)
- Firmware 0.99-tune
- Clocksetting on 231, no other firmware changes.
- Harware mod: Lowered the Asicboard-fans (more airflow over the board), added 2 fans at the outlet (so now push + pull) and slightly lowered fan speed to reduce the noise (80%).

Images
Status page --> 676 GH/s / 9131 WU
Advanced page --> 2.5V SPI and 256kHz frequency, slightly modded volts (higher, as required for the increased clock), not completely fine-tuned.
CGminer --> Error rate used to be 3%@560GH/s, since I lowered the volts a bit after the initial increase this increased to just below 4%. Fine with me as it lowered the temps 5-10 degrees an decreased power to 'around' 40 W per DC instead of closer to 50 W.
Pool stat  --> 650ghs average
Power consumption in bertmod used to be 453 W, now it is 651 W (43% increase in consumption for 18/19% performance increase). With voltage finetuning the power consumption should be able to come down a bit, now it is all pretty much on the safe side (safe as in: higher voltage as required to avoid that cores get shut down).
legendary
Activity: 1260
Merit: 1008
December 30, 2013, 01:58:28 PM
I have my jupiter now at '201' with the case off and an extra 120mm fan right in the middle pushing the cooler intake air toward the rear boards.  Seems to give steady 621GH/s (12+hrs) with temps from 40-58C across the boards.  With proper cooling I have kept the wattage per asic around 40W so as not to overload the vrms.  Not a bad result for a small text editor change and a fan.

Next up will be to see if I can lower the voltage needed to keep this performance and maybe go to '211'.  I had missed tinkering in the asic era,  no longer.  Grin

what's your HW errors rate?

I'm going to apply 201/211 to the rear-left asic just to see how it beahave. if everything is ok I'll set 201 also for the front boards.
legendary
Activity: 1400
Merit: 1000
I owe my soul to the Bitcoin code...
December 30, 2013, 11:31:25 AM
I have my jupiter now at '201' with the case off and an extra 120mm fan right in the middle pushing the cooler intake air toward the rear boards.  Seems to give steady 621GH/s (12+hrs) with temps from 40-58C across the boards.  With proper cooling I have kept the wattage per asic around 40W so as not to overload the vrms.  Not a bad result for a small text editor change and a fan.

Next up will be to see if I can lower the voltage needed to keep this performance and maybe go to '211'.  I had missed tinkering in the asic era,  no longer.  Grin
legendary
Activity: 1260
Merit: 1008
December 30, 2013, 10:43:20 AM
other two things I've just discovered:

- voltage settings are persistant across reboot. it makes sense since they're stored into /confg/advanced.conf

- the rear asics board run hotter than the front one (with or without case, and with or without overclocking). in my case the hotter is the rear-left one. I have a spare case fan and I've decided to use it to cool down the aformentioned asic board. The result was quite impressive on a 1F1 OC machine the temps lower from 65C to 42C and over HW error rate lower from 4.5% to 3.1%.

I'm planning to the same for the rear-right asic as soon the next time I'll go to the colo facility. Once I've done that I'll try to run the miner to higher freq/voltage.
legendary
Activity: 1260
Merit: 1008
December 29, 2013, 06:35:16 PM
Wonderful thread gentlemen, really shows the pioneering spirit of the mining core. Thank you.

Has there been a conglomeration yet of the tuning suite's best practices.  Kind of like 'these ranges of voltages seem to keep up performance for less power' or 'certain SPI frequencies yield better stability results' etc.?

I would like to start adjusting values on the Adv. page if there is cause but without a manual or even best practices it is daunting. 

no. unluckily there's no a "easy" howto so far. anyway the two things I've learnt about "Advanced" settings are:

- once you've overclocked probably you'll notice more disabled cores than usual. to make those cores working again you need to set a higher voltage for the die to which they belong to.

- The higher the hashrate the higher the SPI freq if you want to keep the HW error rate in shape, and that means that probably you need to higher also SPI voltage (I still haven't tried this, though)



legendary
Activity: 1400
Merit: 1000
I owe my soul to the Bitcoin code...
December 29, 2013, 01:02:55 PM
Wonderful thread gentlemen, really shows the pioneering spirit of the mining core. Thank you.

Has there been a conglomeration yet of the tuning suite's best practices.  Kind of like 'these ranges of voltages seem to keep up performance for less power' or 'certain SPI frequencies yield better stability results' etc.?

I would like to start adjusting values on the Adv. page if there is cause but without a manual or even best practices it is daunting. 
legendary
Activity: 1260
Merit: 1008
December 27, 2013, 03:38:14 PM
I just wanted to report that my Oct Jupiter has been running at ~950GH/s (6-modules) and pulling an between 57-60 amps per VRM for about a week now. Temps range between 57-78.5 degrees Celsius. So far so good!

You are a brave man for pulling much more than the recommended max safe current  Shocked

You use 211 I assume?

I use different settings for each die, but many of them are using 231

Wow, that is a crazy overclock then  Grin

Do you mind posting a screenshot of your Status and Advanced Tab + SSH Scgminer Tab + your settings.

Also are you not worried about drawing more than 200A per board?

Thank you.

Yes, I am worried about the 200A per board, but I'm willing to risk it. Here's the info you requested:

FW version: 0.99-tuning (October)
cgminer version: 3.9.0
Status Page
Advanced Tab
modified cgminer.sh file

I want to congrat for your setup and for your achievement Smiley

having said that, the temps showed in the status page is very high for the two 4 VRMs asic slots (68 and 78). Have you bought this two from KNC's expansion batch?

One last thing I saw is that, always for this 2 4 VRMs slots, you have setted negative voltage for each die, what does it mean?

Anyway after seeing this it seems that  8 VRMs are better suited for OC.

Many thanks for sharing those info, really appreciated.
legendary
Activity: 1260
Merit: 1008
December 27, 2013, 03:27:23 PM

nothing.

your overclock settings will stay in place, but your change to voltage won't get applied :/


So the voltages I have set would not get back to default when I apply the overclocking?



right, they would not get back.

restarting cgminer through cgminer.sh does not overwrite the voltage settings you apply via "Advanced" tab.

hero member
Activity: 812
Merit: 502
December 27, 2013, 01:58:06 PM

nothing.

your overclock settings will stay in place, but your change to voltage won't get applied :/


So the voltages I have set would not get back to default when I apply the overclocking?

full member
Activity: 226
Merit: 100
December 27, 2013, 12:18:59 PM
I just wanted to report that my Oct Jupiter has been running at ~950GH/s (6-modules) and pulling an between 57-60 amps per VRM for about a week now. Temps range between 57-78.5 degrees Celsius. So far so good!

You are a brave man for pulling much more than the recommended max safe current  Shocked

You use 211 I assume?

I use different settings for each die, but many of them are using 231

Wow, that is a crazy overclock then  Grin

Do you mind posting a screenshot of your Status and Advanced Tab + SSH Scgminer Tab + your settings.

Also are you not worried about drawing more than 200A per board?

Thank you.

Yes, I am worried about the 200A per board, but I'm willing to risk it. Here's the info you requested:

FW version: 0.99-tuning (October)
cgminer version: 3.9.0
Status Page
Advanced Tab
modified cgminer.sh file

Thank you. It looks like your last board is not a team player Smiley Is the last die so bad that you even underclocked it below factory frequency?

Also I assume you installed CGminer manually, because 0.99-tuning comes with 3.8.1 and 0.99.1-Tune comes with 3.9.0.

Yep, that last board hates me. And yes, I did install cgminer manually.
hero member
Activity: 812
Merit: 502
December 27, 2013, 12:16:59 PM
I just wanted to report that my Oct Jupiter has been running at ~950GH/s (6-modules) and pulling an between 57-60 amps per VRM for about a week now. Temps range between 57-78.5 degrees Celsius. So far so good!

You are a brave man for pulling much more than the recommended max safe current  Shocked

You use 211 I assume?

I use different settings for each die, but many of them are using 231

Wow, that is a crazy overclock then  Grin

Do you mind posting a screenshot of your Status and Advanced Tab + SSH Scgminer Tab + your settings.

Also are you not worried about drawing more than 200A per board?

Thank you.

Yes, I am worried about the 200A per board, but I'm willing to risk it. Here's the info you requested:

FW version: 0.99-tuning (October)
cgminer version: 3.9.0
Status Page
Advanced Tab
modified cgminer.sh file

Thank you. It looks like your last board is not a team player Smiley Is the last die so bad that you even underclocked it below factory frequency?

Also I assume you installed CGminer manually, because 0.99-tuning comes with 3.8.1 and 0.99.1-Tune comes with 3.9.0.
full member
Activity: 226
Merit: 100
December 27, 2013, 12:06:55 PM
I just wanted to report that my Oct Jupiter has been running at ~950GH/s (6-modules) and pulling an between 57-60 amps per VRM for about a week now. Temps range between 57-78.5 degrees Celsius. So far so good!

You are a brave man for pulling much more than the recommended max safe current  Shocked

You use 211 I assume?

I use different settings for each die, but many of them are using 231

Wow, that is a crazy overclock then  Grin

Do you mind posting a screenshot of your Status and Advanced Tab + SSH Scgminer Tab + your settings.

Also are you not worried about drawing more than 200A per board?

Thank you.

Yes, I am worried about the 200A per board, but I'm willing to risk it. Here's the info you requested:

FW version: 0.99-tuning (October)
cgminer version: 3.9.0
Status Page
Advanced Tab
modified cgminer.sh file
hero member
Activity: 812
Merit: 502
December 27, 2013, 11:47:19 AM
I just wanted to report that my Oct Jupiter has been running at ~950GH/s (6-modules) and pulling an between 57-60 amps per VRM for about a week now. Temps range between 57-78.5 degrees Celsius. So far so good!

You are a brave man for pulling much more than the recommended max safe current  Shocked

You use 211 I assume?

I use different settings for each die, but many of them are using 231

Wow, that is a crazy overclock then  Grin

Do you mind posting a screenshot of your Status and Advanced Tab + SSH Scgminer Tab + your settings.

Also are you not worried about drawing more than 200A per board?

Thank you.
full member
Activity: 226
Merit: 100
December 27, 2013, 11:36:54 AM
I just wanted to report that my Oct Jupiter has been running at ~950GH/s (6-modules) and pulling an between 57-60 amps per VRM for about a week now. Temps range between 57-78.5 degrees Celsius. So far so good!

You are a brave man for pulling much more than the recommended max safe current  Shocked

You use 211 I assume?

I use different settings for each die, but many of them are using 231
Pages:
Jump to: