Hacking The KNC Firmware: Overclocking - page 56.

CYPER

hero member

Activity: 812

Merit: 502

Quote from: sickpig on December 26, 2013, 10:54:38 AM

from my limited experience I think that both points apply to everyone. For what is worth I've been albe to verify only the second one: every time I overclock the miner a good number of cores will be disabled during the first two minutes of the new cgminer session. Usually they're concentrated in one or two die.

In my case dies are being disabled (1 die = 48 cores). If just cores are disabled then that is easy to fix - just apply more voltage to stabilize them.

Quote from: sickpig on December 26, 2013, 10:54:38 AM

On the other hand I've never tried
to verify the first point, mainly because I don't know how to do it. How do you know for sure that the clock has been reset, do you look at the Amps or is there anyway to read the value of a PLL registry?

Yes, voltage drops. For example you apply the overclock and you see a VRM working at 54A. You change the voltage setting just 1-2 values lower and after you hit apply the Amps drop from 54 to 44, which means the higher overclock frequency is no longer applied.

Quote from: sickpig on December 26, 2013, 10:54:38 AM

Instead of increasing the voltage to the maximum value, I just set it a little bit higher take into accounts on how many
cores are disabled in a particular die. the I restart cgminer and wait 1 minute to see if changes make any difference.

I use this approch because I don't want to cook my asics/vrms.

I've tried that, but it doesn't work as effective as applying max voltage. With max voltage the sleepy dies usually kick in immediately.
Also I have a 2nd theory, which I haven't tested a lot, but if you supply sufficient voltage to a sleepy die it might awake after 2-3-4-5-6 hours. But I don't think there is any consistency in results with this method and I prefer to wake them up immediately with stress/shock voltage that waiting hours for them to wake up naturally, which is not guaranteed to happen.

Quote from: sickpig on December 26, 2013, 10:54:38 AM

I've also increased the SPI frequency because of what 'orama said in one of hist last post said:

How much? I played with this and I usually stick to 256000Mhz. I tried even more, but I can't see any correlation between this and any results.

Quote from: sickpig on December 26, 2013, 10:54:38 AM

Another think I do is taking not of all the changes I apply along the way (a goodthing is coping /config/adavanced.conf at differnet moment in time)

To check the distribution of disabled cores I use a modified version of a pl script included in bertmod. It is an ASCII version, it only outputs temps and disabled core per die, e.g.

Code:

 Board 0: Temperature sensor: 47.5C
 DIE 0 ON: 46 OFF: 2  95.8% OK
 DIE 1 ON: 48 OFF: 0  100% OK
 DIE 2 ON: 48 OFF: 0  100% OK
 DIE 3 ON: 48 OFF: 0  100% OK
Board 2: Temperature sensor: 64.0C
 DIE 0 ON: 47 OFF: 1  97.9% OK
 DIE 1 ON: 48 OFF: 0  100% OK
 DIE 2 ON: 48 OFF: 0  100% OK
 DIE 3 ON: 48 OFF: 0  100% OK
Board 3: Temperature sensor: 55.0C
 DIE 0 ON: 48 OFF: 0  100% OK
 DIE 1 ON: 48 OFF: 0  100% OK
 DIE 2 ON: 48 OFF: 0  100% OK
 DIE 3 ON: 48 OFF: 0  100% OK
Board 4: Temperature sensor: 49.0C
 DIE 0 ON: 48 OFF: 0  100% OK
 DIE 1 ON: 48 OFF: 0  100% OK
 DIE 2 ON: 48 OFF: 0  100% OK
 DIE 3 ON: 48 OFF: 0  100% OK

How exactly did you do that?
You created an additional page within the lighttpd server?

sickpig

legendary

Activity: 1260

Merit: 1008

Quote from: CYPER on December 25, 2013, 09:10:47 PM

Quote from: sickpig on December 25, 2013, 07:52:01 PM

so you're impling that setting the clock to the default value lower the Amps, despite the fact that you're increasing the voltage per die?

I have 2 problems with the current state of overclocking:
Problem 1 applies to everyone:
Setting the voltage on the Advanced tab resets the clock.
Problem 2 applies to me only (and maybe other people):
Setting any overclock in cgminer.sh kills a number of dies. I think it happens because when you increase the frequency of the chip the current required increases, which creates a change in the voltage/current values and so this change makes the dies sleepy.

from my limited experience I think that both points apply to everyone. For what is worth I've been albe to verify only the second one: every time I overclock the miner a good number of cores will be disabled during the first two minutes of the new cgminer session. Usually they're concentrated in one or two die. On the other hand I've never tried
to verify the first point, mainly because I don't know how to do it. How do you know for sure that the clock has been reset, do you look at the Amps or is there anyway to read the value of a PLL registry?

Quote from: Bitcoinorama on December 23, 2013, 10:39:30 AM

If the SPI frequency is too low then there is not enough bandwidth to collect all the good nonces found. So you want to find an equilibrium where by SPI frequency is high enough not to miss any of the nonces found, but low enough to retain a healthy noise to signal ratio and thus minimise hardware errors.

Another think I do is taking not of all the changes I apply along the way (a goodthing is coping /config/adavanced.conf at differnet moment in time)

To check the distribution of disabled cores I use a modified version of a pl script included in bertmod. It is an ASCII version, it only outputs temps and disabled core per die, e.g.

Code:

 Board 0: Temperature sensor: 47.5C
 DIE 0 ON: 46 OFF: 2  95.8% OK
 DIE 1 ON: 48 OFF: 0  100% OK
 DIE 2 ON: 48 OFF: 0  100% OK
 DIE 3 ON: 48 OFF: 0  100% OK
Board 2: Temperature sensor: 64.0C
 DIE 0 ON: 47 OFF: 1  97.9% OK
 DIE 1 ON: 48 OFF: 0  100% OK
 DIE 2 ON: 48 OFF: 0  100% OK
 DIE 3 ON: 48 OFF: 0  100% OK
Board 3: Temperature sensor: 55.0C
 DIE 0 ON: 48 OFF: 0  100% OK
 DIE 1 ON: 48 OFF: 0  100% OK
 DIE 2 ON: 48 OFF: 0  100% OK
 DIE 3 ON: 48 OFF: 0  100% OK
Board 4: Temperature sensor: 49.0C
 DIE 0 ON: 48 OFF: 0  100% OK
 DIE 1 ON: 48 OFF: 0  100% OK
 DIE 2 ON: 48 OFF: 0  100% OK
 DIE 3 ON: 48 OFF: 0  100% OK

CYPER

hero member

Activity: 812

Merit: 502

Quote from: sickpig on December 25, 2013, 07:52:01 PM

so you're impling that setting the clock to the default value lower the Amps, despite the fact that you're increasing the voltage per die?

I have 2 problems with the current state of overclocking:
Problem 1 applies to everyone:
Setting the voltage on the Advanced tab resets the clock.
Problem 2 applies to me only (and maybe other people):
Setting any overclock in cgminer.sh kills a number of dies. I think it happens because when you increase the frequency of the chip the current required increases, which creates a change in the voltage/current values and so this change makes the dies sleepy.

This is what I do: on boards with sleepy dies I increase voltage to the max and when I see they kick in and are alive I immediately lower it to safe values, but not too low so they don't fall asleep again.
Then I restart cgminer.sh where the overclock values are and this makes the dies to fall asleep again even though the voltage hasn't changed (just the current changes as higher frequency results in higher current).

Quote from: sickpig on December 25, 2013, 07:52:01 PM

nothing.

your overclock settings will stay in place, but your change to voltage won't get applied :/

edit:

give IRC a try and see if hno is hanging around..

So it won't work. To be honest I don't see any certain way to achieve the same result twice. I'm just fiddling with all the settings/values I can change hoping it will work.
2-3 days ago this same miner was happily hashing overclocked at around 600GH/s
Now it is even hard to make it hash at stock as some dies are very hard to be kept awake.

sickpig

legendary

Activity: 1260

Merit: 1008

Quote from: CYPER on December 25, 2013, 07:36:05 PM

Quote from: sickpig on December 25, 2013, 06:51:02 PM

so everytime you hit click on "Apply" button on the "Adavenced" tab, the clock is resetted to default?

edit:

I've gone through the code and it seems that what happened when you click on apply is the execution of this command:

Code:

    waas -c /config/advanced.conf > /dev/null
    killall monitordcdc

in /config/advanced.conf there's a JSON rappresentation of the data contained in the Advanced tab.

This means that waas command reset all the default value for the PLL. Unluckily there's no source code for waas executable.

Yes, I believe so as the Amps drop.

so you're impling that setting the clock to the default value lower the Amps, despite the fact that you're increasing the voltage per die?

Quote from: CYPER on December 25, 2013, 07:36:05 PM

So what would happen if I remove that command?

nothing.

your overclock settings will stay in place, but your change to voltage won't get applied :/

edit:

give IRC a try and see if hno is hanging around..

CYPER

hero member

Activity: 812

Merit: 502

Quote from: sickpig on December 25, 2013, 06:51:02 PM

Quote from: CYPER on December 25, 2013, 05:13:45 PM

So I have this very temperamental miner with around 6-7 sleepy dies on 3 boards with the 4th board being OK.
At stock clock I can easily awaken them by applying max voltage to all 4 dies of a single board and when I see them kick in I quickly lower them back to safe values.

But whenever I try to overclock it the dies fall asleep and because I need to change voltage settings to awaken them the overclock disappears Sad

And that same miner used to be overclocked with all dies working, but I don't remember how I did it.

Any ideas? Is there a way to NOT remove the overclock when changing voltage settings?

so everytime you hit click on "Apply" button on the "Adavenced" tab, the clock is resetted to default?

edit:

I've gone through the code and it seems that what happened when you click on apply is the execution of this command:

Code:

    waas -c /config/advanced.conf > /dev/null
    killall monitordcdc

in /config/advanced.conf there's a JSON rappresentation of the data contained in the Advanced tab.

This means that waas command reset all the default value for the PLL. Unluckily there's no source code for waas executable.

Yes, I believe so as the Amps drop.

So what would happen if I remove that command?

sickpig

legendary

Activity: 1260

Merit: 1008

Quote from: CYPER on December 25, 2013, 05:13:45 PM

So I have this very temperamental miner with around 6-7 sleepy dies on 3 boards with the 4th board being OK.
At stock clock I can easily awaken them by applying max voltage to all 4 dies of a single board and when I see them kick in I quickly lower them back to safe values.

But whenever I try to overclock it the dies fall asleep and because I need to change voltage settings to awaken them the overclock disappears Sad

And that same miner used to be overclocked with all dies working, but I don't remember how I did it.

Any ideas? Is there a way to NOT remove the overclock when changing voltage settings?

so everytime you hit click on "Apply" button on the "Adavenced" tab, the clock is resetted to default?

edit:

I've gone through the code and it seems that what happened when you click on apply is the execution of this command:

Code:

    waas -c /config/advanced.conf > /dev/null
    killall monitordcdc

in /config/advanced.conf there's a JSON rappresentation of the data contained in the Advanced tab.

This means that waas command reset all the default value for the PLL. Unluckily there's no source code for waas executable.

CYPER

hero member

Activity: 812

Merit: 502

So I have this very temperamental miner with around 6-7 sleepy dies on 3 boards with the 4th board being OK.
At stock clock I can easily awaken them by applying max voltage to all 4 dies of a single board and when I see them kick in I quickly lower them back to safe values.

But whenever I try to overclock it the dies fall asleep and because I need to change voltage settings to awaken them the overclock disappears Sad

And that same miner used to be overclocked with all dies working, but I don't remember how I did it.

Any ideas? Is there a way to NOT remove the overclock when changing voltage settings?

CYPER

hero member

Activity: 812

Merit: 502

Quote from: tolip_wen on December 24, 2013, 05:45:57 PM

Quote from: sudya_dred on December 24, 2013, 03:46:12 PM

Found interesting. Think it's my problem, please look in yours (October miner) - ls /sys/class/gpio/ What the last gpiochip ? I have only 96, and it is my enables core, no directory for other (must be 192).
How can i copy /sys/class/gpio/gpiochipXXX whith new name or from other device to /sys/class/gpio/*.*

Those are the GPIO pins of the BBB nothing to do with ASICs, dies, or cores.
GPIO = General Purpose Input/Output

The miner only uses a few of them.

For anyone still wondering how to get bfgminer + overclocking the easiest way is to install bertmod and then make a copy of the cgminer.sh:

Code:

cp /etc/init.d/cgminer.sh /config/bfg1.sh

Then you need to edit the newly created file by:

Code:

vi /config/bfg1.sh

Delete everything by pressing this on your keyboard: 120dd
Make sure all text is gone.
Then press i and paste everything from this into your file: http://pastebin.com/fFCWngnq
Then press :x! and then Enter
Finally do /config/bfg1.sh restart

The reason BFGminer didn't want to start with the original cgminer.sh file even after editing it is this part. It basically checks the if the BFGminer setting is checked at the web interface and then starts the appropriate software accordingly:

Code:

MINING_SW=`ls -l /usr/bin/cgminer`
        if [ "`echo $MINING_SW | grep bfgminer`" != "" ] ; then
                export LD_LIBRARY_PATH=/usr/bfgminer/
                start-stop-daemon -b -S -x screen -- -S cgminer -t cgminer -m -d "$DAEMON" --api-listen -c /config/cgminer.conf -S knc:auto
        else
                start-stop-daemon -b -S -x screen -- -S cgminer -t cgminer -m -d "$DAEMON" --api-listen --default-config /config/cgminer.conf
        fi

What I've noticed is in order to stabilize bad dies with cores shutting off and on you need to increase the voltage, so the the total current (Amps) is around 50. Again according to an engineer from KNC we should not get above 50A per VRM, but Bitcoinorama said 64A max. Until this is settled don't go too far. Remember that increasing the frequency also increases the current, so depending on your miner you might have to reduce in at the Advanced tab.

If you are not sure what you are doing better don't start.

tolip_wen

sr. member

Activity: 386

Merit: 250

Quote from: sudya_dred on December 24, 2013, 03:46:12 PM

Found interesting. Think it's my problem, please look in yours (October miner) - ls /sys/class/gpio/ What the last gpiochip ? I have only 96, and it is my enables core, no directory for other (must be 192).
How can i copy /sys/class/gpio/gpiochipXXX whith new name or from other device to /sys/class/gpio/*.*

Those are the GPIO pins of the BBB nothing to do with ASICs, dies, or cores.
GPIO = General Purpose Input/Output

The miner only uses a few of them.

sickpig

legendary

Activity: 1260

Merit: 1008

Quote from: sudya_dred on December 24, 2013, 03:46:12 PM

Found interesting. Think it's my problem, please look in yours (October miner) - ls /sys/class/gpio/ What the last gpiochip ? I have only 96, and it is my enables core, no directory for other (must be 192).
How can i copy /sys/class/gpio/gpiochipXXX whith new name or from other device to /sys/class/gpio/*.*

dunno what you mean, but this is the content of aforementioned dir on my october miner.

Code:

root@mine:~# ls -l /sys/class/gpio/
--w-------    1 root     root          4096 Jan  1  2000 export
lrwxrwxrwx    1 root     root             0 Jan  1  2000 gpio49 -> ../../devices/virtual/gpio/gpio49
lrwxrwxrwx    1 root     root             0 Jan  1  2000 gpio59 -> ../../devices/virtual/gpio/gpio59
lrwxrwxrwx    1 root     root             0 Jan  1  2000 gpio66 -> ../../devices/virtual/gpio/gpio66
lrwxrwxrwx    1 root     root             0 Jan  1  2000 gpio67 -> ../../devices/virtual/gpio/gpio67
lrwxrwxrwx    1 root     root             0 Jan  1  2000 gpio69 -> ../../devices/virtual/gpio/gpio69
lrwxrwxrwx    1 root     root             0 Jan  1  2000 gpio70 -> ../../devices/virtual/gpio/gpio70
lrwxrwxrwx    1 root     root             0 Jan  1  2000 gpio71 -> ../../devices/virtual/gpio/gpio71
lrwxrwxrwx    1 root     root             0 Jan  1  2000 gpio76 -> ../../devices/virtual/gpio/gpio76
lrwxrwxrwx    1 root     root             0 Dec 24 20:57 gpiochip0 -> ../../devices/virtual/gpio/gpiochip0
lrwxrwxrwx    1 root     root             0 Dec 24 20:57 gpiochip32 -> ../../devices/virtual/gpio/gpiochip32
lrwxrwxrwx    1 root     root             0 Dec 24 20:57 gpiochip64 -> ../../devices/virtual/gpio/gpiochip64
lrwxrwxrwx    1 root     root             0 Dec 24 20:57 gpiochip96 -> ../../devices/virtual/gpio/gpiochip96
--w-------    1 root     root          4096 Dec 24 20:57 unexport

anyway I don't think you can just "create/copy" something in /sys. sysfs it is the way moderm linuxes export information about hw to user space, so if something is missing there is probably because there's no such a thing on the hw side, or at least the kernel is not albe to deal with it.

sickpig

legendary

Activity: 1260

Merit: 1008

Quote from: CYPER on December 24, 2013, 02:03:21 PM

Quote from: sickpig on December 24, 2013, 01:14:34 PM

Quote from: CYPER on December 24, 2013, 01:09:49 PM

Quote from: sickpig on December 24, 2013, 01:04:48 PM

Quote from: the-skeptic on December 21, 2013, 01:56:34 PM

Looks good. But please, don't help add a couple hundred THs to the network by being the hero and writing an overclocking tutorial!

wow this is bold. I have no words.

Imagine what would have happened if tolip_wen had applied the same reasoning...

We would had to wait until January till the new firmware comes with built-in overclocking

You can't be sure, because tolip_wen's findings could have influenced KnC's choice to release a OC-ready firmware.

The thing that really annoys me is the attitude. Without the sharing of knowledge almost all the bitcoin ecosystem would not exist at all.

What is stopping you from providing such a tutorial?

I'm not knowledgeable enough otherwise I would have done it , as simple as that.

sudya_dred

newbie

Activity: 12

Merit: 0

Found interesting. Think it's my problem, please look in yours (October miner) - ls /sys/class/gpio/ What the last gpiochip ? I have only 96, and it is my enables core, no directory for other (must be 192).
How can i copy /sys/class/gpio/gpiochipXXX whith new name or from other device to /sys/class/gpio/*.*

CYPER

hero member

Activity: 812

Merit: 502

Quote from: sickpig on December 24, 2013, 01:14:34 PM

Quote from: CYPER on December 24, 2013, 01:09:49 PM

Quote from: sickpig on December 24, 2013, 01:04:48 PM

Quote from: the-skeptic on December 21, 2013, 01:56:34 PM

Looks good. But please, don't help add a couple hundred THs to the network by being the hero and writing an overclocking tutorial!

wow this is bold. I have no words.

Imagine what would have happened if tolip_wen had applied the same reasoning...

We would had to wait until January till the new firmware comes with built-in overclocking

You can't be sure, because tolip_wen's findings could have influenced KnC's choice to release a OC-ready firmware.

The thing that really annoys me is the attitude. Without the sharing of knowledge almost all the bitcoin ecosystem would not exist at all.

What is stopping you from providing such a tutorial?

sickpig

legendary

Activity: 1260

Merit: 1008

Quote from: CYPER on December 24, 2013, 01:09:49 PM

Quote from: sickpig on December 24, 2013, 01:04:48 PM

Quote from: the-skeptic on December 21, 2013, 01:56:34 PM

Looks good. But please, don't help add a couple hundred THs to the network by being the hero and writing an overclocking tutorial!

wow this is bold. I have no words.

Imagine what would have happened if tolip_wen had applied the same reasoning...

We would had to wait until January till the new firmware comes with built-in overclocking

You can't be sure, because tolip_wen's findings could have influenced KnC's choice to release a OC-ready firmware.

The thing that really annoys me is the attitude. Without the sharing of knowledge almost all the bitcoin ecosystem would not exist at all.

CYPER

hero member

Activity: 812

Merit: 502

Quote from: sickpig on December 24, 2013, 01:04:48 PM

Quote from: the-skeptic on December 21, 2013, 01:56:34 PM

Looks good. But please, don't help add a couple hundred THs to the network by being the hero and writing an overclocking tutorial!

wow this is bold. I have no words.

Imagine what would have happened if tolip_wen had applied the same reasoning...

We would had to wait until January till the new firmware comes with built-in overclocking

sickpig

legendary

Activity: 1260

Merit: 1008

Quote from: the-skeptic on December 21, 2013, 01:56:34 PM

Looks good. But please, don't help add a couple hundred THs to the network by being the hero and writing an overclocking tutorial!

wow this is bold. I have no words.

Imagine what would have happened if tolip_wen had applied the same reasoning...

sickpig

legendary

Activity: 1260

Merit: 1008

Quote from: sudya_dred on December 24, 2013, 12:05:16 PM

DC\DC - Off after diagnostic tool for November Sad

. I try enable-core, recovery and different firmware. Sad

you try the nov diagnostic on a october miner?

Topic: Hacking The KNC Firmware: Overclocking - page 56. (Read 144387 times)