Pages:
Author

Topic: Antminer S5 - Underclock - Undervolt - Best J/GH - page 3. (Read 31123 times)

legendary
Activity: 3808
Merit: 1723
Didn't have time to read this entire thread. So basically there is no way to mod the S5 by reducing the resistance like on the S1/S3 ?

Only way is to supply 10V from a power supply?

Can't you put a resistor between the PCIe connection to drop the voltage down to 10V from the PSU's 12V or is there some kind of protection built into PSUs ?


hero member
Activity: 588
Merit: 500
I'm slightly against the PLL Idea, since I believe that's part of the ASIC, is it not? Therefore it would be based on, most likely, the digital voltage which comes from the LDO, and since that's stable... I think if we can figure out how the oscillator is triggered (why it fires up and when) that might help a little bit. Does the power from the LDO's go anywhere besides the ASICs?

Yes PLL is part of the ASIC and yes the ASIC IO circuitry and the Digital / Analog parts of the PLL are powered by the LDO. Each Node / Chip pair has it's own LDO that draws it's input voltage from Core Voltage of the Node 3 chips up the Chain.

So at 12V supply when each node is seeing 0.8V then the input to the LDO sees 0.8V x 4 = 3.2V. The LDO gives out 1.8V for the PLL Analog Circuitry and the IO Circuitry. There is then a potential divider that drops the voltage to 0.9V for the PLL Digital Circuitry. That is all each LDO supplies, and as I said earlier there is plenty of headroom on the input voltage to the LDO for any sensible supply voltage reduction.

The oscillator itself seems to be fine at all voltage levels and starts without problem, the problem is the Clock Output (Pin 19) from the bottom ASIC that you have monitored at the CLK01 Test Point. If you look at the 25MHz oscillator itself it is a nice sign wave, whereas the 25MHz clock signal at the test point is not that nice looking and has obviously been through more than just a buffer?


Rich

Important update to the above and back to the oscillator being the problem  Smiley. I carefully checked the oscillator again and at the point that the Clock Output CLK01 stops although there is still an oscillation it's amplitude drops and it's unstable.  Also it is at that point that the core voltage drops from 0.7V to 0.48V. However If you increase the voltage then the oscillation does not become stable again until cgminer is restarted. I suspect a reset command is sent to the ASICS, also I am sure commands to set the PLL dividers will be sent.

So I think it is worth trying a separately powered oscillator.
member
Activity: 67
Merit: 10
I'm slightly against the PLL Idea, since I believe that's part of the ASIC, is it not? Therefore it would be based on, most likely, the digital voltage which comes from the LDO, and since that's stable... I think if we can figure out how the oscillator is triggered (why it fires up and when) that might help a little bit. Does the power from the LDO's go anywhere besides the ASICs?
hero member
Activity: 588
Merit: 500
I think this is really a game of finding a stable voltage for some sort of infrastructure-like device, such as the oscillator. Question is: is it just the oscillator or are there more components at play here?

I think this is one of those problems where there are multiple factors in play, always more difficult to sort. Keep at the back of your mind when working on this the key changes identified between the earlier board & the V1.91 that does allow undervolting.

1) It has multiple oscillators every 4 Nodes up the chain.

2) The 1.8V for the IO stages & PLL is in the earlier versions derived from an LDO regulator fed from the Core Voltage 3 Nodes up the Chain, wheras V1.91 takes it's voltage directly from the next Node up the chain.

I have already checked that the LDO Regulators continue to supply 1.8V when the supply is reduced and they have plenty in hand for a voltage way below where we are going to.

A theory I have is that having it directly connected to the stage above means that the IO Voltage will track down with the Core voltage and the ASIC "likes this"? My other theory is that it was merely a Bitmain cost reduction unrelated to anything else?

I think the problem we have here is not the oscillator stopping but the PLL that converts to the higher operating frequency not locking?

Rich
member
Activity: 67
Merit: 10
I think this is really a game of finding a stable voltage for some sort of infrastructure-like device, such as the oscillator. Question is: is it just the oscillator or are there more components at play here?
hero member
Activity: 588
Merit: 500
As you were.....While typing this I decided to check the output of the oscillator itself and it keeps running at the lower voltage, so it's already ok.... So this means that the reduced core voltage for some reason stops the Clock Output from the first chip not the oscillator itself?

Not sure. I got distracted and ended up leaving the thing running for about 20 minutes at ~11V (300MHz) and the clock stayed going for quite a while then apparently out of nowhere, it started beeping, and I looked over to see it off. Not sure what might trigger it to go off.

If it is not hashing after a few minutes the miner will beep, the same as if it has lost Internet connection. Yes I have found if you are on the edge of working it can stop for no reason, also I have had one board stop & the other keep going.
member
Activity: 67
Merit: 10
As you were.....While typing this I decided to check the output of the oscillator itself and it keeps running at the lower voltage, so it's already ok.... So this means that the reduced core voltage for some reason stops the Clock Output from the first chip not the oscillator itself?

Not sure. I got distracted and ended up leaving the thing running for about 20 minutes at ~11V (300MHz) and the clock stayed going for quite a while then apparently out of nowhere, it started beeping, and I looked over to see it off. Not sure what might trigger it to go off.
hero member
Activity: 588
Merit: 500
The LM75A Temperature sensor runs on 3.3V which comes from the Controller Board, that is a convenient place to pick up a voltage for an external oscillator test, not sure if I have anything "in stock" or if I will have to buy something?

As you were.....While typing this I decided to check the output of the oscillator itself and it keeps running at the lower voltage, so it's already ok.... So this means that the reduced core voltage for some reason stops the Clock Output from the first chip not the oscillator itself?

Rich
member
Activity: 67
Merit: 10
Another observation is that when the oscillator is running it keeps running during a cgminer restart. However if the oscillator stops it will not restart when the voltage is increased, you have to restart cgminer to get it going again.

At the moment this is the only thing I can process a little farther. Maybe cgminer sends out an instruction of some sort, triggering the oscillator to do its thing, and if it's already running and it does a quick reset, it's so fast we're not going to see it unless we're looking closely enough and know what to look for and when. I think setting up an oscillator to run off of a dedicated supply somewhere, the control board, an individual linear regulator, something along those lines?
hero member
Activity: 588
Merit: 500
I have run a few quick tests and I think you are onto something with the oscillator not running. I have a v1.3 Board which I am repairing. I have bypassed the top two stages as there is a faulty ASIC or two there, but otherwise it is operational and I am operating it at reduced voltage so the the Core Voltages are correct.

So I have reduced the voltage so that I get to the point where it does not start hashing and like you the key is that the oscillator is not running. It typically runs briefly and then stops. I think there are other factors preventing undervolting of the earlier boards, however this is definitely one of them. I am wondering about fitting a separate oscillator powered from the 3.3V as opposed to relying on the oscillator circuitry in the first ASIC?

Another "feature" I want to explore is that I have been monitoring the Core Voltage on the first stage during power up. For some reason it is low / lower than the other stages (0.45V v 0.7V) when the oscillator is not running. There is some variance between the other stages but not as much as this? This may not be helping or may be a red herring?

Another observation is that when the oscillator is running it keeps running during a cgminer restart. However if the oscillator stops it will not restart when the voltage is increased, you have to restart cgminer to get it going again.


Rich
hero member
Activity: 588
Merit: 500
Not sure to what degree interaction will be possible, just up in the UK, it's 07:40. A few comments.

So that clock signal is on the output of the bottom ASIC in the chain (the one with the Y1 oscillator at the centre of the board) where it feeds the clock input of the next chip up. As a matter of interest the 4 pads to the left are also to monitor the signals between the pairs of chips.

On a v1.91 board of the four I have measured at 300MHz the voltage for an acceptable HW error rate varied between 10.4V & 11V. You might want to cut the frequency right back for low voltage testing to 125MHz as this gives more scope for voltage reduction with a low error rate. It will of course make no difference to oscillators stopping. At 125MHz my four 1.91 boards had acceptable errors between 9V & 9.7V.

So oscillator stopping is not good, can you confirm the minimum voltage that you can restart cgminer and have a clock? Also worth measuring the bottom Node Core voltage (Across the big capacitor C17)  for reference?


Rich

member
Activity: 67
Merit: 10
Okay. In the lab on some good gear, so if y'all can reply quick enough I can do some interactive testing.
Something I did notice when I lower the voltage is I lose the clock signal that's observed on CLK01 directly underneath the ribbon cable. That signal is a 25MHz (mostly) square wave at about 3V. If I reset cgminer (using the apply settings button in the interface) after lowering the voltage, I'll lose the clock. If I adjust on the fly, the clock stays going at a lower voltage. I'm running the ASICs at 300MHz right now, down to 11.0V and still have the clock and hashing at 975GH/s with a .02% HWE.

Here's the clock signal I lose if started below 12V at any frequency, but seems stable above 10 something volts.
http://wavemadison.net/mine/s5_clock.jpg
hero member
Activity: 588
Merit: 500
I'm about to clean out the cat fur from my V1.3 S5, would you like some high-resolution images of the boards disassembled for any reason?

I have pictures of most of the Rev's but some are poor so yes one good high res picture of the component side of the board would be good. At some point I am going to start a guide on Troubleshooting and Repair and that might be handy.


Rich
member
Activity: 67
Merit: 10
I'm about to clean out the cat fur from my V1.3 S5, would you like some high-resolution images of the boards disassembled for any reason?
hero member
Activity: 588
Merit: 500
yeah, it has hit the point I can afford to run mine 24/7 which averages out to 13.8c/KWh, even with this crappy supply I've got. Still want to score more efficiency if I can, especially if I can figure out WHY some of these machines are cutting out at certain voltages, but something dawned on me earlier...

LDO's are mentioned for the older boards, and they switched regulation methods. Is this a game of losses? Instead of using LDO's, which in my understanding, are still linear devices, they switched to a switch-mode type regulation which can be set dead on. One of my older ideas was that the signal wasn't going through at a high enough level to be considered "high" or "low" to whatever receiver was getting them... Also still haven't checked things out on the scope yet either for whatever reason.


On all the earlier boards there are LDO's that supply the 1.8V for the IO & PLL part of the ASIC, not the High Current Core Voltage that comes from the Series connection of the Cores.

On V1.91 they have completely done away with the LDO which used to take it's input from the stage 3 levels up the chain and now they directly take the voltage from the core voltage of the stage above.

Not the best of pictures but here is V1.9 on the Left, V1.91 on the Right



So this is the bottom stage in the chain, you can see on the Left U6 which is the LDO, then on the Right U6 is not fitted but R231 is fitted, this is a zero Ohm resistor and connects the Core Voltage from the next stage up the chain directly to the IO & Analog PPL circuitry, the Digital PLL retains the voltage divider to drop the voltage to half of the Analog voltage.

The LDO is however retained for the top stage of the chain, as there is no stage above it,  Smiley fed from the 14V Buck converter on the Top Right of the PCB

Rich
member
Activity: 67
Merit: 10
yeah, it has hit the point I can afford to run mine 24/7 which averages out to 13.8c/KWh, even with this crappy supply I've got. Still want to score more efficiency if I can, especially if I can figure out WHY some of these machines are cutting out at certain voltages, but something dawned on me earlier...

LDO's are mentioned for the older boards, and they switched regulation methods. Is this a game of losses? Instead of using LDO's, which in my understanding, are still linear devices, they switched to a switch-mode type regulation which can be set dead on. One of my older ideas was that the signal wasn't going through at a high enough level to be considered "high" or "low" to whatever receiver was getting them... Also still haven't checked things out on the scope yet either for whatever reason.
legendary
Activity: 1302
Merit: 1068
I've been asking and many don't reply, others "don't know", and the one seller who confirmed 1.91 sold out before I got my money. Keep digging I guess. Thanks. I'll update when I get something.

People are lazy, to get a reply i had to link a picture to where to get this information every time. (Top right corner of each PCB, under the bitmain logo + enter url here) then i got replies easily.

With the BTC price value going up however, i won't be undervolting these babies for a long time.
hero member
Activity: 588
Merit: 500
V1.9 boards have provision in the layout to convert to a 1.91. That is additional Oscillator modules with ascociated components and removal of the LDO regulators and feeding power directly from the stage above. See this post for pictures and some more info.

https://bitcointalksearch.org/topic/m.12164839

The oscillator in particular is not an easy addition unless you are experienced in and have the equipment fore surface mount components.

However we have still not proven if one or both of these changes are responsible for the undervolting ability of the V1.91


Rich
member
Activity: 67
Merit: 10
What about the v1.9 boards? Found one of those and remember the mention of those somewhere earlier in the thread.
member
Activity: 67
Merit: 10
I've been asking and many don't reply, others "don't know", and the one seller who confirmed 1.91 sold out before I got my money. Keep digging I guess. Thanks. I'll update when I get something.
Pages:
Jump to: