Bitmain's Released Antminer S9, World's First 16nm Miner Ready to Order - page 147.

ComputerGenie

hero member

Activity: 1092

Merit: 552

Retired IRCX God

A literal example of why autofreq doesn't "burn chips" and isn't "pressing the utmost hash out of each chip"...
Chain 1(left out because it's redundant) and Chain 2 are from Batch 4 and Chain 3 is from a "preset" batch:

Code:

read PIC voltage=940 on chain[2]
Chain:2 chipnum=63
...
Asic[ 0]:625
Asic[ 1]:625 Asic[ 2]:625 Asic[ 3]:625 Asic[ 4]:625 Asic[ 5]:625 Asic[ 6]:625 Asic[ 7]:625 Asic[ 8]:625
Asic[ 9]:625 Asic[10]:625 Asic[11]:625 Asic[12]:625 Asic[13]:625 Asic[14]:625 Asic[15]:625 Asic[16]:625
Asic[17]:625 Asic[18]:625 Asic[19]:625 Asic[20]:625 Asic[21]:625 Asic[22]:625 Asic[23]:625 Asic[24]:625
Asic[25]:625 Asic[26]:625 Asic[27]:625 Asic[28]:625 Asic[29]:625 Asic[30]:625 Asic[31]:625 Asic[32]:625
Asic[33]:625 Asic[34]:625 Asic[35]:625 Asic[36]:625 Asic[37]:625 Asic[38]:625 Asic[39]:625 Asic[40]:625
Asic[41]:625 Asic[42]:625 Asic[43]:625 Asic[44]:625 Asic[45]:625 Asic[46]:625 Asic[47]:625 Asic[48]:625
Asic[49]:625 Asic[50]:625 Asic[51]:625 Asic[52]:625 Asic[53]:625 Asic[54]:625 Asic[55]:625 Asic[56]:625
Asic[57]:625 Asic[58]:625 Asic[59]:625 Asic[60]:625 Asic[61]:625 Asic[62]:625
Chain:2 max freq=625
Chain:2 min freq=625

read PIC voltage=940 on chain[3]
Chain:3 chipnum=63
...
Asic[ 0]:568
Asic[ 1]:606 Asic[ 2]:568 Asic[ 3]:593 Asic[ 4]:606 Asic[ 5]:575 Asic[ 6]:593 Asic[ 7]:516 Asic[ 8]:612
Asic[ 9]:593 Asic[10]:556 Asic[11]:612 Asic[12]:593 Asic[13]:593 Asic[14]:533 Asic[15]:600 Asic[16]:606
Asic[17]:550 Asic[18]:600 Asic[19]:606 Asic[20]:533 Asic[21]:606 Asic[22]:606 Asic[23]:504 Asic[24]:606
Asic[25]:606 Asic[26]:500 Asic[27]:606 Asic[28]:606 Asic[29]:587 Asic[30]:606 Asic[31]:606 Asic[32]:600
Asic[33]:606 Asic[34]:606 Asic[35]:612 Asic[36]:606 Asic[37]:612 Asic[38]:612 Asic[39]:606 Asic[40]:612
Asic[41]:612 Asic[42]:587 Asic[43]:612 Asic[44]:612 Asic[45]:606 Asic[46]:612 Asic[47]:606 Asic[48]:606
Asic[49]:606 Asic[50]:612 Asic[51]:606 Asic[52]:612 Asic[53]:612 Asic[54]:606 Asic[55]:575 Asic[56]:612
Asic[57]:593 Asic[58]:612 Asic[59]:612 Asic[60]:606 Asic[61]:606 Asic[62]:612
Chain:3 max freq=612
Chain:3 min freq=500

Code:

Chain# ASIC# Frequency(avg) GH/S(ideal) Temp(Chip2)

1 63 625.00 4,488.75 87
2 63 625.00 4,488.75 84
3 63 594.57 4,270.21 73

OgNasty

donator

Activity: 4760

Merit: 4323

Leading Crypto Sports Betting & Casino Platform

Quote from: Micky25 on March 29, 2017, 02:20:24 AM

Thank you guys for shareing your experience. For me the problems started with the autotune models and I can only speak out of personal experience. From my point of view, if the autotune software burns chips, it doesn't do its job good.

Agreed. There should be an underclock option.

Micky25

legendary

Activity: 974

Merit: 1000

Thank you guys for shareing your experience. For me the problems started with the autotune models and I can only speak out of personal experience. From my point of view, if the autotune software burns chips, it doesn't do its job good.

fanatic26

hero member

Activity: 756

Merit: 560

Quote from: Micky25 on March 28, 2017, 02:04:41 PM

So be so kind to enlighten me with your wisdom and, while doing that, explain why the high failure rate just started with the autotune models.

I will be kind enough to enlighten you. You are talking out your ass with no hard numbers to back up the claims you are making. I have personally managed hundreds of these machines and can say with certainty that the autotune software did not increase failure rates. It just made deployments more annoying because in some cases you have to wait hours for it to settle on a speed and start actually hashing so you can verify full functionality.

As far as the testing the autotune function does, it is to find the weakest chip on the chain and not run the system outside of that spec. As these are a string design you cannot target and test individual chips without powering the rest of the chain to pass its data to the controller. It does stability checks to make sure it is running at the optimal voltage for those specific chips on that specific board.. It is not designed to push the absolute maximum out of the board as you seem to erroneously think.

ComputerGenie

hero member

Activity: 1092

Merit: 552

Retired IRCX God

Quote from: Biodom on March 28, 2017, 05:00:26 PM

...i don't think that i have the exact numbers, but, anecdotally, non-autotuned S9 had at least 10-13% board failure rate, maybe even more initially...

That's right about in the margins we experienced.

Biodom

legendary

Activity: 4004

Merit: 4656

Quote from: Micky25 on March 28, 2017, 02:04:41 PM

Quote from: ComputerGenie on March 28, 2017, 01:51:58 PM

Quote from: Micky25 on March 28, 2017, 01:19:06 PM

...For me, the most cogent and simplest explanation for the burned asics is their wretched autotune software....

That is because you, quite literally, have no clue what you're on about. Roll Eyes

So be so kind to enlighten me with your wisdom and, while doing that, explain why the high failure rate just started with the autotune models.

I don't think that it did, really. If anything, autotune was introduced to ameliorate a high failure rate of non-autotuned S9 because, among other things, people probably tried to overclock them as they did with S5 and S7. Naturally, i don't really blame anyone doing this as margins are thin. i don't think that i have the exact numbers, but, anecdotally, non-autotuned S9 had at least 10-13% board failure rate, maybe even more initially.

It would be interesting to know the autotuned numbers. Any large scale miner can calculate this number for his/her farm and at scale it would have a smaller standard error.

ComputerGenie

hero member

Activity: 1092

Merit: 552

Retired IRCX God

Quote from: Micky25 on March 28, 2017, 02:04:41 PM

Quote from: ComputerGenie on March 28, 2017, 01:51:58 PM

Quote from: Micky25 on March 28, 2017, 01:19:06 PM

...For me, the most cogent and simplest explanation for the burned asics is their wretched autotune software....

That is because you, quite literally, have no clue what you're on about. Roll Eyes

So be so kind to enlighten me with your wisdom and, while doing that, explain why the high failure rate just started with the autotune models.

In Psychology, they call it "selective perception". Undecided

Micky25

legendary

Activity: 974

Merit: 1000

Quote from: ComputerGenie on March 28, 2017, 01:51:58 PM

Quote from: Micky25 on March 28, 2017, 01:19:06 PM

...For me, the most cogent and simplest explanation for the burned asics is their wretched autotune software....

That is because you, quite literally, have no clue what you're on about. Roll Eyes

So be so kind to enlighten me with your wisdom and, while doing that, explain why the high failure rate just started with the autotune models.

ComputerGenie

hero member

Activity: 1092

Merit: 552

Retired IRCX God

Quote from: Micky25 on March 28, 2017, 01:19:06 PM

...For me, the most cogent and simplest explanation for the burned asics is their wretched autotune software....

That is because you, quite literally, have no clue what you're on about. Roll Eyes

Micky25

legendary

Activity: 974

Merit: 1000

Quote from: fanatic26 on March 28, 2017, 01:23:57 PM

P.S. it has nothing to do with the autotune firmware. That firmware was created specifically to verify it was not running any chips on a hashing board out of spec.

Did you watch what happens during this autotune process? Though I'm no coder, this doesn't look overly elaborated to me. I seams like a try and error approach.

fanatic26

hero member

Activity: 756

Merit: 560

Quote from: elokk on March 27, 2017, 09:17:05 PM

I never had one case of burned asic on prior Bitmain products (s1, s3, s5, s7)

I can show you pictures of HUNDREDS of burnt up S7 hashing boards i have in house. The early batch s7s had a firmware bug and they would catch on fire for a variety of reasons. I once lost 24 of them because a fan failed on a procurve switch and it shut down. This is NOT a new thing.

P.S. it has nothing to do with the autotune firmware. That firmware was created specifically to verify it was not running any chips on a hashing board out of spec.

Micky25

legendary

Activity: 974

Merit: 1000

Quote from: elokk on March 27, 2017, 09:17:05 PM

After speaking with a Bitmain representative in regards to the issues with their 16nm products, I would highly recommend using a hosting service. Things as little as a power outage, lost internet connection, etc can cause asics to burn which may not even be detectable by a quick visual inspection. Obviously, a hosting company is not immune from these service interruptions but should be setup with fail-overs. All of my s9 & r4 repairs were burned asics. I never had one case of burned asic on prior Bitmain products (s1, s3, s5, s7)

Could that be because you don't have autotune on the prior products? For me, the most cogent and simplest explanation for the burned asics is their wretched autotune software. Made to maximize profit by pressing the utmost hash out of each chip, some of them get, unsurprisingly, wasted by that.

Colohub

member

Activity: 89

Merit: 10

Quote from: ComputerGenie on March 28, 2017, 11:38:13 AM

Quote from: VentMine on March 28, 2017, 11:08:54 AM

I'm just curious what you mean here. What safety measures would a premier hosting service have in place to help prevent failed / burnt hashboards? What kind of "failovers"?

Data-centers have large-scale surge protection and redundant power supply sources (things that are cost ineffective in a smaller environment). Boards can become damaged by brown-outs and surges that occur during standard power recovery, the protections that a data-center has nearly eliminates those issues.

Agreed, not only do we host in our data center, we are nearing 300 Antminers of our own with 20 more that went online yesterday. Power is strictly monitored at the distribution unit as well as at the outlet level on the pdu's in the cage. No more thank 80% of a 30AMP circuit is every drawn. Irregularities, if any, are reported via text message. In the even of any outages, battery backups immediately kick in for 20 seconds prior to a 2 megawatt generator supplying power. Put to the test 2x since 2013 and worked flawlessly. We also ring fiber in the center and to Chicago/Omaha to ensure no connectivity loss. We have hosted an A4 for an overseas client, that was concerned about excessive reboots, not a single one since coming aboard in late January.

ComputerGenie

hero member

Activity: 1092

Merit: 552

Retired IRCX God

Quote from: VentMine on March 28, 2017, 11:08:54 AM

I'm just curious what you mean here. What safety measures would a premier hosting service have in place to help prevent failed / burnt hashboards? What kind of "failovers"?

Data-centers have large-scale surge protection and redundant power supply sources (things that are cost ineffective in a smaller environment). Boards can become damaged by brown-outs and surges that occur during standard power recovery, the protections that a data-center has nearly eliminates those issues.

VentMine

full member

Activity: 236

Merit: 105

Quote from: elokk on March 27, 2017, 06:24:14 PM

After speaking with a Bitmain representative in regards to the issues with their 16nm products, I would highly recommend using a hosting service. Things as little as a power outage, lost internet connection, etc can cause asics to burn which may not even be detectable by a quick visual inspection. Obviously, a hosting company is not immune from these service interruptions but should be setup with fail-overs. All of my s9 & r4 repairs were burned asics. I never had one case of burned asic on prior Bitmain products (s1, s3, s5, s7)

I'm just curious what you mean here. What safety measures would a premier hosting service have in place to help prevent failed / burnt hashboards? What kind of "failovers"?

ComputerGenie

hero member

Activity: 1092

Merit: 552

Retired IRCX God

Quote from: elokk on March 27, 2017, 09:17:05 PM

You are free to run whatever client you wish or support a different chain. Nobody is forcing you to use Core. However, the best and most competent developers contribute and maintain code to the core client. You are free to open a Git and submit proposals if know of a beneficial feature that is not currently available.

If you do not support segwit, run versions prior to 0.13.1, that is if you run a node.
...

Yes, it is wholly true that if you want to run a Core node, you are free to not support segwit by running an older, less optimized version of Core that is worse for the network as a whole.
Meanwhile in the decentralized world where Bitcoin is supposed to be run by a consensus, if the majority want x in a wallet, then Core devs should put x in the wallet. In the decentralized world where Bitcoin is supposed to be run by a consensus, one shouldn't have to make the choice between a client that is, literally, slower and one that signals something that the majority obviously don't want.

"You are free to open a Git and submit proposals if know of a beneficial feature that is not currently available." Satoshi Nakamoto, himself, could write a BIP that 99% of the community wants and if the devs personally dislike the change, then it's not going into Core.

elokk

hero member

Activity: 723

Merit: 519

Quote from: ComputerGenie on March 27, 2017, 07:18:03 PM

Quote from: elokk on March 27, 2017, 06:24:14 PM

Core updates code with new releases, nodes are not required to upgrade

Q: And the version of Core that is up-to-date and allows you to not signal in favor of segwit is where? Huh

A: Nowhere, because no such creature exists.

Q: And the version of Core that is up-to-date and allows you to signal in favor of a BIP that Core devs are not in favor of is where? Huh

A: Nowhere, because no such creature exists.

You are free to run whatever client you wish or support a different chain. Nobody is forcing you to use Core. However, the best and most competent developers contribute and maintain code to the core client. You are free to open a Git and submit proposals if know of a beneficial feature that is not currently available.

If you do not support segwit, run versions prior to 0.13.1, that is if you run a node.

Being that this is the Antminer s9 thread, we should get back on topic:

After speaking with a Bitmain representative in regards to the issues with their 16nm products, I would highly recommend using a hosting service. Things as little as a power outage, lost internet connection, etc can cause asics to burn which may not even be detectable by a quick visual inspection. Obviously, a hosting company is not immune from these service interruptions but should be setup with fail-overs. All of my s9 & r4 repairs were burned asics. I never had one case of burned asic on prior Bitmain products (s1, s3, s5, s7)

fanatic26

hero member

Activity: 756

Merit: 560

Quote from: HagssFIN on March 27, 2017, 08:27:48 AM

There use to be such high demand for manual control ability with autotune S9 models, but now, has someone tried my method with S9?
I've done this successfully to my two R4s.

I have a couple misbehaving S9s that I wouldnt mind trying this on. If I get time later this week to try it I will report back on what I find. I have a couple batches of autotunes I can try this on.

fanatic26

hero member

Activity: 756

Merit: 560

Quote from: ComputerGenie on March 27, 2017, 07:18:03 PM

Q: And the version of Core that is up-to-date and allows you to not signal in favor of segwit is where? Huh

A: Nowhere, because no such creature exists.

Q: And the version of Core that is up-to-date and allows you to signal in favor of a BIP that Core devs are not in favor of is where? Huh

A: Nowhere, because no such creature exists.

How dare you point out flaws in the almighty CORE OVERLORDS. Shouldnt you just read /r/bitcoin and follow all the sheep there in blindly laying blame on random people rather than objectively looking at the facts? Pretty sure thats how most of the community works these days lol

ComputerGenie

hero member

Activity: 1092

Merit: 552

Retired IRCX God

Quote from: elokk on March 27, 2017, 06:24:14 PM

Core updates code with new releases, nodes are not required to upgrade

Q: And the version of Core that is up-to-date and allows you to not signal in favor of segwit is where? Huh

A: Nowhere, because no such creature exists.

Q: And the version of Core that is up-to-date and allows you to signal in favor of a BIP that Core devs are not in favor of is where? Huh

A: Nowhere, because no such creature exists.

Topic: Bitmain's Released Antminer S9, World's First 16nm Miner Ready to Order - page 147. (Read 531298 times)