Author

Topic: Bitmain's Released Antminer S9, World's First 16nm Miner Ready to Order - page 147. (Read 531168 times)

hero member
Activity: 1092
Merit: 552
Retired IRCX God
Why not run a bitcoin unlimited node?
Yes, because the answer to stopping the centralization of Bitcoin around the Core dev's beliefs is to centralize Bitcoin around the BU dev's beliefs....  Undecided
legendary
Activity: 974
Merit: 1000
Thank you guys for shareing your experience. For me the problems started with the autotune models and I can only speak out of personal experience. From my point of view, if the autotune software burns chips, it doesn't do its job good.

Agreed.  There should be an underclock option.


Well, apparently we seem to get this option by doing the HagssFIN hack - if we dare  Grin  Did somebody?



There use to be such high demand for manual control ability with autotune S9 models, but now, has someone tried my method with S9?
I've done this successfully to my two R4s.

This is also S9 related, so I thought to share this here as well.
If there is someone with a desire for non-autotune settings, be brave and please report if you get same results as I had  Wink

Originally posted in the R4 thread https://bitcointalksearch.org/topic/m.18267781
Well, this is an interesting find and I thought I'll share it with you. Smiley

I had a problem booting my Antminer R4 8.0 Th/s, batch 2, autotune-model.

So I went looking in here:
Bitmain.com: Three Ways to Restore Factory Settings (R4/S9/T9)
https://enforum.bitmain.com/bbs/topics/3957

I used the last option in the list, IP Reporter button restore.
Quote from: Bitmain
Usage: Please power off the miner, then hold down the IP Reporter and don’t release it. At the same time, please power on the miner.
Releasing the IP Reporter after 5 seconds, the machine will automatically restore factory settings.

My miner restored with firmware Aug. 9 2016 and autotune settings were gone.
I am now suddenly able to control frequency and custom fan settings.  Cheesy

This was not my original problem, but I'm happy with it.
My original problem was that the miner jammed somehow in the booting process and didn't even show up in my router ip list.

sr. member
Activity: 546
Merit: 253
You are free to run whatever client you wish or support a different chain. Nobody is forcing you to use Core. However, the best and most competent developers contribute  and maintain code to the core client. You are free to open a Git and submit proposals if know of a beneficial feature that is not currently available.

If you do not support segwit, run versions prior to 0.13.1, that is if you run a node.
...
Yes, it is wholly true that if you want to run a Core node, you are free to not support segwit by running an older, less optimized version of Core that is worse for the network as a whole.
Meanwhile in the decentralized world where Bitcoin is supposed to be run by a consensus, if the majority want x in a wallet, then Core devs should put x in the wallet. In the decentralized world where Bitcoin is supposed to be run by a consensus, one shouldn't have to make the choice between a client that is, literally, slower and one that signals something that the majority obviously don't want.

"You are free to open a Git and submit proposals if know of a beneficial feature that is not currently available." Satoshi Nakamoto, himself, could write a BIP that 99% of the community wants and if the devs personally dislike the change, then it's not going into Core.


Why not run a bitcoin unlimited node?
newbie
Activity: 56
Merit: 0
Indeed. It also doesn't seem to do _anything_ per-run - it's just validating the speeds stored in the PIC allow the card to hash sensibly.

It is possible to force it to recalibrate itsself, but apparently that's a support call to BitMain where they add your specific MAC to 'the list' and it gets to run their utility, which presumably does some sort of encrypted communication to the mothership.

Interesting that the non-autotuned boards all show a static clock speed. I have one of these in my autotuning miner and was wondering the cause - it would seem that this might just have been old stock boards.
hero member
Activity: 1092
Merit: 552
Retired IRCX God
A literal example of why autofreq doesn't "burn chips" and isn't "pressing the utmost hash out of each chip"...
Chain 1(left out because it's redundant) and Chain 2 are from Batch 4 and Chain 3 is from a "preset" batch:
Code:
read PIC voltage=940 on chain[2]
Chain:2 chipnum=63
...
Asic[ 0]:625
Asic[ 1]:625 Asic[ 2]:625 Asic[ 3]:625 Asic[ 4]:625 Asic[ 5]:625 Asic[ 6]:625 Asic[ 7]:625 Asic[ 8]:625
Asic[ 9]:625 Asic[10]:625 Asic[11]:625 Asic[12]:625 Asic[13]:625 Asic[14]:625 Asic[15]:625 Asic[16]:625
Asic[17]:625 Asic[18]:625 Asic[19]:625 Asic[20]:625 Asic[21]:625 Asic[22]:625 Asic[23]:625 Asic[24]:625
Asic[25]:625 Asic[26]:625 Asic[27]:625 Asic[28]:625 Asic[29]:625 Asic[30]:625 Asic[31]:625 Asic[32]:625
Asic[33]:625 Asic[34]:625 Asic[35]:625 Asic[36]:625 Asic[37]:625 Asic[38]:625 Asic[39]:625 Asic[40]:625
Asic[41]:625 Asic[42]:625 Asic[43]:625 Asic[44]:625 Asic[45]:625 Asic[46]:625 Asic[47]:625 Asic[48]:625
Asic[49]:625 Asic[50]:625 Asic[51]:625 Asic[52]:625 Asic[53]:625 Asic[54]:625 Asic[55]:625 Asic[56]:625
Asic[57]:625 Asic[58]:625 Asic[59]:625 Asic[60]:625 Asic[61]:625 Asic[62]:625
Chain:2 max freq=625
Chain:2 min freq=625

read PIC voltage=940 on chain[3]
Chain:3 chipnum=63
...
Asic[ 0]:568
Asic[ 1]:606 Asic[ 2]:568 Asic[ 3]:593 Asic[ 4]:606 Asic[ 5]:575 Asic[ 6]:593 Asic[ 7]:516 Asic[ 8]:612
Asic[ 9]:593 Asic[10]:556 Asic[11]:612 Asic[12]:593 Asic[13]:593 Asic[14]:533 Asic[15]:600 Asic[16]:606
Asic[17]:550 Asic[18]:600 Asic[19]:606 Asic[20]:533 Asic[21]:606 Asic[22]:606 Asic[23]:504 Asic[24]:606
Asic[25]:606 Asic[26]:500 Asic[27]:606 Asic[28]:606 Asic[29]:587 Asic[30]:606 Asic[31]:606 Asic[32]:600
Asic[33]:606 Asic[34]:606 Asic[35]:612 Asic[36]:606 Asic[37]:612 Asic[38]:612 Asic[39]:606 Asic[40]:612
Asic[41]:612 Asic[42]:587 Asic[43]:612 Asic[44]:612 Asic[45]:606 Asic[46]:612 Asic[47]:606 Asic[48]:606
Asic[49]:606 Asic[50]:612 Asic[51]:606 Asic[52]:612 Asic[53]:612 Asic[54]:606 Asic[55]:575 Asic[56]:612
Asic[57]:593 Asic[58]:612 Asic[59]:612 Asic[60]:606 Asic[61]:606 Asic[62]:612
Chain:3 max freq=612
Chain:3 min freq=500

Code:
Chain# ASIC# Frequency(avg) GH/S(ideal) Temp(Chip2)

1 63 625.00 4,488.75 87
2 63 625.00 4,488.75 84
3 63 594.57 4,270.21 73

donator
Activity: 4760
Merit: 4323
Leading Crypto Sports Betting & Casino Platform
Thank you guys for shareing your experience. For me the problems started with the autotune models and I can only speak out of personal experience. From my point of view, if the autotune software burns chips, it doesn't do its job good.

Agreed.  There should be an underclock option.
legendary
Activity: 974
Merit: 1000
Thank you guys for shareing your experience. For me the problems started with the autotune models and I can only speak out of personal experience. From my point of view, if the autotune software burns chips, it doesn't do its job good.
hero member
Activity: 756
Merit: 560
So be so kind to enlighten me with your wisdom and, while doing that, explain why the high failure rate just started with the autotune models.


I will be kind enough to enlighten you. You are talking out your ass with no hard numbers to back up the claims you are making. I have personally managed hundreds of these machines and can say with certainty that the autotune software did not increase failure rates. It just made deployments more annoying because in some cases you have to wait hours for it to settle on a speed and start actually hashing so you can verify full functionality.

As far as the testing the autotune function does, it is to find the weakest chip on the chain and not run the system outside of that spec. As these are a string design you cannot target and test individual chips without powering the rest of the chain to pass its data to the controller. It does stability checks to make sure it is running at the optimal voltage for those specific chips on that specific board.. It is not designed to push the absolute maximum out of the board as you seem to erroneously think.
hero member
Activity: 1092
Merit: 552
Retired IRCX God
...i don't think that i have the exact numbers, but, anecdotally, non-autotuned S9 had at least 10-13% board failure rate, maybe even more initially...
That's right about in the margins we experienced.
legendary
Activity: 3892
Merit: 4331
...For me, the most cogent and simplest explanation for the burned asics is their wretched autotune software....
That is because you, quite literally, have no clue what you're on about.  Roll Eyes

So be so kind to enlighten me with your wisdom and, while doing that, explain why the high failure rate just started with the autotune models.

I don't think that it did, really. If anything, autotune was introduced to ameliorate a high failure rate of non-autotuned S9 because, among other things, people probably tried to overclock them as they did with S5 and S7. Naturally, i don't really blame anyone doing this as margins are thin. i don't think that i have the exact numbers, but, anecdotally, non-autotuned S9 had at least 10-13% board failure rate, maybe even more initially.

It would be interesting to know the autotuned numbers. Any large scale miner can calculate this number for his/her farm and at scale it would have a smaller standard error.
hero member
Activity: 1092
Merit: 552
Retired IRCX God
...For me, the most cogent and simplest explanation for the burned asics is their wretched autotune software....
That is because you, quite literally, have no clue what you're on about.  Roll Eyes

So be so kind to enlighten me with your wisdom and, while doing that, explain why the high failure rate just started with the autotune models.
In Psychology, they call it "selective perception".  Undecided
legendary
Activity: 974
Merit: 1000
...For me, the most cogent and simplest explanation for the burned asics is their wretched autotune software....
That is because you, quite literally, have no clue what you're on about.  Roll Eyes

So be so kind to enlighten me with your wisdom and, while doing that, explain why the high failure rate just started with the autotune models.
hero member
Activity: 1092
Merit: 552
Retired IRCX God
...For me, the most cogent and simplest explanation for the burned asics is their wretched autotune software....
That is because you, quite literally, have no clue what you're on about.  Roll Eyes
legendary
Activity: 974
Merit: 1000
P.S. it has nothing to do with the autotune firmware. That firmware was created specifically to verify it was not running any chips on a hashing board out of spec.

Did you watch what happens during this autotune process? Though I'm no coder, this doesn't look overly elaborated to me. I seams like a try and error approach.
hero member
Activity: 756
Merit: 560
I never had one case of burned asic on prior Bitmain products (s1, s3, s5, s7)

I can show you pictures of HUNDREDS of burnt up S7 hashing boards i have in house. The early batch s7s had a firmware bug and they would catch on fire for a variety of reasons. I once lost 24 of them because a fan failed on a procurve switch and it shut down. This is NOT a new thing.


P.S. it has nothing to do with the autotune firmware. That firmware was created specifically to verify it was not running any chips on a hashing board out of spec.
legendary
Activity: 974
Merit: 1000

After speaking with a Bitmain representative in regards to the issues with their 16nm products, I would highly recommend using a hosting service. Things as little as a power outage, lost internet connection, etc can cause asics to burn which may not even be detectable by a quick visual inspection. Obviously, a hosting company is not immune from these service interruptions but should be setup with fail-overs. All of my s9 & r4 repairs were burned asics. I never had one case of burned asic on prior Bitmain products (s1, s3, s5, s7)

Could that be because you don't have autotune on the prior products? For me, the most cogent and simplest explanation for the burned asics is their wretched autotune software. Made to maximize profit by pressing the utmost hash out of each chip, some of them get, unsurprisingly, wasted by that.
member
Activity: 89
Merit: 10
I'm just curious what you mean here. What safety measures would a premier hosting service have in place to help prevent failed / burnt hashboards? What kind of "failovers"?
Data-centers have large-scale surge protection and redundant power supply sources (things that are cost ineffective in a smaller environment). Boards can become damaged by brown-outs and surges that occur during standard power recovery, the protections that a data-center has nearly eliminates those issues.

Agreed, not only do we host in our data center, we are nearing 300 Antminers of our own with 20 more that went online yesterday. Power is strictly monitored at the distribution unit as well as at the outlet level on the pdu's in the cage. No more thank 80% of a 30AMP circuit is every drawn. Irregularities, if any, are reported via text message. In the even of any outages, battery backups immediately kick in for 20 seconds prior to a 2 megawatt generator supplying power. Put to the test 2x since 2013 and worked flawlessly. We also ring fiber in the center and to Chicago/Omaha to ensure no connectivity loss. We have hosted an A4 for an overseas client, that was concerned about excessive reboots, not a single one since coming aboard in late January.
hero member
Activity: 1092
Merit: 552
Retired IRCX God
I'm just curious what you mean here. What safety measures would a premier hosting service have in place to help prevent failed / burnt hashboards? What kind of "failovers"?
Data-centers have large-scale surge protection and redundant power supply sources (things that are cost ineffective in a smaller environment). Boards can become damaged by brown-outs and surges that occur during standard power recovery, the protections that a data-center has nearly eliminates those issues.
full member
Activity: 236
Merit: 105
After speaking with a Bitmain representative in regards to the issues with their 16nm products, I would highly recommend using a hosting service. Things as little as a power outage, lost internet connection, etc can cause asics to burn which may not even be detectable by a quick visual inspection. Obviously, a hosting company is not immune from these service interruptions but should be setup with fail-overs. All of my s9 & r4 repairs were burned asics. I never had one case of burned asic on prior Bitmain products (s1, s3, s5, s7)

I'm just curious what you mean here. What safety measures would a premier hosting service have in place to help prevent failed / burnt hashboards? What kind of "failovers"?
hero member
Activity: 1092
Merit: 552
Retired IRCX God
You are free to run whatever client you wish or support a different chain. Nobody is forcing you to use Core. However, the best and most competent developers contribute  and maintain code to the core client. You are free to open a Git and submit proposals if know of a beneficial feature that is not currently available.

If you do not support segwit, run versions prior to 0.13.1, that is if you run a node.
...
Yes, it is wholly true that if you want to run a Core node, you are free to not support segwit by running an older, less optimized version of Core that is worse for the network as a whole.
Meanwhile in the decentralized world where Bitcoin is supposed to be run by a consensus, if the majority want x in a wallet, then Core devs should put x in the wallet. In the decentralized world where Bitcoin is supposed to be run by a consensus, one shouldn't have to make the choice between a client that is, literally, slower and one that signals something that the majority obviously don't want.

"You are free to open a Git and submit proposals if know of a beneficial feature that is not currently available." Satoshi Nakamoto, himself, could write a BIP that 99% of the community wants and if the devs personally dislike the change, then it's not going into Core.
Jump to: