Author

Topic: Bitmain's Released Antminer S9, World's First 16nm Miner Ready to Order - page 193. (Read 531173 times)

member
Activity: 135
Merit: 11
I have a batch 13 S9 and have lost internet access two times over the past few months with no issues and no sign of the bug that caused previous models to overheat.

Still running the original July 2016 firmware that it came with.
legendary
Activity: 3892
Merit: 4331
Anybody knows if S9 still have the "no internet" bug that plagued S7?
I am asking because at least earlier S9 could be set on manual fan that prevented "no internet" damage in S7.
hero member
Activity: 770
Merit: 523
It's at the datacenter down the street.
I imagine they were probably idling.
It just shutdown the hashing, high power usage.
I could still get to the gui to reboot, didn't have to login to linux to reboot.
member
Activity: 135
Merit: 11
That is good to know. Curious to know were the fans shut down also or still spinning?
hero member
Activity: 770
Merit: 523
Looks like one s9 board (11.85th) went up to 140C and the software shut down the miner  but the api was still alive.
Rebooted and A/OK
Have to monitor these babies.
Oh no, maybe one or few of the heat sinks have come off?  Undecided
Looks like it is ok, must just be a fluke in this miner that occurs every month.
My point was the software shut it down, very pleased with that.
Never had a heatsink pop off (fingers crossed)
legendary
Activity: 2464
Merit: 1710
Electrical engineer. Mining since 2014.
Looks like one s9 board (11.85th) went up to 140C and the software shut down the miner  but the api was still alive.
Rebooted and A/OK
Have to monitor these babies.
Oh no, maybe one or few of the heat sinks have come off?  Undecided
hero member
Activity: 770
Merit: 523
Looks like one s9 board (11.85th) went up to 140C and the software shut down the miner  but the api was still alive.
Did it about a month ago also. Same miner.
Rebooted from gui  and A/OK
Have to monitor these babies.
legendary
Activity: 3822
Merit: 2703
Evil beware: We have waffles!
Running 10 s9's (along with 15 s7's) here, from batch 1 to batch 23, only 2 board failures to date: 1 from my batch 1 and 1 board from a batch 17. Both sent to BitmainWarranty in CO - yes even the one from batch 17 which was still well under the 90-day. Just faster, no customs issues,  and cheaper shipping inc being fully insured for full $450 value.

All run in a clean industrial environment, most in 2nd floor parts storage area - gets up to 90F in the summer.
PSU's are mostly Bitmains 1600w which also run the bulk of my s7's along with a few IBM 2kw PSU's using Sidehackhacks breakout.

I agree with your ID layout. After all, the boards slide in/out of the exhaust end of the case so makes sense to use that as the reference end. Looking at it and moving left-to-right we have boards 1-2-3.

Controller socket numbering OTH... They are all over the map from batch to batch. Think my B1 has 5 sockets on it and uses every-other-one....

Definitely agree on ESD: ALWAYS either ground yourself before touching/handling a hashboard or better yet ware an ESD grounding wrist strap.

If you do do repairs on the boards or at least some trouble shooting, it would be nice to know what the failures are. So far Bitmain and BitmainWarranty just give excuses why the repairs are not fully documented and why they can't tell us what failed. Given the low cost of the (2) repairs I've had done so far I'm thinking it may be PIC data getting munged and just needing to be reprogramed or if an actual physical part, something in the Vcore reg or perhaps a node-bypass cap.
hero member
Activity: 700
Merit: 501
https://bitcointalk.org/index.php?topic=905210.msg


Thanks for your feedback. I like to hear this positive news.
I hope others follow your lead and chime in as well.
full member
Activity: 219
Merit: 100
Bitcoin Mining Hosting
I thought the boards would be 1 2 3 left to right or right to left.
Not the case. Seems they can be in any order.
For me it is trial and error to find sick or dead boards.
Dead boards are easier, little red light is out.
When they are sick the red light is still on. So I pull hash cables one at a time and restart until the sick board goes away.

I have spent quite some time troubleshooting only a few dead boards myself. While I have had only a few boards go dead I will say that the few I have tried to fix had no rhyme or reason to the fixes. The order of boards to I/O is weird.

I took pictures of the one I troubleshot yesterday. I will try to get them up in a post with a step by step for others. I know you guys know how to troubleshoot to a hash board, using ther LED, etc but it sure might make life easier for others just getting into a situation to have a 1,2,3 approach.

What do you guys think? Is it worth the time? Please give me your thoughts before I go to all the trouble of cleaning up all the pictures I took.

Something along the lines of:
A. When viewing the miner from the back we will call the hash boards from right to left board 1, 2, 3. (back meaning facing the exhaust fan)

B. When viewing the miner from the back we will call the connectors on the Bitmain I/O board from right to left J1, J2, J3, and so on. (back still means facing the exhaust fan.)

C. If you see channels 2 and 4 functioning in the miner user interface (in my case) hash board 1 was the one at fault. Hash board 1 was connected to J1 on the IO board.


I can state these things now because I took the time to disconnect power from each hash board, repower the unit, and note which channel shows activity. To further troubleshoot all possible scenarios I also connected the power and cable going to the IO board from a hashboard which was functioning correctly.
So using a different cable, a different connection point on the IO board where a known good hashboard ran, and the PCIe power connections from a known functioning hash board and I have a board which still never shows up in the GUI.

Once I am this far I remove the board from the unit and take it to the bench for further testing, some of which must be under power so be prepared for such to move on. I use many steps from the posts I quoted in my last post in this thread above. I am not going to put those in a hash board step by step at this point as there are too many variables and I feel if someone has the ability to troubleshoot at that level we have a different discussion.

What concerns me is (as you guys have mentioned) the hash board which fails doesn't show (or not show) as a consistent chain in the GUI. This may be why I was unable to simply be sure if the hash board is connected to J1, then it is always the board on the right. (When facing the exhaust fan) It sounds like you guys are saying it is indeed not always the same channel which appears and therefore just because it is connected to J1 it may not be the board on the right.

Please confirm I understand, and also I appreciate your and others input on if you think this is useful along with your experience on which hash board fails versus which channel was "missing" in the GUI.

I know we do not hear people talk when everything is running great but that is another point I think many people would enjoy hearing. Regarding the last three S9 batches, or lets say any purchases within the last 3 months, has anyone had any failures?

If people could share their personal experience it would be a great help. Say, I purchased 4 miners and I had one board failure, or I purchased 14 miners and had zero failures.

For people with several or zero failures are you performing any regular maintenance, how are your miners setup, what are your ambient and operating temperatures, what type of power supply are you using and any other details you would be willing to share.

I look forward to your replies.

Thanks  

Edit 1:
PS
I deal with ESD scenarios quite a bit in my day job. The equipment I work with creates a great deal of high frequency noise in normal use and we must use components which are isolated against such and determine ways to further isolate the equipment - particularly PC components. For example we use optically isolated serial ports for RS422 communications and fiber for our network connection, etc.
I understand the debate regarding some people who say "I have been working on electronics for 20 years and have never seen or damaged a component due to noise or a static discharge." I am not interested in that debate. What I am interested in is do people take any precautions while troubleshooting to alleviate a potential issue due to a static discharge? If so, what are those precautions?

Thanks Again

Edit 2:
I have many hashboards which have required repair which has obviously cost me a great deal as two of the units I purchased used and 4 of the hashboards were from those two units. One being a 550 and the other being a 600, but both from the same time frame. I am hoping to hear positive words from people who have made more recent purchases.


the S9 has progressively gotten better batch by batch. We usually get at least one large set per batch , and the last couple of batches have been nearly flawless ( couple machines needed one random reboot for false dead board report ). compare that to the earlier hardware , and it's way better. CO repair costs are fair, and it's usually 5-8 Biz Day turn around max.
hero member
Activity: 700
Merit: 501
https://bitcointalk.org/index.php?topic=905210.msg
I thought the boards would be 1 2 3 left to right or right to left.
Not the case. Seems they can be in any order.
For me it is trial and error to find sick or dead boards.
Dead boards are easier, little red light is out.
When they are sick the red light is still on. So I pull hash cables one at a time and restart until the sick board goes away.

I have spent quite some time troubleshooting only a few dead boards myself. While I have had only a few boards go dead I will say that the few I have tried to fix had no rhyme or reason to the fixes. The order of boards to I/O is weird.

I took pictures of the one I troubleshot yesterday. I will try to get them up in a post with a step by step for others. I know you guys know how to troubleshoot to a hash board, using ther LED, etc but it sure might make life easier for others just getting into a situation to have a 1,2,3 approach.

What do you guys think? Is it worth the time? Please give me your thoughts before I go to all the trouble of cleaning up all the pictures I took.

Something along the lines of:
A. When viewing the miner from the back we will call the hash boards from right to left board 1, 2, 3. (back meaning facing the exhaust fan)

B. When viewing the miner from the back we will call the connectors on the Bitmain I/O board from right to left J1, J2, J3, and so on. (back still means facing the exhaust fan.)

C. If you see channels 2 and 4 functioning in the miner user interface (in my case) hash board 1 was the one at fault. Hash board 1 was connected to J1 on the IO board.


I can state these things now because I took the time to disconnect power from each hash board, repower the unit, and note which channel shows activity. To further troubleshoot all possible scenarios I also connected the power and cable going to the IO board from a hashboard which was functioning correctly.
So using a different cable, a different connection point on the IO board where a known good hashboard ran, and the PCIe power connections from a known functioning hash board and I have a board which still never shows up in the GUI.

Once I am this far I remove the board from the unit and take it to the bench for further testing, some of which must be under power so be prepared for such to move on. I use many steps from the posts I quoted in my last post in this thread above. I am not going to put those in a hash board step by step at this point as there are too many variables and I feel if someone has the ability to troubleshoot at that level we have a different discussion.

What concerns me is (as you guys have mentioned) the hash board which fails doesn't show (or not show) as a consistent chain in the GUI. This may be why I was unable to simply be sure if the hash board is connected to J1, then it is always the board on the right. (When facing the exhaust fan) It sounds like you guys are saying it is indeed not always the same channel which appears and therefore just because it is connected to J1 it may not be the board on the right.

Please confirm I understand, and also I appreciate your and others input on if you think this is useful along with your experience on which hash board fails versus which channel was "missing" in the GUI.

I know we do not hear people talk when everything is running great but that is another point I think many people would enjoy hearing. Regarding the last three S9 batches, or lets say any purchases within the last 3 months, has anyone had any failures?

If people could share their personal experience it would be a great help. Say, I purchased 4 miners and I had one board failure, or I purchased 14 miners and had zero failures.

For people with several or zero failures are you performing any regular maintenance, how are your miners setup, what are your ambient and operating temperatures, what type of power supply are you using and any other details you would be willing to share.

I look forward to your replies.

Thanks  

Edit 1:
PS
I deal with ESD scenarios quite a bit in my day job. The equipment I work with creates a great deal of high frequency noise in normal use and we must use components which are isolated against such and determine ways to further isolate the equipment - particularly PC components. For example we use optically isolated serial ports for RS422 communications and fiber for our network connection, etc.
I understand the debate regarding some people who say "I have been working on electronics for 20 years and have never seen or damaged a component due to noise or a static discharge." I am not interested in that debate. What I am interested in is do people take any precautions while troubleshooting to alleviate a potential issue due to a static discharge? If so, what are those precautions?

Thanks Again

Edit 2:
I have many hashboards which have required repair which has obviously cost me a great deal as two of the units I purchased used and 4 of the hashboards were from those two units. One being a 550 and the other being a 600, but both from the same time frame. I am hoping to hear positive words from people who have made more recent purchases.
sr. member
Activity: 338
Merit: 251
I thought the boards would be 1 2 3 left to right or right to left.
Not the case. Seems they can be in any order.
For me it is trial and error to find sick or dead boards.
Dead boards are easier, little red light is out.
When they are sick the red light is still on. So I pull hash cables one at a time and restart until the sick board goes away.

I have spent quite some time troubleshooting only a few dead boards myself. While I have had only a few boards go dead I will say that the few I have tried to fix had no rhyme or reason to the fixes. The order of boards to I/O is weird.
hero member
Activity: 770
Merit: 523
I thought the boards would be 1 2 3 left to right or right to left.
Not the case. Seems they can be in any order.
For me it is trial and error to find sick or dead boards.
Dead boards are easier, little red light is out.
When they are sick the red light is still on. So I pull hash cables one at a time and restart until the sick board goes away. Bingo.
hero member
Activity: 700
Merit: 501
https://bitcointalk.org/index.php?topic=905210.msg

Biodom I agree it would be nice to see information.
I would like to hear from anyone who purchased the last couple of batches.
Are the QC issues corrected with the hash board failures?

I have very few S9s left and with the power usage it can be a good machine. My personal experience has not been good regarding failures.
Bitmain Warranty have repaired several and I managed to spend the time on one.

The BTC hardware landscape has been difficult most of the time. At least it isn't pre-order ripoffs.
It is fairly difficult to grab a unit from one of these non-announced S9 Sales. Not that I would be first in line to buy one, but they sold out fast so are they going "all over" in any significant amount?
 - I do not think Bitmain can build mining equipment fast enough. Between their own use, the huge contracts they have for other large-scale miners, deals with other equipment manufacturers, and the cash. Man it must take quite a few people just to count all the money rolling in. It is said they can control bitcoin in almost every way, especially the price, if they need to limit the amount of equipment in use, especially "competing" against them, why not only deal with large money deals for a while? I bet they seriously cannot build them fast enough... If they want to do so. Magic Money boxes indeed. (With boards that fail, arggg)
I do want to see an S9+, but not an increase in hashpower. I want to see an S9 perfected. Running at the originally quoted hashrate without needing to be screwed with.

Hashpower growing while form factor remains the same or shrinks is a great thing, but, when we have a significant portion of hash rate in one hash board we need the ability to stock spares and I would like to hear positive news regarding the failure rates? Have they diminished?

On another note how difficult can it be to hide BTC on the blockchain, or get it off? Very fucking difficult to do something with, if someone is determined.

I am still running several underclocked S7s, thanks to:

sidehack and co:
https://bitcointalksearch.org/topic/hacking-the-s7-improving-efficiency-through-minor-hardware-manipulation-1504228

RadekG and co:
https://bitcointalksearch.org/topic/cheap-and-simple-repair-of-s7-hash-board-1420909

These guys and the people posting feedback, contributing to the threads help us.
Throw them tips if you can, I know sidehack is always trying to do something to help miners be better miners.

By the way, which chain is which board when looking at the S9?
I have searched through this thread and google FUd it, but in my case the FU may mean something else. I'd like to see a picture showing which port on the board controls which hash board. I have to go take one apart now so maybe I'll remember this time Smiley

Edit 1:
I found this description:
The HW errors don't look like a problem but that reported hashrate sure does.  You might try swapping where the hash board is connected on the controller board.  My B3 has 4 connectors on the control board so there is an extra and I have seen another post where someone else said theirs did too.  Looking down at the unit from the top with the intake fan pointing towards you (what I would call the front) I believe that bottom chain in the GUI should be the right hand board. 

Thanks all




 
legendary
Activity: 3892
Merit: 4331
So, Bitmain did not release any info on 13.5 Th machines.
This is odd. Perhaps, it is an average of 12.93 and 14? The number is close enough (13.46) that I start to think that it could be the case.
Oddball machines that were rejected, then brought back. Or machines with mixed autotuning boards.
If it is just a population average, then this would be unfair from the perspective of those who will get essentially a 12.93 Th machine.
We shall see. Radiosilence from BMT on the issue so far.
legendary
Activity: 1050
Merit: 1000
Tim for BITMAIN to release a 30TH 4U rackmount unit!
Bit better build quality please..

SP50 dreams live...via Bitmain? Possible, but shipping is a problem.

3 s9s running off a 4k DPS-2500 is really close to that dream.

To all those that fear no new gear if bit-main cuts off sales buy from Avalon or bitfury as they will have some for sale.

I think what we won't see for two years is a 0.05 watt miner
With some crafty engineering you could do this,
It would be nice to have 9 blades on one controller, in an enclosed 4u case, more fans, power supplies could be in a 2u rack etc.

I think cooling, ease of use. etc. would be pretty nice.
full member
Activity: 294
Merit: 100
I think these were indeed surplus units or the few miners that couldn't quite reach the 14Ths promised from the last batch. Either way they are gone for good now.

I am a bit worried about bitmain not stocking new units, between the way the last batch went down and the fact that they have gone bitcoin only. Could this change indicate that they are shutting down sales of S9 for a while/forever? Could that last batch of R4 sales be just to get rid of surplus frames and stock from the first R4 batch a few months ago?

I wonder if bitmain will suddenly shut down miner sales for 6 months or more like they did at the end of the S5. They gave no notice of when they would stop sales or for how long, they just did.

may be yes may not..i think they will restock next month after xmas / new year 2017 or may be they are gonna release s10/s11. The btc difficulties is jumping like crazy....
id5
newbie
Activity: 25
Merit: 0
Looks like they are confiscating gold from India's people. Jewelry also.
And to balance that with the Official stance:
http://timesofindia.indiatimes.com/india/No-tax-on-jewellery/gold-purchased-out-of-disclosed-income-Finance-Ministry/articleshow/55724734.cms
It is Gold from undisclosed income they are after.

Most Indian's don't trust banks because they have had so many failures of the last century, they store wealth in wearable gold instead. The I-T Act allows the seizure of gold including jewellery if you have more than $19K per married lady, $8.5K per unmarried lady and $3.8K per male. You then have to prove where you got it from to get it back. Good luck with that in a country where the average worker is paid in cash with no receipt.
legendary
Activity: 4256
Merit: 8551
'The right to privacy matters'
Tim for BITMAIN to release a 30TH 4U rackmount unit!
Bit better build quality please..

SP50 dreams live...via Bitmain? Possible, but shipping is a problem.

3 s9s running off a 4k DPS-2500 is really close to that dream.

To all those that fear no new gear if bit-main cuts off sales buy from Avalon or bitfury as they will have some for sale.

I think what we won't see for two years is a 0.05 watt miner
legendary
Activity: 3892
Merit: 4331
Tim for BITMAIN to release a 30TH 4U rackmount unit!
Bit better build quality please..

SP50 dreams live...via Bitmain? Possible, but shipping is a problem.
Jump to: