Author

Topic: New used batch of S9 13.5T and PSU - not working - how to test PSUs? (Read 235 times)

member
Activity: 124
Merit: 47
Bitmaxz, I will do this when I get home then. Thanks for looking at this. The internet connection is pretty good. It's usually around 6Mb down 300+ Kb up - but sometimes it's a little laggy. It's actually cell based internet service through AT&T. No DSL or other option here for us. However, I have 12 other miners besides these 3 (8 S9 variants and 4 L3 variants) and they are all connecting and mining fine (And have been for months).

The 015 unit has only shown 2 boards a couple of times - but the second board would drop quickly. It has never shown 3 boards yet.

The PICKit3 is interesting - I looked at the links you gave on that. I did not understand what these PICs were - so now I have a better understanding. I may go ahead and get one of those PICKit3 units - but won't use it on these 3 miners. If I don't find a firmware or other fix I will send them back. The seller does not want me to open them if I'm going to return them and they still have their warranty stickers on them. So no opening them up.

I bet that PICKit3 setup along with copying the PIC file/data from a working hash board to one with missing PIC or corrupted PIC could solve a lot of problems on here.
member
Activity: 124
Merit: 47
Bitmaxz, I will do this when I get home then. Thanks for looking at this. The internet connection is pretty good. It's usually around 6Mb down 300+ Kb up - but sometimes it's a little laggy. It's actually cell based internet service through AT&T. No DSL or other option here for us. However, I have 12 other miners besides these 3 (8 S9 variants and 4 L3 variants) and they are all connecting and mining fine (And have been for months).

The 015 unit has only shown 2 boards a couple of times - but the second board would drop quickly. It has never shown 3 boards yet.

The PICKit3 is interesting - I looked at the links you gave on that. I did not understand what these PICs were - so now I have a better understanding. I may go ahead and get one of those PICKit3 units - but won't use it on these 3 miners. If I don't find a firmware or other fix I will send them back. The seller does not want me to open them if I'm going to return them and they still have their warranty stickers on them. So no opening them up.

I bet that PICKit3 setup along with copying the PIC file/data from a working hash board to one with missing PIC or corrupted PIC could solve a lot of problems on here.



@Bitmaxz, I tried to follow your directions - but quickly ran in to a problem on all 3 of these units - I could not flash the s9_fix_upgrade.tar.gz as I kept getting the "413 - Request Entity Too Large" error! At one point I did notice something odd too. I am using Windows 10 Pro - but when I would download that file from Bitmain - it would tell me the name of the file being downloaded was s9_fix_upgrade.tar.tar - but then when I would view the file in the folder I saved it in, it would be just s9_fix_upgrade.tar. (no .gz, or extra .tar) I have my system set to show all file extensions too - that was not the issue. Anyway, I tried it with the way it saved it, I also renamed it and added the .gz extension as well - but no go either way. ONE time 013 took it with the .gz extension added and didn't give the 413 error - but then it wouldn't take the Antminer-S9-all-201711171757-autofreq-user-Update2UBI-NF.tar.gz flash. It was back to the same 413 error. I even tried older fixed freq firmwares, S9i and S9j firmwares - I just kept getting the 413 errors (after re-boots and power cycles too). I hunted around for a workaround to that and found that @Tim-bc had said that about the only way to force the miners to take the firmware when this error was coming up was to use a MicroSD card and to change jumper positions on the controller to write the firmware from the SD card. Long story short - I got more very weird status from these miners - 015 even showing that one chain had 150 ASIC chips on it at one point - and I just decided to cut my losses on this and am returning them to the seller.

The jumper was accessible by pulling the controller toward the hash boards - without having to open the unit up with - and then using curved forceps to reach and move the jumper - but I didn't find what file(s) needed to be put on the SD card - the .gz file, or then extracted contents, or other? I was becoming leary of having to use this approach in case anything went wrong - I didn't want to give the seller reason to not take the units back AND - it was after 1:30AM this morning at this point - so that's that. I'm done with these.

I appreciate your time reviewing the kernel logs. I did record a few more kernel logs during all this as well - including the one with the 150 ASICs on one hash board - if you would be interested in looking at that for the fun of it. If so, let me know and I'll post that on the pastebin.com site as well. Thanks!
legendary
Activity: 3374
Merit: 3095
Playbet.io - Crypto Casino and Sportsbook
I will try the flashing process you listed when I get off work.

I have pasted their kernel logs ( I refer to these 3 as 13, 14, and 15) to the pastbin.com site as you asked. The links to the pastes are Miner 013, Miner 014, and Miner 015.

Miner 013: https://pastebin.com/wqUFDsKB
Miner 014: https://pastebin.com/fMsav2Zx
Miner 015: https://pastebin.com/ThwBPsXu

I appreciate reviews of these. I can find some things in these myself - but much of it is still unknown to me on how to interpret them.

Do you have stable internet connection? I noticed that the miner 013 and miner 014 has the same error which is network "Fatal Error: network connection lost!" and the 015 miner only detects 1 hashboard.

You can try the method below it might solve the issue if not, the final solution that I know is to flash the hashing board with PICKit 3.
There is a guide here but the method is for t9 miner but you can apply it to s9 miner check the thread here https://bitcointalksearch.org/topic/repair-a-t9-after-a-bad-firmware-lost-chain-5032987

How long should I leave them unplugged before re-plugging? Is this to make sure they have all power dissipated - like that being held in capacitors or in the PSU? Is the part of the process critical - the power cable and data cables? If not, I can do the flashing remotely - but if this really needs done in this process I will have to do it after work.

Replug it after flashing the miner or control board make sure only control board has power so that you can go to the miner's dashboard and flash it with the firmware above "s9_fix_upgrade.tar.gz" then after that flash it again with autofreq firmware.

After the flashing is done you can replug all hashing board and test if it works?

Please read this one how he fixes the miner if it has missing hashboard https://bitcointalksearch.org/topic/a-quick-fix-for-antminer-s9-missingnot-showing-hash-board-no-skills-needed-d-5034849
member
Activity: 124
Merit: 47
Can you try to flash this firmware s9_fix_upgrade.tar.gz it uses to fix the file system then remove the hashboard cable from the hashboard including the power cable and then flash it again with this firmware Antminer-S9-all-201711171757-autofreq-user-Update2UBI-NF.tar.gz
This firmware mostly fixed this kind of issue.

Honestly, it's hard to troubleshoot the miner without kernel logs if you can provide the kernel logs we can find if it's a hardware related issue or software related issue.

You can find the kernel logs from the system tab menu then paste it to http://pastebin.com/ and put the link here we will try to find what is the issue of the miner and we will try to give some basic solution to fix the problem.

I will try the flashing process you listed when I get off work.

I have pasted their kernel logs ( I refer to these 3 as 13, 14, and 15) to the pastbin.com site as you asked. The links to the pastes are Miner 013, Miner 014, and Miner 015.

Miner 013: https://pastebin.com/wqUFDsKB
Miner 014: https://pastebin.com/fMsav2Zx
Miner 015: https://pastebin.com/ThwBPsXu

I appreciate reviews of these. I can find some things in these myself - but much of it is still unknown to me on how to interpret them.

How long should I leave them unplugged before re-plugging? Is this to make sure they have all power dissipated - like that being held in capacitors or in the PSU? Is the part of the process critical - the power cable and data cables? If not, I can do the flashing remotely - but if this really needs done in this process I will have to do it after work.
legendary
Activity: 3374
Merit: 3095
Playbet.io - Crypto Casino and Sportsbook
Can you try to flash this firmware s9_fix_upgrade.tar.gz it uses to fix the file system then remove the hashboard cable from the hashboard including the power cable and then flash it again with this firmware Antminer-S9-all-201711171757-autofreq-user-Update2UBI-NF.tar.gz
This firmware mostly fixed this kind of issue.

Honestly, it's hard to troubleshoot the miner without kernel logs if you can provide the kernel logs we can find if it's a hardware related issue or software related issue.

You can find the kernel logs from the system tab menu then paste it to http://pastebin.com/ and put the link here we will try to find what is the issue of the miner and we will try to give some basic solution to fix the problem.
member
Activity: 124
Merit: 47
Yeah, I will probably do that. And yes, 3 PSU's (APW3++ units) and 3 S9-13.5T's. All are connected to 240v mains (last volt check was 242V).

I was hoping that maybe warming them up a bit would be the answer - the other tests get so time consuming. Smiley

I have these running in the same space as 12 other miners - but they were all running before the cool temps started in.



Looking at the kernel logs - one of them was having a fan error. I check the fan and a piece of the protective plastic "grill" covering the fan blades had been broken and was lodged in the miner and the fan blade. I it never made any noise or rattles - but had the fan stuck! After getting this out, it started hashing - but only with 2 boards. One of the boards had 11K+ errors after 3 minutes - I rebooted and now only one board shows up with a temp of 15 (should be high 60' to mid 70's). The others are showing good numbers on the boards that are hashing - HW, temps, hash rates, but one only has 1 board hashing the other has 2 boards hashing. Sad Will do the PSU and cable swapping this evening to track patters and issues.



These machines are making me crazy! I tried known good PSUs - no difference. I tried raising the ambient temp for them - and one time they came up good - well - two of them - one would showed two hash boards - but then later dropped one again (like within half an hour). So I figured - maybe they just didn't want to startup cold - it was about 9 or 10 degrees C. And warming them up got a better result when starting them. But, I after running for nearly 3 hours - they should have been nice and warmed up, so I added them to AwesomeMiner and did a restart through AM to make sure it was able to control them properly - and they started doing the same thing - missing hash boards, one hash board in each running like 10 C higher than the others. I then tried raising the intake air temp again, but with sporadic results. At this point after many restarts and trying different intake temps - pretty much one has 3 hash board showing - but one is always at 0TH/s with a total hash speed of about 9TH/s, the other is only showing 3 hash boards but one of them is only at like 1.7TH/s instead of the 4.5TH/s it should be at - for a total of about 10.7TH/s, and the last one hash one hash board running at 4.5TH/s. So, I think these are being returned. I cannot see how or why they should be so finicky and unstable unless they are just bad. My other 12 miners re-start and come back up fine in these temps and having to raise the temp to the point that their fans start running faster is not acceptable - not that this is even helping at this point.
legendary
Activity: 1554
Merit: 2037
Just want to get everything straight here.

Do you have 3 PSU's and 3 Miners?

The temperature you state of 9-10 Celsius is fine. I've never had an issue at those temps, closer to 0 I've had to warm them up a little. I'd say the output voltage from the PSU's seems fine as well. We still need to know what the PSU Specs are to be sure you have the proper input voltage and that it has the proper capacity.

One thing you could do assuming your providing enough power is swapping around the combination of PSU's and Miners to see if the problem follows the PSU. You can also take note of which chains/board(s) are not hashing - Swap the Connectors around and again see if the problem follows the connectors or stays on the same chains/board(s).
member
Activity: 124
Merit: 47
I tested the PSUs which looked to be brand new - sealed original tape. The miners are used. One has 1 hash board hashing, one has 2 hash boards hashing, and one has no hash boards hashing. I tested the PSU output with voltmeter before connecting them, and they all read 12.26 to 12.27VDC. The temperature in the room is 49-50 degrees F (about 9 - 10 degrees C). It this too cool to get them to startup? Is there another better way to test the PSU's output voltage - like, the voltage may be ok, put are they putting out enough amps? Is it too cool for them?

Basically I'm trying to rule out temp and PSUs right now - if it is other issues - I will probably just return them. Oh, and already flashed them with the latest firmware prior to LPM firmware. No change.

My other miners are working fine, but were also already running before the temps dropped.

Thanks!
Jump to: