Author

Topic: S17 Pro Issues (Read 96 times)

legendary
Activity: 1988
Merit: 1561
CLEAN non GPL infringing code made in Rust lang
February 04, 2022, 08:43:30 AM
#7
Remember that the Target Temperature is also the pre-heat value (temperature control MUST be set to Auto).

If you are starting a miner from cold in winter, it may help to pre-heat it yourself with a hair dryer or the output from another miner first.
legendary
Activity: 3234
Merit: 2943
Block halving is coming.
January 30, 2022, 07:11:03 PM
#6
Would you mind switching it back to stock firmware the above logs are from BraiinsOS if you can switch it back to the original firmware and run it again maybe the temp issue will show up?
And then post the kernel logs here because I'm a bit confused about the logs from BraiinsOS compared to the original firmware.

Also, I think BraiinsOS still detecting those temp you can only disable it if configure your miner to disable temperature sensor scanning(--no-sensor-scan).
I don't know how to apply it on s17 but try to post on this thread below then ask how to disable the temp sensor scanning.

- https://bitcointalksearch.org/topic/braiins-os-braiins-os-custom-asic-firmware-optimize-performance-efficiency-5036844

But if you are looking for someone who can repair this issue then check https://www.zeusbtc.com/Repair.asp ask them if they have a repair shop near your area.
newbie
Activity: 4
Merit: 0
January 30, 2022, 03:51:42 PM
#5
I believe the problem is from flaky solder connections, these miners are plagued with them. I have found the problems tend to show up more often when cold, then when you warm them up thermal expansion can close the flaky connection enough for the miner to run. You may be able to limp along for a while, but in my experience, it is just a matter of time before it starts failing solidly and won't come up even when warm.

The same issues can cause the temperature sensor issues. Sometimes all the downstream sensors from the place where the issue is start having communication errors.

That would fit with my symptoms. I think I may have reached that point of no return today. Couldn't get it to stay hashing on all 3 boards, even with the ambient temp up in the 50F range, which is a new low for this machine.

I disabled the 2 hashboards that were reporting sensor issues and so far it's been running fine on the remaining board. I won't know if it's significantly more stable until it has a chance to run overnight, but I've accepted I will need repairs.

I see you're in the market for 17-series components, I don't suppose you offer repair services?
hero member
Activity: 544
Merit: 589
January 30, 2022, 03:38:31 PM
#4
I believe the problem is from flaky solder connections, these miners are plagued with them. I have found the problems tend to show up more often when cold, then when you warm them up thermal expansion can close the flaky connection enough for the miner to run. You may be able to limp along for a while, but in my experience, it is just a matter of time before it starts failing solidly and won't come up even when warm.

The same issues can cause the temperature sensor issues. Sometimes all the downstream sensors from the place where the issue is start having communication errors.
newbie
Activity: 27
Merit: 0
January 30, 2022, 01:53:35 PM
#3
Had a similar issue as well before my board went down. Before I pulled the board and sent it to repair, when i saw this error I shutdown the machine for 10mins and warmed moved my intake to a mix of warm room air and outside cold air. Brought the board back for another week or so. Also try vnishs FW might bring it back totally.
newbie
Activity: 4
Merit: 0
January 30, 2022, 01:05:02 PM
#2
Update:

I turned it on long enough to get a log and this time it's reporting temp sensor errors from two boards:

Sun Jan 30 09:23:28 2022 daemon.err bosminer[1632]: Jan 30 16:23:28.484 ERROR bosminer_hal::sensor: Sensor hb3.11[ii_hwmon::tmp451::TMP451]: read failed: I2C error: general error Hashchip: no response for read_register(reg=0x1c) from chip One( 11 )



Sun Jan 30 09:23:29 2022 daemon.err bosminer[1632]: Jan 30 16:23:29.918 ERROR bosminer_hal::sensor: Sensor hb2.8[ii_hwmon::tmp451::TMP451]: read failed: I2C error: general error Hashchip: no response for read_register(reg=0x1c) from chip One( 8 )



Sun Jan 30 09:23:30 2022 daemon.err bosminer[1632]: Jan 30 16:23:30.020 ERROR bosminer_hal::sensor: Sensor hb2.36[ii_hwmon::tmp451::TMP451]: read failed: I2C error: general error Hashchip: no response for read_register(reg=0x1c) from chip One( 36 )



Sun Jan 30 09:25:03 2022 daemon.err bosminer[1632]: Jan 30 16:25:03.696 ERROR bosminer_hal::sensor: Sensor hb3.8[ii_hwmon::tmp451::TMP451]: read failed: I2C error: general error Hashchip: no response for read_register(reg=0x1c) from chip One( 8 )
Sun Jan 30 09:25:03 2022 daemon.err bosminer[1632]: Jan 30 16:25:03.797 ERROR bosminer_hal::sensor: Sensor hb3.36[ii_hwmon::tmp451::TMP451]: read failed: I2C error: general error Hashchip: no response for read_register(reg=0x1c) from chip One( 36 )

Assuming I'm correct in interpreting the sensor IDs, looks like there are 2x sensors on hashboard 2, and 3x on hashboard 3 that aren't communicating.

Further assuming there's nothing to do about this myself if I don't trust my soldering skills in this context?
newbie
Activity: 4
Merit: 0
January 30, 2022, 12:20:38 PM
#1
Hello,

First post here but I've been lurking for a while.

I have one uppity S17 Pro that seems to be highly sensitive to low temperatures, much more so than the others. The main symptom is when incoming air gets cold (like 35F or below), it will essentially go into a reboot loop. It will boot up and start hashing for ~30-45 seconds, then it appears to lose all 3 boards and restart itself. Sometimes I can coax it back online but it's been gradually getting worse over the last couple months. Then, just a few days ago, I started seeing intermittent, slightly erratic chip temp readings from one board. This device's stability is now so low I'm at the point where I want to send it out for repair, but I thought I'd throw this out here in case it turns out to be something I can diag/repair myself.

For comparison, the other miners will run happily until incoming air gets into the teens (F). Then, they will typically reboot once and run happily again for 1-6 hours before they do it again, if they do it again. Yes, I'm sure you'll tell me that's bad for them, and I normally modulate the incoming cooling air temp but there's some diagnostic value in knowing that difference exists.

I'm running Braiins OS+, just installed the new 21.12.1 release. Didn't improve anything. Changing the power settings doesn't effect anything. If I look at the log, I see lots of "TX fifo on hashboard (n) is empty" where (n) is 1, 2, or 3. I can post more of the log if you'd like to read it. There's no mention of temp sensors or anything else. The fact that the "TX fifo" errors occur simultaneously on all three hashboards has me thinking it might be a control board issue?

Thank you for your time.
Jump to: