Author

Topic: S17+ "read temp sensor failed" (Read 76 times)

legendary
Activity: 2436
Merit: 6643
be constructive or S.T.F.U
September 28, 2021, 06:13:00 PM
#4
thank you for the quick reply. I've searched and saw some other people with this error that apparently fixed it with another FW or reflashing the FW? Or do my logs change from those?

You have not posted the other logs so I can't tell, however, very rarely the temp sensor error is actually a temp sensor error, in that case, some custom firmware will work as long as 2 out of 4 sensors are working unlike the stock firmware which requires 4 out of 4, but this isn't usual, however, you lose nothing if you try another firmware like Vnish or BO.

Quote
How do I spot the physical hashboard with chain 2?


Follow the ribbon cable end at the control board, there are some tiny but readable labels of Chain numbers.
newbie
Activity: 33
Merit: 0
September 28, 2021, 05:06:56 PM
#3

This is caused by physical damage, there isn't much you can do to fix it if you don't have the tools and the skills needed.

Notice while the kernel log reports an issue with the temp sensor, this problem is rarely caused by an actual faulty temp sensor, it's usually caused by a bad chip on the hash board the reports the error, in your case chain 2.

Sometimes, the PSU can be the cause, to test this, unplug the other two hash boards and see if this one works, if it does not, then it's not a PSU-related issue and you will have to unplug that hash board and mine with 2 of them.

By the way, this is a very common issue with all the 17 series, at this point probably the vast majority of these miners have lost at least one hash board.

thank you for the quick reply. I've searched and saw some other people with this error that apparently fixed it with another FW or reflashing the FW? Or do my logs change from those?

How do I spot the physical hashboard with chain 2?

Thanks!
legendary
Activity: 2436
Merit: 6643
be constructive or S.T.F.U
September 28, 2021, 04:40:59 PM
#2
Quote
ل2021-09-28 18:43:31 temperature.c:843:get_temp_info: read temp sensor failed: chain = 2, sensor = 3, chip = 50, reg = 1

This is caused by physical damage, there isn't much you can do to fix it if you don't have the tools and the skills needed.

Notice while the kernel log reports an issue with the temp sensor, this problem is rarely caused by an actual faulty temp sensor, it's usually caused by a bad chip on the hash board the reports the error, in your case chain 2.

Sometimes, the PSU can be the cause, to test this, unplug the other two hash boards and see if this one works, if it does not, then it's not a PSU-related issue and you will have to unplug that hash board and mine with 2 of them.

By the way, this is a very common issue with all the 17 series, at this point probably the vast majority of these miners have lost at least one hash board.
newbie
Activity: 33
Merit: 0
September 28, 2021, 03:29:01 PM
#1
Hey guys

So, one of my S17+'s started having this error today, one of the hashboards disappeared and the fans slowed down.

Here's the log:

Code:
2021-09-28 18:41:44 thread.c:1338:create_asic_status_monitor_thread: create thread
2021-09-28 18:41:44 frequency.c:1110:inc_freq_with_fixed_vco: chain = 255, freq = 555, is_higher_voltage = true
2021-09-28 18:41:44 power_api.c:352:set_to_voltage_by_steps: Set to voltage raw 2090, step by step.
2021-09-28 18:41:45 power_api.c:85:check_voltage_multi: retry time: 0
2021-09-28 18:41:46 power_api.c:40:_get_avg_voltage: chain = 0, voltage = 21.000000
2021-09-28 18:41:48 power_api.c:40:_get_avg_voltage: chain = 1, voltage = 20.940731
2021-09-28 18:41:49 power_api.c:40:_get_avg_voltage: chain = 2, voltage = 21.003812
2021-09-28 18:41:49 power_api.c:53:_get_avg_voltage: average_voltage = 20.981514
2021-09-28 18:41:49 power_api.c:71:check_voltage: target_vol = 20.90, actural_vol = 20.98, check voltage passed.
2021-09-28 18:43:20 thread.c:1273:asic_status_monitor_thread: ERROR: chain 2 get hashrate_reg_counter 18, require 65, failed times 1: ooooo ooooo ooooo oooxx xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx
2021-09-28 18:43:22 thread.c:1293:asic_status_monitor_thread: chain 2 can't get enough hashrate reg val for 0 times.
2021-09-28 18:43:22 thread.c:1273:asic_status_monitor_thread: ERROR: chain 2 get hashrate_reg_counter 18, require 65, failed times 1: ooooo ooooo ooooo oooxx xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx
2021-09-28 18:43:23 temperature.c:843:get_temp_info: read temp sensor failed: chain = 2, sensor = 2, chip = 54, reg = 0
2021-09-28 18:43:23 temperature.c:843:get_temp_info: read temp sensor failed: chain = 2, sensor = 2, chip = 54, reg = 1
2021-09-28 18:43:23 temperature.c:843:get_temp_info: read temp sensor failed: chain = 2, sensor = 3, chip = 50, reg = 0
2021-09-28 18:43:24 thread.c:1293:asic_status_monitor_thread: chain 2 can't get enough hashrate reg val for 1 times.
2021-09-28 18:43:24 temperature.c:843:get_temp_info: read temp sensor failed: chain = 2, sensor = 3, chip = 50, reg = 1
2021-09-28 18:43:24 thread.c:1273:asic_status_monitor_thread: ERROR: chain 2 get hashrate_reg_counter 18, require 65, failed times 1: ooooo ooooo ooooo oooxx xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx
2021-09-28 18:43:26 thread.c:1293:asic_status_monitor_thread: chain 2 can't get enough hashrate reg val for 2 times.
2021-09-28 18:43:26 thread.c:1273:asic_status_monitor_thread: ERROR: chain 2 get hashrate_reg_counter 18, require 65, failed times 1: ooooo ooooo ooooo oooxx xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx
2021-09-28 18:43:26 temperature.c:843:get_temp_info: read temp sensor failed: chain = 2, sensor = 2, chip = 54, reg = 0
2021-09-28 18:43:27 temperature.c:843:get_temp_info: read temp sensor failed: chain = 2, sensor = 2, chip = 54, reg = 1
2021-09-28 18:43:27 temperature.c:843:get_temp_info: read temp sensor failed: chain = 2, sensor = 3, chip = 50, reg = 0
2021-09-28 18:43:27 temperature.c:843:get_temp_info: read temp sensor failed: chain = 2, sensor = 3, chip = 50, reg = 1
2021-09-28 18:43:28 thread.c:1293:asic_status_monitor_thread: chain 2 can't get enough hashrate reg val for 3 times.
2021-09-28 18:43:28 thread.c:1273:asic_status_monitor_thread: ERROR: chain 2 get hashrate_reg_counter 18, require 65, failed times 1: ooooo ooooo ooooo oooxx xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx
2021-09-28 18:43:30 thread.c:1293:asic_status_monitor_thread: chain 2 can't get enough hashrate reg val for 4 times.
2021-09-28 18:43:30 thread.c:1297:asic_status_monitor_thread: chain 2 can't get enough hashrate reg val for 5 times, power off.
2021-09-28 18:43:30 temperature.c:843:get_temp_info: read temp sensor failed: chain = 2, sensor = 2, chip = 54, reg = 0
2021-09-28 18:43:30 temperature.c:843:get_temp_info: read temp sensor failed: chain = 2, sensor = 2, chip = 54, reg = 1
2021-09-28 18:43:31 temperature.c:843:get_temp_info: read temp sensor failed: chain = 2, sensor = 3, chip = 50, reg = 0
2021-09-28 18:43:31 temperature.c:843:get_temp_info: read temp sensor failed: chain = 2, sensor = 3, chip = 50, reg = 1
2021-09-28 18:44:17 power_api.c:352:set_to_voltage_by_steps: Set to voltage raw 2080, step by step.
2021-09-28 18:44:19 power_api.c:85:check_voltage_multi: retry time: 0
2021-09-28 18:44:20 power_api.c:40:_get_avg_voltage: chain = 0, voltage = 20.857143
2021-09-28 18:44:21 power_api.c:40:_get_avg_voltage: chain = 1, voltage = 20.820933
2021-09-28 18:44:21 power_api.c:53:_get_avg_voltage: average_voltage = 20.839038
2021-09-28 18:44:21 power_api.c:71:check_voltage: target_vol = 20.80, actural_vol = 20.84, check voltage passed.
2021-09-28 18:45:11 power_api.c:352:set_to_voltage_by_steps: Set to voltage raw 2030, step by step.
2021-09-28 18:45:13 power_api.c:85:check_voltage_multi: retry time: 0
2021-09-28 18:45:15 power_api.c:40:_get_avg_voltage: chain = 0, voltage = 20.380952
2021-09-28 18:45:16 power_api.c:40:_get_avg_voltage: chain = 1, voltage = 20.367425
2021-09-28 18:45:16 power_api.c:53:_get_avg_voltage: average_voltage = 20.374189
2021-09-28 18:45:16 power_api.c:71:check_voltage: target_vol = 20.30, actural_vol = 20.37, check voltage passed.
2021-09-28 18:47:40 driver-btm-api.c:765:set_timeout: freq = 555, percent = 90, hcn = 44236, timeout = 79
2021-09-28 18:47:40 power_api.c:310:set_to_working_voltage_by_steps: Set to voltage raw 1980, step by step.
2021-09-28 18:47:44 power_api.c:85:check_voltage_multi: retry time: 0
2021-09-28 18:47:46 power_api.c:40:_get_avg_voltage: chain = 0, voltage = 19.880952
2021-09-28 18:47:47 power_api.c:40:_get_avg_voltage: chain = 1, voltage = 19.839647
2021-09-28 18:47:47 power_api.c:53:_get_avg_voltage: average_voltage = 19.860300
2021-09-28 18:47:47 power_api.c:71:check_voltage: target_vol = 19.80, actural_vol = 19.86, check voltage passed.
2021-09-28 18:47:47 thread.c:1373:create_check_system_status_thread: create thread
2021-09-28 18:47:47 driver-btm-api.c:2618:bitmain_soc_init: Init done!
2021-09-28 18:47:47 driver-btm-api.c:222:set_miner_status: STATUS_INIT
2021-09-28 18:47:52 driver-btm-api.c:222:set_miner_status: STATUS_OKAY


I've tried rebooting and upgrading the stock FW, but to no avail.
Any ideas how to workaround this?


Thanks!
Jump to: