Pages:
Author

Topic: Any ideas why my S17 53th keeps showing 127 degrees and 0.00 hash? (Read 179 times)

jr. member
Activity: 37
Merit: 6
Not sure if I understood correctly but if you're seeing 127-127-127-127 temp readouts that's almost certainly a PSU issue. Try swapping PSU's with another miner that you know works fine to eliminate or confirm. I've had 127-127-127-127 errors on 4 different 17-series miners and replacing the PSU fixed every one.

so far installing the recovery via SD and then flashing the most recent firmware seems to have made it run well again.
full member
Activity: 201
Merit: 404
Not sure if I understood correctly but if you're seeing 127-127-127-127 temp readouts that's almost certainly a PSU issue. Try swapping PSU's with another miner that you know works fine to eliminate or confirm. I've had 127-127-127-127 errors on 4 different 17-series miners and replacing the PSU fixed every one.
legendary
Activity: 2170
Merit: 6279
be constructive or S.T.F.U
Seems to be running great now! No 127 degrees, so far it's 30 minutes with 40-60 degrees on all chips like before, and it seems to be ok, this is the longest it ran before crapping out on me.

I hope this isn't just a coincidence with your voltage becoming stable on random bases, but either way, should the problem appear again you have to measure the voltage, don't count on what your electrician tells you unless he is there measuring it live, get a plug and play voltage meter and measure the voltage consistently if it goes anywhere above 240v then that will likely cause issues.

jr. member
Activity: 37
Merit: 6
Ok so I did the SD card recovery, had some issues, but finally got that to work. Then I saw in the dashboard it was back to the 2019 firmware, so I flashed the newest firmware and rebooted the rig.

Seems to be running great now! No 127 degrees, so far it's 30 minutes with 40-60 degrees on all chips like before, and it seems to be ok, this is the longest it ran before crapping out on me.

So hopefully this works out, and I appreciate your help, I learned some new stuff!
jr. member
Activity: 37
Merit: 6
I'm running it directly to a PDU on a 30 amp breaker like I always do on my 100amp service.

This is irrelevant to the question, it doesn't matter how you connect it, the AC voltage is what matters, these PSUs need a consistent voltage rate of 220-240v, they might survive a 200V but nothing below that, also above 240v is going to cause issues, you need to measure the input VOLTAGE and not the amps, the problem could be caused by something else, but everything as of now points to a power related issue, it's either a bad PSU or bad voltage.

My electrician just told me I have 240V
legendary
Activity: 2170
Merit: 6279
be constructive or S.T.F.U
I'm running it directly to a PDU on a 30 amp breaker like I always do on my 100amp service.

This is irrelevant to the question, it doesn't matter how you connect it, the AC voltage is what matters, these PSUs need a consistent voltage rate of 220-240v, they might survive a 200V but nothing below that, also above 240v is going to cause issues, you need to measure the input VOLTAGE and not the amps, the problem could be caused by something else, but everything as of now points to a power related issue, it's either a bad PSU or bad voltage.
jr. member
Activity: 37
Merit: 6
Where do you find the recovery firmware for the SD card?

You can find it on the Bitmain download page check this link below

- https://service.bitmain.com/support/download?product=Flashing%20SD%20card%20with%20image

But I don't if it could fix this issue but it is worth trying.

Have you already tried to run your miner with only one hashboard? If not yet then test them one by one.

Good call, I'll try running them one hashboard at a time.

I did order a replacement PSU to see if that's the issue. Would be weird since the PSU is basically new.
legendary
Activity: 3206
Merit: 2904
Block halving is coming.
Where do you find the recovery firmware for the SD card?

You can find it on the Bitmain download page check this link below

- https://service.bitmain.com/support/download?product=Flashing%20SD%20card%20with%20image

But I don't if it could fix this issue but it is worth trying.

Have you already tried to run your miner with only one hashboard? If not yet then test them one by one.
jr. member
Activity: 37
Merit: 6
it ran for 7 minutes fine before showing 127 degrees across all chips  Cry
jr. member
Activity: 37
Merit: 6
After the reinstall I hadn't rebooted it and it wasn't running, so I rebooted it, now the temps look normal and the kernel log says this. I guess I'll see how long it runs for...

2021-09-07 16:04:14 power_api.c:219:set_working_voltage_raw: working_voltage_raw = 1800
2021-09-07 16:04:15 temperature.c:281:calibrate_temp_sensor_one_chain: Temperature sensor calibration: chain 0 success.
2021-09-07 16:04:17 temperature.c:281:calibrate_temp_sensor_one_chain: Temperature sensor calibration: chain 1 success.
2021-09-07 16:04:18 temperature.c:281:calibrate_temp_sensor_one_chain: Temperature sensor calibration: chain 2 success.
2021-09-07 16:04:18 uart.c:71:set_baud: set fpga_baud to 6000000
2021-09-07 16:04:20 driver-btm-api.c:245:check_bringup_temp: Bring up temperature is 28
2021-09-07 16:04:20 thread.c:798:create_check_miner_status_thread: create thread
2021-09-07 16:04:20 thread.c:788:create_set_miner_status_thread: create thread
2021-09-07 16:04:20 thread.c:773:create_temperature_monitor_thread: create thread
2021-09-07 16:04:20 frequency.c:749:inc_freq_with_fixed_vco: chain = 255, freq = 565, is_higher_voltage = true
2021-09-07 16:04:20 power_api.c:287:set_to_voltage_by_steps: Set to voltage raw 2020, step by step.
2021-09-07 16:04:38 power_api.c:87:check_voltage_multi: retry time: 0
2021-09-07 16:04:39 power_api.c:37:_get_avg_voltage: chain = 0, voltage = 20.108086
2021-09-07 16:04:40 power_api.c:37:_get_avg_voltage: chain = 1, voltage = 20.089717
2021-09-07 16:04:41 power_api.c:37:_get_avg_voltage: chain = 2, voltage = 20.101963
2021-09-07 16:04:41 power_api.c:50:_get_avg_voltage: average_voltage = 20.099922
2021-09-07 16:04:41 power_api.c:68:check_voltage: target_vol = 20.20, actural_vol = 20.10, check voltage passed.
2021-09-07 16:05:11 power_api.c:287:set_to_voltage_by_steps: Set to voltage raw 1880, step by step.
2021-09-07 16:05:31 power_api.c:87:check_voltage_multi: retry time: 0
2021-09-07 16:05:32 power_api.c:37:_get_avg_voltage: chain = 0, voltage = 18.730400
2021-09-07 16:05:34 power_api.c:37:_get_avg_voltage: chain = 1, voltage = 18.656924
2021-09-07 16:05:35 power_api.c:37:_get_avg_voltage: chain = 2, voltage = 18.693662
2021-09-07 16:05:35 power_api.c:50:_get_avg_voltage: average_voltage = 18.693662
2021-09-07 16:05:35 power_api.c:68:check_voltage: target_vol = 18.80, actural_vol = 18.69, check voltage passed.
2021-09-07 16:05:41 frequency.c:801:inc_freq_with_fixed_step: chain = 0, freq_start = 565, freq_end = 565, freq_step = 5, is_higher_voltage = true
2021-09-07 16:05:41 frequency.c:801:inc_freq_with_fixed_step: chain = 1, freq_start = 565, freq_end = 565, freq_step = 5, is_higher_voltage = true
2021-09-07 16:05:41 frequency.c:801:inc_freq_with_fixed_step: chain = 2, freq_start = 565, freq_end = 565, freq_step = 5, is_higher_voltage = true
2021-09-07 16:05:41 frequency.c:801:inc_freq_with_fixed_step: chain = 0, freq_start = 565, freq_end = 565, freq_step = 5, is_higher_voltage = true
2021-09-07 16:05:41 frequency.c:801:inc_freq_with_fixed_step: chain = 1, freq_start = 565, freq_end = 565, freq_step = 5, is_higher_voltage = true
2021-09-07 16:05:41 frequency.c:801:inc_freq_with_fixed_step: chain = 2, freq_start = 565, freq_end = 565, freq_step = 5, is_higher_voltage = true
2021-09-07 16:05:41 frequency.c:618:inc_asic_diff_freq_by_steps: chain = 0, start = 565, freq_step = 5
2021-09-07 16:05:44 frequency.c:618:inc_asic_diff_freq_by_steps: chain = 1, start = 565, freq_step = 5
2021-09-07 16:05:47 frequency.c:618:inc_asic_diff_freq_by_steps: chain = 2, start = 565, freq_step = 5
2021-09-07 16:05:50 driver-btm-api.c:528:set_timeout: freq = 595, percent = 90, hcn = 73728, timeout = 123
2021-09-07 16:05:50 power_api.c:269:set_to_working_voltage_by_steps: Set to voltage raw 1800, step by step.
2021-09-07 16:06:11 power_api.c:87:check_voltage_multi: retry time: 0
2021-09-07 16:06:12 power_api.c:37:_get_avg_voltage: chain = 0, voltage = 17.897666
2021-09-07 16:06:13 power_api.c:37:_get_avg_voltage: chain = 1, voltage = 17.867051
2021-09-07 16:06:14 power_api.c:37:_get_avg_voltage: chain = 2, voltage = 17.879297
2021-09-07 16:06:14 power_api.c:50:_get_avg_voltage: average_voltage = 17.881338
2021-09-07 16:06:14 power_api.c:68:check_voltage: target_vol = 18.00, actural_vol = 17.88, check voltage passed.
2021-09-07 16:06:14 thread.c:793:create_check_system_status_thread: create thread
2021-09-07 16:06:15 driver-btm-api.c:1988:bitmain_soc_init: Init done!
2021-09-07 16:06:15 driver-btm-api.c:198:set_miner_status: STATUS_INIT
2021-09-07 16:06:19 driver-btm-api.c:198:set_miner_status: STATUS_OKAY
2021-09-07 16:06:20 frequency.c:216:get_ideal_hash_rate_GH: ideal_hash_rate = 54675
2021-09-07 16:06:20 frequency.c:236:get_sale_hash_rate_GH: sale_hash_rate = 53000
2021-09-07 16:06:24 driver-btm-api.c:1199:dhash_chip_send_job: Version num 4
jr. member
Activity: 37
Merit: 6
The weird thing about it is that the PSU

There is nothing weird about that, the s17 series are a piece of trash and you should expect everything. What is your input AC voltage?

I'm running it directly to a PDU on a 30 amp breaker like I always do on my 100amp service.
legendary
Activity: 2170
Merit: 6279
be constructive or S.T.F.U
The weird thing about it is that the PSU

There is nothing weird about that, the s17 series are a piece of trash and you should expect everything. What is your input AC voltage?
jr. member
Activity: 37
Merit: 6
I reinstalled the S17 pro firmware and the kernel log still says:

2021-09-07 15:43:12 thread.c:127:pic_heart_beat_thread: chain[1] heart beat fail 9 times.
2021-09-07 15:43:13 temperature.c:695:get_temp_info: read temp sensor failed: chain = 1, sensor = 0, chip = 55, reg = 1
2021-09-07 15:43:13 temperature.c:695:get_temp_info: read temp sensor failed: chain = 1, sensor = 1, chip = 40, reg = 0
2021-09-07 15:43:15 temperature.c:695:get_temp_info: read temp sensor failed: chain = 1, sensor = 1, chip = 40, reg = 1
2021-09-07 15:43:16 temperature.c:695:get_temp_info: read temp sensor failed: chain = 1, sensor = 2, chip = 180, reg = 0
2021-09-07 15:43:17 temperature.c:695:get_temp_info: read temp sensor failed: chain = 1, sensor = 2, chip = 180, reg = 1
2021-09-07 15:43:18 temperature.c:695:get_temp_info: read temp sensor failed: chain = 1, sensor = 3, chip = 195, reg = 0
2021-09-07 15:43:19 temperature.c:695:get_temp_info: read temp sensor failed: chain = 1, sensor = 3, chip = 195, reg = 1
2021-09-07 15:43:20 temperature.c:695:get_temp_info: read temp sensor failed: chain = 2, sensor = 0, chip = 55, reg = 0
2021-09-07 15:43:20 power_api.c:37:_get_avg_voltage: chain = 0, voltage = 0.000000
2021-09-07 15:43:21 temperature.c:695:get_temp_info: read temp sensor failed: chain = 2, sensor = 0, chip = 55, reg = 1
2021-09-07 15:43:21 thread.c:127:pic_heart_beat_thread: chain[2] heart beat fail 9 times.
2021-09-07 15:43:22 temperature.c:695:get_temp_info: read temp sensor failed: chain = 2, sensor = 1, chip = 40, reg = 0
2021-09-07 15:43:23 temperature.c:695:get_temp_info: read temp sensor failed: chain = 2, sensor = 1, chip = 40, reg = 1
2021-09-07 15:43:24 temperature.c:695:get_temp_info: read temp sensor failed: chain = 2, sensor = 2, chip = 180, reg = 0
2021-09-07 15:43:25 temperature.c:695:get_temp_info: read temp sensor failed: chain = 2, sensor = 2, chip = 180, reg = 1
2021-09-07 15:43:26 temperature.c:695:get_temp_info: read temp sensor failed: chain = 2, sensor = 3, chip = 195, reg = 0
2021-09-07 15:43:27 temperature.c:695:get_temp_info: read temp sensor failed: chain = 2, sensor = 3, chip = 195, reg = 1
2021-09-07 15:43:29 power_api.c:37:_get_avg_voltage: chain = 1, voltage = 0.000000
2021-09-07 15:43:29 temperature.c:695:get_temp_info: read temp sensor failed: chain = 0, sensor = 0, chip = 55, reg = 0
2021-09-07 15:43:30 temperature.c:695:get_temp_info: read temp sensor failed: chain = 0, sensor = 0, chip = 55, reg = 1
2021-09-07 15:43:31 temperature.c:695:get_temp_info: read temp sensor failed: chain = 0, sensor = 1, chip = 40, reg = 0
2021-09-07 15:43:33 temperature.c:695:get_temp_info: read temp sensor failed: chain = 0, sensor = 1, chip = 40, reg = 1
2021-09-07 15:43:33 temperature.c:695:get_temp_info: read temp sensor failed: chain = 0, sensor = 2, chip = 180, reg = 0
2021-09-07 15:43:35 temperature.c:695:get_temp_info: read temp sensor failed: chain = 0, sensor = 2, chip = 180, reg = 1
2021-09-07 15:43:35 temperature.c:695:get_temp_info: read temp sensor failed: chain = 0, sensor = 3, chip = 195, reg = 0
2021-09-07 15:43:37 temperature.c:695:get_temp_info: read temp sensor failed: chain = 0, sensor = 3, chip = 195, reg = 1
2021-09-07 15:43:37 power_api.c:37:_get_avg_voltage: chain = 2, voltage = 0.000000
2021-09-07 15:43:37 power_api.c:50:_get_avg_voltage: average_voltage = 0.000000
2021-09-07 15:43:37 power_api.c:63:check_voltage: target_vol = 18.80, actural_vol = 0.00, more than 1.0v diff.
2021-09-07 15:43:37 temperature.c:695:get_temp_info: read temp sensor failed: chain = 1, sensor = 0, chip = 55, reg = 0
2021-09-07 15:43:38 power_api.c:87:check_voltage_multi: retry time: 10
2021-09-07 15:43:39 temperature.c:695:get_temp_info: read temp sensor failed: chain = 1, sensor = 0, chip = 55, reg = 1
2021-09-07 15:43:40 temperature.c:695:get_temp_info: read temp sensor failed: chain = 1, sensor = 1, chip = 40, reg = 0
2021-09-07 15:43:41 temperature.c:695:get_temp_info: read temp sensor failed: chain = 1, sensor = 1, chip = 40, reg = 1
2021-09-07 15:43:41 thread.c:127:pic_heart_beat_thread: chain[0] heart beat fail 9 times.
2021-09-07 15:43:42 temperature.c:695:get_temp_info: read temp sensor failed: chain = 1, sensor = 2, chip = 180, reg = 0
2021-09-07 15:43:43 temperature.c:695:get_temp_info: read temp sensor failed: chain = 1, sensor = 2, chip = 180, reg = 1
2021-09-07 15:43:44 temperature.c:695:get_temp_info: read temp sensor failed: chain = 1, sensor = 3, chip = 195, reg = 0
2021-09-07 15:43:45 temperature.c:695:get_temp_info: read temp sensor failed: chain = 1, sensor = 3, chip = 195, reg = 1
2021-09-07 15:43:46 temperature.c:695:get_temp_info: read temp sensor failed: chain = 2, sensor = 0, chip = 55, reg = 0
2021-09-07 15:43:46 power_api.c:37:_get_avg_voltage: chain = 0, voltage = 0.000000
2021-09-07 15:43:47 temperature.c:695:get_temp_info: read temp sensor failed: chain = 2, sensor = 0, chip = 55, reg = 1
2021-09-07 15:43:48 temperature.c:695:get_temp_info: read temp sensor failed: chain = 2, sensor = 1, chip = 40, reg = 0
2021-09-07 15:43:49 temperature.c:695:get_temp_info: read temp sensor failed: chain = 2, sensor = 1, chip = 40, reg = 1
2021-09-07 15:43:50 thread.c:127:pic_heart_beat_thread: chain[1] heart beat fail 10 times.
2021-09-07 15:43:50 temperature.c:695:get_temp_info: read temp sensor failed: chain = 2, sensor = 2, chip = 180, reg = 0
2021-09-07 15:43:51 temperature.c:695:get_temp_info: read temp sensor failed: chain = 2, sensor = 2, chip = 180, reg = 1
2021-09-07 15:43:52 temperature.c:695:get_temp_info: read temp sensor failed: chain = 2, sensor = 3, chip = 195, reg = 0
2021-09-07 15:43:54 temperature.c:695:get_temp_info: read temp sensor failed: chain = 2, sensor = 3, chip = 195, reg = 1
2021-09-07 15:43:55 power_api.c:37:_get_avg_voltage: chain = 1, voltage = 0.000000
2021-09-07 15:43:55 temperature.c:695:get_temp_info: read temp sensor failed: chain = 0, sensor = 0, chip = 55, reg = 0
2021-09-07 15:43:57 temperature.c:695:get_temp_info: read temp sensor failed: chain = 0, sensor = 0, chip = 55, reg = 1
jr. member
Activity: 37
Merit: 6
It seems that the temp sensor is reading abnormal or it is broken.

Have you tried to flash it? If not then try it first then try to test the hashboard one by one

Also, a little bit of research here on the forum will bring you here https://bitcointalksearch.org/topic/antminer-t17s17-temp-sensor-problem-discussion-5244120
Reading that would help you to get rid of the sensor problem.

I've been searching online but can't seem to find the recovery firmware to use the SD card with. But I found the normal S17 firmware I can install.

Where do you find the recovery firmware for the SD card?
jr. member
Activity: 37
Merit: 6
Code:
2021-09-02 16:20:19 temperature.c:695:get_temp_info: read temp sensor failed: chain = 0, sensor = 0, chip = 55, reg = 0
2021-09-02 16:20:09 temperature.c:695:get_temp_info: read temp sensor failed: chain = 1, sensor = 3, chip = 195, reg = 1
2021-09-02 16:20:10 temperature.c:695:get_temp_info: read temp sensor failed: chain = 2, sensor = 0, chip = 55, reg = 0

As the kernel log, it's not being able to read the temps out of these temp sensors, but the problem isn't exactly that, it's very unlikely for a temp sensor to break, in most cases it's actually a bad chip that tricks the firmware into thinking that issue with one of the sensors.

However, the above is valid when the error only shows for 1 hash board, but then you said the 3 hash boards were working fine and then stopped together, and since it's highly unlikely that ALL there boards will fail at the same time, this confirms Zeusbtc's tech's theory which suggests that in most cases when you get a temp error on all three boards it means your PSU is bad.

The weird thing about it is that the PSU is pretty much brand new.
jr. member
Activity: 37
Merit: 6
It seems that the temp sensor is reading abnormal or it is broken.

Have you tried to flash it? If not then try it first then try to test the hashboard one by one

Also, a little bit of research here on the forum will bring you here https://bitcointalksearch.org/topic/antminer-t17s17-temp-sensor-problem-discussion-5244120
Reading that would help you to get rid of the sensor problem.

I was thinking of trying the SD card / flashing but wasn't 100% sure how to do it and didn't want to mess up the rig.


Thanks for the link, I'll try to do this today and see if it works.
legendary
Activity: 2170
Merit: 6279
be constructive or S.T.F.U
Code:
2021-09-02 16:20:19 temperature.c:695:get_temp_info: read temp sensor failed: chain = 0, sensor = 0, chip = 55, reg = 0
2021-09-02 16:20:09 temperature.c:695:get_temp_info: read temp sensor failed: chain = 1, sensor = 3, chip = 195, reg = 1
2021-09-02 16:20:10 temperature.c:695:get_temp_info: read temp sensor failed: chain = 2, sensor = 0, chip = 55, reg = 0

As the kernel log, it's not being able to read the temps out of these temp sensors, but the problem isn't exactly that, it's very unlikely for a temp sensor to break, in most cases it's actually a bad chip that tricks the firmware into thinking that issue with one of the sensors.

However, the above is valid when the error only shows for 1 hash board, but then you said the 3 hash boards were working fine and then stopped together, and since it's highly unlikely that ALL there boards will fail at the same time, this confirms Zeusbtc's tech's theory which suggests that in most cases when you get a temp error on all three boards it means your PSU is bad.
legendary
Activity: 3206
Merit: 2904
Block halving is coming.
It seems that the temp sensor is reading abnormal or it is broken.

Have you tried to flash it? If not then try it first then try to test the hashboard one by one

Also, a little bit of research here on the forum will bring you here https://bitcointalksearch.org/topic/antminer-t17s17-temp-sensor-problem-discussion-5244120
Reading that would help you to get rid of the sensor problem.
jr. member
Activity: 37
Merit: 6
I took apart the rig and checked everything, cleaned it thoroughly, put it back together, and it's still giving me the 127 degrees on each chip after 5 minutes of running it. I have no idea why. Everything looks good, and the power supply is pretty much brand new. Any ideas?  Cry
jr. member
Activity: 37
Merit: 6
Thanks for the tip! Hadn't heard of paste bin, just signed up and pasted it there, good to know of this resource, I would've liked to have used it for the past several years lol  Grin
Pages:
Jump to: