Has anyone considered putting their bad hash board to oven as some people do for GPUs? In hopes of re-soldering it.
I have plenty of dead hash boards to test, give me the oven receipie and I'll cook the hell out of them.
The science behind it as explained to me years ago.
Things shrink when cooled. The boards shrink at one rate, the wire traces in the board shrink at another. The chips shrink at one rate the solder connections at another.
We are talking about fractions of a % of a mm. BUT, something that was just on the edge at room temperature will at times start to work.
That explains it very well.
Update:
The 2 hash boards are working just fine still, I also noticed one of the boards that came back to life and then died now shows full Asics (65 asics) but it has a temp sensor error
2021-02-23 19:04:14 driver-btm-api.c:1134:check_asic_number_with_power_on: Chain[0]: find 65 asic, times 0
2021-02-23 19:04:25 driver-btm-api.c:1134:check_asic_number_with_power_on: Chain[2]: find 65 asic, times 0
2021-02-23 19:04:30 driver-hash-chip.c:266:set_uart_relay: set uart relay to 0x330003
2021-02-23 19:04:30 driver-btm-api.c:435:set_order_clock: chain[0]: set order clock, stragegy 3
2021-02-23 19:04:30 driver-btm-api.c:435:set_order_clock: chain[2]: set order clock, stragegy 3
2021-02-23 19:04:31 driver-hash-chip.c:502:set_clock_delay_control: core_data = 0x34
2021-02-23 19:04:31 driver-btm-api.c:1892:check_clock_counter: freq 50 clock_counter_limit 6
2021-02-23 19:04:31 voltage[0] = 1980
2021-02-23 19:04:31 voltage[2] = 1980
2021-02-23 19:04:31 power_api.c:226:set_working_voltage_raw: working_voltage_raw = 1980
2021-02-23 19:04:32 temperature.c:340:calibrate_temp_sensor_one_chain: chain 0 temp sensor NCT218
2021-02-23 19:04:33 temperature.c:340:calibrate_temp_sensor_one_chain: chain 2 temp sensor NCT218
2021-02-23 19:04:33 uart.c:72:set_baud: set fpga_baud to 12000000
2021-02-23 19:04:34 driver-btm-api.c:293:check_bringup_temp: Bring up temperature is 21
2021-02-23 19:04:34 thread.c:1378:create_check_miner_status_thread: create thread
2021-02-23 19:04:34 thread.c:1368:create_show_miner_status_thread: create thread
2021-02-23 19:04:34 thread.c:1348:create_temperature_monitor_thread: create thread
2021-02-23 19:04:34 frequency.c:514:check_bringup_temp_dec_freq: dec freq = 0 when bringup temp = 21 dec_freq_index=0
2021-02-23 19:04:34 freq_tuning.c:183:freq_tuning_get_max_freq: Max freq of tuning is 650
2021-02-23 19:04:34 driver-btm-api.c:1765:send_null_work: [DEBUG] Send null work.
2021-02-23 19:04:34 thread.c:1338:create_asic_status_monitor_thread: create thread
2021-02-23 19:04:34 frequency.c:1110:inc_freq_with_fixed_vco: chain = 255, freq = 500, is_higher_voltage = true
2021-02-23 19:05:58 power_api.c:352:set_to_voltage_by_steps: Set to voltage raw 2090, step by step.
2021-02-23 19:05:59 power_api.c:85:check_voltage_multi: retry time: 0
2021-02-23 19:06:01 power_api.c:40:_get_avg_voltage: chain = 0, voltage = 20.817602
2021-02-23 19:06:02 power_api.c:40:_get_avg_voltage: chain = 2, voltage = 20.720663
2021-02-23 19:06:02 power_api.c:53:_get_avg_voltage: average_voltage = 20.769132
2021-02-23 19:06:02 power_api.c:71:check_voltage: target_vol = 20.90, actural_vol = 20.77, check voltage passed.
2021-02-23 19:09:27 frequency.c:1152:inc_freq_with_fixed_step: chain = 2, freq_start = 500, freq_end = 520, freq_step = 5, is_higher_voltage = true
2021-02-23 19:09:35 frequency.c:1181:inc_asic_diff_freq_by_steps: chain = 0, start = 500, freq_step = 5
2021-02-23 19:09:41 frequency.c:1181:inc_asic_diff_freq_by_steps: chain = 2, start = 520, freq_step = 5
2021-02-23 19:09:44 driver-btm-api.c:765:set_timeout: freq = 550, percent = 90, hcn = 44236, timeout = 80
2021-02-23 19:09:44 power_api.c:310:set_to_working_voltage_by_steps: Set to voltage raw 1980, step by step.
2021-02-23 19:09:49 power_api.c:85:check_voltage_multi: retry time: 0
2021-02-23 19:09:50 power_api.c:40:_get_avg_voltage: chain = 0, voltage = 19.727040
2021-02-23 19:09:51 power_api.c:40:_get_avg_voltage: chain = 2, voltage = 19.630102
2021-02-23 19:09:51 power_api.c:53:_get_avg_voltage: average_voltage = 19.678571
2021-02-23 19:09:51 power_api.c:71:check_voltage: target_vol = 19.80, actural_vol = 19.68, check voltage passed.
2021-02-23 19:09:51 thread.c:1373:create_check_system_status_thread: create thread
2021-02-23 19:09:52 driver-btm-api.c:2618:bitmain_soc_init: Init done!
2021-02-23 19:09:52 driver-btm-api.c:222:set_miner_status: STATUS_INIT
2021-02-23 19:09:56 driver-btm-api.c:222:set_miner_status: STATUS_OKAY
2021-02-23 19:09:57 frequency.c:205:get_ideal_hash_rate_GH: ideal_hash_rate = 45227
2021-02-23 19:09:57 frequency.c:223:get_sale_hash_rate_GH: sale_hash_rate = 43000
2021-02-23 19:10:00 driver-btm-api.c:1496:dhash_chip_send_job: Version num 4.
2021-02-23 19:10:00 driver-btm-api.c:1644:dhash_chip_send_job: stime.tv_sec 1614107400, block_ntime 1614107373
2021-02-23 19:10:04 thread.c:1273:asic_status_monitor_thread: ERROR: chain 2 get hashrate_reg_counter 63, require 65, failed times 1: xxooo ooooo ooooo ooooo ooooo ooooo ooooo ooooo ooooo ooooo ooooo ooooo ooooo
2021-02-23 19:10:05 thread.c:1293:asic_status_monitor_thread: chain 2 can't get enough hashrate reg val for 0 times.
2021-02-23 19:10:23 thread.c:1273:asic_status_monitor_thread: ERROR: chain 2 get hashrate_reg_counter 5, require 65, failed times 1: ooooo xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx
2021-02-23 19:10:24 temperature.c:843:get_temp_info: read temp sensor failed: chain = 2, sensor = 0, chip = 14, reg = 0
2021-02-23 19:10:24 temperature.c:843:get_temp_info: read temp sensor failed: chain = 2, sensor = 0, chip = 14, reg = 1
2021-02-23 19:10:24 temperature.c:843:get_temp_info: read temp sensor failed: chain = 2, sensor = 1, chip = 10, reg = 0
2021-02-23 19:10:24 thread.c:1293:asic_status_monitor_thread: chain 2 can't get enough hashrate reg val for 0 times.
2021-02-23 19:10:25 temperature.c:843:get_temp_info: read temp sensor failed: chain = 2, sensor = 1, chip = 10, reg = 1
2021-02-23 19:10:25 thread.c:1273:asic_status_monitor_thread: ERROR: chain 2 get hashrate_reg_counter 5, require 65, failed times 1: ooooo xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx xxxxx
2021-02-23 19:10:25 temperature.c:843:get_temp_info: read temp sensor failed: chain = 2, sensor = 2, chip = 54, reg = 0
2021-02-23 19:10:25 temperature.c:843:get_temp_info: read temp sensor failed: chain = 2, sensor = 2, chip = 54, reg = 1
2021-02-23 19:10:26 temperature.c:843:get_temp_info: read temp sensor failed: chain = 2, sensor = 3, chip = 50, reg = 0
2021-02-23 19:10:26 temperature.c:843:get_temp_info: read temp sensor failed: chain = 2, sensor = 3, chip = 50, reg = 1
2021-02-23 19:10:26 temperature.c:875:get_temp_info: ERROR: chain 2 can get NONE temp info or temp value abnormal, power it off
2021-02-23 19:10:26 thread.c:1293:asic_status_monitor_thread: chain 2 can't get enough hashrate reg val for 1 times.
2021-02-23 19:10:27 thread.c:1265:asic_status_monitor_thread: ERROR: chain 2 get hashrate_reg_counter 0, require 65, failed times 1
2021-02-23 19:10:28 frequency.c:205:get_ideal_hash_rate_GH: ideal_hash_rate = 22656
2021-02-23 19:10:28 frequency.c:223:get_sale_hash_rate_GH: sale_hash_rate = 22000
2021-02-23 19:40:00 thread.c:259:calc_hashrate_avg: avg rate is 23083.64 in 30 mins
2021-02-23 19:40:00 temperature.c:516:temp_statistics_show: pcb temp 49~65 chip temp 72~80
2021-02-23 20:10:01 thread.c:259:calc_hashrate_avg: avg rate is 22760.07 in 30 mins
2021-02-23 20:10:01 temperature.c:516:temp_statistics_show: pcb temp 50~65 chip temp 71~79
Maybe i froze those temp sensors more than I should, or, it could just be a loose heatsink somewhere, but nonetheless, showing full asic count after the freezer visit is a good sign.