Author

Topic: Error overheated chip t17+ (Read 193 times)

full member
Activity: 228
Merit: 101
NEM (XEM) Top Coin
May 07, 2021, 04:27:04 PM
#14
From what I can tell from the log, the miner starts mining and it takes about 3 minutes for the chip to overheat.


AntMiner

Chain#   ASIC#   Frequency(avg)   Voltage   Consumption (W)   GH/S(ideal)   GH/S(RT)   Errors(HW)   Temp(PCB)   Temp(Chip)   

3               44                        450                  16.5   494   13,305.60   7,857.16   0   43-49-47-53   71-61-80-62   Auto-tuning

I'm assuming that was captured during the 3 minutes, and one chip temp is already up to 80 dec, significantly higher than the others. So it looks like it runs for a little bit, and that chip just keeps getting hotter until it hits the 90Deg limit and shuts down. You could confirm this by just looking at the status page and refreshing constantly during bootup, monitoring the temperature. If it just gradually rises to failure, then I don't think it looks like a faulty sensor, but just a bad ASIC or a bad connection to its heatsink.

These miners have a lot of issues with heat sinks, many times the copper plating that interfaces the chip to the solder that holds the heatsinks on delaminates, which can cause a little gap that will kill the efficiency of the heatsink. It is very possible that if you shake that miner a bit, or if you push on that heatsink, it will just fall right off. I assume you've already checked that there isn't a loose or missing heatsink? And that the heatsinks are clogged up with anything?

You may be able to limp along if you block the 2 empty slots somehow. With them blocked, a lot more air will be forced through the heatsinks and it might be enough to keep it below 90.


Dear wndsnb
this status is with one hashboard and two other hashboard was removed
Chain#   ASIC#   Frequency(avg)   Voltage   Consumption (W)   GH/S(ideal)   GH/S(RT)   Errors(HW)   Temp(PCB)   Temp(Chip)   

3               44                        450                  16.5   494   13,305.60   7,857.16   0   43-49-47-53   71-61-80-62
legendary
Activity: 2436
Merit: 6643
be constructive or S.T.F.U
April 29, 2021, 10:54:54 PM
#13
You may be able to limp along if you block the 2 empty slots somehow. With them blocked, a lot more air will be forced through the heatsinks and it might be enough to keep it below 90.

Or he could use something like Vnish and use the "chip frequency settings" to lower the frequency of the bad chip to the lowest point possible, that way the rest of the chips will run at the default frequency and he won't need any hardware modifications.
sr. member
Activity: 446
Merit: 347
April 28, 2021, 03:22:53 PM
#12
I think exactly on what wndsnb says, he is certainly right !!!

On the other hand, you ask for help, have offered tests, and you do not do them ... so why come and ask us !!! ?

Do you even know that the original bitmain firmware makes the machine heat up much more? it is not for nothing that I asked you to test with my firmware, AT VERY LOW SPEED! and with the fans 100% !!! if that scares you, have you confirm that I am of confidence ...


So, try to see ...
hero member
Activity: 544
Merit: 589
April 28, 2021, 02:27:42 PM
#11
From what I can tell from the log, the miner starts mining and it takes about 3 minutes for the chip to overheat.


AntMiner

Chain#   ASIC#   Frequency(avg)   Voltage   Consumption (W)   GH/S(ideal)   GH/S(RT)   Errors(HW)   Temp(PCB)   Temp(Chip)   

3               44                        450                  16.5   494   13,305.60   7,857.16   0   43-49-47-53   71-61-80-62   Auto-tuning

I'm assuming that was captured during the 3 minutes, and one chip temp is already up to 80 dec, significantly higher than the others. So it looks like it runs for a little bit, and that chip just keeps getting hotter until it hits the 90Deg limit and shuts down. You could confirm this by just looking at the status page and refreshing constantly during bootup, monitoring the temperature. If it just gradually rises to failure, then I don't think it looks like a faulty sensor, but just a bad ASIC or a bad connection to its heatsink.

These miners have a lot of issues with heat sinks, many times the copper plating that interfaces the chip to the solder that holds the heatsinks on delaminates, which can cause a little gap that will kill the efficiency of the heatsink. It is very possible that if you shake that miner a bit, or if you push on that heatsink, it will just fall right off. I assume you've already checked that there isn't a loose or missing heatsink? And that the heatsinks are clogged up with anything?

You may be able to limp along if you block the 2 empty slots somehow. With them blocked, a lot more air will be forced through the heatsinks and it might be enough to keep it below 90.
full member
Activity: 228
Merit: 101
NEM (XEM) Top Coin
April 28, 2021, 01:56:58 PM
#10
2021-04-27 11:37:34 thread.c:1161:check_temperature: over max temp, pcb temp 68 (max 80), chip temp 105(max 103)
2021-04-27 11:37:34 driver-btm-api.c:222:set_miner_status: ERROR_TEMP_TOO_HIGH

I see you created another thread just for this problem, which IMO isn't needed at all, in my last post I asked you to set fans to full RPMs using static fan speed of 100%, and then to report back on the finding, you said you flashed bitmain firmware but you didn't mention anything about the fan speed.
Dear mikeywith
I tested with fan 100% and this error is exist yet
legendary
Activity: 2436
Merit: 6643
be constructive or S.T.F.U
April 27, 2021, 05:30:40 PM
#9
2021-04-27 11:37:34 thread.c:1161:check_temperature: over max temp, pcb temp 68 (max 80), chip temp 105(max 103)
2021-04-27 11:37:34 driver-btm-api.c:222:set_miner_status: ERROR_TEMP_TOO_HIGH

I see you created another thread just for this problem, which IMO isn't needed at all, in my last post I asked you to set fans to full RPMs using static fan speed of 100%, and then to report back on the finding, you said you flashed bitmain firmware but you didn't mention anything about the fan speed.
full member
Activity: 228
Merit: 101
NEM (XEM) Top Coin
April 27, 2021, 06:42:25 AM
#8
I changed framework to bitmain framework and give this error too

Code:
Booting Linux on physical CPU 0x0
Linux version 4.6.0-xilinx-gff8137b-dirty (lzq@armdev2) (gcc version 4.8.3 20140320 (prerelease) (Sourcery CodeBench Lite 2014.05-23) ) #25 SMP PREEMPT Fri Nov 23 15:30:52 CST 2018
CPU: ARMv7 Processor [413fc090] revision 0 (ARMv7), cr=18c5387d
CPU: PIPT / VIPT nonaliasing data cache, VIPT aliasing instruction cache
Machine model: Xilinx Zynq
cma: Reserved 16 MiB at 0x0e000000
Memory policy: Data cache writealloc
On node 0 totalpages: 61440
free_area_init_node: node 0, pgdat c0b39280, node_mem_map cde10000
  Normal zone: 480 pages used for memmap
  Normal zone: 0 pages reserved
  Normal zone: 61440 pages, LIFO batch:15
percpu: Embedded 12 pages/cpu @cddf1000 s19776 r8192 d21184 u49152
pcpu-alloc: s19776 r8192 d21184 u49152 alloc=12*4096
pcpu-alloc: [0] 0 [0] 1
Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 60960
Kernel command line: mem=240M console=ttyPS0,115200 ramdisk_size=33554432 root=/dev/ram rw earlyprintk
PID hash table entries: 1024 (order: 0, 4096 bytes)
Dentry cache hash table entries: 32768 (order: 5, 131072 bytes)
Inode-cache hash table entries: 16384 (order: 4, 65536 bytes)
Memory: 203416K/245760K available (6345K kernel code, 231K rwdata, 1896K rodata, 1024K init, 223K bss, 25960K reserved, 16384K cma-reserved, 0K highmem)
Virtual kernel memory layout:
    vector  : 0xffff0000 - 0xffff1000   (   4 kB)
    fixmap  : 0xffc00000 - 0xfff00000   (3072 kB)
    vmalloc : 0xcf800000 - 0xff800000   ( 768 MB)
    lowmem  : 0xc0000000 - 0xcf000000   ( 240 MB)
    pkmap   : 0xbfe00000 - 0xc0000000   (   2 MB)
    modules : 0xbf000000 - 0xbfe00000   (  14 MB)
      .text : 0xc0008000 - 0xc090c424   (9234 kB)
      .init : 0xc0a00000 - 0xc0b00000   (1024 kB)
      .data : 0xc0b00000 - 0xc0b39fe0   ( 232 kB)
       .bss : 0xc0b39fe0 - 0xc0b71c28   ( 224 kB)
Preemptible hierarchical RCU implementation.
Build-time adjustment of leaf fanout to 32.
RCU restricting CPUs from NR_CPUS=4 to nr_cpu_ids=2.
RCU: Adjusting geometry for rcu_fanout_leaf=32, nr_cpu_ids=2
NR_IRQS:16 nr_irqs:16 16
efuse mapped to cf800000
ps7-slcr mapped to cf802000
L2C: platform modifies aux control register: 0x72360000 -> 0x72760000
L2C: DT/platform modifies aux control register: 0x72360000 -> 0x72760000
L2C-310 erratum 769419 enabled
L2C-310 enabling early BRESP for Cortex-A9
L2C-310 full line of zeros enabled for Cortex-A9
L2C-310 ID prefetch enabled, offset 1 lines
L2C-310 dynamic clock gating enabled, standby mode enabled
L2C-310 cache controller enabled, 8 ways, 512 kB
L2C-310: CACHE_ID 0x410000c8, AUX_CTRL 0x76760001
zynq_clock_init: clkc starts at cf802100
Zynq clock init
sched_clock: 64 bits at 333MHz, resolution 3ns, wraps every 4398046511103ns
clocksource: arm_global_timer: mask: 0xffffffffffffffff max_cycles: 0x4ce07af025, max_idle_ns: 440795209040 ns
Switching to timer-based delay loop, resolution 3ns
clocksource: ttc_clocksource: mask: 0xffff max_cycles: 0xffff, max_idle_ns: 537538477 ns
ps7-ttc #0 at cf80a000, irq=18
Console: colour dummy device 80x30
Calibrating delay loop (skipped), value calculated using timer frequency.. 666.66 BogoMIPS (lpj=3333333)
pid_max: default: 32768 minimum: 301
Mount-cache hash table entries: 1024 (order: 0, 4096 bytes)
Mountpoint-cache hash table entries: 1024 (order: 0, 4096 bytes)
CPU: Testing write buffer coherency: ok
CPU0: thread -1, cpu 0, socket 0, mpidr 80000000
Setting up static identity map for 0x100000 - 0x100058
CPU1: failed to boot: -1
Brought up 1 CPUs
SMP: Total of 1 processors activated (666.66 BogoMIPS).
CPU: All CPU(s) started in SVC mode.
devtmpfs: initialized
VFP support v0.3: implementor 41 architecture 3 part 30 variant 9 rev 4
clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 19112604462750000 ns
pinctrl core: initialized pinctrl subsystem
NET: Registered protocol family 16
DMA: preallocated 256 KiB pool for atomic coherent allocations
cpuidle: using governor menu
hw-breakpoint: found 5 (+1 reserved) breakpoint and 1 watchpoint registers.
hw-breakpoint: maximum watchpoint size is 4 bytes.
zynq-ocm f800c000.ps7-ocmc: ZYNQ OCM pool: 256 KiB @ 0xcf880000
vgaarb: loaded
SCSI subsystem initialized
usbcore: registered new interface driver usbfs
usbcore: registered new interface driver hub
usbcore: registered new device driver usb
media: Linux media interface: v0.10
Linux video capture interface: v2.00
pps_core: LinuxPPS API ver. 1 registered
pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti
PTP clock support registered
EDAC MC: Ver: 3.0.0
Advanced Linux Sound Architecture Driver Initialized.
clocksource: Switched to clocksource arm_global_timer
NET: Registered protocol family 2
TCP established hash table entries: 2048 (order: 1, 8192 bytes)
TCP bind hash table entries: 2048 (order: 2, 16384 bytes)
TCP: Hash tables configured (established 2048 bind 2048)
UDP hash table entries: 256 (order: 1, 8192 bytes)
UDP-Lite hash table entries: 256 (order: 1, 8192 bytes)
NET: Registered protocol family 1
RPC: Registered named UNIX socket transport module.
RPC: Registered udp transport module.
RPC: Registered tcp transport module.
RPC: Registered tcp NFSv4.1 backchannel transport module.
PCI: CLS 0 bytes, default 64
Trying to unpack rootfs image as initramfs...
rootfs image is not initramfs (no cpio magic); looks like an initrd
Freeing initrd memory: 12920K (cce62000 - cdb00000)
hw perfevents: enabled with armv7_cortex_a9 PMU driver, 7 counters available
futex hash table entries: 512 (order: 3, 32768 bytes)
workingset: timestamp_bits=28 max_order=16 bucket_order=0
jffs2: version 2.2. (NAND) (SUMMARY)  © 2001-2006 Red Hat, Inc.
io scheduler noop registered
io scheduler deadline registered
io scheduler cfq registered (default)
dma-pl330 f8003000.ps7-dma: Loaded driver for PL330 DMAC-241330
dma-pl330 f8003000.ps7-dma: DBUFF-128x8bytes Num_Chans-8 Num_Peri-4 Num_Events-16
e0000000.serial: ttyPS0 at MMIO 0xe0000000 (irq = 158, base_baud = 6249999) is a xuartps
console [ttyPS0] enabled
xdevcfg f8007000.ps7-dev-cfg: ioremap 0xf8007000 to cf86e000
[drm] Initialized drm 1.1.0 20060810
brd: module loaded
loop: module loaded
CAN device driver interface
gpiod_set_value: invalid GPIO
libphy: MACB_mii_bus: probed
macb e000b000.ethernet eth0: Cadence GEM rev 0x00020118 at 0xe000b000 irq 31 (00:0a:35:00:00:00)
Generic PHY e000b000.etherne:00: attached PHY driver [Generic PHY] (mii_bus:phy_addr=e000b000.etherne:00, irq=-1)
e1000e: Intel(R) PRO/1000 Network Driver - 3.2.6-k
e1000e: Copyright(c) 1999 - 2015 Intel Corporation.
ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
ehci-pci: EHCI PCI platform driver
usbcore: registered new interface driver usb-storage
mousedev: PS/2 mouse device common for all mice
i2c /dev entries driver
Xilinx Zynq CpuIdle Driver started
sdhci: Secure Digital Host Controller Interface driver
sdhci: Copyright(c) Pierre Ossman
sdhci-pltfm: SDHCI platform and OF driver helper
mmc0: SDHCI controller on e0100000.ps7-sdio [e0100000.ps7-sdio] using ADMA
ledtrig-cpu: registered to indicate activity on CPUs
usbcore: registered new interface driver usbhid
usbhid: USB HID core driver
nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xda
nand: Micron MT29F2G08ABAEAWP
nand: 256 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64
nand: WARNING: pl35x-nand: the ECC used on your system is too weak compared to the one required by the NAND chip
Bad block table found at page 131008, version 0x01
Bad block table found at page 130944, version 0x01
6 ofpart partitions found on MTD device pl35x-nand
Creating 6 MTD partitions on "pl35x-nand":
0x000000000000-0x000002800000 : "BOOT.bin-env-dts-kernel"
0x000002800000-0x000004800000 : "ramfs"
0x000004800000-0x000005000000 : "configs"
0x000005000000-0x000006000000 : "reserve"
0x000006000000-0x000008000000 : "ramfs-bak"
0x000008000000-0x000010000000 : "reserve1"
NET: Registered protocol family 10
sit: IPv6 over IPv4 tunneling driver
NET: Registered protocol family 17
can: controller area network core (rev 20120528 abi 9)
NET: Registered protocol family 29
can: raw protocol (rev 20120528)
can: broadcast manager protocol (rev 20120528 t)
can: netlink gateway (rev 20130117) max_hops=1
zynq_pm_ioremap: no compatible node found for 'xlnx,zynq-ddrc-a05'
zynq_pm_late_init: Unable to map DDRC IO memory.
Registering SWP/SWPB emulation handler
hctosys: unable to open rtc device (rtc0)
ALSA device list:
  No soundcards found.
RAMDISK: gzip image found at block 0
EXT4-fs (ram0): couldn't mount as ext3 due to feature incompatibilities
EXT4-fs (ram0): mounted filesystem without journal. Opts: (null)
VFS: Mounted root (ext4 filesystem) on device 1:0.
devtmpfs: mounted
Freeing unused kernel memory: 1024K (c0a00000 - c0b00000)
EXT4-fs (ram0): re-mounted. Opts: block_validity,delalloc,barrier,user_xattr
random: dd urandom read with 0 bits of entropy available
ubi0: attaching mtd2
ubi0: scanning is finished
ubi0: attached mtd2 (name "configs", size 8 MiB)
ubi0: PEB size: 131072 bytes (128 KiB), LEB size: 126976 bytes
ubi0: min./max. I/O unit sizes: 2048/2048, sub-page size 2048
ubi0: VID header offset: 2048 (aligned 2048), data offset: 4096
ubi0: good PEBs: 64, bad PEBs: 0, corrupted PEBs: 0
ubi0: user volume: 1, internal volumes: 1, max. volumes count: 128
ubi0: max/mean erase counter: 24/12, WL threshold: 4096, image sequence number: 2881768101
ubi0: available PEBs: 0, total reserved PEBs: 64, PEBs reserved for bad PEB handling: 40
ubi0: background thread "ubi_bgt0d" started, PID 708
UBIFS (ubi0:0): background thread "ubifs_bgt0_0" started, PID 711
UBIFS (ubi0:0): UBIFS: mounted UBI device 0, volume 0, name "configs"
UBIFS (ubi0:0): LEB size: 126976 bytes (124 KiB), min./max. I/O unit sizes: 2048 bytes/2048 bytes
UBIFS (ubi0:0): FS size: 1396736 bytes (1 MiB, 11 LEBs), journal size 888833 bytes (0 MiB, 5 LEBs)
UBIFS (ubi0:0): reserved for root: 65970 bytes (64 KiB)
UBIFS (ubi0:0): media format: w4/r0 (latest is w4/r0), UUID 2AF94D23-0760-4801-8C27-532C11A81FC1, small LPT model
ubi1: attaching mtd5
ubi1: scanning is finished
ubi1: attached mtd5 (name "reserve1", size 128 MiB)
ubi1: PEB size: 131072 bytes (128 KiB), LEB size: 126976 bytes
ubi1: min./max. I/O unit sizes: 2048/2048, sub-page size 2048
ubi1: VID header offset: 2048 (aligned 2048), data offset: 4096
ubi1: good PEBs: 1020, bad PEBs: 4, corrupted PEBs: 0
ubi1: user volume: 1, internal volumes: 1, max. volumes count: 128
ubi1: max/mean erase counter: 383/61, WL threshold: 4096, image sequence number: 3563076859
ubi1: available PEBs: 0, total reserved PEBs: 1020, PEBs reserved for bad PEB handling: 36
ubi1: background thread "ubi_bgt1d" started, PID 720
UBIFS (ubi1:0): background thread "ubifs_bgt1_0" started, PID 723
UBIFS (ubi1:0): UBIFS: mounted UBI device 1, volume 0, name "reserve1"
UBIFS (ubi1:0): LEB size: 126976 bytes (124 KiB), min./max. I/O unit sizes: 2048 bytes/2048 bytes
UBIFS (ubi1:0): FS size: 123039744 bytes (117 MiB, 969 LEBs), journal size 6221824 bytes (5 MiB, 49 LEBs)
UBIFS (ubi1:0): reserved for root: 4952683 bytes (4836 KiB)
UBIFS (ubi1:0): media format: w4/r0 (latest is w4/r0), UUID A21581E4-693A-4BBF-A66A-FCE5ACD2EF7F, small LPT model
IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
macb e000b000.ethernet eth0: unable to generate target frequency: 25000000 Hz
macb e000b000.ethernet eth0: link up (100/Full)
IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
In axi fpga driver!
request_mem_region OK!
AXI fpga dev virtual address is 0xcfb38000
*base_vir_addr = 0xb023
In fpga mem driver!
request_mem_region OK!
fpga mem virtual address is 0xd2000000
random: nonblocking pool is initialized
2021-04-27 11:31:48 driver-btm-api.c:762:init_freq_mode: This is scan-user version
2021-04-27 11:31:48 driver-btm-api.c:2553:bitmain_soc_init: opt_multi_version     = 1
2021-04-27 11:31:48 driver-btm-api.c:2554:bitmain_soc_init: opt_bitmain_ab        = 1
2021-04-27 11:31:48 driver-btm-api.c:2555:bitmain_soc_init: opt_bitmain_work_mode = 0
2021-04-27 11:31:48 driver-btm-api.c:2556:bitmain_soc_init: Miner compile time: Tue Jun  2 10:11:10 CST 2020 type: Antminer T17+
2021-04-27 11:31:48 driver-btm-api.c:2557:bitmain_soc_init: commit version: 9e9a5bf 2020-06-01 18:48:14, build by: lol 2020-06-02 10:21:50
2021-04-27 11:31:48 driver-btm-api.c:2165:show_sn: no SN got, please write SN to /config/sn
2021-04-27 11:31:48 driver-btm-api.c:2220:handle_sn_for_factory_mode: read sn failed, use fpga-id to replace sn
2021-04-27 11:31:48 driver-btm-api.c:2226:handle_sn_for_factory_mode: read fpga id success 807c250c2b10481c
2021-04-27 11:31:48 fan.c:263:front_fan_power_on: Note: front fan is power on!
2021-04-27 11:31:48 fan.c:273:rear_fan_power_on: Note: rear fan is power on!
2021-04-27 11:31:48 driver-btm-api.c:1357:miner_device_init: Detect 256MB control board of XILINX
2021-04-27 11:31:48 driver-btm-api.c:1298:init_fan_parameter: fan_eft : 1  fan_pwm : 100
2021-04-27 11:31:54 driver-btm-api.c:1281:init_miner_version: miner ID : 807c250c2b10481c
2021-04-27 11:31:54 driver-btm-api.c:1287:init_miner_version: FPGA Version = 0xB023
2021-04-27 11:31:56 driver-btm-api.c:814:get_product_id: product_id[0] = 0
2021-04-27 11:31:56 driver-btm-api.c:845:get_chip_version: chip_version[0] = 1
2021-04-27 11:31:56 driver-btm-api.c:2293:update_conf_by_power_feedback: Power feedback is disabled
2021-04-27 11:31:56 driver-btm-api.c:2297:update_conf_by_power_feedback: get_calibration_voltage, vol:1850.
2021-04-27 11:31:56 frequency.c:1389:adjust_higer_max_vol_table: adjust_higer_max_vol_table, ideal_hashrate = 19019, index = 0, adjust_vol = 50
2021-04-27 11:31:56 thread.c:1462:create_read_nonce_reg_thread: create thread
2021-04-27 11:32:02 driver-btm-api.c:1281:init_miner_version: miner ID : 807c250c2b10481c
2021-04-27 11:32:02 driver-btm-api.c:1287:init_miner_version: FPGA Version = 0xB023
2021-04-27 11:32:04 driver-btm-api.c:814:get_product_id: product_id[0] = 0
2021-04-27 11:32:04 driver-btm-api.c:845:get_chip_version: chip_version[0] = 1
2021-04-27 11:32:04 driver-btm-api.c:1944:get_ccdly_opt: ccdly_opt[0] = 0
2021-04-27 11:32:04 driver-btm-api.c:2439:bitmain_board_init: g_ccdly_opt = 0
2021-04-27 11:32:04 driver-btm-api.c:775:_set_project_type: project:0, set to T17Plus always
2021-04-27 11:32:04 driver-btm-api.c:790:_set_project_type: Project type: Antminer T17+
2021-04-27 11:32:04 driver-btm-api.c:801:dump_pcb_bom_version: Chain [0] PCB Version: 0x0100
2021-04-27 11:32:04 driver-btm-api.c:802:dump_pcb_bom_version: Chain [0] BOM Version: 0x0100
2021-04-27 11:32:09 driver-btm-api.c:2461:bitmain_board_init: Fan check passed.
2021-04-27 11:32:10 board.c:36:jump_and_app_check_restore_pic: chain[0] PIC jump to app
2021-04-27 11:32:12 board.c:40:jump_and_app_check_restore_pic: Check chain[0] PIC fw version=0x88
2021-04-27 11:32:12 thread.c:1457:create_pic_heart_beat_thread: create thread
2021-04-27 11:32:13 power_api.c:231:power_init: power type version: 0x0042
2021-04-27 11:32:13 power_api.c:249:power_init: Power init:
2021-04-27 11:32:13 power_api.c:250:power_init: current_voltage_raw = 0
2021-04-27 11:32:13 power_api.c:251:power_init: highest_voltage_raw = 1950
2021-04-27 11:32:13 power_api.c:252:power_init: working_voltage_raw = 1900
2021-04-27 11:32:13 power_api.c:253:power_init: higher_voltage_raw  = 1930
2021-04-27 11:32:13 power_api.c:254:power_init: check_asic_voltage_raw  = 2000
2021-04-27 11:32:13 driver-btm-api.c:2471:bitmain_board_init: Enter 30s sleep to make sure power release finish.
2021-04-27 11:32:13 power_api.c:218:power_off: init gpio907
2021-04-27 11:32:13 power_api.c:221:power_off: set gpio907 to 1
2021-04-27 11:32:44 power_api.c:206:power_on: set gpio907 to 0
2021-04-27 11:32:45 power_api.c:388:set_to_voltage_by_steps: Set to voltage raw 1850, step by step.
2021-04-27 11:33:02 power_api.c:120:check_voltage_multi: retry time: 0
2021-04-27 11:33:03 power_api.c:56:_get_avg_voltage: an6 raw = 2.516129, factor = 1.006452
2021-04-27 11:33:03 power_api.c:60:_get_avg_voltage: chain 0, an0 = 0.584615, an2 18.732052, an6 2.500000.
2021-04-27 11:33:03 power_api.c:85:_get_avg_voltage: average_voltage = 18.732051
2021-04-27 11:33:03 power_api.c:106:check_voltage: target_vol = 18.50, actural_vol = 18.73, check voltage passed.
2021-04-27 11:33:05 power_api.c:56:_get_avg_voltage: an6 raw = 2.519355, factor = 1.007742
2021-04-27 11:33:05 power_api.c:60:_get_avg_voltage: chain 0, an0 = 0.583867, an2 18.732394, an6 2.500000.
2021-04-27 11:33:05 power_api.c:85:_get_avg_voltage: average_voltage = 18.732393
2021-04-27 11:33:05 power_api.c:106:check_voltage: target_vol = 18.50, actural_vol = 18.73, check voltage passed.
2021-04-27 11:33:05 uart.c:81:set_baud: set fpga_baud = 115200, fpga_divider = 26
2021-04-27 11:33:14 driver-btm-api.c:1162:check_asic_number_with_power_on: Chain[0]: find 44 asic, times 0
2021-04-27 11:33:16 driver-btm-api.c:425:set_order_clock: chain[0]: set order clock, stragegy 3 clock_en=0x1
2021-04-27 11:33:16 driver-btm-api.c:1931:get_core_clock_delay_setting: PWTH_SEL 3, CCDLY_SEL 0
2021-04-27 11:33:16 driver-hash-chip.c:502:set_clock_delay_control: core_data = 0x34
2021-04-27 11:33:16 driver-btm-api.c:1972:check_clock_counter: freq 50 clock_counter_limit 6
2021-04-27 11:33:24 voltage[0] = 1790
2021-04-27 11:33:24 power_api.c:262:set_working_voltage_raw: working_voltage_raw = 1790
2021-04-27 11:33:25 temperature.c:313:calibrate_temp_sensor_one_chain: Temperature sensor calibration: chain 0 success. NCT218
2021-04-27 11:33:25 uart.c:81:set_baud: set fpga_baud = 6000000, fpga_divider = 3
2021-04-27 11:33:26 driver-btm-api.c:294:check_bringup_temp: Bring up temperature is 34
2021-04-27 11:33:26 thread.c:1477:create_check_miner_status_thread: create thread
2021-04-27 11:33:26 thread.c:1467:create_set_miner_status_thread: create thread
2021-04-27 11:33:26 driver-btm-api.c:712:calculate_timeout: dev->timeout = 245
2021-04-27 11:33:26 thread.c:1447:create_temperature_monitor_thread: create thread
2021-04-27 11:33:26 frequency.c:527:check_bringup_temp_dec_freq: dec freq = 0 when bringup temp = 34 dec_freq_index=0
2021-04-27 11:33:26 freq_tuning.c:184:freq_tuning_get_max_freq: Max freq of tuning is 790
2021-04-27 11:33:26 driver-btm-api.c:1802:send_null_work: [DEBUG] Send null work.
2021-04-27 11:33:26 thread.c:1437:create_asic_status_monitor_thread: create thread
2021-04-27 11:33:26 frequency.c:1194:inc_freq_with_fixed_vco: chain = 255, freq = 640, is_higher_voltage = true
2021-04-27 11:33:26 frequency.c:1210:inc_freq_with_fixed_vco: [0] _POSTDIV1 = 7, _POSTDIV2 = 7, USER_DIV = 1, freq = 52
2021-04-27 11:33:26 power_api.c:388:set_to_voltage_by_steps: Set to voltage raw 1870, step by step.
2021-04-27 11:33:42 power_api.c:120:check_voltage_multi: retry time: 0
2021-04-27 11:33:43 power_api.c:56:_get_avg_voltage: an6 raw = 2.522581, factor = 1.009032
2021-04-27 11:33:43 power_api.c:60:_get_avg_voltage: chain 0, an0 = 18.805627, an2 18.854220, an6 2.500000.
2021-04-27 11:33:43 power_api.c:85:_get_avg_voltage: average_voltage = 18.854220
2021-04-27 11:33:43 power_api.c:106:check_voltage: target_vol = 18.70, actural_vol = 18.85, check voltage passed.
2021-04-27 11:33:45 power_api.c:56:_get_avg_voltage: an6 raw = 2.519355, factor = 1.007742
2021-04-27 11:33:45 power_api.c:60:_get_avg_voltage: chain 0, an0 = 18.829705, an2 18.902688, an6 2.500000.
2021-04-27 11:33:45 power_api.c:85:_get_avg_voltage: average_voltage = 18.902688
2021-04-27 11:33:45 power_api.c:106:check_voltage: target_vol = 18.70, actural_vol = 18.90, check voltage passed.
2021-04-27 11:33:47 frequency.c:1210:inc_freq_with_fixed_vco: [1] _POSTDIV1 = 7, _POSTDIV2 = 6, USER_DIV = 1, freq = 61
2021-04-27 11:33:49 frequency.c:1210:inc_freq_with_fixed_vco: [2] _POSTDIV1 = 6, _POSTDIV2 = 6, USER_DIV = 1, freq = 71
2021-04-27 11:33:51 frequency.c:1210:inc_freq_with_fixed_vco: [3] _POSTDIV1 = 7, _POSTDIV2 = 5, USER_DIV = 1, freq = 73
2021-04-27 11:33:53 frequency.c:1210:inc_freq_with_fixed_vco: [4] _POSTDIV1 = 6, _POSTDIV2 = 5, USER_DIV = 1, freq = 85
2021-04-27 11:33:55 frequency.c:1210:inc_freq_with_fixed_vco: [5] _POSTDIV1 = 7, _POSTDIV2 = 4, USER_DIV = 1, freq = 91
2021-04-27 11:33:57 frequency.c:1210:inc_freq_with_fixed_vco: [6] _POSTDIV1 = 5, _POSTDIV2 = 5, USER_DIV = 1, freq = 102
2021-04-27 11:34:02 frequency.c:1210:inc_freq_with_fixed_vco: [7] _POSTDIV1 = 6, _POSTDIV2 = 4, USER_DIV = 1, freq = 106
2021-04-27 11:34:06 frequency.c:1210:inc_freq_with_fixed_vco: [8] _POSTDIV1 = 7, _POSTDIV2 = 3, USER_DIV = 1, freq = 122
2021-04-27 11:34:11 frequency.c:1210:inc_freq_with_fixed_vco: [9] _POSTDIV1 = 5, _POSTDIV2 = 4, USER_DIV = 1, freq = 128
2021-04-27 11:34:15 frequency.c:1210:inc_freq_with_fixed_vco: [10] _POSTDIV1 = 6, _POSTDIV2 = 3, USER_DIV = 1, freq = 142
2021-04-27 11:34:20 frequency.c:1210:inc_freq_with_fixed_vco: [11] _POSTDIV1 = 4, _POSTDIV2 = 4, USER_DIV = 1, freq = 160
2021-04-27 11:34:28 frequency.c:1210:inc_freq_with_fixed_vco: [12] _POSTDIV1 = 5, _POSTDIV2 = 3, USER_DIV = 1, freq = 170
2021-04-27 11:34:28 power_api.c:388:set_to_voltage_by_steps: Set to voltage raw 1850, step by step.
2021-04-27 11:34:44 power_api.c:120:check_voltage_multi: retry time: 0
2021-04-27 11:34:46 power_api.c:56:_get_avg_voltage: an6 raw = 2.522581, factor = 1.009032
2021-04-27 11:34:46 power_api.c:60:_get_avg_voltage: chain 0, an0 = 18.586957, an2 18.659847, an6 2.500000.
2021-04-27 11:34:46 power_api.c:85:_get_avg_voltage: average_voltage = 18.659846
2021-04-27 11:34:46 power_api.c:106:check_voltage: target_vol = 18.50, actural_vol = 18.66, check voltage passed.
2021-04-27 11:34:48 power_api.c:56:_get_avg_voltage: an6 raw = 2.519355, factor = 1.007742
2021-04-27 11:34:48 power_api.c:60:_get_avg_voltage: chain 0, an0 = 18.610755, an2 18.708066, an6 2.500000.
2021-04-27 11:34:48 power_api.c:85:_get_avg_voltage: average_voltage = 18.708066
2021-04-27 11:34:48 power_api.c:106:check_voltage: target_vol = 18.50, actural_vol = 18.71, check voltage passed.
2021-04-27 11:34:56 frequency.c:1210:inc_freq_with_fixed_vco: [13] _POSTDIV1 = 7, _POSTDIV2 = 2, USER_DIV = 1, freq = 183
2021-04-27 11:35:04 frequency.c:1210:inc_freq_with_fixed_vco: [14] _POSTDIV1 = 4, _POSTDIV2 = 3, USER_DIV = 1, freq = 213
2021-04-27 11:35:16 frequency.c:1210:inc_freq_with_fixed_vco: [15] _POSTDIV1 = 6, _POSTDIV2 = 2, USER_DIV = 1, freq = 213
2021-04-27 11:35:29 frequency.c:1210:inc_freq_with_fixed_vco: [16] _POSTDIV1 = 5, _POSTDIV2 = 2, USER_DIV = 1, freq = 256
2021-04-27 11:35:47 frequency.c:1210:inc_freq_with_fixed_vco: [17] _POSTDIV1 = 4, _POSTDIV2 = 2, USER_DIV = 1, freq = 320
2021-04-27 11:36:11 frequency.c:1210:inc_freq_with_fixed_vco: [18] _POSTDIV1 = 7, _POSTDIV2 = 1, USER_DIV = 1, freq = 366
2021-04-27 11:36:43 frequency.c:1210:inc_freq_with_fixed_vco: [19] _POSTDIV1 = 6, _POSTDIV2 = 1, USER_DIV = 1, freq = 427
2021-04-27 11:37:24 frequency.c:1210:inc_freq_with_fixed_vco: [20] _POSTDIV1 = 5, _POSTDIV2 = 1, USER_DIV = 1, freq = 512
2021-04-27 11:37:34 thread.c:1161:check_temperature: over max temp, pcb temp 68 (max 80), chip temp 105(max 103)
2021-04-27 11:37:34 driver-btm-api.c:222:set_miner_status: ERROR_TEMP_TOO_HIGH
2021-04-27 11:37:34 driver-btm-api.c:156:stop_mining: stop mining: over max temp
2021-04-27 11:37:34 thread.c:1487:cancel_temperature_monitor_thread: cancel thread
2021-04-27 11:37:34 thread.c:1502:cancel_read_nonce_reg_thread: cancel thread
2021-04-27 11:37:34 thread.c:1442:cancel_asic_status_monitor_thread: cancel thread
2021-04-27 11:37:34 driver-btm-api.c:142:killall_hashboard: ****power off hashboard****
sr. member
Activity: 446
Merit: 347
April 23, 2021, 11:22:24 AM
#7
Hi, I see that you are using another firmware, I suggest one thing, I see that it starts in silent mode, it is not very good for the diag ...

please find my firmware, for T17 +, it is totally based on the original firmware, try to install it, force the fan to 100%, and try with a very low speed to see how it behaves, 400mhz and 1600mv to start , and send here your results with photo of the status of the miner if it works well



https://drive.google.com/drive/folders/1yl0fym6ezHnKptGaHrWB6W2Rl2p9nwNp
legendary
Activity: 2436
Merit: 6643
be constructive or S.T.F.U
April 21, 2021, 06:40:12 PM
#6
So you are running with only one hashboard in the miner? If so, the cooling will not be ideal. The air will take the path of least resistance which is right down the empty slots instead of being forced thought the heatsinks.

While that is true, it's unlikely to impact the temps to this degree, based on the kernel log the starting temp is pretty low and within 2 minutes it goes to 90c which is abnormal.

OP can you set the fans to 100% so we can confirm if this is indeed a temp issue, if it is, the miner should run fine with fans at full speed, or in the worst-case scenario it should run a little longer before reaching to the safety threshold of 90c.
full member
Activity: 228
Merit: 101
NEM (XEM) Top Coin
April 21, 2021, 12:42:09 PM
#5
AntMiner

Chain#   ASIC#   Frequency(avg)   Voltage   Consumption (W)   GH/S(ideal)   GH/S(RT)   Errors(HW)   Temp(PCB)   Temp(Chip)   

3               44                        450                  16.5   494   13,305.60   7,857.16   0   43-49-47-53   71-61-80-62   Auto-tuning
full member
Activity: 228
Merit: 101
NEM (XEM) Top Coin
April 21, 2021, 12:39:53 PM
#4

[2021/04/21 17:34:25] INFO: fan[0] - OK
[2021/04/21 17:34:25] INFO: fan[1] - OK
[2021/04/21 17:34:25] INFO: fan[2] - OK
[2021/04/21 17:34:25] INFO: fan[3] - OK
[2021/04/21 17:34:31] INFO: Power ON
[2021/04/21 17:34:33] INFO: Starting FPGA queue
[2021/04/21 17:34:33] INFO: Initializing hash boards
[2021/04/21 17:34:33] INFO: chain[2] - Initializing
[2021/04/21 17:34:46] INFO: chain[2] - 44 chips detected
[2021/04/21 17:34:51] INFO: Start-up temperature is 32 C (min -15 C)
[2021/04/21 17:34:51] INFO: Activating silent start-up mode
[2021/04/21 17:34:51] INFO: Switching to warm-up fan control (60 C)
[2021/04/21 17:34:51] INFO: Changing voltage from 19500 to 16500 mV gradually
[2021/04/21 17:35:31] INFO: Raising freq from 50 to 450 Mhz gradually
[2021/04/21 17:35:40] INFO: Switching to automatic fan control (75 C)
[2021/04/21 17:35:40] INFO: Start mining!
[2021/04/21 17:36:40] INFO: Changing voltage from 16500 to 16300 mV gradually
[2021/04/21 17:37:01] INFO: Changing voltage from 16300 to 16200 mV gradually
[2021/04/21 17:38:29] WARN: chain[2] - Overheated, chip temp=90
[2021/04/21 17:38:29] WARN: Switching to emergency fan control
[2021/04/21 17:38:29] INFO: chain[2] - Shutting down the chain
[2021/04/21 17:38:32] WARN: No working chains
[2021/04/21 17:38:32] INFO: Shutting down the miner
[2021/04/21 17:38:34] INFO: Stopping FPGA queue
hero member
Activity: 544
Merit: 589
April 20, 2021, 05:28:29 PM
#3
So you are running with only one hashboard in the miner? If so, the cooling will not be ideal. The air will take the path of least resistance which is right down the empty slots instead of being forced thought the heatsinks.
legendary
Activity: 3472
Merit: 3217
Playbet.io - Crypto Casino and Sportsbook
April 20, 2021, 05:09:29 PM
#2
Did you overclock the miner recently?

I heard someone has the same problem after cleaning the hashboard but they're pointing to PSU as the cause of the problem due to full of dirt on the PSU. Cleaning the PSU solves their problem.

Check carefully the power supply on that chain2 if well attached(or try to reattach the PSU cable on that hashboard).
Also, check both front and back fans, and maybe one of the fans is not running or is running slow.

Try to reset the miner and then try to run again.
full member
Activity: 228
Merit: 101
NEM (XEM) Top Coin
April 20, 2021, 12:04:55 AM
#1
I have a antminer t17+55 .
It worked until  I changed the place of one hashboad that worked correctly  in other place of hashboard miner ( 2 hashboards removed for maintenance).
now I get this error ..... (chain2 -overheated, chip temp=90)
what's problem?

-----------------------------------------------------------

fan(0) - ok
fan(1) - ok
fan(2) - ok
fan(3) - ok
Power On
chain2 ...initializing
chain 2...44 chips detected
start mining!
chain2 -overheated, chip temp=90
switching to emergency fan control
chain2 - shutting down the chain.
shutting down the miner
Power off

--------------------------------------------------------
Jump to: