Author

Topic: New Antminer T17 issues (Read 869 times)

legendary
Activity: 2436
Merit: 6643
be constructive or S.T.F.U
July 29, 2020, 11:46:01 PM
#29
Not sure how I missed this Kernel log,  anyway below is the problem.

Code:
2020-07-24 11:35:12 temperature.c:268:calibrate_temp_sensor_one_chain: Temperature sensor calibration: chain 0 success.
2020-07-24 11:35:13 temperature.c:268:calibrate_temp_sensor_one_chain: Temperature sensor calibration: chain 1 success.
2020-07-24 11:35:14 temperature.c:189:is_temp_sensor_type_correct: Wrong temp sensor type, chain = 2, sensor = 6, type = 0x0, retry.
2020-07-24 11:35:14 temperature.c:189:is_temp_sensor_type_correct: Wrong temp sensor type, chain = 2, sensor = 6, type = 0x0, retry.
2020-07-24 11:35:15 temperature.c:189:is_temp_sensor_type_correct: Wrong temp sensor type, chain = 2, sensor = 6, type = 0x0, retry.
2020-07-24 11:35:15 temperature.c:189:is_temp_sensor_type_correct: Wrong temp sensor type, chain = 2, sensor = 6, type = 0x0, retry.
2020-07-24 11:35:15 driver-btm-api.c:197:set_miner_status: ERROR_TEMP_LOST

Your miner has a total of 12 temp sensors, so that is 4 on each board, now it simply can't communicate with any of the 4 sensors on the 3rd board (usually most left), but based on information I have collected and confirmed from different sources including my own and other memmber's experince, it's unlikely that all 4 sensors went bad, and it's more likely than not that one of the chips/heatsinks lost conductivity to the PCB and it causes this error, there are a few things you can try, chances are slim but there is always hope.
legendary
Activity: 4326
Merit: 8950
'The right to privacy matters'
July 24, 2020, 11:19:10 AM
#28
do not upgrade your firmware

please let us know what firmware is in the machine.

please look at mikeywith's thread on thermal sensors.
newbie
Activity: 1
Merit: 0
July 24, 2020, 06:51:21 AM
#27
Code:
Booting Linux on physical CPU 0x0
Linux version 4.6.0-xilinx-gff8137b-dirty (lzq@armdev2) (gcc version 4.8.3 20140320 (prerelease) (Sourcery CodeBench Lite 2014.05-23) ) #25 SMP PREEMPT Fri Nov 23 15:30:52 CST 2018
CPU: ARMv7 Processor [413fc090] revision 0 (ARMv7), cr=18c5387d
CPU: PIPT / VIPT nonaliasing data cache, VIPT aliasing instruction cache
Machine model: Xilinx Zynq
cma: Reserved 16 MiB at 0x0e000000
Memory policy: Data cache writealloc
On node 0 totalpages: 61440
free_area_init_node: node 0, pgdat c0b39280, node_mem_map cde10000
  Normal zone: 480 pages used for memmap
  Normal zone: 0 pages reserved
  Normal zone: 61440 pages, LIFO batch:15
percpu: Embedded 12 pages/cpu @cddf1000 s19776 r8192 d21184 u49152
pcpu-alloc: s19776 r8192 d21184 u49152 alloc=12*4096
pcpu-alloc: [0] 0 [0] 1
Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 60960
Kernel command line: mem=240M console=ttyPS0,115200 ramdisk_size=33554432 root=/dev/ram rw earlyprintk
PID hash table entries: 1024 (order: 0, 4096 bytes)
Dentry cache hash table entries: 32768 (order: 5, 131072 bytes)
Inode-cache hash table entries: 16384 (order: 4, 65536 bytes)
Memory: 195068K/245760K available (6345K kernel code, 231K rwdata, 1896K rodata, 1024K init, 223K bss, 34308K reserved, 16384K cma-reserved, 0K highmem)
Virtual kernel memory layout:
    vector  : 0xffff0000 - 0xffff1000   (   4 kB)
    fixmap  : 0xffc00000 - 0xfff00000   (3072 kB)
    vmalloc : 0xcf800000 - 0xff800000   ( 768 MB)
    lowmem  : 0xc0000000 - 0xcf000000   ( 240 MB)
    pkmap   : 0xbfe00000 - 0xc0000000   (   2 MB)
    modules : 0xbf000000 - 0xbfe00000   (  14 MB)
      .text : 0xc0008000 - 0xc090c424   (9234 kB)
      .init : 0xc0a00000 - 0xc0b00000   (1024 kB)
      .data : 0xc0b00000 - 0xc0b39fe0   ( 232 kB)
       .bss : 0xc0b39fe0 - 0xc0b71c28   ( 224 kB)
Preemptible hierarchical RCU implementation.
Build-time adjustment of leaf fanout to 32.
RCU restricting CPUs from NR_CPUS=4 to nr_cpu_ids=2.
RCU: Adjusting geometry for rcu_fanout_leaf=32, nr_cpu_ids=2
NR_IRQS:16 nr_irqs:16 16
efuse mapped to cf800000
ps7-slcr mapped to cf802000
L2C: platform modifies aux control register: 0x72360000 -> 0x72760000
L2C: DT/platform modifies aux control register: 0x72360000 -> 0x72760000
L2C-310 erratum 769419 enabled
L2C-310 enabling early BRESP for Cortex-A9
L2C-310 full line of zeros enabled for Cortex-A9
L2C-310 ID prefetch enabled, offset 1 lines
L2C-310 dynamic clock gating enabled, standby mode enabled
L2C-310 cache controller enabled, 8 ways, 512 kB
L2C-310: CACHE_ID 0x410000c8, AUX_CTRL 0x76760001
zynq_clock_init: clkc starts at cf802100
Zynq clock init
sched_clock: 64 bits at 333MHz, resolution 3ns, wraps every 4398046511103ns
clocksource: arm_global_timer: mask: 0xffffffffffffffff max_cycles: 0x4ce07af025, max_idle_ns: 440795209040 ns
Switching to timer-based delay loop, resolution 3ns
clocksource: ttc_clocksource: mask: 0xffff max_cycles: 0xffff, max_idle_ns: 537538477 ns
ps7-ttc #0 at cf80a000, irq=18
Console: colour dummy device 80x30
Calibrating delay loop (skipped), value calculated using timer frequency.. 666.66 BogoMIPS (lpj=3333333)
pid_max: default: 32768 minimum: 301
Mount-cache hash table entries: 1024 (order: 0, 4096 bytes)
Mountpoint-cache hash table entries: 1024 (order: 0, 4096 bytes)
CPU: Testing write buffer coherency: ok
CPU0: thread -1, cpu 0, socket 0, mpidr 80000000
Setting up static identity map for 0x100000 - 0x100058
CPU1: failed to boot: -1
Brought up 1 CPUs
SMP: Total of 1 processors activated (666.66 BogoMIPS).
CPU: All CPU(s) started in SVC mode.
devtmpfs: initialized
VFP support v0.3: implementor 41 architecture 3 part 30 variant 9 rev 4
clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 19112604462750000 ns
pinctrl core: initialized pinctrl subsystem
NET: Registered protocol family 16
DMA: preallocated 256 KiB pool for atomic coherent allocations
cpuidle: using governor menu
hw-breakpoint: found 5 (+1 reserved) breakpoint and 1 watchpoint registers.
hw-breakpoint: maximum watchpoint size is 4 bytes.
zynq-ocm f800c000.ps7-ocmc: ZYNQ OCM pool: 256 KiB @ 0xcf880000
vgaarb: loaded
SCSI subsystem initialized
usbcore: registered new interface driver usbfs
usbcore: registered new interface driver hub
usbcore: registered new device driver usb
media: Linux media interface: v0.10
Linux video capture interface: v2.00
pps_core: LinuxPPS API ver. 1 registered
pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti
PTP clock support registered
EDAC MC: Ver: 3.0.0
Advanced Linux Sound Architecture Driver Initialized.
clocksource: Switched to clocksource arm_global_timer
NET: Registered protocol family 2
TCP established hash table entries: 2048 (order: 1, 8192 bytes)
TCP bind hash table entries: 2048 (order: 2, 16384 bytes)
TCP: Hash tables configured (established 2048 bind 2048)
UDP hash table entries: 256 (order: 1, 8192 bytes)
UDP-Lite hash table entries: 256 (order: 1, 8192 bytes)
NET: Registered protocol family 1
RPC: Registered named UNIX socket transport module.
RPC: Registered udp transport module.
RPC: Registered tcp transport module.
RPC: Registered tcp NFSv4.1 backchannel transport module.
PCI: CLS 0 bytes, default 64
Trying to unpack rootfs image as initramfs...
rootfs image is not initramfs (no cpio magic); looks like an initrd
Freeing initrd memory: 21268K (cc63c000 - cdb01000)
hw perfevents: enabled with armv7_cortex_a9 PMU driver, 7 counters available
futex hash table entries: 512 (order: 3, 32768 bytes)
workingset: timestamp_bits=28 max_order=16 bucket_order=0
jffs2: version 2.2. (NAND) (SUMMARY)  © 2001-2006 Red Hat, Inc.
io scheduler noop registered
io scheduler deadline registered
io scheduler cfq registered (default)
dma-pl330 f8003000.ps7-dma: Loaded driver for PL330 DMAC-241330
dma-pl330 f8003000.ps7-dma: DBUFF-128x8bytes Num_Chans-8 Num_Peri-4 Num_Events-16
e0000000.serial: ttyPS0 at MMIO 0xe0000000 (irq = 158, base_baud = 6249999) is a xuartps
console [ttyPS0] enabled
xdevcfg f8007000.ps7-dev-cfg: ioremap 0xf8007000 to cf86e000
[drm] Initialized drm 1.1.0 20060810
brd: module loaded
loop: module loaded
CAN device driver interface
gpiod_set_value: invalid GPIO
libphy: MACB_mii_bus: probed
macb e000b000.ethernet eth0: Cadence GEM rev 0x00020118 at 0xe000b000 irq 31 (00:0a:35:00:00:00)
Generic PHY e000b000.etherne:00: attached PHY driver [Generic PHY] (mii_bus:phy_addr=e000b000.etherne:00, irq=-1)
e1000e: Intel(R) PRO/1000 Network Driver - 3.2.6-k
e1000e: Copyright(c) 1999 - 2015 Intel Corporation.
ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
ehci-pci: EHCI PCI platform driver
usbcore: registered new interface driver usb-storage
mousedev: PS/2 mouse device common for all mice
i2c /dev entries driver
Xilinx Zynq CpuIdle Driver started
sdhci: Secure Digital Host Controller Interface driver
sdhci: Copyright(c) Pierre Ossman
sdhci-pltfm: SDHCI platform and OF driver helper
mmc0: SDHCI controller on e0100000.ps7-sdio [e0100000.ps7-sdio] using ADMA
ledtrig-cpu: registered to indicate activity on CPUs
usbcore: registered new interface driver usbhid
usbhid: USB HID core driver
nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xda
nand: Micron MT29F2G08ABAGAWP
nand: 256 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 128
nand: WARNING: pl35x-nand: the ECC used on your system is too weak compared to the one required by the NAND chip
Bad block table found at page 131008, version 0x01
Bad block table found at page 130944, version 0x01
6 ofpart partitions found on MTD device pl35x-nand
Creating 6 MTD partitions on "pl35x-nand":
0x000000000000-0x000002800000 : "BOOT.bin-env-dts-kernel"
0x000002800000-0x000004800000 : "ramfs"
0x000004800000-0x000005000000 : "configs"
0x000005000000-0x000006000000 : "reserve"
0x000006000000-0x000008000000 : "ramfs-bak"
0x000008000000-0x000010000000 : "reserve1"
NET: Registered protocol family 10
sit: IPv6 over IPv4 tunneling driver
NET: Registered protocol family 17
can: controller area network core (rev 20120528 abi 9)
NET: Registered protocol family 29
can: raw protocol (rev 20120528)
can: broadcast manager protocol (rev 20120528 t)
can: netlink gateway (rev 20130117) max_hops=1
zynq_pm_ioremap: no compatible node found for 'xlnx,zynq-ddrc-a05'
zynq_pm_late_init: Unable to map DDRC IO memory.
Registering SWP/SWPB emulation handler
hctosys: unable to open rtc device (rtc0)
ALSA device list:
  No soundcards found.
RAMDISK: gzip image found at block 0
EXT4-fs (ram0): couldn't mount as ext3 due to feature incompatibilities
EXT4-fs (ram0): mounted filesystem without journal. Opts: (null)
VFS: Mounted root (ext4 filesystem) on device 1:0.
devtmpfs: mounted
Freeing unused kernel memory: 1024K (c0a00000 - c0b00000)
EXT4-fs (ram0): re-mounted. Opts: block_validity,delalloc,barrier,user_xattr
random: dd urandom read with 0 bits of entropy available
ubi0: attaching mtd2
ubi0: scanning is finished
ubi0: attached mtd2 (name "configs", size 8 MiB)
ubi0: PEB size: 131072 bytes (128 KiB), LEB size: 126976 bytes
ubi0: min./max. I/O unit sizes: 2048/2048, sub-page size 2048
ubi0: VID header offset: 2048 (aligned 2048), data offset: 4096
ubi0: good PEBs: 64, bad PEBs: 0, corrupted PEBs: 0
ubi0: user volume: 1, internal volumes: 1, max. volumes count: 128
ubi0: max/mean erase counter: 167/91, WL threshold: 4096, image sequence number: 516415262
ubi0: available PEBs: 0, total reserved PEBs: 64, PEBs reserved for bad PEB handling: 40
ubi0: background thread "ubi_bgt0d" started, PID 708
UBIFS (ubi0:0): background thread "ubifs_bgt0_0" started, PID 711
UBIFS (ubi0:0): UBIFS: mounted UBI device 0, volume 0, name "configs"
UBIFS (ubi0:0): LEB size: 126976 bytes (124 KiB), min./max. I/O unit sizes: 2048 bytes/2048 bytes
UBIFS (ubi0:0): FS size: 1396736 bytes (1 MiB, 11 LEBs), journal size 888833 bytes (0 MiB, 5 LEBs)
UBIFS (ubi0:0): reserved for root: 65970 bytes (64 KiB)
UBIFS (ubi0:0): media format: w4/r0 (latest is w4/r0), UUID D4BCF222-F8DB-4C61-9F03-52AFC11041D8, small LPT model
ubi1: attaching mtd5
ubi1: scanning is finished
ubi1: attached mtd5 (name "reserve1", size 128 MiB)
ubi1: PEB size: 131072 bytes (128 KiB), LEB size: 126976 bytes
ubi1: min./max. I/O unit sizes: 2048/2048, sub-page size 2048
ubi1: VID header offset: 2048 (aligned 2048), data offset: 4096
ubi1: good PEBs: 1020, bad PEBs: 4, corrupted PEBs: 0
ubi1: user volume: 1, internal volumes: 1, max. volumes count: 128
ubi1: max/mean erase counter: 229/66, WL threshold: 4096, image sequence number: 3596985238
ubi1: available PEBs: 0, total reserved PEBs: 1020, PEBs reserved for bad PEB handling: 36
ubi1: background thread "ubi_bgt1d" started, PID 720
UBIFS (ubi1:0): background thread "ubifs_bgt1_0" started, PID 723
UBIFS (ubi1:0): UBIFS: mounted UBI device 1, volume 0, name "reserve1"
UBIFS (ubi1:0): LEB size: 126976 bytes (124 KiB), min./max. I/O unit sizes: 2048 bytes/2048 bytes
UBIFS (ubi1:0): FS size: 123039744 bytes (117 MiB, 969 LEBs), journal size 6221824 bytes (5 MiB, 49 LEBs)
UBIFS (ubi1:0): reserved for root: 4952683 bytes (4836 KiB)
UBIFS (ubi1:0): media format: w4/r0 (latest is w4/r0), UUID 8F961CAE-B1CF-45B5-B664-BC1F6B7EBEFD, small LPT model
alloc_contig_range: [e042, e043) PFNs busy
alloc_contig_range: [e042, e043) PFNs busy
IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
alloc_contig_range: [e042, e043) PFNs busy
alloc_contig_range: [e042, e043) PFNs busy
IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
macb e000b000.ethernet eth0: unable to generate target frequency: 25000000 Hz
macb e000b000.ethernet eth0: link up (100/Full)
IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
In axi fpga driver!
request_mem_region OK!
AXI fpga dev virtual address is 0xcfb38000
*base_vir_addr = 0xab013
In fpga mem driver!
request_mem_region OK!
fpga mem virtual address is 0xd2000000
random: nonblocking pool is initialized
2020-07-24 11:33:23 driver-btm-api.c:782:init_freq_mode: This is scan-user version
2020-07-24 11:33:23 driver-btm-api.c:2082:bitmain_soc_init: opt_multi_version     = 1
2020-07-24 11:33:23 driver-btm-api.c:2083:bitmain_soc_init: opt_bitmain_ab        = 1
2020-07-24 11:33:23 driver-btm-api.c:2084:bitmain_soc_init: opt_bitmain_work_mode = 0
2020-07-24 11:33:23 driver-btm-api.c:2085:bitmain_soc_init: Miner compile time: Mon Jul 29 00:19:06 CST 2019 type: Antminer T17
2020-07-24 11:33:23 driver-btm-api.c:2086:bitmain_soc_init: commit version: e33ca6d 2019-07-18 12:39:03, build by: lol 2019-07-29 00:27:27
2020-07-24 11:33:23 driver-btm-api.c:1937:show_sn: no SN got, please write SN to /nvdata/sn
2020-07-24 11:33:23 driver-btm-api.c:1271:miner_device_init: Detect 256MB control board of XILINX
2020-07-24 11:33:23 driver-btm-api.c:1219:init_fan_parameter: fan_eft : 0  fan_pwm : 0
2020-07-24 11:33:23 thread.c:745:create_read_nonce_reg_thread: create thread
2020-07-24 11:33:29 driver-btm-api.c:1203:init_miner_version: miner ID : 8058e4217b808854
2020-07-24 11:33:29 driver-btm-api.c:1209:init_miner_version: FPGA Version = 0xB013
2020-07-24 11:33:31 eeprom.c:425:check_pattern_test_level: invalid pattern test result. ignore
2020-07-24 11:33:32 eeprom.c:425:check_pattern_test_level: invalid pattern test result. ignore
2020-07-24 11:33:34 eeprom.c:425:check_pattern_test_level: invalid pattern test result. ignore
2020-07-24 11:33:34 driver-btm-api.c:849:get_product_id: product_id[0] = 1
2020-07-24 11:33:34 driver-btm-api.c:849:get_product_id: product_id[1] = 1
2020-07-24 11:33:34 driver-btm-api.c:849:get_product_id: product_id[2] = 1
2020-07-24 11:33:34 driver-btm-api.c:1760:get_ccdly_opt: ccdly_opt[0] = 1
2020-07-24 11:33:34 driver-btm-api.c:1760:get_ccdly_opt: ccdly_opt[1] = 1
2020-07-24 11:33:34 driver-btm-api.c:1760:get_ccdly_opt: ccdly_opt[2] = 1
2020-07-24 11:33:34 driver-btm-api.c:1995:bitmain_board_init: g_ccdly_opt = 1
2020-07-24 11:33:34 driver-btm-api.c:795:_set_project_type: project:2
2020-07-24 11:33:34 driver-btm-api.c:825:_set_project_type: Project type: Antminer T17
2020-07-24 11:33:34 driver-btm-api.c:836:dump_pcb_bom_version: Chain [0] PCB Version: 0x0100
2020-07-24 11:33:34 driver-btm-api.c:837:dump_pcb_bom_version: Chain [0] BOM Version: 0x0100
2020-07-24 11:33:34 driver-btm-api.c:836:dump_pcb_bom_version: Chain [1] PCB Version: 0x0100
2020-07-24 11:33:34 driver-btm-api.c:837:dump_pcb_bom_version: Chain [1] BOM Version: 0x0100
2020-07-24 11:33:34 driver-btm-api.c:836:dump_pcb_bom_version: Chain [2] PCB Version: 0x0100
2020-07-24 11:33:34 driver-btm-api.c:837:dump_pcb_bom_version: Chain [2] BOM Version: 0x0100
2020-07-24 11:33:35 driver-btm-api.c:2015:bitmain_board_init: Fan check passed.
2020-07-24 11:33:37 board.c:36:jump_and_app_check_restore_pic: chain[0] PIC jump to app
2020-07-24 11:33:41 board.c:40:jump_and_app_check_restore_pic: Check chain[0] PIC fw version=0xb9
2020-07-24 11:33:42 board.c:36:jump_and_app_check_restore_pic: chain[1] PIC jump to app
2020-07-24 11:33:46 board.c:40:jump_and_app_check_restore_pic: Check chain[1] PIC fw version=0xb9
2020-07-24 11:33:47 board.c:36:jump_and_app_check_restore_pic: chain[2] PIC jump to app
2020-07-24 11:33:51 board.c:40:jump_and_app_check_restore_pic: Check chain[2] PIC fw version=0xb9
2020-07-24 11:33:51 thread.c:740:create_pic_heart_beat_thread: create thread
2020-07-24 11:33:51 power_api.c:55:power_init: power init ...
2020-07-24 11:33:51 driver-btm-api.c:2025:bitmain_board_init: Enter 30s sleep to make sure power release finish.
2020-07-24 11:34:23 power_api.c:231:set_iic_power_to_highest_voltage: setting to voltage: 17.00 ...
2020-07-24 11:34:29 power_api.c:123:check_voltage_multi: retry time: 0
2020-07-24 11:34:30 power_api.c:85:get_average_voltage: chain[0], voltage is: 17.095547
2020-07-24 11:34:32 power_api.c:85:get_average_voltage: chain[1], voltage is: 17.156777
2020-07-24 11:34:34 power_api.c:85:get_average_voltage: chain[2], voltage is: 17.107793
2020-07-24 11:34:34 power_api.c:96:get_average_voltage: aveage voltage is: 17.120039
2020-07-24 11:34:34 power_api.c:181:set_iic_power_by_voltage: now set voltage to : 17.000000
2020-07-24 11:34:34 driver-hash-chip.c:233:dhash_chip_set_baud_v2: chain[0]: chip baud = 115200, chip_divider = 26
2020-07-24 11:34:34 driver-hash-chip.c:233:dhash_chip_set_baud_v2: chain[1]: chip baud = 115200, chip_divider = 26
2020-07-24 11:34:34 driver-hash-chip.c:233:dhash_chip_set_baud_v2: chain[2]: chip baud = 115200, chip_divider = 26
2020-07-24 11:34:34 uart.c:80:set_baud: set fpga_baud = 115200, fpga_divider = 26
2020-07-24 11:34:45 driver-btm-api.c:1146:check_asic_number_with_power_on: Chain[0]: find 30 asic, times 0
2020-07-24 11:34:55 driver-btm-api.c:1146:check_asic_number_with_power_on: Chain[1]: find 30 asic, times 0
2020-07-24 11:35:06 driver-btm-api.c:1146:check_asic_number_with_power_on: Chain[2]: find 30 asic, times 0
2020-07-24 11:35:09 driver-btm-api.c:339:set_order_clock: chain[0]: set order clock, stragegy 3 clock_en=0x1
2020-07-24 11:35:09 driver-btm-api.c:339:set_order_clock: chain[1]: set order clock, stragegy 3 clock_en=0x1
2020-07-24 11:35:09 driver-btm-api.c:339:set_order_clock: chain[2]: set order clock, stragegy 3 clock_en=0x1
2020-07-24 11:35:10 driver-hash-chip.c:490:set_clock_delay_control: core_data = 0xb4
2020-07-24 11:35:10 driver-hash-chip.c:233:dhash_chip_set_baud_v2: chain[0]: chip baud = 3000000, chip_divider = 0
2020-07-24 11:35:10 driver-hash-chip.c:233:dhash_chip_set_baud_v2: chain[1]: chip baud = 3000000, chip_divider = 0
2020-07-24 11:35:10 driver-hash-chip.c:233:dhash_chip_set_baud_v2: chain[2]: chip baud = 3000000, chip_divider = 0
2020-07-24 11:35:10 uart.c:80:set_baud: set fpga_baud = 3000000, fpga_divider = 0
2020-07-24 11:35:10 driver-btm-api.c:1787:check_clock_counter: freq 50 clock_counter_limit 6
2020-07-24 11:35:10 driver-hash-chip.c:233:dhash_chip_set_baud_v2: chain[0]: chip baud = 115200, chip_divider = 26
2020-07-24 11:35:10 driver-hash-chip.c:233:dhash_chip_set_baud_v2: chain[1]: chip baud = 115200, chip_divider = 26
2020-07-24 11:35:10 driver-hash-chip.c:233:dhash_chip_set_baud_v2: chain[2]: chip baud = 115200, chip_divider = 26
2020-07-24 11:35:10 uart.c:80:set_baud: set fpga_baud = 115200, fpga_divider = 26
2020-07-24 11:35:10 voltage[0] = 1690
2020-07-24 11:35:10 voltage[1] = 1690
2020-07-24 11:35:10 voltage[2] = 1690
2020-07-24 11:35:10 power_api.c:139:set_working_voltage: working_voltage = 16.900000
2020-07-24 11:35:12 temperature.c:268:calibrate_temp_sensor_one_chain: Temperature sensor calibration: chain 0 success.
2020-07-24 11:35:13 temperature.c:268:calibrate_temp_sensor_one_chain: Temperature sensor calibration: chain 1 success.
2020-07-24 11:35:14 temperature.c:189:is_temp_sensor_type_correct: Wrong temp sensor type, chain = 2, sensor = 6, type = 0x0, retry.
2020-07-24 11:35:14 temperature.c:189:is_temp_sensor_type_correct: Wrong temp sensor type, chain = 2, sensor = 6, type = 0x0, retry.
2020-07-24 11:35:15 temperature.c:189:is_temp_sensor_type_correct: Wrong temp sensor type, chain = 2, sensor = 6, type = 0x0, retry.
2020-07-24 11:35:15 temperature.c:189:is_temp_sensor_type_correct: Wrong temp sensor type, chain = 2, sensor = 6, type = 0x0, retry.
2020-07-24 11:35:15 driver-btm-api.c:197:set_miner_status: ERROR_TEMP_LOST
2020-07-24 11:35:15 driver-btm-api.c:138:stop_mining: stop mining: Can't get temperature sensor type!
2020-07-24 11:35:15 thread.c:790:cancel_read_nonce_reg_thread: cancel thread
2020-07-24 11:35:15 driver-btm-api.c:124:killall_hashboard: ****power off hashboard****
legendary
Activity: 2436
Merit: 6643
be constructive or S.T.F.U
April 29, 2020, 08:12:10 PM
#26
Hey nice idea. 1 question, is this issue only related to 17 series? Did you find out that any 19 series had any similar issues?

I don't think anyone except for bitmain runs any Antminer 19 series, they have not yet been delivered, as far as I know, we still have a few months to go before we start knowing about the common issues of these new gears, in fact we have only recently started to encounter all these issues with the 17s series, let alone the 19s, but I am willing to bet that they will not be much better as far as robustness is concerned.
jr. member
Activity: 43
Merit: 59
April 29, 2020, 06:46:57 PM
#25
I started a thread regarding the temp sensors, i have combined all the information available to me so far, will be adding more info, feel free to join and help.

Hey nice idea. 1 question, is this issue only related to 17 series? Did you find out that any 19 series had any similar issues?
legendary
Activity: 2436
Merit: 6643
be constructive or S.T.F.U
April 29, 2020, 04:41:24 PM
#24
If anyone with more knowledge could assist here and save us from this Bitmain theft that would be awesome.

I started a thread regarding the temp sensors, i have combined all the information available to me so far, will be adding more info, feel free to join and help.
jr. member
Activity: 43
Merit: 59
April 29, 2020, 01:11:08 PM
#23
Agreed, on both of my T17+ machines same issue has occurred. I have contacted Bitmain but i think sending a half working machine is not worth it.

It is possible to semi-fix by heating up the machine. I am not sure how to fix them or bypass them manually as i don't even know where they are or how they look like. Only thing i know is there are 4 total per board. If anyone with more knowledge could assist here and save us from this Bitmain theft that would be awesome.
sr. member
Activity: 808
Merit: 294
Created AutoTune to saved the planet! ~USA
April 27, 2020, 06:14:12 PM
#22
Yes, no effect. Later on it actually managed to get operational for 12+ hours when using a higher frequency. It just depends on if the temp sensors fail or not.

I will write to tazers as this looks like to be a common issue for T17 units, and it is good to have it as an option to disable at least for troubleshooting.

Looks like shit temp sensor. Extremely common and the 17 series have a rma rate in the 15-20% rate when it normally is sub 5%.... It sucks contact bitmain or try the repair yourself but it won't be fun
jr. member
Activity: 43
Merit: 59
April 22, 2020, 05:37:28 AM
#21
Yes, no effect. Later on it actually managed to get operational for 12+ hours when using a higher frequency. It just depends on if the temp sensors fail or not.

I will write to tazers as this looks like to be a common issue for T17 units, and it is good to have it as an option to disable at least for troubleshooting.
legendary
Activity: 2436
Merit: 6643
be constructive or S.T.F.U
April 21, 2020, 11:11:09 AM
#20
I am 90% sure this issue is related to temperature sensor. Is there a way we can try and disable the sensor check on that chain? Yes i am aware of a chance of miner catching fire if heatsink becomes loose.

You can post in asic.to thread or reach out to taserz if he does not reply let me know and I will contact him off-forum.

have you actually tried to set a lower frequency on the bad chain?
jr. member
Activity: 43
Merit: 59
April 20, 2020, 11:17:06 PM
#19
Ok so i switched to asic.to firmware. Same thing occurred. I have noticed:

Code:
[2020/04/21 04:11:03] ERROR: src/temp.c:218 chain[0] sen[2] - Lost, no updates for 10 sec
[2020/04/21 04:11:03] ERROR: src/temp.c:218 chain[0] sen[3] - Lost, no updates for 10 sec
[2020/04/21 04:11:03] WARN: chain[0] - 2 sensor(s) reported their temps!
[2020/04/21 04:11:04] ERROR: src/temp.c:218 chain[0] sen[0] - Lost, no updates for 10 sec
[2020/04/21 04:11:04] ERROR: src/temp.c:218 chain[0] sen[1] - Lost, no updates for 10 sec
[2020/04/21 04:11:04] WARN: chain[0] - 0 sensor(s) reported their temps!
[2020/04/21 04:11:04] ERROR: driver-btm-chain.c:950 chain[0] - Failed to read temp from all sensors!
[2020/04/21 04:11:04] INFO: chain[0] - Shutting down the chain

Also the main issue olymps2020 posted had same errors regarding temperature sensors:

Code:
temperature.c:697:get_temp_info: read temp sensor failed: chain = 0, sensor = 0, chip = 64, reg = 0

I am 90% sure this issue is related to temperature sensor. Is there a way we can try and disable the sensor check on that chain? Yes i am aware of a chance of miner catching fire if heatsink becomes loose.
legendary
Activity: 2436
Merit: 6643
be constructive or S.T.F.U
April 19, 2020, 11:50:13 AM
#18
Do you know if switching firmware is reversible? I can try to see if issue will be perma-fixed with those firmware's

of course, they are reversible so if you don't like the firmware you can go back to bitmain, although there is a good chance that you might prefer it to bitmain even if it doesn't fix the issue you have, you will be able to overclock the other two working boards to get something close to the initial hashrate with 3 working boards.
jr. member
Activity: 43
Merit: 59
April 19, 2020, 11:01:33 AM
#17
AwoesmeMiner and Asic.to both are Vnish based and give you the ability to change both voltage and frequency, I

Do you know if switching firmware is reversible? I can try to see if issue will be perma-fixed with those firmware's
legendary
Activity: 2436
Merit: 6643
be constructive or S.T.F.U
April 18, 2020, 05:04:30 PM
#16
Also did anyone try this method? https://bitcointalksearch.org/topic/google-chrome-access-locked-miner-features-5239515

By deleting some lines you can actually enter voltage control.

NO, you can't get access to voltage control just buy altering the style="display:none", all you get is fan control ( that's on all miners ) and for some, you get to choose the frequency (Not for T17, S17, and their family), temperature threshold and etc. But not voltage control whatsoever.

To play with the voltage you will need a firmware that allowed you to do so, the two known firmware out there for the T17 are AwoesmeMiner and Asic.to both are Vnish based and give you the ability to change both voltage and frequency, I suggest you start with lowering the frequency first, then go to voltage.
jr. member
Activity: 43
Merit: 59
April 18, 2020, 12:20:38 AM
#15
So long story short, i fixed the problem with new firmware at the cost of machine taking 15 mins to boot which im fine with. Hope my experience helps.


Update: Issue re-occurred after few days. I also managed to re-flash board ( i had to format sd cart to fat-32 ) and it worked. Then i loaded Antminer-T17+-user-OM-201911111409-sig_4637.tar.gz firmware.

Machine runs fine for 2-3 days then issue repeats. Works fine again after reboot etc.  


If it gets worse i will try and swap PSU-s between 2 T17+ machines.  


Also did anyone try this method? https://bitcointalksearch.org/topic/google-chrome-access-locked-miner-features-5239515

By deleting some lines you can actually enter voltage control.
legendary
Activity: 2436
Merit: 6643
be constructive or S.T.F.U
April 15, 2020, 06:03:11 AM
#14

I think this is due to the fact that the tunning works better on this firmware version, it probably drops the frequency / increase the voltage of the board and thus it shows, every time one loses a board should really just give it a try with different firmware, preferably one that you can manually tune each board with, I had very good luck my old S9s by flashing this firmware.

The success rate is not that high, and in most cases, the hash board never comes back to life, but you lose nothing trying a different firmware.
legendary
Activity: 4326
Merit: 8950
'The right to privacy matters'
April 14, 2020, 09:06:37 PM
#13
Did you try to re-flash board with SD card?

I am in Howell NJ

gear is in Clifton NJ

Corona-v lockdown means I should not drive to work on it.

I can access it via team viewer that lets me

reboot it.
upgrade firmware.
change pools.

It is numbered I can call the guy in the warehouse and shut it off or power it up.

But hands on card flashing, fan repairing, cleaning it out all have to wait until May something.  Maybe right after the ½ ing.
jr. member
Activity: 43
Merit: 59
April 14, 2020, 09:01:07 PM
#12
Did you try to re-flash board with SD card?
legendary
Activity: 4326
Merit: 8950
'The right to privacy matters'
April 14, 2020, 03:53:10 PM
#11
Of all the bitmain gear I purchased last year my t17+ has had the most issues.

I am glad you found that 17+ firmware I will give it a spin on my t17+ since it has been running with 2 boards for more then 3 months.

I will post back if it gives me the third board back


"Antminer T17+
Hostname   antMiner
Model   GNU/Linux
Hardware Version   Socket connect failed: Connection refused
Kernel Version   Linux 4.6.0-xilinx-gff8137b-dirty #25 SMP PREEMPT Fri Nov 23 15:30:52 CST 2018
File System Version   Wed Apr 8 11:27:07 CST 2020"

Just loaded it now see above.

and it booted in under five minutes dropped the board after it ran for 1 minute


booted again both times failed socket came up which is normal but this time the tuning is slower. 7 minutes vs 4.75 minutes and still tuning.

and 8 minutes in we are doing 3 boards will get back if it holds it is a score.
jr. member
Activity: 43
Merit: 59
April 14, 2020, 11:33:11 AM
#10
In that 15 minute boot time i discounted the basic boot time. I have more T17 machines. And this one boots with 15 min delay by showing the socket connect fail error before it starts hashing.

But yeah im ok, as long as all 3 boards are visible and hashing. So main problem was fixed at the cost of causing a smaller problem to appear.
legendary
Activity: 4326
Merit: 8950
'The right to privacy matters'
April 14, 2020, 10:55:12 AM
#9
The cost of 15 minute boot time is pretty much what the t17 takes in the first place.
socket connect fail is also normal

So you are looking good .
jr. member
Activity: 43
Merit: 59
April 14, 2020, 09:29:22 AM
#8
I didn't because i don't think its neither hashboard or PSU problem. Here is why:

I have contacted Bitmain support. They have provided me with newest firmware: Antminer-T17 +-user-OM-202002281759-sig_5446.tar.gz - File System Version    - Fri Feb 28 17:59:06 CST 2020

Previous system was File System Version   Fri Dec 6 10:46:34 CST 2019

Outcome is that each time machine boots i get an: Hardware Version Socket connect failed: Connection refused ; And machine takes like 10-15 minutes to start hashing. However now it works well, all 3 hashboards are visible and 1 of them doesn't disappear anymore. I tried re-flashing board with sd card with zip file Bitmain sent me. However didn't manage to do it, both leds remain on for 30 mins, and nothing happens. Machine boots normally after SD card is removed. I also tried factory reset few times. I tried downgrading firmware to 2019 version but it won't let me go back.

So long story short, i fixed the problem with new firmware at the cost of machine taking 15 mins to boot which im fine with. Hope my experience helps.
legendary
Activity: 3472
Merit: 3217
Playbet.io - Crypto Casino and Sportsbook
April 08, 2020, 07:00:07 PM
#7
Did you manage to find any fix to this issue? One of my T17+ machines is acting the same. I get the same info in log, 1-2 hours of mining one of boards stops hashing. After a reboot it works fine for few hours then issue repeats.

Have you tried the other solution above? If you confirmed that the PSU is good then I would like to suggest moving the miner on the other outlets or plug the miner directly to the wall outlet then test it again.

Because sometimes extensions can't handle high wattage and can't give enough power to the miner.
jr. member
Activity: 43
Merit: 59
April 08, 2020, 06:14:12 PM
#6
Did you manage to find any fix to this issue? One of my T17+ machines is acting the same. I get the same info in log, 1-2 hours of mining one of boards stops hashing. After a reboot it works fine for few hours then issue repeats.
legendary
Activity: 2436
Merit: 6643
be constructive or S.T.F.U
January 16, 2020, 04:04:08 PM
#5
If you confirmed that all 3 hash boards work just fine, i think it would be best to buy a new PSU, for two main reasons. First one is this PSU is not good and it could damage the hash boards eventually, so there is a good amount of risk in keeping it. The second reason is economically based, T17 40th makes about 6.3$ a day (before electricity bill) so every board makes a good 60$ a month, T17 PSU costs 134$, say 30$ for shipping. you are looking at about 160$. If your power cost is 5 cents then PSU ROI will be about 5 months.

if you decide to keep on mining with only 2 boards, at least sell the other one that's sitting there doing nothing. Just 2 two sats.
newbie
Activity: 2
Merit: 0
January 15, 2020, 09:52:06 PM
#4
... So using 2 boards may be your best choice.

I'm starting to think the same.. i guess the last attempt is to disassemble the PSU and look for any obvious damage.

Thank you both!
legendary
Activity: 2436
Merit: 6643
be constructive or S.T.F.U
January 15, 2020, 06:42:17 PM
#3
UPDATE:

I disassembled the miner and disconnected the power from Chain [2], now the miner works fine, including Chain[0] which had the problem before, does this mean that there is a power problem ?

This is a good sign that all 3 boards are still good, here are the three possible reasons.

1- Voltage:

if you are not feeding the miner with 200-240v it could act weird and not be able to power on 3 hashboards.

2-PSU is dying:

Due to unregulated voltage or simply bad luck, the PSU can't feed enough power for all boards

3-The data-cable

Make sure you test all three of them by swapping them.

also can you tell us what is the exact voltage you plugging the miner to? please notice these new gears take voltage very seriously, anything above 240v or below 200 vots is simply a free ride to RMA.
legendary
Activity: 4326
Merit: 8950
'The right to privacy matters'
January 15, 2020, 07:42:58 AM
#2
By disconnecting 1 of 3 boards you may have shifted the board counter over
Hard to tell exactly what you did.

If you could disconnect two boards and boot and do full code of log

then do two boards again  show full log

then do two boards again show full log.

each time you do this make sure

board 0  on first try is connected
board 1  on second try is connected
board 2 on third they is connected.

if you always get 1 board to work and if it is a different board each time you may not have a board issue.

then try with 2 boards connected   make board 0 disconnected
try 2 boards connected make board 1 disconnect
try 2 boards connected make board 2 disconnect

if you always get 2 boards to work  you may not have a board issue

lastly try all three and if board zero does not work I suspect the psu  which means order a psu

https://shop.bitmain.com/product/detail?pid=0002019072316001724716dkNtX50679

looks like it is sold out  may need to wait til feb to get one.

https://hmtech.co/   is in use they got me a psu for my m20s from whatsminer  maybe they have a t17 psu

but do the trouble shooting before you buy a psu.

There is a cost analysis since you are getting 2 boards to work right now spending 200 for a psu many not pay off.

If you are off 14 th that is about 2 dollars a day with free power if your power is a dollar it is a dollar a day so a 200 dollar psu will take 200 days to pay off.  The ½ ing comes in 120 days. So using 2 boards may be your best choice.
newbie
Activity: 2
Merit: 0
January 15, 2020, 06:32:57 AM
#1
The miner is just over a month old and is showing issues with Chain[0].

The kernel log shows the first error:

Code:
temperature.c:697:get_temp_info: read temp sensor failed: chain = 0, sensor = 0, chip = 64, reg = 0

Then it shuts down that chip, and gets stuck in retrying to check temp.

It worked normally a few weeks ago, and even worked with 2/3 chips yesterday with 2/3 hash rate.

I have followed the five steps in the first post on this website but still nothing.

Could anyone help ?

Code:
2020-01-15 11:31:28 thread.c:105:pic_heart_beat_thread: chain[0] heart beat fail 5 times.
2020-01-15 11:31:30 power_api.c:86:get_average_voltage: chain[0], voltage is: 0.000000
2020-01-15 11:31:32 power_api.c:86:get_average_voltage: chain[1], voltage is: 17.781328
2020-01-15 11:31:33 temperature.c:697:get_temp_info: read temp sensor failed: chain = 0, sensor = 2, chip = 168, reg = 1
2020-01-15 11:31:35 power_api.c:86:get_average_voltage: chain[2], voltage is: 18.001758
2020-01-15 11:31:35 power_api.c:97:get_average_voltage: aveage voltage is: 11.927695
2020-01-15 11:31:35 power_api.c:110:check_voltage: target_vol = 17.90, actural_vol = 11.93, more than 1.0v diff.
2020-01-15 11:31:36 power_api.c:124:check_voltage_multi: retry time: 6
2020-01-15 11:31:40 temperature.c:697:get_temp_info: read temp sensor failed: chain = 0, sensor = 3, chip = 184, reg = 0
2020-01-15 11:31:48 power_api.c:86:get_average_voltage: chain[0], voltage is: 0.000000
2020-01-15 11:31:50 power_api.c:86:get_average_voltage: chain[1], voltage is: 17.787451
2020-01-15 11:31:53 temperature.c:697:get_temp_info: read temp sensor failed: chain = 0, sensor = 3, chip = 184, reg = 1
2020-01-15 11:31:53 power_api.c:86:get_average_voltage: chain[2], voltage is: 18.014004
2020-01-15 11:31:53 power_api.c:97:get_average_voltage: aveage voltage is: 11.933818
2020-01-15 11:31:53 power_api.c:110:check_voltage: target_vol = 17.90, actural_vol = 11.93, more than 1.0v diff.
2020-01-15 11:31:54 power_api.c:124:check_voltage_multi: retry time: 7
2020-01-15 11:31:54 thread.c:642:check_temperature: over max temp, pcb temp 69 (max 80), chip temp 107(max 103)
2020-01-15 11:31:54 driver-btm-api.c:201:set_miner_status: ERROR_TEMP_TOO_HIGH
2020-01-15 11:31:54 driver-btm-api.c:142:stop_mining: stop mining: over max temp
2020-01-15 11:31:54 thread.c:824:cancel_temperature_monitor_thread: cancel thread
2020-01-15 11:31:54 thread.c:834:cancel_read_nonce_reg_thread: cancel thread
2020-01-15 11:31:54 driver-btm-api.c:128:killall_hashboard: ****power off hashboard****
2020-01-15 11:31:56 thread.c:105:pic_heart_beat_thread: chain[0] heart beat fail 6 times.
2020-01-15 11:32:10 power_api.c:86:get_average_voltage: chain[0], voltage is: 0.000000
2020-01-15 11:32:12 power_api.c:86:get_average_voltage: chain[1], voltage is: 17.775205
2020-01-15 11:32:15 power_api.c:86:get_average_voltage: chain[2], voltage is: 18.007881
2020-01-15 11:32:15 power_api.c:97:get_average_voltage: aveage voltage is: 11.927695
2020-01-15 11:32:15 power_api.c:110:check_voltage: target_vol = 17.90, actural_vol = 11.93, more than 1.0v diff.
2020-01-15 11:32:16 power_api.c:124:check_voltage_multi: retry time: 8
2020-01-15 11:32:29 thread.c:105:pic_heart_beat_thread: chain[0] heart beat fail 7 times.
2020-01-15 11:32:30 power_api.c:86:get_average_voltage: chain[0], voltage is: 0.000000
2020-01-15 11:32:32 power_api.c:86:get_average_voltage: chain[1], voltage is: 17.799697
2020-01-15 11:32:35 power_api.c:86:get_average_voltage: chain[2], voltage is: 18.050742
2020-01-15 11:32:35 power_api.c:97:get_average_voltage: aveage voltage is: 11.950146
2020-01-15 11:32:35 power_api.c:110:check_voltage: target_vol = 17.90, actural_vol = 11.95, more than 1.0v diff.
2020-01-15 11:32:36 power_api.c:124:check_voltage_multi: retry time: 9
2020-01-15 11:32:48 power_api.c:86:get_average_voltage: chain[0], voltage is: 0.000000
2020-01-15 11:32:50 power_api.c:86:get_average_voltage: chain[1], voltage is: 17.799697
2020-01-15 11:32:53 power_api.c:86:get_average_voltage: chain[2], voltage is: 18.050742
2020-01-15 11:32:53 power_api.c:97:get_average_voltage: aveage voltage is: 11.950146
2020-01-15 11:32:53 power_api.c:110:check_voltage: target_vol = 17.90, actural_vol = 11.95, more than 1.0v diff.
2020-01-15 11:32:54 power_api.c:124:check_voltage_multi: retry time: 10
2020-01-15 11:32:56 thread.c:105:pic_heart_beat_thread: chain[0] heart beat fail 8 times.
2020-01-15 11:33:07 power_api.c:86:get_average_voltage: chain[0], voltage is: 0.000000
2020-01-15 11:33:08 power_api.c:86:get_average_voltage: chain[1], voltage is: 17.799697
2020-01-15 11:33:10 power_api.c:86:get_average_voltage: chain[2], voltage is: 18.050742
2020-01-15 11:33:10 power_api.c:97:get_average_voltage: aveage voltage is: 11.950146
2020-01-15 11:33:10 power_api.c:110:check_voltage: target_vol = 17.90, actural_vol = 11.95, more than 1.0v diff.
2020-01-15 11:33:11 power_api.c:124:check_voltage_multi: retry time: 11

UPDATE:

I disassembled the miner and disconnected the power from Chain [2], now the miner works fine, including Chain[0] which had the problem before, does this mean that there is a power problem ?
Jump to: