Author

Topic: Antminer T17 - Stuck on troubleshooting where the error is (Read 273 times)

jr. member
Activity: 46
Merit: 14
have you seen the repair manual for T17? please find here https://www.zeusbtc.com/NewsDetails.asp?ID=187  , there is a PDF which you can download from there, it's in Chinese, but should be "easy" to understand.


T17 Hash Board Repair Manual EN version : https://www.zeusbtc.com/manuals/Antminer-T17-Hash-Board-Repair-Guide.asp
member
Activity: 166
Merit: 82
EET/NASA intern 2013 Bitmain/MicroBT/IPC cert
This is the problem of the hashboard, and to fix it, you need a tester to repeat the signals for you to find the failure.You need to be familiar with electronic science
Before you shot this poor guy down with a less than helpful answer you should’ve double checked your translation. There’s no such thing as electronic science. It’s like saying food pizza or seated car. AND to correct you, you didn’t need a test gig to trace signals. When the log fills up and says zero ASIC the control board is sending the relevant signals until shutting down hashboard. The only thing a test jig does is allow you to conveniently retry with a button push, as well as the option of to run a stress test.

Sorry for the rant gmaxwell et el Undecided but sheesh, he’s received so much helpful advice before this.
newbie
Activity: 1
Merit: 0
This is the problem of the hashboard, and to fix it, you need a tester to repeat the signals for you to find the failure.You need to be familiar with electronic science
hero member
Activity: 544
Merit: 589
If I understand correctly, I can apply 21v to the boards metal clamps (with +/- on the correct place offcourse), and then measure the testpoints on the board?

The board won't do anything unless the correct commands are sent from a control board. It needs to command the pic microcontroller to turn on power to the chips on the board, and then it needs to send a command to the chips in order to see any signals at the test points.

It is possible to just use the control board from the miner, and make up some cables (need to be 4awg to 6awg cables) to run from the psu to the hashboard so you can run it on a bench so you can access the test points. It is very slow though because it takes so long to boot up and then you only get 3 chances to measure anything when it is checking the ASIC count, after the 3rd try it just shuts down the board and you need to start again. The normal Bitmain style test jig (that is just a S17+ control board with special firmware), does the same thing except it boots a bit faster and then runs a test pattern to verify the operation and performance of each chip if all chips are found. For troubleshooting a board that isn't finding all the chips, it is still very slow, maybe a couple of minutes per round of 3 ASIC counts.

The tester from Asic.Repair (https://tester.asic.repair/en), far superior to the standard test jig for troubleshooting boards that aren't finding all chips. It allows you to run the ASIC count test continuously about once a second with no boot-up time. I have both and rarely use the standard test fixutre any more.

legendary
Activity: 2170
Merit: 6279
be constructive or S.T.F.U
If I understand correctly, I can apply 21v to the boards metal clamps (with +/- on the correct place offcourse), and then measure the testpoints on the board?

I don't that is possible without a tester/fixture tool, my understanding is that the current won't flow in the hashboard without a control board, and the normal control board will only supply the hashboard with power for a very short period of time, it will stop once the asic count fails, so you have a very short window to test, I am not sure if that is doable, but I guess it is if you perfectly time when does the current flow based on the kernel log or x seconds from powering on.

The main feature of the fixture tools despite their brand is that they keep the current on the hash board for as long as you want it to be there so you can measure the voltage around the chips, I think it would be best to reach out to wndsnb to confirm the above.
newbie
Activity: 6
Merit: 0
Thanks for all the help so far!

I will start focusing on one board at the time from now.

If I understand correctly, I can apply 21v to the boards metal clamps (with +/- on the correct place offcourse), and then measure the testpoints on the board?
legendary
Activity: 2170
Merit: 6279
be constructive or S.T.F.U
Are the outlets on the control-board ALWAYS the same chain no., or do they alter depending on how the boards / how many boards that are connected?

It has been a while since I played with these gears, but IIRC they will always display the same chain, it would be best if you just troubleshoot them 1 by 1 to avoid confusion.



Quote
4. Started up the miner and chain 1 now found all the 30 asics BUT... chain 3 it only found 3 asics on board

I get a feeling that something else than the boards themselves are causing this, but I'm just guessing.

My next step is to try and measure a board to see that all chips are ok.


There is a tiny chance that the PSU is faulty and causes this, but it's not very likely to be the case, it's very common for a dashboard to show 30 Asics and then after a reboot, it will show 3 or 0 Asics, so this could very well be just a coincidence, this is why it's ALWAYS best to test 1 board at the time, so now since chain 1 (previously showing 21 Asics) is showing 30 Asics, let it run alone for 12-24 hours, if it sticks and nothing goes wrong, you can be somehow sure that it has been fixed and then put it aside and start with chain 3 doing the same process all over again, testing the board for a few mins can be misleading and will give you a lot of false results.



Quote
Does anyone have info about what currents I need to apply to the board in order to get some measurments, or can this be found in the previous link?

Do you mean voltage? if it's what you mean then it is 21v DC.
newbie
Activity: 6
Merit: 0
Quote
It seems like you swapped the hashboards and now you are referencing them in the wrong way, can we stick to calling them chain 0,1,2 according to the kernel log where 0 in the log = 1 in the miner status page, 1 = 2 and 2=3 , just to avoid confusion.

Absolutely, I will try my best, but as I have little knowledge about the chains and boards I have a (probably simple to answer) question:

Are the outlets on the control-board ALWAYS the same chain no., or do they alter depending on how the boards / how many boards that are connected?

Quote
According to the last image hashboard 2 ( 3 on the status page) looks perfect finding all 30 Asics, board 0 ( 1 on the miner status page) shows 21 Asics, which suggests that there is an issue either with the 21st asic or the 22nd chip.
...
For now I think you should focus on chain 0 which is showing 21 asics, it should be easier to fix, have you seen the repair manual for T17? please find here https://www.zeusbtc.com/NewsDetails.asp?ID=187  , there is a PDF which you can download from there, it's in Chinese, but should be "easy" to understand.

Now here's a strang thing that happened as I started working on the board for chain 1 (21 asics found):

1. I removed the data cable to verify that chain 1 with "21 asics found" was in the specific slot of the miner.
2. I removed the board and looked at chip 20-21-22 (from both directions) to see if I found anything suspicious.
3. Resoldered back cooling flanges and put the board back in the same slot.
4. Started up the miner and chain 1 now found all the 30 asics BUT... chain 3 it only found 3 asics on board

I get a feeling that something else than the boards themselves are causing this, but I'm just guessing.

My next step is to try and measure a board to see that all chips are ok.

Does anyone have info about what currents I need to apply to the board in order to get some measurments, or can this be found in the previous link?
legendary
Activity: 2170
Merit: 6279
be constructive or S.T.F.U
It seems like you swapped the hashboards and now you are referencing them in the wrong way, can we stick to calling them chain 0,1,2 according to the kernel log where 0 in the log = 1 in the miner status page, 1 = 2 and 2=3 , just to avoid confusion.

According to the last image hashboard 2 ( 3 on the status page) looks perfect finding all 30 Asics, board 0 ( 1 on the miner status page) shows 21 Asics, which suggests that there is an issue either with the 21st asic or the 22nd chip.



For now I think you should focus on chain 0 which is showing 21 asics, it should be easier to fix, have you seen the repair manual for T17? please find here https://www.zeusbtc.com/NewsDetails.asp?ID=187  , there is a PDF which you can download from there, it's in Chinese, but should be "easy" to understand.

I would also wait for wndsnb to respond, he has the most knowledge in fixing these hashboards.
newbie
Activity: 6
Merit: 0
Now I believe something is going in the right direction!

Additional update:


I upgraded the os to try and get some more info, and this is what it looks like right now, with all 3 boards connected.

It's just started, but can anyone see if it looks "ok" (although 2 of the boards needs to be checked for problems/solder balls etc)

https://imgur.com/a/kOAdwwl

But from having a totaly malfunctional T17 I hope this is better...
newbie
Activity: 6
Merit: 0
Some more update:

Electricity:
I measured and got 232v from the outlets to drive the T17.

Board 2:
No action taken yet

Board 1:
I removed the cooling alu on chip 12 + 29 and 30 (counting from incoming electricity)
There were 2 solder balls that I removed on two different chips.

Board 0:
On this one I've removed all alus and measured some resistance towards ground.
I've found one place on the board that I feel is a bit suspisious;

When measuring from the red areas (to ground) in the picture below, I get the following readings:

1: 6.7 kOhm
2: 6.87 kOhm
3: 6.7 kOhm

Also some of the other measure-points at the "1, 2, 3" places differ in the same way when measuring towards ground; pos 2 does not follow the pattern.

Could it be that this specific chip marked in black is faulty?

https://imgur.com/a/G1zo0Zu
newbie
Activity: 6
Merit: 0
Ok, so I decided to pick the machine apart and clean it, and man, there was some stuff in there that probably shouldn't be...

I guess they don't put mosquitos and bees in there from the factory? Smiley

Anyway, I compressor aired the loose parts, (spray)cleaned them with electrical cleaning spray, compressor aired again, put togehter, waited a day before start up.

So now, whenever I run the machine, and troubleshoot it by moving the hashboards to different slots in the "controlboard", moving the datacables to different positions etc, I end up getting the same results;

- Find 0 Asics on chain 0
- Find 12 Asics on chain 1
- Find 0 Asics on chain 2


So for the board on chain 0 and 2 I will try the folllowing this video https://www.youtube.com/watch?v=5bdRJFGLuc0 with help of some additional info in the comments field of the video.

Kernel log at the moment with all 3 boards connected (Shortened due to character exceed):
Code:
Booting Linux on physical CPU 0x0
Linux version 4.6.0-xilinx-gff8137b-dirty (lzq@armdev2) (gcc version 4.8.3 20140320 (prerelease) (Sourcery CodeBench Lite 2014.05-23) ) #25 SMP PREEMPT Fri Nov 23 15:30:52 CST 2018
CPU: ARMv7 Processor [413fc090] revision 0 (ARMv7), cr=18c5387d
CPU: PIPT / VIPT nonaliasing data cache, VIPT aliasing instruction cache
Machine model: Xilinx Zynq
cma: Reserved 16 MiB at 0x0e000000
Memory policy: Data cache writealloc
On node 0 totalpages: 61440
free_area_init_node: node 0, pgdat c0b39280, node_mem_map cde10000
  Normal zone: 480 pages used for memmap
  Normal zone: 0 pages reserved
  Normal zone: 61440 pages, LIFO batch:15
percpu: Embedded 12 pages/cpu @cddf1000 s19776 r8192 d21184 u49152
pcpu-alloc: s19776 r8192 d21184 u49152 alloc=12*4096
pcpu-alloc: [0] 0 [0] 1
Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 60960
Kernel command line: mem=240M console=ttyPS0,115200 ramdisk_size=33554432 root=/dev/ram rw earlyprintk
PID hash table entries: 1024 (order: 0, 4096 bytes)
Dentry cache hash table entries: 32768 (order: 5, 131072 bytes)
Inode-cache hash table entries: 16384 (order: 4, 65536 bytes)
Memory: 203752K/245760K available (6345K kernel code, 231K rwdata, 1896K rodata, 1024K init, 223K bss, 25624K reserved, 16384K cma-reserved, 0K highmem)
Virtual kernel memory layout:
    vector  : 0xffff0000 - 0xffff1000   (   4 kB)
    fixmap  : 0xffc00000 - 0xfff00000   (3072 kB)
    vmalloc : 0xcf800000 - 0xff800000   ( 768 MB)
    lowmem  : 0xc0000000 - 0xcf000000   ( 240 MB)
    pkmap   : 0xbfe00000 - 0xc0000000   (   2 MB)
    modules : 0xbf000000 - 0xbfe00000   (  14 MB)
      .text : 0xc0008000 - 0xc090c424   (9234 kB)
      .init : 0xc0a00000 - 0xc0b00000   (1024 kB)
      .data : 0xc0b00000 - 0xc0b39fe0   ( 232 kB)
       .bss : 0xc0b39fe0 - 0xc0b71c28   ( 224 kB)
Preemptible hierarchical RCU implementation.
Build-time adjustment of leaf fanout to 32.
RCU restricting CPUs from NR_CPUS=4 to nr_cpu_ids=2.
RCU: Adjusting geometry for rcu_fanout_leaf=32, nr_cpu_ids=2
NR_IRQS:16 nr_irqs:16 16
efuse mapped to cf800000
ps7-slcr mapped to cf802000
L2C: platform modifies aux control register: 0x72360000 -> 0x72760000
L2C: DT/platform modifies aux control register: 0x72360000 -> 0x72760000
L2C-310 erratum 769419 enabled
L2C-310 enabling early BRESP for Cortex-A9
L2C-310 full line of zeros enabled for Cortex-A9
L2C-310 ID prefetch enabled, offset 1 lines
L2C-310 dynamic clock gating enabled, standby mode enabled
L2C-310 cache controller enabled, 8 ways, 512 kB
L2C-310: CACHE_ID 0x410000c8, AUX_CTRL 0x76760001
zynq_clock_init: clkc starts at cf802100
Zynq clock init
sched_clock: 64 bits at 333MHz, resolution 3ns, wraps every 4398046511103ns
clocksource: arm_global_timer: mask: 0xffffffffffffffff max_cycles: 0x4ce07af025, max_idle_ns: 440795209040 ns
Switching to timer-based delay loop, resolution 3ns
clocksource: ttc_clocksource: mask: 0xffff max_cycles: 0xffff, max_idle_ns: 537538477 ns
ps7-ttc #0 at cf80a000, irq=18
Console: colour dummy device 80x30
Calibrating delay loop (skipped), value calculated using timer frequency.. 666.66 BogoMIPS (lpj=3333333)
pid_max: default: 32768 minimum: 301
Mount-cache hash table entries: 1024 (order: 0, 4096 bytes)
Mountpoint-cache hash table entries: 1024 (order: 0, 4096 bytes)
CPU: Testing write buffer coherency: ok
CPU0: thread -1, cpu 0, socket 0, mpidr 80000000
Setting up static identity map for 0x100000 - 0x100058
CPU1: failed to boot: -1
Brought up 1 CPUs
SMP: Total of 1 processors activated (666.66 BogoMIPS).
CPU: All CPU(s) started in SVC mode.
devtmpfs: initialized
VFP support v0.3: implementor 41 architecture 3 part 30 variant 9 rev 4
clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 19112604462750000 ns
pinctrl core: initialized pinctrl subsystem
NET: Registered protocol family 16
DMA: preallocated 256 KiB pool for atomic coherent allocations
cpuidle: using governor menu
hw-breakpoint: found 5 (+1 reserved) breakpoint and 1 watchpoint registers.
hw-breakpoint: maximum watchpoint size is 4 bytes.
zynq-ocm f800c000.ps7-ocmc: ZYNQ OCM pool: 256 KiB @ 0xcf880000
vgaarb: loaded
SCSI subsystem initialized
usbcore: registered new interface driver usbfs
usbcore: registered new interface driver hub
usbcore: registered new device driver usb
media: Linux media interface: v0.10
Linux video capture interface: v2.00
pps_core: LinuxPPS API ver. 1 registered
pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti
PTP clock support registered
EDAC MC: Ver: 3.0.0
Advanced Linux Sound Architecture Driver Initialized.
clocksource: Switched to clocksource arm_global_timer
NET: Registered protocol family 2
TCP established hash table entries: 2048 (order: 1, 8192 bytes)
TCP bind hash table entries: 2048 (order: 2, 16384 bytes)
TCP: Hash tables configured (established 2048 bind 2048)
UDP hash table entries: 256 (order: 1, 8192 bytes)
UDP-Lite hash table entries: 256 (order: 1, 8192 bytes)
NET: Registered protocol family 1
RPC: Registered named UNIX socket transport module.
RPC: Registered udp transport module.
RPC: Registered tcp transport module.
RPC: Registered tcp NFSv4.1 backchannel transport module.
PCI: CLS 0 bytes, default 64
Trying to unpack rootfs image as initramfs...
rootfs image is not initramfs (no cpio magic); looks like an initrd
Freeing initrd memory: 12584K (cceb7000 - cdb01000)
hw perfevents: enabled with armv7_cortex_a9 PMU driver, 7 counters available
futex hash table entries: 512 (order: 3, 32768 bytes)
workingset: timestamp_bits=28 max_order=16 bucket_order=0
jffs2: version 2.2. (NAND) (SUMMARY)  © 2001-2006 Red Hat, Inc.
io scheduler noop registered
io scheduler deadline registered
io scheduler cfq registered (default)
dma-pl330 f8003000.ps7-dma: Loaded driver for PL330 DMAC-241330
dma-pl330 f8003000.ps7-dma: DBUFF-128x8bytes Num_Chans-8 Num_Peri-4 Num_Events-16
e0000000.serial: ttyPS0 at MMIO 0xe0000000 (irq = 158, base_baud = 6249999) is a xuartps
console [ttyPS0] enabled
xdevcfg f8007000.ps7-dev-cfg: ioremap 0xf8007000 to cf86e000
[drm] Initialized drm 1.1.0 20060810
brd: module loaded
loop: module loaded
CAN device driver interface
gpiod_set_value: invalid GPIO
libphy: MACB_mii_bus: probed
macb e000b000.ethernet eth0: Cadence GEM rev 0x00020118 at 0xe000b000 irq 31 (00:0a:35:00:00:00)
Generic PHY e000b000.etherne:00: attached PHY driver [Generic PHY] (mii_bus:phy_addr=e000b000.etherne:00, irq=-1)
e1000e: Intel(R) PRO/1000 Network Driver - 3.2.6-k
e1000e: Copyright(c) 1999 - 2015 Intel Corporation.
ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
ehci-pci: EHCI PCI platform driver
usbcore: registered new interface driver usb-storage
mousedev: PS/2 mouse device common for all mice
i2c /dev entries driver
Xilinx Zynq CpuIdle Driver started
sdhci: Secure Digital Host Controller Interface driver
sdhci: Copyright(c) Pierre Ossman
sdhci-pltfm: SDHCI platform and OF driver helper
mmc0: SDHCI controller on e0100000.ps7-sdio [e0100000.ps7-sdio] using ADMA
ledtrig-cpu: registered to indicate activity on CPUs
usbcore: registered new interface driver usbhid
usbhid: USB HID core driver
nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xda
nand: Micron MT29F2G08ABAGAWP
nand: 256 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 128
nand: WARNING: pl35x-nand: the ECC used on your system is too weak compared to the one required by the NAND chip
Bad block table found at page 131008, version 0x01
Bad block table found at page 130944, version 0x01
6 ofpart partitions found on MTD device pl35x-nand
Creating 6 MTD partitions on "pl35x-nand":
0x000000000000-0x000002800000 : "BOOT.bin-env-dts-kernel"
0x000002800000-0x000004800000 : "ramfs"
0x000004800000-0x000005000000 : "configs"
0x000005000000-0x000006000000 : "reserve"
0x000006000000-0x000008000000 : "ramfs-bak"
0x000008000000-0x000010000000 : "reserve1"
NET: Registered protocol family 10
sit: IPv6 over IPv4 tunneling driver
NET: Registered protocol family 17
can: controller area network core (rev 20120528 abi 9)
NET: Registered protocol family 29
can: raw protocol (rev 20120528)
can: broadcast manager protocol (rev 20120528 t)
can: netlink gateway (rev 20130117) max_hops=1
zynq_pm_ioremap: no compatible node found for 'xlnx,zynq-ddrc-a05'
zynq_pm_late_init: Unable to map DDRC IO memory.
Registering SWP/SWPB emulation handler
hctosys: unable to open rtc device (rtc0)
ALSA device list:
  No soundcards found.
RAMDISK: gzip image found at block 0
EXT4-fs (ram0): couldn't mount as ext3 due to feature incompatibilities
EXT4-fs (ram0): mounted filesystem without journal. Opts: (null)
VFS: Mounted root (ext4 filesystem) on device 1:0.
devtmpfs: mounted
Freeing unused kernel memory: 1024K (c0a00000 - c0b00000)
EXT4-fs (ram0): re-mounted. Opts: block_validity,delalloc,barrier,user_xattr
random: dd urandom read with 0 bits of entropy available
ubi0: attaching mtd2
ubi0: scanning is finished
ubi0: attached mtd2 (name "configs", size 8 MiB)
ubi0: PEB size: 131072 bytes (128 KiB), LEB size: 126976 bytes
ubi0: min./max. I/O unit sizes: 2048/2048, sub-page size 2048
ubi0: VID header offset: 2048 (aligned 2048), data offset: 4096
ubi0: good PEBs: 64, bad PEBs: 0, corrupted PEBs: 0
ubi0: user volume: 1, internal volumes: 1, max. volumes count: 128
ubi0: max/mean erase counter: 3/1, WL threshold: 4096, image sequence number: 1433474905
ubi0: available PEBs: 0, total reserved PEBs: 64, PEBs reserved for bad PEB handling: 40
ubi0: background thread "ubi_bgt0d" started, PID 708
UBIFS (ubi0:0): background thread "ubifs_bgt0_0" started, PID 711
UBIFS (ubi0:0): recovery needed
UBIFS (ubi0:0): recovery completed
UBIFS (ubi0:0): UBIFS: mounted UBI device 0, volume 0, name "configs"
UBIFS (ubi0:0): LEB size: 126976 bytes (124 KiB), min./max. I/O unit sizes: 2048 bytes/2048 bytes
UBIFS (ubi0:0): FS size: 1396736 bytes (1 MiB, 11 LEBs), journal size 888833 bytes (0 MiB, 5 LEBs)
UBIFS (ubi0:0): reserved for root: 65970 bytes (64 KiB)
UBIFS (ubi0:0): media format: w4/r0 (latest is w4/r0), UUID 0935B86A-4714-4EB9-9FD3-96EC09C62EF1, small LPT model
ubi1: attaching mtd5
ubi1: scanning is finished
ubi1: attached mtd5 (name "reserve1", size 128 MiB)
ubi1: PEB size: 131072 bytes (128 KiB), LEB size: 126976 bytes
ubi1: min./max. I/O unit sizes: 2048/2048, sub-page size 2048
ubi1: VID header offset: 2048 (aligned 2048), data offset: 4096
ubi1: good PEBs: 1020, bad PEBs: 4, corrupted PEBs: 0
ubi1: user volume: 1, internal volumes: 1, max. volumes count: 128
ubi1: max/mean erase counter: 1/0, WL threshold: 4096, image sequence number: 1861296417
ubi1: available PEBs: 0, total reserved PEBs: 1020, PEBs reserved for bad PEB handling: 36
ubi1: background thread "ubi_bgt1d" started, PID 720
UBIFS (ubi1:0): background thread "ubifs_bgt1_0" started, PID 723
UBIFS (ubi1:0): recovery needed
UBIFS (ubi1:0): recovery completed
UBIFS (ubi1:0): UBIFS: mounted UBI device 1, volume 0, name "reserve1"
UBIFS (ubi1:0): LEB size: 126976 bytes (124 KiB), min./max. I/O unit sizes: 2048 bytes/2048 bytes
UBIFS (ubi1:0): FS size: 123039744 bytes (117 MiB, 969 LEBs), journal size 6221824 bytes (5 MiB, 49 LEBs)
UBIFS (ubi1:0): reserved for root: 4952683 bytes (4836 KiB)
UBIFS (ubi1:0): media format: w4/r0 (latest is w4/r0), UUID 57BDEC5B-CB8F-4CAA-8972-67DEDDA4EA08, small LPT model
IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
macb e000b000.ethernet eth0: unable to generate target frequency: 25000000 Hz
macb e000b000.ethernet eth0: link up (100/Full)
IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
In axi fpga driver!
request_mem_region OK!
AXI fpga dev virtual address is 0xcfb38000
*base_vir_addr = 0xab013
In fpga mem driver!
request_mem_region OK!
fpga mem virtual address is 0xd2000000
random: nonblocking pool is initialized

------------>


2021-05-13 05:51:42 driver-btm-api.c:663:init_freq_mode: This is scan-user version
2021-05-13 05:51:42 driver-btm-api.c:2028:bitmain_soc_init: opt_multi_version     = 1
2021-05-13 05:51:42 driver-btm-api.c:2029:bitmain_soc_init: opt_bitmain_ab        = 1
2021-05-13 05:51:42 driver-btm-api.c:2030:bitmain_soc_init: opt_bitmain_work_mode = 254
2021-05-13 05:51:42 driver-btm-api.c:2031:bitmain_soc_init: Miner compile time: Thu Apr 23 16:29:07 CST 2020 type: Antminer T17
2021-05-13 05:51:42 driver-btm-api.c:2032:bitmain_soc_init: commit version: 1c5be6f 2020-04-20 16:18:14, build by: lol 2020-04-23 16:35:04
2021-05-13 05:51:42 driver-btm-api.c:1844:show_sn: no SN got, please write SN to /nvdata/sn
2021-05-13 05:51:42 driver-btm-api.c:1167:miner_device_init: Detect 256MB control board of XILINX
2021-05-13 05:51:42 driver-btm-api.c:1115:init_fan_parameter: fan_eft : 0  fan_pwm : 0
2021-05-13 05:51:42 thread.c:885:create_read_nonce_reg_thread: create thread
2021-05-13 05:51:48 driver-btm-api.c:1099:init_miner_version: miner ID : 806cf5864e104814
2021-05-13 05:51:48 driver-btm-api.c:1105:init_miner_version: FPGA Version = 0xB013
2021-05-13 05:51:50 eeprom.c:431:check_pattern_test_level: L1 board
2021-05-13 05:51:52 eeprom.c:431:check_pattern_test_level: L1 board
2021-05-13 05:51:54 eeprom.c:431:check_pattern_test_level: L1 board
2021-05-13 05:51:54 driver-btm-api.c:737:get_product_id: product_id[0] = 1
2021-05-13 05:51:54 driver-btm-api.c:737:get_product_id: product_id[1] = 1
2021-05-13 05:51:54 driver-btm-api.c:737:get_product_id: product_id[2] = 1
2021-05-13 05:51:54 driver-btm-api.c:1666:get_ccdly_opt: ccdly_opt[0] = 1
2021-05-13 05:51:54 driver-btm-api.c:1666:get_ccdly_opt: ccdly_opt[1] = 1
2021-05-13 05:51:54 driver-btm-api.c:1666:get_ccdly_opt: ccdly_opt[2] = 1
2021-05-13 05:51:54 driver-btm-api.c:1919:bitmain_board_init: g_ccdly_opt = 1
2021-05-13 05:51:54 driver-btm-api.c:676:_set_project_type: project:2
2021-05-13 05:51:54 driver-btm-api.c:706:_set_project_type: Project type: Antminer T17
2021-05-13 05:51:54 driver-btm-api.c:717:dump_pcb_bom_version: Chain [0] PCB Version: 0x0100
2021-05-13 05:51:54 driver-btm-api.c:718:dump_pcb_bom_version: Chain [0] BOM Version: 0x0100
2021-05-13 05:51:54 driver-btm-api.c:717:dump_pcb_bom_version: Chain [1] PCB Version: 0x0100
2021-05-13 05:51:54 driver-btm-api.c:718:dump_pcb_bom_version: Chain [1] BOM Version: 0x0100
2021-05-13 05:51:54 driver-btm-api.c:717:dump_pcb_bom_version: Chain [2] PCB Version: 0x0100
2021-05-13 05:51:54 driver-btm-api.c:718:dump_pcb_bom_version: Chain [2] BOM Version: 0x0100
2021-05-13 05:51:55 driver-btm-api.c:1939:bitmain_board_init: Fan check passed.
2021-05-13 05:51:57 board.c:36:jump_and_app_check_restore_pic: chain[0] PIC jump to app
2021-05-13 05:52:00 board.c:40:jump_and_app_check_restore_pic: Check chain[0] PIC fw version=0xb9
2021-05-13 05:52:02 board.c:36:jump_and_app_check_restore_pic: chain[1] PIC jump to app
2021-05-13 05:52:05 board.c:40:jump_and_app_check_restore_pic: Check chain[1] PIC fw version=0xb9
2021-05-13 05:52:07 board.c:36:jump_and_app_check_restore_pic: chain[2] PIC jump to app
2021-05-13 05:52:10 board.c:40:jump_and_app_check_restore_pic: Check chain[2] PIC fw version=0xb9
2021-05-13 05:52:10 thread.c:880:create_pic_heart_beat_thread: create thread
2021-05-13 05:52:10 power_api.c:55:power_init: power init ...
2021-05-13 05:52:10 driver-btm-api.c:1949:bitmain_board_init: Enter 30s sleep to make sure power release finish.
2021-05-13 05:52:43 power_api.c:232:set_iic_power_to_highest_voltage: setting to voltage: 17.00 ...
2021-05-13 05:52:48 power_api.c:124:check_voltage_multi: retry time: 0
2021-05-13 05:52:50 power_api.c:86:get_average_voltage: chain[0], voltage is: 17.132285
2021-05-13 05:52:52 power_api.c:86:get_average_voltage: chain[1], voltage is: 17.175146
2021-05-13 05:52:53 power_api.c:86:get_average_voltage: chain[2], voltage is: 17.138408
2021-05-13 05:52:53 power_api.c:97:get_average_voltage: aveage voltage is: 17.148613
2021-05-13 05:52:53 power_api.c:182:set_iic_power_by_voltage: now set voltage to : 17.000000
2021-05-13 05:52:54 uart.c:80:set_baud: set fpga_baud = 115200, fpga_divider = 26
2021-05-13 05:53:05 driver-btm-api.c:1042:check_asic_number_with_power_on: Chain[0]: find 0 asic, times 0
2021-05-13 05:53:14 driver-btm-api.c:1042:check_asic_number_with_power_on: Chain[0]: find 0 asic, times 1
2021-05-13 05:53:25 driver-btm-api.c:1042:check_asic_number_with_power_on: Chain[0]: find 0 asic, times 2
2021-05-13 05:53:25 driver-btm-api.c:1069:check_asic_number: Chain 0 only find 0 asic, will power off hash board 0
2021-05-13 05:53:37 driver-btm-api.c:1042:check_asic_number_with_power_on: Chain[1]: find 12 asic, times 0
2021-05-13 05:53:47 driver-btm-api.c:1042:check_asic_number_with_power_on: Chain[1]: find 12 asic, times 1
2021-05-13 05:53:57 driver-btm-api.c:1042:check_asic_number_with_power_on: Chain[1]: find 12 asic, times 2
2021-05-13 05:53:57 driver-btm-api.c:1069:check_asic_number: Chain 1 only find 12 asic, will power off hash board 1
2021-05-13 05:54:08 driver-btm-api.c:1042:check_asic_number_with_power_on: Chain[2]: find 0 asic, times 0
2021-05-13 05:54:18 driver-btm-api.c:1042:check_asic_number_with_power_on: Chain[2]: find 0 asic, times 1
2021-05-13 05:54:28 driver-btm-api.c:1042:check_asic_number_with_power_on: Chain[2]: find 0 asic, times 2
2021-05-13 05:54:28 driver-btm-api.c:1069:check_asic_number: Chain 2 only find 0 asic, will power off hash board 2
2021-05-13 05:54:29 driver-btm-api.c:205:set_miner_status: STATUS_INIT
2021-05-13 05:54:34 driver-btm-api.c:205:set_miner_status: STATUS_OKAY
2021-05-13 05:54:36 driver-btm-c5_socketb.c:1049:main: poweroff hash board and enter sleep mode ...
2021-05-13 05:54:39 driver-btm-api.c:1325:dhash_chip_send_job: Version num 4

hero member
Activity: 544
Merit: 589
From the log you posted, it looks like none of the boards are working when they are all connected. For the hashboard to work, all 30 asics need to be found.

Code:
2021-05-08 20:26:52 driver-btm-api.c:1069:check_asic_number: Chain 0 only find 0 asic, will power off hash board 0

2021-05-08 20:27:24 driver-btm-api.c:1069:check_asic_number: Chain 1 only find 12 asic, will power off hash board 1

2021-05-08 20:27:55 driver-btm-api.c:1069:check_asic_number: Chain 2 only find 0 asic, will power off hash board 2

The "get_average_voltage" messages shows the measurement from each hashboard of the PSU voltage. There is only one main supply voltage, so the three boards are measuring the same supply voltage. It could be the PSU is just shutting down and the 3 measurements are showing the voltage drop after the supply shut down, notice from the timestamps that there are a few seconds between each reading.

Code:
2021-05-08 20:26:18 power_api.c:86:get_average_voltage: chain[0], voltage is: 17.034316
2021-05-08 20:26:20 power_api.c:86:get_average_voltage: chain[1], voltage is: 16.513857
2021-05-08 20:26:23 power_api.c:86:get_average_voltage: chain[2], voltage is: 15.558662
2021-05-08 20:26:23 power_api.c:97:get_average_voltage: aveage voltage is: 16.368945
2021-05-08 20:26:23 power_api.c:182:set_iic_power_by_voltage: now set voltage to : 17.000000 

From the little info you've given, I'd guess the most probable problem would be a bad PSU or input voltage as mikeywith said. Although I'm not sure how one board would work when the log you posted shows the PSU turning off before the boards start hashing. Before they start hashing, all 3 boards together are using much less power than a single board would use while actually hashing.

We might be able to get a better idea of what the issue is if you post some more logs and screenshots of the status screen.

  • Log of 1 board working when all 3 boards are connected
  • Logs of each board working when connected indiviudally

Also, what is the AC voltage you are powering these with? You should verify with a voltmeter.
legendary
Activity: 2170
Merit: 6279
be constructive or S.T.F.U
All 3 boards in the T17 connected as "standard";
 - 1 card mines
 - 2 card does not mine
 
Restarting with all only card "1" connected to socket 0 on the control card;
 - Card "1" mines

Restarting with all only card "2" connected to socket 0 on the control card;
 - Card "2" mines

I find it hard to understand this part, it could be your explanation or I am just getting old, but my best guess is that all hash boards work fine as long as you only run 1 hash board!

If the above is correct, then your issue is most likely a bad PSU or low AC input voltage.

newbie
Activity: 6
Merit: 0
Hi all, I'm both new to the mining experience and also to this forum...

I stumbled upon a Antminer T17 the other day, throwing an error, so the price was right, and I could not resist to buy it Smiley

I'm quite used to try and troubleshoot things from my daily work, but since I'm quite new and unexperienced to mining, I need some expert help on getting by right now.

I will try to explain what I've done;

First startup (Also se part of kernel log below):

All 3 boards in the T17 connected as "standard";
 - 1 card mines
 - 2 card does not mine
 
Restarting with all only card "1" connected to socket 0 on the control card;
 - Card "1" mines

Restarting with all only card "2" connected to socket 0 on the control card;
 - Card "2" mines


The kernel log for "First startup" gives me this info:

2021-05-08 20:26:18 power_api.c:86:get_average_voltage: chain[0], voltage is: 17.034316
2021-05-08 20:26:20 power_api.c:86:get_average_voltage: chain[1], voltage is: 16.513857
2021-05-08 20:26:23 power_api.c:86:get_average_voltage: chain[2], voltage is: 15.558662
2021-05-08 20:26:23 power_api.c:97:get_average_voltage: aveage voltage is: 16.368945
2021-05-08 20:26:23 power_api.c:182:set_iic_power_by_voltage: now set voltage to : 17.000000
2021-05-08 20:26:23 uart.c:80:set_baud: set fpga_baud = 115200, fpga_divider = 26
2021-05-08 20:26:33 driver-btm-api.c:1042:check_asic_number_with_power_on: Chain[0]: find 0 asic, times 0
2021-05-08 20:26:43 driver-btm-api.c:1042:check_asic_number_with_power_on: Chain[0]: find 0 asic, times 1
2021-05-08 20:26:52 driver-btm-api.c:1042:check_asic_number_with_power_on: Chain[0]: find 0 asic, times 2
2021-05-08 20:26:52 driver-btm-api.c:1069:check_asic_number: Chain 0 only find 0 asic, will power off hash board 0
2021-05-08 20:27:04 driver-btm-api.c:1042:check_asic_number_with_power_on: Chain[1]: find 12 asic, times 0
2021-05-08 20:27:14 driver-btm-api.c:1042:check_asic_number_with_power_on: Chain[1]: find 12 asic, times 1
2021-05-08 20:27:24 driver-btm-api.c:1042:check_asic_number_with_power_on: Chain[1]: find 12 asic, times 2
2021-05-08 20:27:24 driver-btm-api.c:1069:check_asic_number: Chain 1 only find 12 asic, will power off hash board 1
2021-05-08 20:27:36 driver-btm-api.c:1042:check_asic_number_with_power_on: Chain[2]: find 0 asic, times 0
2021-05-08 20:27:45 driver-btm-api.c:1042:check_asic_number_with_power_on: Chain[2]: find 0 asic, times 1
2021-05-08 20:27:55 driver-btm-api.c:1042:check_asic_number_with_power_on: Chain[2]: find 0 asic, times 2
2021-05-08 20:27:55 driver-btm-api.c:1069:check_asic_number: Chain 2 only find 0 asic, will power off hash board 2


So my question is, based on that  :

 Could this be a ECC/PSU problem since the voltage on one of the cards is 15.558662?
Jump to: