Author

Topic: S9 - Fan Error when 2+ boards hooked up (Read 262 times)

full member
Activity: 538
Merit: 175
September 01, 2018, 10:03:41 PM
#6
No, only 1 fan is working. The one reading 720 is only spinning because of the other fan forcing air through it. Usually it is the exhaust fan that dies.

You are right, but it is not always the exhaust fan that dies. In my high corrosion area the intake fan fails maybe 4 out of 5 times.

The higher numbered fan sockets are closer to the edge of the controller board. Just follow the wire from the socket to find out which fan you need to replace. (So in your case, fan[3] is lower and its socket is farther from the edge of the board).
legendary
Activity: 3822
Merit: 2703
Evil beware: We have waffles!
August 10, 2018, 08:17:18 AM
#5
No, only 1 fan is working. The one reading 720 is only spinning because of the other fan forcing air through it. Usually it is the exhaust fan that dies.
newbie
Activity: 10
Merit: 0
August 09, 2018, 11:50:40 PM
#4
how did you fix this problem? I have the same problem now

get fan[3] speed=720
get fan[5] speed=4560
Fatal Error: some Fan lost or Fan speed low!

both fans work but at low RPM
newbie
Activity: 4
Merit: 1
January 22, 2018, 06:05:56 PM
#3
Here is my kernel log.


Code:
Booting Linux on physical CPU 0x0
Initializing cgroup subsys cpuset
Linux version 3.10.31-ltsi-00003-gcf03eb9 (lzq@armdev01) (gcc version 4.7.3 20121106 (prerelease) (crosstool-NG linaro-1.13.1-4.7-2012.11-20121123 - Linaro GCC 2012.11) ) #81 SMP Mon Apr 25 11:20:36 CST 2016
CPU: ARMv7 Processor [413fc090] revision 0 (ARMv7), cr=10c5387d
CPU: PIPT / VIPT nonaliasing data cache, VIPT aliasing instruction cache
Machine: Altera SOCFPGA, model: Altera SOCFPGA Cyclone V
Memory policy: ECC disabled, Data cache writealloc
On node 0 totalpages: 258048
free_area_init_node: node 0, pgdat 806e5cc0, node_mem_map 8072a000
  Normal zone: 2016 pages used for memmap
  Normal zone: 0 pages reserved
  Normal zone: 258048 pages, LIFO batch:31
PERCPU: Embedded 8 pages/cpu @80f17000 s11200 r8192 d13376 u32768
pcpu-alloc: s11200 r8192 d13376 u32768 alloc=8*4096
pcpu-alloc: [0] 0 [0] 1
Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 256032
Kernel command line: mem=1008M console=ttyS0,115200 root=/dev/mtdblock3 rw rootfstype=jffs2
PID hash table entries: 4096 (order: 2, 16384 bytes)
Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)
Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
Memory: 1008MB = 1008MB total
Memory: 1015844k/1015844k available, 16348k reserved, 0K highmem
Virtual kernel memory layout:
    vector  : 0xffff0000 - 0xffff1000   (   4 kB)
    fixmap  : 0xfff00000 - 0xfffe0000   ( 896 kB)
    vmalloc : 0xbf800000 - 0xff000000   (1016 MB)
    lowmem  : 0x80000000 - 0xbf000000   (1008 MB)
    modules : 0x7f000000 - 0x80000000   (  16 MB)
      .text : 0x80008000 - 0x8065a930   (6475 kB)
      .init : 0x8065b000 - 0x806adbc0   ( 331 kB)
      .data : 0x806ae000 - 0x806e9990   ( 239 kB)
       .bss : 0x806e9990 - 0x80729384   ( 255 kB)
SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=2, Nodes=1
Hierarchical RCU implementation.
NR_IRQS:16 nr_irqs:16 16
sched_clock: 32 bits at 100MHz, resolution 10ns, wraps every 42949ms
Console: colour dummy device 80x30
Calibrating delay loop... 1196.85 BogoMIPS (lpj=5984256)
pid_max: default: 32768 minimum: 301
Mount-cache hash table entries: 512
CPU: Testing write buffer coherency: ok
ftrace: allocating 17687 entries in 52 pages
CPU0: thread -1, cpu 0, socket 0, mpidr 80000000
Setting up static identity map for 0x804ab220 - 0x804ab278
CPU1: failed to come online
Brought up 1 CPUs
SMP: Total of 1 processors activated (1196.85 BogoMIPS).
CPU: All CPU(s) started in SVC mode.
devtmpfs: initialized
NET: Registered protocol family 16
fpga bridge driver
DMA: preallocated 256 KiB pool for atomic coherent allocations
L310 cache controller enabled
l2x0: 8 ways, CACHE_ID 0x410030c9, AUX_CTRL 0x32460000, Cache size: 524288 B
syscon fffef000.l2-cache: regmap [mem 0xfffef000-0xfffeffff] registered
syscon ffd05000.rstmgr: regmap [mem 0xffd05000-0xffd05fff] registered
syscon ffc25000.sdrctl: regmap [mem 0xffc25000-0xffc25fff] registered
syscon ff800000.l3regs: regmap [mem 0xff800000-0xff800fff] registered
syscon ffd08000.sysmgr: regmap [mem 0xffd08000-0xffd0bfff] registered
hw-breakpoint: found 5 (+1 reserved) breakpoint and 1 watchpoint registers.
hw-breakpoint: maximum watchpoint size is 4 bytes.
altera_hps2fpga_bridge fpgabridge.2: fpga bridge [hps2fpga] registered as device hps2fpga
altera_hps2fpga_bridge fpgabridge.2: init-val not specified
altera_hps2fpga_bridge fpgabridge.3: fpga bridge [lshps2fpga] registered as device lwhps2fpga
altera_hps2fpga_bridge fpgabridge.3: init-val not specified
altera_hps2fpga_bridge fpgabridge.4: fpga bridge [fpga2hps] registered as device fpga2hps
altera_hps2fpga_bridge fpgabridge.4: init-val not specified
bio: create slab at 0
FPGA Mangager framework driver
SCSI subsystem initialized
usbcore: registered new interface driver usbfs
usbcore: registered new interface driver hub
usbcore: registered new device driver usb
pps_core: LinuxPPS API ver. 1 registered
pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti
PTP clock support registered
Switching to clocksource timer0
NET: Registered protocol family 2
TCP established hash table entries: 8192 (order: 4, 65536 bytes)
TCP bind hash table entries: 8192 (order: 4, 65536 bytes)
TCP: Hash tables configured (established 8192 bind 8192)
TCP: reno registered
UDP hash table entries: 512 (order: 2, 16384 bytes)
UDP-Lite hash table entries: 512 (order: 2, 16384 bytes)
NET: Registered protocol family 1
RPC: Registered named UNIX socket transport module.
RPC: Registered udp transport module.
RPC: Registered tcp transport module.
RPC: Registered tcp NFSv4.1 backchannel transport module.
hw perfevents: enabled with ARMv7 Cortex-A9 PMU driver, 7 counters available
arm-pmu arm-pmu: PMU:CTI successfully enabled for 1 cores
NFS: Registering the id_resolver key type
Key type id_resolver registered
Key type id_legacy registered
NTFS driver 2.1.30 [Flags: R/W].
jffs2: version 2.2. (NAND) © 2001-2006 Red Hat, Inc.
msgmni has been set to 1984
io scheduler noop registered (default)
Serial: 8250/16550 driver, 2 ports, IRQ sharing disabled
ffc02000.serial0: ttyS0 at MMIO 0xffc02000 (irq = 194) is a 16550A
console [ttyS0] enabled
altera_fpga_manager ff706000.fpgamgr: fpga manager [Altera FPGA Manager] registered as minor 0
brd: module loaded
denali-nand-dt ff900000.nand: Dump timing register values:acc_clks: 4, re_2_we: 20, re_2_re: 20
we_2_re: 12, addr_2_data: 14, rdwr_en_lo_cnt: 2
rdwr_en_hi_cnt: 2, cs_setup_cnt: 2
ONFI param page 0 valid
ONFI flash detected
NAND device: Manufacturer ID: 0x2c, Chip ID: 0xda (Micron MT29F2G08ABAEAWP), 256MiB, page size: 2048, OOB size: 64
Bad block table found at page 131008, version 0x01
Bad block table found at page 130944, version 0x01
5 ofpart partitions found on MTD device denali-nand
Creating 5 MTD partitions on "denali-nand":
0x000000000000-0x000001000000 : "NAND Flash Boot Area 16MB"
0x000001000000-0x000002000000 : "NAND Flash Boot Area backup1 16MB"
0x000002000000-0x000003000000 : "NAND Flash Boot Area backup2 16MB"
0x000003000000-0x00000b000000 : "NAND Flash jffs2 Root Filesystem 128MB"
0x00000b000000-0x000010000000 : "NAND Flash jffs2 Root Filesystem 80MB"
dw_spi_mmio fff00000.spi: master is unqueued, this is deprecated
CAN device driver interface
c_can_platform ffc00000.d_can: invalid resource
c_can_platform ffc00000.d_can: control memory is not used for raminit
c_can_platform ffc00000.d_can: c_can_platform device registered (regs=bf8dc000, irq=163)
stmmac_hw_init: 1000M
stmmac - user ID: 0x10, Synopsys ID: 0x37
 Ring mode enabled
 DMA HW capability register supported
 Enhanced/Alternate descriptors
Enabled extended descriptors
 RX Checksum Offload Engine supported (type 2)
 TX Checksum insertion supported
 Enable RX Mitigation via HW Watchdog Timer
libphy: stmmac: probed
eth0: PHY ID 0007c0f1 at 0 IRQ POLL (stmmac-0:00) active
usbcore: registered new interface driver usb-storage
mousedev: PS/2 mouse device common for all mice
i2c /dev entries driver
Synopsys Designware Multimedia Card Interface Driver
dwmmc_socfpga ff704000.dwmmc0: couldn't determine pwr-en, assuming pwr-en = 0
dwmmc_socfpga ff704000.dwmmc0: Using internal DMA controller.
dwmmc_socfpga ff704000.dwmmc0: Version ID is 240a
dwmmc_socfpga ff704000.dwmmc0: DW MMC controller at irq 171, 32 bit host data width, 1024 deep fifo
mmc_host mmc0: Bus speed (slot 0) = 50000000Hz (slot req 400000Hz, actual 396825HZ div = 63)
dwmmc_socfpga ff704000.dwmmc0: 1 slots initialized
ledtrig-cpu: registered to indicate activity on CPUs
usbcore: registered new interface driver usbhid
usbhid: USB HID core driver
oprofile: using arm/armv7-ca9
TCP: cubic registered
NET: Registered protocol family 10
sit: IPv6 over IPv4 tunneling driver
NET: Registered protocol family 17
NET: Registered protocol family 15
can: controller area network core (rev 20120528 abi 9)
NET: Registered protocol family 29
can: raw protocol (rev 20120528)
can: broadcast manager protocol (rev 20120528 t)
can: netlink gateway (rev 20130117) max_hops=1
8021q: 802.1Q VLAN Support v1.8
Key type dns_resolver registered
VFP support v0.3: implementor 41 architecture 3 part 30 variant 9 rev 4
ThumbEE CPU extension supported.
Registering SWP/SWPB emulation handler
mmc_host mmc0: Bus speed (slot 0) = 50000000Hz (slot req 300000Hz, actual 297619HZ div = 84)
mmc_host mmc0: Bus speed (slot 0) = 50000000Hz (slot req 200000Hz, actual 200000HZ div = 125)
mmc_host mmc0: Bus speed (slot 0) = 50000000Hz (slot req 100000Hz, actual 100000HZ div = 250)
mmc_host mmc0: Bus speed (slot 0) = 50000000Hz (slot req 400000Hz, actual 396825HZ div = 63)
mmc_host mmc0: Bus speed (slot 0) = 50000000Hz (slot req 300000Hz, actual 297619HZ div = 84)
mmc_host mmc0: Bus speed (slot 0) = 50000000Hz (slot req 200000Hz, actual 200000HZ div = 125)
mmc_host mmc0: Bus speed (slot 0) = 50000000Hz (slot req 100000Hz, actual 100000HZ div = 250)
jffs2: jffs2_scan_inode_node(): CRC failed on node at 0x059937cc: Read 0xffffffff, calculated 0x3ebc8775
jffs2: Empty flash at 0x05993824 ends at 0x05994000
jffs2: jffs2_scan_inode_node(): CRC failed on node at 0x073087c4: Read 0xffffffff, calculated 0xcfaca8a3
VFS: Mounted root (jffs2 filesystem) on device 31:3.
devtmpfs: mounted
Freeing unused kernel memory: 328K (8065b000 - 806ad000)
eth0: device MAC address 4a:67:55:e8:8d:3a
init phy ok
PHY DMA init OK
eth0: device MAC address 00:e9:6d:16:03:f9
init phy ok
PHY DMA init OK
IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
libphy: stmmac-0:00 - Link is Up - 100/Full
IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
In axi fpga driver!
Original value in RESET_MANAGER_BASE_ADDR + BRGMODRST_ADDR is 0x0
request_mem_region OK!
AXI fpga dev virtual address is 0xbf942000
*base_vir_addr = 0xc50f
In fpga mem driver!
request_mem_region OK!
fpga mem virtual address is 0xc0000000
eth0: device MAC address 00:e9:6d:16:03:f9
init phy ok
PHY DMA init OK
IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
eth0: device MAC address 00:e9:6d:16:03:f9
init phy ok
PHY DMA init OK
IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
libphy: stmmac-0:00 - Link is Up - 100/Full
IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
This is C5 board.
DETECT HW version=0000c50f
miner ID : 00f14388186cfa13
Miner Type = S9
AsicType = 1387
real AsicNum = 63
use critical mode to search freq...
get PLUG ON=0x00000007
Find hashboard on Chain[0]
Find hashboard on Chain[1]
Find hashboard on Chain[2]
set_reset_allhashboard = 0x0000ffff
Check chain[0] PIC fw version=0x03
Check chain[1] PIC fw version=0x03
Check chain[2] PIC fw version=0x03
chain[0]: [63:22] [63:5] [63:25] [63:4] [63:5] [63:19] [63:255] [63:255]
has freq in PIC, will disable freq setting.
chain[0] has freq in PIC and will jump over...
Chain[0] has core num in PIC
Chain[0] ASIC[12] has core num=3
Chain[0] ASIC[15] has core num=2
Chain[0] ASIC[17] has core num=4
Chain[0] ASIC[43] has core num=1
Check chain[0] PIC fw version=0x03
chain[1]: [63:22] [63:5] [63:24] [63:24] [63:70] [63:0] [63:255] [63:255]
has freq in PIC, will disable freq setting.
chain[1] has freq in PIC and will jump over...
Chain[1] has core num in PIC
Chain[1] ASIC[1] has core num=1
Chain[1] ASIC[15] has core num=3
Chain[1] ASIC[38] has core num=1
Chain[1] ASIC[62] has core num=1
Check chain[1] PIC fw version=0x03
chain[2]: [63:22] [63:5] [63:24] [63:25] [63:40] [63:53] [63:255] [63:255]
has freq in PIC, will disable freq setting.
chain[2] has freq in PIC and will jump over...
Chain[2] has core num in PIC
Chain[2] ASIC[4] has core num=1
Chain[2] ASIC[12] has core num=2
Chain[2] ASIC[15] has core num=2
Chain[2] ASIC[19] has core num=1
Chain[2] ASIC[20] has core num=1
Chain[2] ASIC[21] has core num=3
Chain[2] ASIC[23] has core num=1
Chain[2] ASIC[25] has core num=1
Chain[2] ASIC[27] has core num=2
Chain[2] ASIC[28] has core num=9
Chain[2] ASIC[30] has core num=5
Chain[2] ASIC[32] has core num=5
Chain[2] ASIC[33] has core num=1
Chain[2] ASIC[35] has core num=4
Chain[2] ASIC[37] has core num=5
Chain[2] ASIC[38] has core num=1
Chain[2] ASIC[39] has core num=1
Chain[2] ASIC[41] has core num=5
Chain[2] ASIC[42] has core num=1
Chain[2] ASIC[43] has core num=1
Chain[2] ASIC[49] has core num=12
Chain[2] ASIC[50] has core num=1
Chain[2] ASIC[58] has core num=1
Chain[2] ASIC[62] has core num=7
Check chain[2] PIC fw version=0x03
get PIC voltage=74 on chain[0], value=900
get PIC voltage=108 on chain[1], value=880
get PIC voltage=6 on chain[2], value=940
set_reset_allhashboard = 0x00000000
chain[0] temp offset record: 62,-3,32,-4,0,0,0,0
chain[0] temp chip I2C addr=0x98
chain[1] temp offset record: 62,-4,32,-4,0,0,0,0
chain[1] temp chip I2C addr=0x98
chain[2] temp offset record: 62,-4,32,-7,0,0,0,0
chain[2] temp chip I2C addr=0x98
set_reset_allhashboard = 0x0000ffff
set_reset_allhashboard = 0x00000000
CRC error counter=0
set command mode to VIL

--- check asic number
After Get ASIC NUM CRC error counter=0
set_baud=0
The min freq=700
set real timeout 52, need sleep=379392
After TEST CRC error counter=0
set_reset_allhashboard = 0x0000ffff
set_reset_allhashboard = 0x00000000
search freq for 1 times, completed chain = 3, total chain num = 3
set_reset_allhashboard = 0x0000ffff
set_reset_allhashboard = 0x00000000
restart Miner chance num=2
waiting for receive_func to exit!
waiting for pic heart to exit!
bmminer not found=  365 root       0:00 grep bmminer

bmminer not found, restart bmminer ...
This is user mode for mining
This is C5 board.
Miner Type = S9
Miner compile time: Tue Aug 15 11:37:46 CST 2017 type: Antminer S9set_reset_allhashboard = 0x0000ffff
set_reset_allhashboard = 0x00000000
set_reset_allhashboard = 0x0000ffff
miner ID : 00f14388186cfa13
set_reset_allhashboard = 0x0000ffff
get fan[0] speed=1680
get fan[0] speed=1680
get fan[0] speed=1680
Checking fans!get fan[0] speed=1680
get fan[1] speed=4200
get fan[0] speed=1680
get fan[1] speed=4200
get fan[0] speed=1680
get fan[1] speed=4200
get fan[0] speed=1680
get fan[1] speed=4200
get fan[0] speed=1680
get fan[1] speed=4200
get fan[0] speed=1680
get fan[1] speed=4200
get fan[0] speed=1680
get fan[1] speed=4200
get fan[0] speed=1680
get fan[1] speed=4200
get fan[0] speed=1680
get fan[1] speed=4200
get fan[0] speed=1680
get fan[1] speed=4200
get fan[0] speed=1680
get fan[1] speed=4200
get fan[0] speed=4320
get fan[1] speed=4200
get fan[0] speed=4320
get fan[1] speed=4200
get fan[0] speed=4320
get fan[1] speed=4200
get fan[0] speed=4320
get fan[1] speed=4200
get fan[0] speed=4320
get fan[1] speed=4200
get fan[0] speed=4320
get fan[1] speed=4200
get fan[0] speed=4320
get fan[1] speed=4200
get fan[0] speed=4320
get fan[1] speed=4200
get fan[0] speed=4320
get fan[1] speed=4200
get fan[0] speed=4320
get fan[1] speed=4200
get fan[0] speed=4320
get fan[1] speed=4200
get fan[0] speed=4320
get fan[1] speed=4200
get fan[0] speed=4320
get fan[1] speed=4200
get fan[0] speed=4320
get fan[1] speed=4200
get fan[0] speed=4320
get fan[1] speed=4200
get fan[0] speed=4320
get fan[1] speed=4200
get fan[0] speed=4320
get fan[1] speed=4200
get fan[0] speed=4320
get fan[1] speed=4200
get fan[0] speed=4320
get fan[1] speed=4200
get fan[0] speed=4320
get fan[1] speed=4200
get fan[0] speed=4320
get fan[1] speed=4200
get fan[0] speed=4320
get fan[1] speed=4200
get fan[0] speed=4320
get fan[1] speed=4200
get fan[0] speed=4320
get fan[1] speed=4200
get fan[0] speed=4320
get fan[1] speed=4200
get fan[0] speed=4320
get fan[1] speed=4200
get fan[0] speed=4320
get fan[1] speed=4080
get fan[0] speed=4320
get fan[1] speed=4080
get fan[0] speed=4320
get fan[1] speed=4080
get fan[0] speed=4320
get fan[1] speed=4080
get fan[0] speed=4320
get fan[1] speed=4080
get fan[0] speed=4320
get fan[1] speed=4080
get fan[0] speed=4320
get fan[1] speed=4080
get fan[0] speed=4320
get fan[1] speed=4080
get fan[0] speed=4320
get fan[1] speed=4080
get fan[0] speed=4320
get fan[1] speed=4080
get fan[0] speed=4320
get fan[1] speed=4080
get fan[0] speed=4320
Fatal Error: some Fan lost or Fan speed low!
hero member
Activity: 756
Merit: 560
January 22, 2018, 01:07:58 PM
#2
Flash the device with different firmware? Try a different fan port on the controller?
newbie
Activity: 4
Merit: 1
January 22, 2018, 11:00:07 AM
#1
Hello,

Just recently purchased a used S9, 1 board not hashing. I had the previous owner reboot the device and saw it come back to hashing on 2 cards. He had it hooked up with an APW3++ to 220v. I brought it home, hooked it up to the APW3++ on 110v, but left the non hashing card unhooked (knowing the power limitations of the APW3++ on 110v). Here's what happened:

1. I let it sit for over an hour. No hashing. Checked the kernel log, recognized both boards, but started cycling through checking the fans and said "Fatal error: fan speed too low"
2. Unhooked I/O cable for 2 boards, leaving board #1 connected. Boots up, starts hashing within 5 minutes.
3. Disconnected board #1 and connected #2, started hashing with 5 minutes.
4. Hooked both boards back up, kernel log shows fan error.

Any ideas?
Jump to: