Author

Topic: Strange problem with miners, help needed (Read 489 times)

brand new
Activity: 0
Merit: 1
December 25, 2018, 06:31:23 PM
#14
Thank you for procedure on adjusting fan speed.  Grin
legendary
Activity: 3164
Merit: 2258
I fix broken miners. And make holes in teeth :-)
December 18, 2018, 09:47:43 PM
#13
Hm. Well, let's try this: Dunk the board in the fluid, and hook it up to real fans. See if it fails.

If so, it's something with the board. Could be the trimmer circuit at the end of the string. If not, it's something with the fan deletes, try running them out of the fluid with extension cables.

One step at a time.
newbie
Activity: 7
Merit: 0
December 18, 2018, 11:54:49 AM
#12
Have you tried switching the fans between the two types of miners? Maybe the model not working with your device has a different way to determine RPM.

Try to isolate the problem.

Yeah, I've been trying to isolate the problem. I appreciate the reply. I've already tried switching the fans and fan deletes between two types of miners.

So I do have an update. I did the painstaking process over the past week of removing the miners from the fluid and replacing the fans on them. I then got them all to start mining, and reflashed the firmware.

Then, (this is risky, and I would not recommend trying this), I put the fan deletes on and ran them in the air. Guess what? They started mining right before I turned them off. So, now I have the miners that were originally resetting every so often running on the fan deletes in air.

I put them back into the tank, and they start resetting again. Could this be a tank issue?

This fluid isn't supposed to carry a charge though.

I'm still basically where I was at on day 1.

Sorry for the late replies, holidays are approaching and I have been busy.
legendary
Activity: 3164
Merit: 2258
I fix broken miners. And make holes in teeth :-)
December 06, 2018, 03:08:49 PM
#11
Have you tried switching the fans between the two types of miners? Maybe the model not working with your device has a different way to determine RPM.

Try to isolate the problem.
legendary
Activity: 2394
Merit: 6581
be constructive or S.T.F.U
December 05, 2018, 05:54:43 PM
#10
well then start by running them at stock freq to eliminate that option. also check the pools. try using something different, switch PSUs !

also one thing I noticed about electronics in general that you most likely will never find 2 identical , let alone the poor quality control in these Asics, so everything could be really different for GOD knows why. I have some asics from same batches, same cooling, same fan speed and same about everything yet they still have different temps, different hashrate, some need restarts every now and then, some have been on for months with 0 problems. this is just the way it works !
newbie
Activity: 7
Merit: 0
December 05, 2018, 12:37:32 PM
#9
Thank you for the reply. Yes, I have tried custom firmware based off Bitmain's own firmware. This custom firmware allows me to disable the fans, but it doesn't work. The miner still resets.

I could cut the fan blades, but the my main reason for immersion cooling is to cut on utility costs (especially right now lol).

I appreciate it though. I'm open to all suggestions.



***Update: Tried Braiins OS and the machine is still resetting.

I'm hoping this is a simple problem I'm overlooking. I just don't understand why the fan mod works on one miner and not this one.
legendary
Activity: 2394
Merit: 6581
be constructive or S.T.F.U
December 04, 2018, 02:51:23 PM
#8
i think you have tried every possible solution that i could suggest to you. now what's left to try a custom firmware like this one > https://bitcointalksearch.org/topic/antminer-s9-volt-rocket-ship-firmware-mod-time-to-max-out-your-bitmain-psu-4513567

please do your own research on this firmware, i have not tested it and i don't know anything about it except for the fact that the dev claims this can run without fans attached

If you run my firmware the 880v and don't tweak the hash-rate it will do exactly what your trying to do. It will run without a fan check.

https://bitcointalksearch.org/topic/antminer-s9-volt-rocket-ship-firmware-mod-time-to-max-out-your-bitmain-psu-4513567

also another thing that i remember one guy mentioned about was cutting the fan blades to reduce the resistance for the cooling fluid, you may keep this as last resort should everything else fails!
newbie
Activity: 7
Merit: 0
December 04, 2018, 02:02:54 PM
#7
@SidSlobber

First off, thanks for the reply. I've flashed the fixed firmware, and have verified the correct firmware on all miners. I have also already tried the fixed duty cycle for the fans.

Seems that the vast majority of the miners not working are s9is (with only 2 S9s not working).

Since my last post, I have purchased a fan delete with a 555 timer and tried that. Sadly, the results were the same. The  miner constantly resets. I am currently in the process of installing Braiins OS. I'm running out of ideas, and since Braiins is a truly 3rd party OS it's worth a shot.

If you have any more logical steps to try I would really appreciate. I feel like I've tried everything.

Thank you.
newbie
Activity: 14
Merit: 29
December 02, 2018, 03:51:36 AM
#6
To me there is nothing obviously wrong in your kernel log.  I've checked it against one of my miners with what I believe is the same hw/fw config as yours and the log is identical "line for line" including "bmminer not found, restart bmminer" and the chain temps etc - so I don't think either of those previous suggestions are your problem.   You've obviously done the logical troubleshooting - i.e. Miner-Bad runs fine and doesn't reset with original fans attached, but still resets even with a known working fan delete mod swapped from Miner-Good.   Frustrating.  It's possibly crashing when it gets to the "checking fans......" stage e.g. this is from one of my auto-tune S9is which occurs around the same time in the kernel log as when yours crashes:

Code:
Miner compile time: Wed Nov 7 11:19:02 CST 2018 type: Antminer S9i
miner ID : 803244067910485c
Checking fans...
get fan[4] speed=5880
get fan[4] speed=5880
get fan[5] speed=6000
get fan[4] speed=5880
get fan[5] speed=6000
get fan[4] speed=5880
get fan[5] speed=6000

As you have already trouble-shot your hardware mod, it's possible that there is some subtle difference with your Miner-Bad's compared to the Miner-Good's - but the issue only shows up when you attach the fan delete mod. You have probably done this already - but I'd start looking for what is different with these crashing miners - if anything.  E.g. you mention that you have S9's &  S9i's in 13, 13.5 and 14 variants - so is there any obvious pattern with the 7 in your tank that are not working e.g. it would give us more clues if, for example, they were all S9i's etc?   Also as per the quote below, you use an example of 2 x 13.5 S9i Miners, but your kernel log appears to be from a plain S9 - which is fine obviously as you only mentioned S9i's "as an example".    So as the "devil is in the detail" can you just confirm that the kernel log was from an S9 i.e. not from an S9i but with the "wrong" S9 asci-boost firmware applied?    

So let's say I have 2 13.5 TH S9i miners (I'll call them Miner-Good and Miner-Bad).

Until recently, there was only the fixed-freq and the auto-freq firmware for the S9 available.  But since Bitmain launched the S9i and S9j and then also released the asic-boost firmware's (very confusingly in different ways for the 3 S9 models  - i.e. loaded "on-top" of the previous firmware for the S9s, but a total new replacement firmware for the S9i and S9j)  - there are now 11 current "latest" firmware's for the S9 models available on the Bitmain website.   Combined with a lack of Bitmain documentation this has led to the people flashing the wrong firmware to their miners - e.g. S9 FW on an S9i and lots of confusion about the asic-boost update process. The miners may appear to work with wrong firmware in some cases, but it could cause "strange" issues or sub-optimal performance.

So assuming that you are using the standard auto-tune / auto-freq, to completely rule out any FW incompatibility issues, your miners should be as follows if they are up to date:

S9 - the asic-boost patch Antminer-S9-LPM-20181102.tar.gz  flashed on top of the latest auto-tune FW - Antminer-S9-all-201711171757-autofreq-user-Update2UBI-NF.tar.gz which was probably what your miners already had if they were previously up-to-date.  Unfortunately due to the inconsistent way Bitmain released the asic-boost for the S9s as a small patch file, once it is applied "on-top" there is no way of knowing what the "main" firmware version is - the miner's overview page just displays the asic boost version FW details i.e. - "File System Version   Sun Nov 2 11:55:42 UTC 2018"

S9i - just the asic-boost new FW - Antminer-S9i-all-201811071119-autofreq-user-Update2UBI-NF.tar.gz   (the very latest S9i version released a few days ago drops both the power and hashrate by around 40% - which Bitmain either didn't intend or haven't documented - so I guess you don't want to use that).

Apologies if the above is obvious to you and you are certain that you have the correct up-to-date FW on all your different S9 models.  I only mention it as the Bitmain documentation is not clear and some experienced miners have therefore flashed the wrong firmware as can be seen in other sections of this forum.

Therefore if you are 100% certain of your fan delete mods, all the FW is up-to-date and the correct version for the S9 variant and there are no obvious patterns with the "bad miners", then I guess it is probably going to have to be trial and error to get them working without the fans.   A couple of suggestions which are quick to try are:

a. Try flashing a "bad" S9 with the appropriate fixed-freq firmware: e.g. Antminer-S9-all-201705031838-650M-user-Update2UBI-NF.tar.gz  which is for a 14 Th/s S9 but should be fine for all of them.   The fixed-freq FW does a much quicker "boot" process and skips all of the auto-tune and auto-freq processes - so that may help to get round the "fan speed checking" issue.  Some of my miners are using fixed-freq FW and unlike my normal auto-tune miners the are no "fan checking" lines in the log - so it might work for your set-up - but no guarantee.   If it works then you can apply the LPM asic-boost FW on-top which will leave you with a boosted fixed-freq miner still running with less power, but now also faster than before.  Your 13 and 13.5 Th/s S9's may be fine running with the 14 TH/s fixed-freq FW - but if not (i.e. unstable or too many HW errors) just use  http://192.168.1.xx/cgi-bin/minerAdvanced.cgi to get to the hidden config page and set the freq manually to whatever is stable (or back to what the miner originally was e.g. freq=631.25 for 13.5Th/s etc).   Obviously you can't do this for your S9i's as Bitmain doesn't offer any fixed-freq firmware for this model.

b. Another suggestion - it may not work - but it only takes a few seconds to try - is to manually fix the fan speed - or notional fan speed in your case.  I don't know if a fixed "dummy" fan speed is compatible with your "fan delete hw mod".   This should just end up with an entry as follows as the very last lines of the kernel log and hopefully will mean that the fan speed does not get checked during restart - and hence hopefully stop your miners from crashing.  
  
Code:
Set fixed fan speed=75
FAN PWM: 75
read_temp_func Done!
CRC error counter=0

As you are doing immersion cooling, you are probably far more expert than me with tweaking fans etc, however below is the method I use to manually over-ride the auto-tune fan speed - only takes a few seconds + a restart (it's only temporarily revealing the hidden options in the standard Bitmain miner web-server - so it's not changing any code etc):

The very simplest method - it takes at most 30 seconds per machine and doesn't involve customizing Bitmain's code at all - is as follows:

Note: I've used Chrome - but it is similar for other browsers.

1.  Open your miner in your browser and go to the "Miner Configuration" tab.
2.  Right-click the area around the new "Low Power Mode" and click "Inspect" - the  Chrome DevTools Elements panel will appear on the right.
3.  A few lines down from the Low Power Mode, you'll notice a line with "fan-ctrl" in it.  
4.  On the fan-ctrl line, highlight the  text :none from the style="display :none", press delete and then press enter.
5.  You will now temporarily have a "Customize the fan speed percentage" check-box and input box showing in the miner GUI.  
6.  Enter your new fixed fan speeds (85 to 90% should be good for most overclocking) and do the normal "Save & Apply".
7.  Once the miner restarts - check the kernel log - at the very bottom, it will now show the following instead of the usual auto-tune lines:

Code:
Set fixed fan speed=89
FAN PWM: 89

In summary - it's just right-click to inspect the web-page, delete the :none text on the fan control line and you are done.  

Before edit screenshot:
https://imgur.com/cDHmAlF

After edit screenshot:
https://imgur.com/cIvktyx
full member
Activity: 294
Merit: 129
November 30, 2018, 03:45:00 PM
#5
According to your kernel logs "bmminer not found, restart bmminer ..."
This might be the reason why the miner keeps rebooting. The only way that I know you can fix this is to reflash it with other firmware.

Can you try to flash it with this firmware Antminer-S9-all-201711171757-autofreq-user-Update2UBI-NF.tar.gz
Or downgrade it to "November firmware".

The November firmware is well known to cause many problems including the bmminer not found issue. I would NEVER recommend this firmware to anyone.
legendary
Activity: 3374
Merit: 3095
Playbet.io - Crypto Casino and Sportsbook
November 30, 2018, 03:38:37 PM
#4
According to your kernel logs "bmminer not found, restart bmminer ..."
This might be the reason why the miner keeps rebooting. The only way that I know you can fix this is to reflash it with other firmware.

Can you try to flash it with this firmware Antminer-S9-all-201711171757-autofreq-user-Update2UBI-NF.tar.gz
Or downgrade it to "November firmware".

Edit:
It looks like its unable to detect the chain 5,6 and 7 temps. There is something wrong reading the temp, not a software it looks like a faulty sensor.

Code:
chain[5] has no middle temp, use special fix mode.
chain[6] temp offset record: 62,0,0,0,0,0,35,28
chain[6] temp chip I2C addr=0x98
chain[6] has no middle temp, use special fix mode.
chain[7] temp offset record: 62,0,0,0,0,0,35,28
chain[7] temp chip I2C addr=0x98
chain[7] has no middle temp, use special fix mode.
newbie
Activity: 7
Merit: 0
November 30, 2018, 03:07:56 PM
#3
Thank you for the reply:

Code:
[    0.000000] Booting Linux on physical CPU 0x0
[    0.000000] Linux version 3.14.0-xilinx-g16220c3 (lzq@armdev2) (gcc version 4.8.3 20140320 (prerelease) (Sourcery CodeBench Lite 2014.05-23) ) #83 SMP PREEMPT Thu Jul 12 11:42:53 CST 2018
[    0.000000] CPU: ARMv7 Processor [413fc090] revision 0 (ARMv7), cr=18c5387d
[    0.000000] CPU: PIPT / VIPT nonaliasing data cache, VIPT aliasing instruction cache
[    0.000000] Machine model: Xilinx Zynq
[    0.000000] cma: CMA: reserved 128 MiB at 16800000
[    0.000000] Memory policy: Data cache writealloc
[    0.000000] On node 0 totalpages: 126976
[    0.000000] free_area_init_node: node 0, pgdat c074ac00, node_mem_map debd8000
[    0.000000]   Normal zone: 992 pages used for memmap
[    0.000000]   Normal zone: 0 pages reserved
[    0.000000]   Normal zone: 126976 pages, LIFO batch:31
[    0.000000] PERCPU: Embedded 8 pages/cpu @debc1000 s9088 r8192 d15488 u32768
[    0.000000] pcpu-alloc: s9088 r8192 d15488 u32768 alloc=8*4096
[    0.000000] pcpu-alloc: [0] 0 [0] 1
[    0.000000] Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 125984
[    0.000000] Kernel command line: noinitrd mem=496M console=ttyPS0,115200 root=ubi0:rootfs ubi.mtd=1 rootfstype=ubifs rw rootwait
[    0.000000] PID hash table entries: 2048 (order: 1, 8192 bytes)
[    0.000000] Dentry cache hash table entries: 65536 (order: 6, 262144 bytes)
[    0.000000] Inode-cache hash table entries: 32768 (order: 5, 131072 bytes)
[    0.000000] Memory: 364316K/507904K available (5057K kernel code, 284K rwdata, 1928K rodata, 204K init, 258K bss, 143588K reserved, 0K highmem)
[    0.000000] Virtual kernel memory layout:
[    0.000000]     vector  : 0xffff0000 - 0xffff1000   (   4 kB)
[    0.000000]     fixmap  : 0xfff00000 - 0xfffe0000   ( 896 kB)
[    0.000000]     vmalloc : 0xdf800000 - 0xff000000   ( 504 MB)
[    0.000000]     lowmem  : 0xc0000000 - 0xdf000000   ( 496 MB)
[    0.000000]     pkmap   : 0xbfe00000 - 0xc0000000   (   2 MB)
[    0.000000]     modules : 0xbf000000 - 0xbfe00000   (  14 MB)
[    0.000000]       .text : 0xc0008000 - 0xc06da8ac   (6987 kB)
[    0.000000]       .init : 0xc06db000 - 0xc070e380   ( 205 kB)
[    0.000000]       .data : 0xc0710000 - 0xc0757138   ( 285 kB)
[    0.000000]        .bss : 0xc0757144 - 0xc0797bfc   ( 259 kB)
[    0.000000] Preemptible hierarchical RCU implementation.
[    0.000000] Dump stacks of tasks blocking RCU-preempt GP.
[    0.000000] RCU restricting CPUs from NR_CPUS=4 to nr_cpu_ids=2.
[    0.000000] RCU: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=2
[    0.000000] NR_IRQS:16 nr_irqs:16 16
[    0.000000] ps7-slcr mapped to df802000
[    0.000000] zynq_clock_init: clkc starts at df802100
[    0.000000] Zynq clock init
[    0.000014] sched_clock: 64 bits at 333MHz, resolution 3ns, wraps every 3298534883328ns
[    0.000290] ps7-ttc #0 at df804000, irq=43
[    0.000590] Console: colour dummy device 80x30
[    0.000620] Calibrating delay loop... 1332.01 BogoMIPS (lpj=6660096)
[    0.100239] pid_max: default: 32768 minimum: 301
[    0.100449] Mount-cache hash table entries: 1024 (order: 0, 4096 bytes)
[    0.100468] Mountpoint-cache hash table entries: 1024 (order: 0, 4096 bytes)
[    0.102621] CPU: Testing write buffer coherency: ok
[    0.102929] CPU0: thread -1, cpu 0, socket 0, mpidr 80000000
[    0.102986] Setting up static identity map for 0x4cb118 - 0x4cb170
[    0.103206] L310 cache controller enabled
[    0.103226] l2x0: 8 ways, CACHE_ID 0x410000c8, AUX_CTRL 0x72760000, Cache size: 512 kB
[    0.180983] CPU1: Booted secondary processor
[    0.270216] CPU1: thread -1, cpu 1, socket 0, mpidr 80000001
[    0.270344] Brought up 2 CPUs
[    0.270363] SMP: Total of 2 processors activated.
[    0.270371] CPU: All CPU(s) started in SVC mode.
[    0.271030] devtmpfs: initialized
[    0.273454] VFP support v0.3: implementor 41 architecture 3 part 30 variant 9 rev 4
[    0.274643] regulator-dummy: no parameters
[    0.281985] NET: Registered protocol family 16
[    0.284201] DMA: preallocated 256 KiB pool for atomic coherent allocations
[    0.286464] cpuidle: using governor ladder
[    0.286477] cpuidle: using governor menu
[    0.293819] syscon f8000000.ps7-slcr: regmap [mem 0xf8000000-0xf8000fff] registered
[    0.295326] hw-breakpoint: found 5 (+1 reserved) breakpoint and 1 watchpoint registers.
[    0.295339] hw-breakpoint: maximum watchpoint size is 4 bytes.
[    0.295451] zynq-ocm f800c000.ps7-ocmc: ZYNQ OCM pool: 256 KiB @ 0xdf880000
[    0.317260] bio: create slab at 0
[    0.318634] vgaarb: loaded
[    0.319336] SCSI subsystem initialized
[    0.320243] usbcore: registered new interface driver usbfs
[    0.320898] usbcore: registered new interface driver hub
[    0.321181] usbcore: registered new device driver usb
[    0.321698] media: Linux media interface: v0.10
[    0.321862] Linux video capture interface: v2.00
[    0.322102] pps_core: LinuxPPS API ver. 1 registered
[    0.322114] pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti
[    0.322234] PTP clock support registered
[    0.322594] EDAC MC: Ver: 3.0.0
[    0.323786] Advanced Linux Sound Architecture Driver Initialized.
[    0.326512] DMA-API: preallocated 4096 debug entries
[    0.326526] DMA-API: debugging enabled by kernel config
[    0.326592] Switched to clocksource arm_global_timer
[    0.346521] NET: Registered protocol family 2
[    0.347483] TCP established hash table entries: 4096 (order: 2, 16384 bytes)
[    0.347541] TCP bind hash table entries: 4096 (order: 3, 32768 bytes)
[    0.347626] TCP: Hash tables configured (established 4096 bind 4096)
[    0.347671] TCP: reno registered
[    0.347687] UDP hash table entries: 256 (order: 1, 8192 bytes)
[    0.347717] UDP-Lite hash table entries: 256 (order: 1, 8192 bytes)
[    0.347954] NET: Registered protocol family 1
[    0.348294] RPC: Registered named UNIX socket transport module.
[    0.348307] RPC: Registered udp transport module.
[    0.348315] RPC: Registered tcp transport module.
[    0.348323] RPC: Registered tcp NFSv4.1 backchannel transport module.
[    0.348335] PCI: CLS 0 bytes, default 64
[    0.348760] hw perfevents: enabled with ARMv7 Cortex-A9 PMU driver, 7 counters available
[    0.350752] futex hash table entries: 512 (order: 3, 32768 bytes)
[    0.352804] jffs2: version 2.2. (NAND) © 2001-2006 Red Hat, Inc.
[    0.352993] msgmni has been set to 967
[    0.353749] io scheduler noop registered
[    0.353762] io scheduler deadline registered
[    0.353801] io scheduler cfq registered (default)
[    0.361720] dma-pl330 f8003000.ps7-dma: Loaded driver for PL330 DMAC-2364208
[    0.361739] dma-pl330 f8003000.ps7-dma: DBUFF-128x8bytes Num_Chans-8 Num_Peri-4 Num_Events-16
[    0.484232] e0001000.serial: ttyPS0 at MMIO 0xe0001000 (irq = 82, base_baud = 3124999) is a xuartps
[    1.051365] console [ttyPS0] enabled
[    1.055647] xdevcfg f8007000.ps7-dev-cfg: ioremap 0xf8007000 to df866000
[    1.063254] [drm] Initialized drm 1.1.0 20060810
[    1.080280] brd: module loaded
[    1.089685] loop: module loaded
[    1.099093] e1000e: Intel(R) PRO/1000 Network Driver - 2.3.2-k
[    1.104843] e1000e: Copyright(c) 1999 - 2013 Intel Corporation.
[    1.112889] libphy: XEMACPS mii bus: probed
[    1.117453] ------------- phy_id = 0x3625e62
[    1.122170] xemacps e000b000.ps7-ethernet: pdev->id -1, baseaddr 0xe000b000, irq 54
[    1.130819] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
[    1.137476] ehci-pci: EHCI PCI platform driver
[    1.144688] zynq-dr e0002000.ps7-usb: Unable to init USB phy, missing?
[    1.151507] usbcore: registered new interface driver usb-storage
[    1.158350] mousedev: PS/2 mouse device common for all mice
[    1.164433] i2c /dev entries driver
[    1.171355] zynq-edac f8006000.ps7-ddrc: ecc not enabled
[    1.176883] cpufreq_cpu0: failed to get cpu0 regulator: -19
[    1.182758] Xilinx Zynq CpuIdle Driver started
[    1.187666] sdhci: Secure Digital Host Controller Interface driver
[    1.193761] sdhci: Copyright(c) Pierre Ossman
[    1.198127] sdhci-pltfm: SDHCI platform and OF driver helper
[    1.204894] mmc0: no vqmmc regulator found
[    1.208985] mmc0: no vmmc regulator found
[    1.246611] mmc0: SDHCI controller on e0100000.ps7-sdio [e0100000.ps7-sdio] using ADMA
[    1.255283] usbcore: registered new interface driver usbhid
[    1.260798] usbhid: USB HID core driver
[    1.265502] nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xda
[    1.271800] nand: Micron MT29F2G08ABAEAWP
[    1.275764] nand: 256MiB, SLC, page size: 2048, OOB size: 64
[    1.281705] Bad block table found at page 131008, version 0x01
[    1.287933] Bad block table found at page 130944, version 0x01
[    1.293981] 3 ofpart partitions found on MTD device pl353-nand
[    1.299765] Creating 3 MTD partitions on "pl353-nand":
[    1.304854] 0x000000000000-0x000002000000 : "BOOT.bin-env-dts-kernel"
[    1.312922] 0x000002000000-0x00000b000000 : "angstram-rootfs"
[    1.320224] 0x00000b000000-0x000010000000 : "upgrade-rootfs"
[    1.330954] TCP: cubic registered
[    1.334189] NET: Registered protocol family 17
[    1.338884] Registering SWP/SWPB emulation handler
[    1.344736] regulator-dummy: disabling
[    1.349055] UBI: attaching mtd1 to ubi0
[    1.876302] UBI: scanning is finished
[    1.888096] UBI: attached mtd1 (name "angstram-rootfs", size 144 MiB) to ubi0
[    1.895151] UBI: PEB size: 131072 bytes (128 KiB), LEB size: 126976 bytes
[    1.901948] UBI: min./max. I/O unit sizes: 2048/2048, sub-page size 2048
[    1.908624] UBI: VID header offset: 2048 (aligned 2048), data offset: 4096
[    1.915462] UBI: good PEBs: 1152, bad PEBs: 0, corrupted PEBs: 0
[    1.921471] UBI: user volume: 1, internal volumes: 1, max. volumes count: 128
[    1.928583] UBI: max/mean erase counter: 65/19, WL threshold: 4096, image sequence number: 934887695
[    1.937699] UBI: available PEBs: 0, total reserved PEBs: 1152, PEBs reserved for bad PEB handling: 40
[    1.946915] UBI: background thread "ubi_bgt0d" started, PID 1084
[    1.946920] drivers/rtc/hctosys.c: unable to open rtc device (rtc0)
[    1.950816] ALSA device list:
[    1.950820]   No soundcards found.
[    1.967219] UBIFS: background thread "ubifs_bgt0_0" started, PID 1086
[    1.996094] UBIFS: recovery needed
[    2.096333] UBIFS: recovery completed
[    2.100029] UBIFS: mounted UBI device 0, volume 0, name "rootfs"
[    2.105951] UBIFS: LEB size: 126976 bytes (124 KiB), min./max. I/O unit sizes: 2048 bytes/2048 bytes
[    2.115087] UBIFS: FS size: 128626688 bytes (122 MiB, 1013 LEBs), journal size 9023488 bytes (8 MiB, 72 LEBs)
[    2.124970] UBIFS: reserved for root: 0 bytes (0 KiB)
[    2.130010] UBIFS: media format: w4/r0 (latest is w4/r0), UUID 2F0368C9-A3D5-4F6F-8350-C0A1080F2C3F, small LPT model
[    2.141519] VFS: Mounted root (ubifs filesystem) on device 0:11.
[    2.148715] devtmpfs: mounted
[    2.151819] Freeing unused kernel memory: 204K (c06db000 - c070e000)
[    2.990739] random: dd urandom read with 1 bits of entropy available
[    3.376631]
[    3.376631] bcm54xx_config_init
[    3.986640]
[    3.986640] bcm54xx_config_init
[    7.987455] xemacps e000b000.ps7-ethernet: Set clk to 124999998 Hz
[    7.993557] xemacps e000b000.ps7-ethernet: link up (1000/FULL)
[   22.598152] In axi fpga driver!
[   22.601223] request_mem_region OK!
[   22.604610] AXI fpga dev virtual address is 0xdf9fe000
[   22.609758] *base_vir_addr = 0xc51e
[   22.624909] In fpga mem driver!
[   22.628030] request_mem_region OK!
[   22.631541] fpga mem virtual address is 0xe2000000
[   23.426028]
[   23.426028] bcm54xx_config_init
[   24.055960]
[   24.055960] bcm54xx_config_init
[   28.056423] xemacps e000b000.ps7-ethernet: Set clk to 124999998 Hz
[   28.062524] xemacps e000b000.ps7-ethernet: link up (1000/FULL)
log_level = 4
This is XILINX board. Totalram:       507486208
Detect 512MB control board of XILINX
mmap axi_fpga_addr = 0xb6f76000
axi_fpga_addr data = 0xc51e
mmap fpga_mem_addr = 0xb5d6f000
forceFreq=-1 forceFlag=0
min work minertest[0]:912


DETECT HW version=0000c51e
miner ID : 8054748e7680881c
Miner Type = S9
AsicType = 1387
real AsicNum = 63
use critical mode to search freq...
get PLUG ON=0x000000e0
Find hashboard on Chain[5]
Find hashboard on Chain[6]
Find hashboard on Chain[7]
Check chain[5] PIC fw version=0x03
Check chain[6] PIC fw version=0x03
Check chain[7] PIC fw version=0x03
read pic freq and badcore num...
chain[5]: [63:255] [63:255] [63:255] [63:255] [63:255] [63:255] [63:255] [63:255]
has freq in PIC, will disable freq setting.
chain[5] has freq in PIC and will jump over...
Chain[5] has core num in PIC
Chain[5] ASIC[0] has core num=1
Chain[5] ASIC[1] has core num=1
Chain[5] ASIC[2] has core num=9
Chain[5] ASIC[3] has core num=2
Chain[5] ASIC[4] has core num=1
Chain[5] ASIC[5] has core num=1
Chain[5] ASIC[6] has core num=1
Chain[5] ASIC[7] has core num=1
Chain[5] ASIC[8] has core num=1
Chain[5] ASIC[10] has core num=1
Chain[5] ASIC[11] has core num=1
Chain[5] ASIC[16] has core num=1
Chain[5] ASIC[17] has core num=3
Chain[5] ASIC[21] has core num=1
Chain[5] ASIC[23] has core num=1
Chain[5] ASIC[25] has core num=1
Chain[5] ASIC[26] has core num=1
Chain[5] ASIC[28] has core num=1
Chain[5] ASIC[29] has core num=12
Chain[5] ASIC[30] has core num=1
Chain[5] ASIC[31] has core num=1
Chain[5] ASIC[32] has core num=3
Chain[5] ASIC[33] has core num=1
Chain[5] ASIC[34] has core num=1
Chain[5] ASIC[35] has core num=1
Chain[5] ASIC[36] has core num=1
Chain[5] ASIC[37] has core num=1
Chain[5] ASIC[38] has core num=1
Chain[5] ASIC[40] has core num=1
Chain[5] ASIC[41] has core num=1
Chain[5] ASIC[43] has core num=1
Chain[5] ASIC[44] has core num=1
Chain[5] ASIC[45] has core num=2
Chain[5] ASIC[46] has core num=1
Chain[5] ASIC[47] has core num=1
Chain[5] ASIC[48] has core num=3
Chain[5] ASIC[50] has core num=1
Chain[5] ASIC[51] has core num=2
Chain[5] ASIC[52] has core num=1
Chain[5] ASIC[53] has core num=1
Chain[5] ASIC[54] has core num=2
Chain[5] ASIC[55] has core num=1
Chain[5] ASIC[56] has core num=1
Chain[5] ASIC[57] has core num=13
Chain[5] ASIC[59] has core num=3
Chain[5] ASIC[60] has core num=1
Chain[5] ASIC[61] has core num=1
Chain[5] ASIC[62] has core num=2
Check chain[5] PIC fw version=0x03
read pic freq and badcore num...
chain[6]: [63:255] [63:255] [63:255] [63:255] [63:255] [63:255] [63:255] [63:255]
has freq in PIC, will disable freq setting.
chain[6] has freq in PIC and will jump over...
Chain[6] has core num in PIC
Chain[6] ASIC[0] has core num=1
Chain[6] ASIC[1] has core num=1
Chain[6] ASIC[2] has core num=1
Chain[6] ASIC[4] has core num=2
Chain[6] ASIC[5] has core num=1
Chain[6] ASIC[7] has core num=1
Chain[6] ASIC[8] has core num=1
Chain[6] ASIC[9] has core num=1
Chain[6] ASIC[10] has core num=1
Chain[6] ASIC[11] has core num=1
Chain[6] ASIC[13] has core num=1
Chain[6] ASIC[15] has core num=1
Chain[6] ASIC[16] has core num=1
Chain[6] ASIC[17] has core num=1
Chain[6] ASIC[19] has core num=1
Chain[6] ASIC[21] has core num=1
Chain[6] ASIC[22] has core num=2
Chain[6] ASIC[23] has core num=1
Chain[6] ASIC[24] has core num=2
Chain[6] ASIC[25] has core num=1
Chain[6] ASIC[26] has core num=1
Chain[6] ASIC[27] has core num=1
Chain[6] ASIC[28] has core num=1
Chain[6] ASIC[29] has core num=1
Chain[6] ASIC[30] has core num=1
Chain[6] ASIC[32] has core num=1
Chain[6] ASIC[34] has core num=1
Chain[6] ASIC[35] has core num=1
Chain[6] ASIC[36] has core num=1
Chain[6] ASIC[37] has core num=1
Chain[6] ASIC[38] has core num=1
Chain[6] ASIC[40] has core num=1
Chain[6] ASIC[41] has core num=1
Chain[6] ASIC[42] has core num=1
Chain[6] ASIC[43] has core num=1
Chain[6] ASIC[44] has core num=1
Chain[6] ASIC[48] has core num=12
Chain[6] ASIC[49] has core num=1
Chain[6] ASIC[50] has core num=1
Chain[6] ASIC[51] has core num=1
Chain[6] ASIC[52] has core num=1
Chain[6] ASIC[53] has core num=1
Chain[6] ASIC[55] has core num=1
Chain[6] ASIC[56] has core num=2
Chain[6] ASIC[58] has core num=1
Chain[6] ASIC[59] has core num=1
Chain[6] ASIC[60] has core num=1
Chain[6] ASIC[61] has core num=1
Chain[6] ASIC[62] has core num=1
Check chain[6] PIC fw version=0x03
read pic freq and badcore num...
chain[7]: [63:255] [63:255] [63:255] [63:255] [63:255] [63:255] [63:255] [63:255]
has freq in PIC, will disable freq setting.
chain[7] has freq in PIC and will jump over...
Chain[7] has core num in PIC
Chain[7] ASIC[1] has core num=1
Chain[7] ASIC[2] has core num=1
Chain[7] ASIC[3] has core num=2
Chain[7] ASIC[4] has core num=1
Chain[7] ASIC[5] has core num=1
Chain[7] ASIC[6] has core num=1
Chain[7] ASIC[7] has core num=1
Chain[7] ASIC[8] has core num=1
Chain[7] ASIC[9] has core num=1
Chain[7] ASIC[10] has core num=1
Chain[7] ASIC[11] has core num=1
Chain[7] ASIC[12] has core num=1
Chain[7] ASIC[13] has core num=1
Chain[7] ASIC[14] has core num=15
Chain[7] ASIC[15] has core num=2
Chain[7] ASIC[16] has core num=1
Chain[7] ASIC[17] has core num=1
Chain[7] ASIC[18] has core num=2
Chain[7] ASIC[19] has core num=1
Chain[7] ASIC[20] has core num=1
Chain[7] ASIC[21] has core num=1
Chain[7] ASIC[22] has core num=1
Chain[7] ASIC[23] has core num=1
Chain[7] ASIC[25] has core num=1
Chain[7] ASIC[26] has core num=1
Chain[7] ASIC[27] has core num=1
Chain[7] ASIC[28] has core num=1
Chain[7] ASIC[29] has core num=1
Chain[7] ASIC[30] has core num=1
Chain[7] ASIC[32] has core num=1
Chain[7] ASIC[33] has core num=2
Chain[7] ASIC[34] has core num=1
Chain[7] ASIC[35] has core num=1
Chain[7] ASIC[36] has core num=1
Chain[7] ASIC[37] has core num=1
Chain[7] ASIC[38] has core num=1
Chain[7] ASIC[40] has core num=1
Chain[7] ASIC[42] has core num=1
Chain[7] ASIC[43] has core num=1
Chain[7] ASIC[45] has core num=1
Chain[7] ASIC[46] has core num=1
Chain[7] ASIC[47] has core num=1
Chain[7] ASIC[49] has core num=1
Chain[7] ASIC[51] has core num=1
Chain[7] ASIC[52] has core num=1
Chain[7] ASIC[53] has core num=1
Chain[7] ASIC[54] has core num=1
Chain[7] ASIC[55] has core num=1
Chain[7] ASIC[56] has core num=1
Chain[7] ASIC[57] has core num=1
Chain[7] ASIC[59] has core num=2
Chain[7] ASIC[60] has core num=2
Chain[7] ASIC[61] has core num=1
Chain[7] ASIC[62] has core num=1
Check chain[7] PIC fw version=0x03
get PIC voltage=6 on chain[5], value=940
get PIC voltage=6 on chain[6], value=940
get PIC voltage=6 on chain[7], value=940
chain[5] temp offset record: 62,0,0,0,0,0,35,28
chain[5] temp chip I2C addr=0x98
chain[5] has no middle temp, use special fix mode.
chain[6] temp offset record: 62,0,0,0,0,0,35,28
chain[6] temp chip I2C addr=0x98
chain[6] has no middle temp, use special fix mode.
chain[7] temp offset record: 62,0,0,0,0,0,35,28
chain[7] temp chip I2C addr=0x98
chain[7] has no middle temp, use special fix mode.
total_exist_chain_num = 3
single_board_frq_tuning enter
min_rate, des_rate, fix_volt:13800, 14000, 880
force_freq not set, don't need tuning
restart Miner chance num=2
waiting for receive_func to exit!
waiting for pic heart to exit!
bmminer not found= 1660 root       0:00 grep bmminer

bmminer not found, restart bmminer ...
This is user mode for mining
Detect 512MB control board of XILINX
Miner Type = S9
Miner compile time: Sun Nov 2 11:55:42 UTC 2018 type: Antminer S9set_reset_allhashboard = 0x0000ffff
set_reset_allhashboard = 0x00000000
set_reset_allhashboard = 0x0000ffff
miner ID : 8054748e7680881c
set_reset_allhashboard = 0x0000ffff

That's how far it gets before it resets.

The miner works fine with the air cooled fans it came with.

With a fan delete or firmware which overrides the fans it just constantly resets.

I've tried using one of the fan deletes (linked earlier) that works on a good miner on one of the bad miners and it still just resets.
full member
Activity: 294
Merit: 129
November 30, 2018, 01:45:51 PM
#2
Can you post the logs of the machines that keep restarting? That would help immensely in trying to troubleshoot the issue.
newbie
Activity: 7
Merit: 0
November 30, 2018, 01:20:53 PM
#1
I posted about this problem recently, and was given a solution to download firmware. I tried that with the same problem. So i'm back again to explain in more detail. Thanks for the replies.

So, I'm going to try and keep this short and simple. I run a small mining setup (100 miners) using immersion cooling. Currently, I have a mixture of S9-S9i from 13 to 14 TH/S.

I purchased some fan delete mods (These but at a cheaper price from someone else: https://www.amazon.com/Wave-Pulse-Generator-Adjustable-1Hz-150KHz/dp/B01MA1M7Y9 ). I've been soldering and attaching these to the miners.

This is where the problem starts. Just remember, these miners work perfectly with the out of the box air cooled fans attached.

So let's say I have 2 13.5 TH S9i miners (I'll call them Miner-Good and Miner-Bad). I flash them both with the new AsicBoost firmware recently released by Bitmain. I then attach the fan deletes. Miner-Good works flawlessly. Miner-B, however, starts resetting itself every 15-30 seconds.

So, I decide to replace the fan delete on Miner-Good, and place it on Miner-Bad. Miner-Bad still just resets.

I tried downloading custom firmware to disable the fan check. Miner-Good works fine but Miner-Bad still resets.

The only way I can get Miner-Bad to stop resetting is to reattach the original air cooled fans. I just don't understand why. They are both s9i's at 13.5 TH. They have the same firmware.

And this isn't an issue with just one miner, in one tank I have 17 miners that work, and 7 that do not work (all resetting unless I attach original fans).

Does anyone have any other ideas or solutions? I'm lost.

Thank you. I hope I was clear, but I'm not much of a writer Smiley
Jump to: