Hi ckolivas,
I have an issue with one of the rigs (2 x 5850), no matter which version of cgminer I use. GPU1 dies in matter of a couple of hours, rig does not respond, sometimes a hard reset is inevitable.
The system runs:
CentOS 6.3
kernel 2.6.32-279.5.1.el6.x86_64
glibc-2.12-1.80.el6_3.4.x86_64
glib2-2.22.5-7.el6.x86_64
gcc-4.4.6-4.el6.x86_64
AMD-APP-SDK-v2.5-lnx64
ati-driver-installer-11-12-x86.x86_64
I tried the latest cgminer-2.7.0, not using any optimizations at all except intensity set to 2. The following is grabbed from /var/log/messages.
--------------------------------------------------------------------------------------------------------------------------------------
Aug 19 06:08:21 hostname kernel: [fglrx] ASIC hang happened
Aug 19 06:08:21 hostname kernel: Pid: 3430, comm: cgminer Tainted: P --------------- 2.6.32-279.5.1.el6.x86_64 #1
Aug 19 06:08:21 hostname kernel: Call Trace:
Aug 19 06:08:21 hostname kernel: [] ? KCL_DEBUG_OsDump+0xe/0x10 [fglrx]
Aug 19 06:08:21 hostname kernel: [] ? firegl_hardwareHangRecovery+0x1c/0x50 [fglrx]
Aug 19 06:08:21 hostname kernel: [] ? _ZN4Asic9WaitUntil15ResetASICIfHungEv+0x9/0x10 [fglrx]
Aug 19 06:08:21 hostname kernel: [] ? _ZN4Asic9WaitUntil15WaitForCompleteEv+0x9c/0xf0 [fglrx]
Aug 19 06:08:21 hostname kernel: [] ? _ZN4Asic19PM4ElapsedTimeStampEj14_LARGE_INTEGER12_QS_CP_RING_+0xaf/0x170 [fglrx]
Aug 19 06:08:21 hostname kernel: [] ? firegl_trace+0x72/0x1e0 [fglrx]
Aug 19 06:08:21 hostname kernel: [] ? firegl_trace+0x72/0x1e0 [fglrx]
Aug 19 06:08:21 hostname kernel: [] ? _ZN15QS_PRIVATE_CORE27multiVpuPM4ElapsedTimeStampEj14_LARGE_INTEGER12_QS_CP_RIN G_+0x33/0x50 [fglrx]
Aug 19 06:08:21 hostname kernel: [] ? _Z19uQSTimeStampRetiredmjj14_LARGE_INTEGER+0x74/0x80 [fglrx]
Aug 19 06:08:21 hostname kernel: [] ? _Z8uCWDDEQCmjjPvjS_+0x54d/0x10c0 [fglrx]
Aug 19 06:08:21 hostname kernel: [] ? down+0x2e/0x50
Aug 19 06:08:21 hostname kernel: [] ? firegl_cmmqs_CWDDE_32+0x332/0x440 [fglrx]
Aug 19 06:08:21 hostname kernel: [] ? firegl_cmmqs_CWDDE32+0x70/0x100 [fglrx]
Aug 19 06:08:21 hostname kernel: [] ? firegl_cmmqs_CWDDE32+0x0/0x100 [fglrx]
Aug 19 06:08:21 hostname kernel: [] ? firegl_ioctl+0x1ed/0x250 [fglrx]
Aug 19 06:08:21 hostname kernel: [] ? __do_page_fault+0x1ec/0x480
Aug 19 06:08:21 hostname kernel: [] ? ip_firegl_unlocked_ioctl+0xe/0x20 [fglrx]
Aug 19 06:08:21 hostname kernel: [] ? vfs_ioctl+0x22/0xa0
Aug 19 06:08:21 hostname kernel: [] ? do_vfs_ioctl+0x84/0x580
Aug 19 06:08:21 hostname kernel: [] ? sys_ioctl+0x81/0xa0
Aug 19 06:08:21 hostname kernel: [] ? system_call_fastpath+0x16/0x1b
Aug 19 06:08:21 hostname kernel: pubdev:0xffffffffa049ed20, num of device:2 , name:fglrx, major 8, minor 92.
Aug 19 06:08:21 hostname kernel: device 0 : 0xffff88007a1b0000 .
Aug 19 06:08:21 hostname kernel: Asic ID:0x6899, revision:0x2, MMIOReg:0xffffc90000340000.
Aug 19 06:08:21 hostname kernel: FB phys addr: 0xd0000000, MC :0xf00000000, Total FB size :0x40000000.
Aug 19 06:08:21 hostname kernel: gart table MC:0xf0fb27000, Physical:0xdfb27000, size:0x1d8000.
Aug 19 06:08:21 hostname kernel: mc_node :FB, total 1 zones
Aug 19 06:08:21 hostname kernel: MC start:0xf00000000, Physical:0xd0000000, size:0xfd00000.
Aug 19 06:08:21 hostname kernel: Mapped heap -- Offset:0x0, size:0xfb27000, reference count:21, mapping count:0,
Aug 19 06:08:21 hostname kernel: Mapped heap -- Offset:0x0, size:0x1000000, reference count:1, mapping count:0,
Aug 19 06:08:21 hostname kernel: Mapped heap -- Offset:0xfb27000, size:0x1d9000, reference count:1, mapping count:0,
Aug 19 06:08:21 hostname kernel: mc_node :INV_FB, total 1 zones
Aug 19 06:08:21 hostname kernel: MC start:0xf0fd00000, Physical:0xdfd00000, size:0x30300000.
Aug 19 06:08:21 hostname kernel: Mapped heap -- Offset:0x302f4000, size:0xc000, reference count:1, mapping count:0,
Aug 19 06:08:21 hostname kernel: mc_node :GART_USWC, total 2 zones
Aug 19 06:08:21 hostname kernel: MC start:0x260c0000, Physical:0x0, size:0x24c00000.
Aug 19 06:08:21 hostname kernel: Mapped heap -- Offset:0x0, size:0x2000000, reference count:21, mapping count:0,
Aug 19 06:08:21 hostname kernel: mc_node :GART_CACHEABLE, total 3 zones
Aug 19 06:08:21 hostname kernel: MC start:0x10400000, Physical:0x0, size:0x15cc0000.
Aug 19 06:08:21 hostname kernel: Mapped heap -- Offset:0x1500000, size:0x200000, reference count:1, mapping count:0,
Aug 19 06:08:21 hostname kernel: Mapped heap -- Offset:0x1400000, size:0x100000, reference count:1, mapping count:0,
Aug 19 06:08:21 hostname kernel: Mapped heap -- Offset:0x1300000, size:0x100000, reference count:1, mapping count:0,
Aug 19 06:08:21 hostname kernel: Mapped heap -- Offset:0x1200000, size:0x100000, reference count:2, mapping count:0,
Aug 19 06:08:21 hostname kernel: Mapped heap -- Offset:0x1100000, size:0x100000, reference count:1, mapping count:0,
Aug 19 06:08:21 hostname kernel: Mapped heap -- Offset:0x1000000, size:0x100000, reference count:2, mapping count:0,
Aug 19 06:08:21 hostname kernel: Mapped heap -- Offset:0xf00000, size:0x100000, reference count:2, mapping count:0,
Aug 19 06:08:21 hostname kernel: Mapped heap -- Offset:0xe00000, size:0x100000, reference count:2, mapping count:0,
Aug 19 06:08:21 hostname kernel: Mapped heap -- Offset:0xd00000, size:0x100000, reference count:1, mapping count:0,
Aug 19 06:08:21 hostname kernel: Mapped heap -- Offset:0xb00000, size:0x200000, reference count:1, mapping count:0,
Aug 19 06:08:21 hostname kernel: Mapped heap -- Offset:0x800000, size:0x300000, reference count:2, mapping count:0,
Aug 19 06:08:21 hostname kernel: Mapped heap -- Offset:0x500000, size:0x300000, reference count:2, mapping count:0,
Aug 19 06:08:21 hostname kernel: Mapped heap -- Offset:0x200000, size:0x300000, reference count:2, mapping count:0,
Aug 19 06:08:21 hostname kernel: Mapped heap -- Offset:0x0, size:0x200000, reference count:6, mapping count:0,
Aug 19 06:08:21 hostname kernel: Mapped heap -- Offset:0xef000, size:0x11000, reference count:1, mapping count:0,
Aug 19 06:08:21 hostname kernel: GRBM : 0xb0633828, SRBM : 0x20000ec0 .
Aug 19 06:08:21 hostname kernel: CP_RB_BASE : 0x260c00, CP_RB_RPTR : 0x1920 , CP_RB_WPTR :0x1920.
Aug 19 06:08:21 hostname kernel: CP_IB1_BUFSZ:0x0, CP_IB1_BASE_HI:0x0, CP_IB1_BASE_LO:0x2686f000.
Aug 19 06:08:21 hostname kernel: last submit IB buffer -- MC :0x2686f000,phys:0x7b4f5000.
Aug 19 06:08:21 hostname kernel: device 1 : 0xffff88003759c000 .
Aug 19 06:08:21 hostname kernel: Asic ID:0x6899, revision:0x2, MMIOReg:0xffffc90004980000.
Aug 19 06:08:21 hostname kernel: FB phys addr: 0xe0000000, MC :0xf00000000, Total FB size :0x40000000.
Aug 19 06:08:21 hostname kernel: gart table MC:0xf0fb27000, Physical:0xefb27000, size:0x1d8000.
Aug 19 06:08:21 hostname kernel: mc_node :FB, total 1 zones
Aug 19 06:08:21 hostname kernel: MC start:0xf00000000, Physical:0xe0000000, size:0xfd00000.
Aug 19 06:08:21 hostname kernel: Mapped heap -- Offset:0x0, size:0xfb27000, reference count:20, mapping count:0,
Aug 19 06:08:21 hostname kernel: Mapped heap -- Offset:0x0, size:0x1000000, reference count:1, mapping count:0,
Aug 19 06:08:21 hostname kernel: Mapped heap -- Offset:0xfb27000, size:0x1d9000, reference count:1, mapping count:0,
Aug 19 06:08:21 hostname kernel: mc_node :INV_FB, total 1 zones
Aug 19 06:08:21 hostname kernel: MC start:0xf0fd00000, Physical:0xefd00000, size:0x30300000.
Aug 19 06:08:21 hostname kernel: Mapped heap -- Offset:0x302f4000, size:0xc000, reference count:1, mapping count:0,
Aug 19 06:08:21 hostname kernel: mc_node :GART_USWC, total 2 zones
Aug 19 06:08:21 hostname kernel: MC start:0x260c0000, Physical:0x0, size:0x24c00000.
Aug 19 06:08:21 hostname kernel: Mapped heap -- Offset:0x0, size:0x2000000, reference count:21, mapping count:0,
Aug 19 06:08:21 hostname kernel: mc_node :GART_CACHEABLE, total 3 zones
Aug 19 06:08:21 hostname kernel: MC start:0x10400000, Physical:0x0, size:0x15cc0000.
Aug 19 06:08:21 hostname kernel: Mapped heap -- Offset:0x2c00000, size:0x200000, reference count:1, mapping count:0,
Aug 19 06:08:21 hostname kernel: Mapped heap -- Offset:0x2b00000, size:0x100000, reference count:1, mapping count:0,
Aug 19 06:08:21 hostname kernel: Mapped heap -- Offset:0x2a00000, size:0x100000, reference count:1, mapping count:0,
Aug 19 06:08:21 hostname kernel: Mapped heap -- Offset:0x2900000, size:0x100000, reference count:2, mapping count:0,
Aug 19 06:08:21 hostname kernel: Mapped heap -- Offset:0x2800000, size:0x100000, reference count:1, mapping count:0,
Aug 19 06:08:21 hostname kernel: Mapped heap -- Offset:0x2600000, size:0x200000, reference count:1, mapping count:0,
Aug 19 06:08:21 hostname kernel: Mapped heap -- Offset:0x1d00000, size:0x900000, reference count:2, mapping count:0,
Aug 19 06:08:21 hostname kernel: Mapped heap -- Offset:0x1400000, size:0x900000, reference count:2, mapping count:0,
Aug 19 06:08:21 hostname kernel: Mapped heap -- Offset:0xb00000, size:0x900000, reference count:3, mapping count:0,
Aug 19 06:08:21 hostname kernel: Mapped heap -- Offset:0x200000, size:0x900000, reference count:2, mapping count:0,
Aug 19 06:08:21 hostname kernel: Mapped heap -- Offset:0x0, size:0x200000, reference count:5, mapping count:0,
Aug 19 06:08:21 hostname kernel: Mapped heap -- Offset:0xef000, size:0x11000, reference count:1, mapping count:0,
Aug 19 06:08:21 hostname kernel: GRBM : 0xb0633828, SRBM : 0x200006c0 .
Aug 19 06:08:21 hostname kernel: CP_RB_BASE : 0x260c00, CP_RB_RPTR : 0x6970 , CP_RB_WPTR :0x6990.
Aug 19 06:08:21 hostname kernel: CP_IB1_BUFSZ:0x0, CP_IB1_BASE_HI:0x0, CP_IB1_BASE_LO:0x262e7000.
Aug 19 06:08:21 hostname kernel: last submit IB buffer -- MC :0x262e7000,phys:0x693c3000.
Aug 19 06:08:21 hostname kernel: Dump the trace queue.
Aug 19 06:08:21 hostname kernel: End of dump
-----------------------------------------------------------------------------------------------------------------------------------
Can you see anything that causes the problem?
Thx in advance.