Hi Claymore and all,
I'm facing intermittent issue.
The Rig can run from an hour to over 14 hours without any issues but the issue occurs as shown below:
20:37:39:527 1044 NVML: cannot get current temperature, error 15
20:37:39:589 1044 NVML: cannot get fan speed, error 15
20:37:39:636 1044 srv bs: 0
20:37:39:636 c08 NVML: cannot get current temperature, error 15
20:37:39:636 1044 sent: 235
20:37:39:636 c08 NVML: cannot get fan speed, error 15
20:37:39:652 1524 NVML: cannot get current temperature, error 15
20:37:39:652 1524 NVML: cannot get fan speed, error 15
20:37:39:667 1524 srv bs: 0
20:37:39:667 1524 sent: 240
20:37:41:261 18dc ETH: checking pool connection...
20:37:41:261 18dc send: {"worker": "", "jsonrpc": "2.0", "params": [], "id": 3, "method": "eth_getWork"}
20:37:41:277 18dc got 248 bytes
20:37:41:277 18dc buf: {"id":3,"jsonrpc":"2.0","result":["0xf40ae5f4d129887a178eee2d9ad801f3282e0960f1886af2968b86e236180d19","0xe7cc04b23a9cc81f260579130b45591574d12ac3be13f0a279ad68fc753f9699","0x0112e0be826d694b2e62d01511f12a6061fbaec8bc02357593e70e52ba","0x3e7bca"]}
20:37:41:277 18dc parse packet: 247
20:37:41:277 18dc ETH: job is the same
20:37:41:277 18dc new buf size: 0
20:37:41:902 10d0 recv: 69
20:37:41:902 10d0 srv pck: 68
20:37:41:902 10d0 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:41:902 10d0 NVML: cannot get fan speed, error 999 (an internal driver error occurred)
20:37:41:902 10d0 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:41:933 10d0 NVML: cannot get fan speed, error 999 (an internal driver error occurred)
20:37:41:933 10d0 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:41:933 10d0 NVML: cannot get fan speed, error 999 (an internal driver error occurred)
20:37:41:933 10d0 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:41:964 10d0 NVML: cannot get fan speed, error 999 (an internal driver error occurred)
20:37:41:964 10d0 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:41:996 10d0 NVML: cannot get fan speed, error 999 (an internal driver error occurred)
20:37:41:996 10d0 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:41:996 10d0 NVML: cannot get fan speed, error 999 (an internal driver error occurred)
20:37:42:011 10d0 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:42:027 10d0 NVML: cannot get fan speed, error 999 (an internal driver error occurred)
20:37:42:027 10d0 srv bs: 0
20:37:42:027 10d0 sent: 225
20:37:42:152 13b4 GPU 0, GpuMiner cu_k1 failed 30, unknown error
20:37:42:433 13b4 GPU 0, GpuMiner kx failed 1
20:37:42:433 10e4 GPU 1, GpuMiner cu_k1 failed 30, unknown error
20:37:42:454 10e4 GPU 1, GpuMiner kx failed 1
20:37:42:460 1b98 GPU 2, GpuMiner cu_k1 failed 30, unknown error
20:37:42:464 1b98 GPU 2, GpuMiner kx failed 1
20:37:42:469 b18 GPU 5, GpuMiner cu_k1 failed 30, unknown error
20:37:42:470 b18 GPU 5, GpuMiner kx failed 1
20:37:42:433 13b4 Set global fail flag, failed GPU0
20:37:42:479 838 GPU 6, GpuMiner cu_k1 failed 30, unknown error
20:37:42:480 838 GPU 6, GpuMiner kx failed 1
20:37:42:482 11c4 GPU 4, GpuMiner cu_k1 failed 30, unknown error
20:37:42:489 11c4 GPU 4, GpuMiner kx failed 1
20:37:42:492 f60 GPU 3, GpuMiner cu_k1 failed 30, unknown error
20:37:42:495 f60 GPU 3, GpuMiner kx failed 1
20:37:42:498 f60 Set global fail flag, failed GPU3
20:37:42:500 13b4 GPU 0 failed
20:37:42:460 10e4 Set global fail flag, failed GPU1
20:37:42:482 838 Set global fail flag, failed GPU6
20:37:42:505 838 GPU 6 failed
20:37:42:492 11c4 Set global fail flag, failed GPU4
20:37:42:508 11c4 GPU 4 failed
20:37:42:511 1190 GPU 0, GpuMiner cu_k1 failed 30, unknown error
20:37:42:514 f60 GPU 3 failed
20:37:42:516 f28 GPU 3, GpuMiner cu_k1 failed 30, unknown error
20:37:42:519 f28 GPU 3, GpuMiner kx failed 1
20:37:42:522 87c GPU 1, GpuMiner cu_k1 failed 30, unknown error
20:37:42:525 87c GPU 1, GpuMiner kx failed 1
20:37:42:526 c28 GPU 6, GpuMiner cu_k1 failed 30, unknown error
20:37:42:527 c28 GPU 6, GpuMiner kx failed 1
20:37:42:528 1924 GPU 4, GpuMiner cu_k1 failed 30, unknown error
20:37:42:529 1924 GPU 4, GpuMiner kx failed 1
20:37:42:530 10e4 GPU 1 failed
20:37:42:522 f28 Set global fail flag, failed GPU3
20:37:42:469 1b98 Set global fail flag, failed GPU2
20:37:42:526 87c Set global fail flag, failed GPU1
20:37:42:478 b18 Set global fail flag, failed GPU5
20:37:42:528 c28 Set global fail flag, failed GPU6
20:37:42:536 1190 GPU 0, GpuMiner kx failed 1
20:37:42:530 1924 Set global fail flag, failed GPU4
20:37:42:538 f28 GPU 3 failed
20:37:42:540 1b98 GPU 2 failed
20:37:42:541 a5c GPU 2, GpuMiner cu_k1 failed 30, unknown error
20:37:42:542 a5c GPU 2, GpuMiner kx failed 1
20:37:42:545 b18 GPU 5 failed
20:37:42:548 14a4 GPU 5, GpuMiner cu_k1 failed 30, unknown error
20:37:42:573 14a4 GPU 5, GpuMiner kx failed 1
20:37:42:598 14a4 Set global fail flag, failed GPU5
20:37:42:601 1924 GPU 4 failed
20:37:42:602 87c GPU 1 failed
20:37:42:545 a5c Set global fail flag, failed GPU2
20:37:42:604 c28 GPU 6 failed
20:37:42:537 1190 Set global fail flag, failed GPU0
20:37:42:607 14a4 GPU 5 failed
20:37:42:608 a5c GPU 2 failed
20:37:42:610 1190 GPU 0 failed
20:37:42:889 c08 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:42:891 c08 NVML: cannot get fan speed, error 999 (an internal driver error occurred)
20:37:42:892 c08 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:42:893 c08 NVML: cannot get fan speed, error 999 (an internal driver error occurred)
20:37:42:894 c08 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:42:895 c08 NVML: cannot get fan speed, error 999 (an internal driver error occurred)
20:37:42:896 c08 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:42:897 c08 NVML: cannot get fan speed, error 999 (an internal driver error occurred)
20:37:42:898 c08 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:42:901 c08 NVML: cannot get fan speed, error 999 (an internal driver error occurred)
20:37:42:902 c08 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:42:903 c08 NVML: cannot get fan speed, error 999 (an internal driver error occurred)
20:37:42:904 c08 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:42:906 c08 NVML: cannot get fan speed, error 999 (an internal driver error occurred)
20:37:44:598 cf4 recv: 69
20:37:44:599 cf4 srv pck: 68
20:37:46:822 cf4 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:46:167 18dc send: {"id":6,"jsonrpc":"2.0","method":"eth_submitHashrate","params":["0x0", "0x0000000000000000000000000000000000000000000000000000000058213005"]}
20:37:46:846 18dc got 39 bytes
20:37:46:850 18dc buf: {"id":6,"jsonrpc":"2.0","result":true}
20:37:46:859 18dc parse packet: 38
20:37:46:864 18dc new buf size: 0
20:37:46:903 cf4 NVML: cannot get fan speed, error 999 (an internal driver error occurred)
20:37:46:908 cf4 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:47:023 1984 recv: 69
20:37:47:034 1984 srv pck: 68
20:37:47:111 cf4 NVML: cannot get fan speed, error 17
20:37:47:126 cf4 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:47:142 cf4 NVML: cannot get fan speed, error 17
20:37:47:142 cf4 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:47:173 cf4 NVML: cannot get fan speed, error 17
20:37:47:189 cf4 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:47:189 cf4 NVML: cannot get fan speed, error 17
20:37:47:189 cf4 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:47:189 cf4 NVML: cannot get fan speed, error 17
20:37:47:189 cf4 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:47:189 cf4 NVML: cannot get fan speed, error 17
20:37:47:189 c08 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:47:189 cf4 srv bs: 0
20:37:47:189 c08 NVML: cannot get fan speed, error 17
20:37:47:189 cf4 sent: 196
20:37:47:189 c08 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:47:204 c08 NVML: cannot get fan speed, error 17
20:37:47:204 c08 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:47:204 c08 NVML: cannot get fan speed, error 17
20:37:47:204 c08 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:47:204 c08 NVML: cannot get fan speed, error 17
20:37:47:204 c08 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:47:204 c08 NVML: cannot get fan speed, error 17
20:37:47:204 c08 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:47:204 c08 NVML: cannot get fan speed, error 17
20:37:47:204 c08 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:47:204 c08 NVML: cannot get fan speed, error 17
20:37:47:204 1984 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:47:204 1984 NVML: cannot get fan speed, error 17
20:37:47:220 1984 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:47:220 1984 NVML: cannot get fan speed, error 17
20:37:47:220 1984 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:47:220 1984 NVML: cannot get fan speed, error 17
20:37:47:220 1984 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:47:220 1984 NVML: cannot get fan speed, error 17
20:37:47:231 1984 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:47:233 1984 NVML: cannot get fan speed, error 17
20:37:47:235 1984 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:47:236 1984 NVML: cannot get fan speed, error 17
20:37:47:237 1984 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:47:238 1984 NVML: cannot get fan speed, error 17
20:37:47:241 1984 srv bs: 0
20:37:47:246 1984 sent: 196
20:37:49:592 1890 recv: 69
20:37:49:592 1890 srv pck: 68
20:37:49:592 1890 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:49:592 1890 NVML: cannot get fan speed, error 17
20:37:49:592 1890 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:49:592 1890 NVML: cannot get fan speed, error 17
20:37:49:592 1890 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:49:592 1890 NVML: cannot get fan speed, error 17
20:37:49:592 1890 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:49:592 1890 NVML: cannot get fan speed, error 17
20:37:49:592 1890 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:49:608 1890 NVML: cannot get fan speed, error 17
20:37:49:608 1890 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:49:608 1890 NVML: cannot get fan speed, error 17
20:37:49:608 1890 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:49:608 1890 NVML: cannot get fan speed, error 17
20:37:49:608 1890 srv bs: 0
20:37:49:608 1890 sent: 196
20:37:50:478 c08 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:50:479 c08 NVML: cannot get fan speed, error 17
20:37:50:479 c08 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:50:480 c08 NVML: cannot get fan speed, error 17
20:37:50:481 c08 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:50:481 c08 NVML: cannot get fan speed, error 17
20:37:50:482 c08 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:50:483 c08 NVML: cannot get fan speed, error 17
20:37:50:483 c08 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:50:484 c08 NVML: cannot get fan speed, error 17
20:37:50:485 c08 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:50:485 c08 NVML: cannot get fan speed, error 17
20:37:50:486 c08 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:50:487 c08 NVML: cannot get fan speed, error 17
20:37:51:285 18dc ETH: checking pool connection...
20:37:51:285 18dc send: {"worker": "", "jsonrpc": "2.0", "params": [], "id": 3, "method": "eth_getWork"}
20:37:51:300 18dc got 248 bytes
20:37:51:301 18dc buf: {"id":3,"jsonrpc":"2.0","result":["0xf40ae5f4d129887a178eee2d9ad801f3282e0960f1886af2968b86e236180d19","0xe7cc04b23a9cc81f260579130b45591574d12ac3be13f0a279ad68fc753f9699","0x0112e0be826d694b2e62d01511f12a6061fbaec8bc02357593e70e52ba","0x3e7bca"]}
20:37:51:302 18dc parse packet: 247
20:37:51:303 18dc ETH: job is the same
20:37:51:303 18dc new buf size: 0
20:37:52:025 95c recv: 69
20:37:52:026 95c srv pck: 68
20:37:52:027 95c NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:52:028 95c NVML: cannot get fan speed, error 17
20:37:52:028 95c NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:52:029 95c NVML: cannot get fan speed, error 17
20:37:52:030 95c NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:52:030 95c NVML: cannot get fan speed, error 17
20:37:52:031 95c NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:52:032 95c NVML: cannot get fan speed, error 17
20:37:52:032 95c NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:52:033 95c NVML: cannot get fan speed, error 17
20:37:52:034 95c NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:52:034 95c NVML: cannot get fan speed, error 17
20:37:52:035 95c NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:52:036 95c NVML: cannot get fan speed, error 17
20:37:52:036 95c srv bs: 0
20:37:52:037 95c sent: 196
20:37:53:748 c08 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:53:748 c08 NVML: cannot get fan speed, error 17
20:37:53:748 c08 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:53:748 c08 NVML: cannot get fan speed, error 17
20:37:53:748 c08 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:53:748 c08 NVML: cannot get fan speed, error 17
20:37:53:748 c08 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:53:748 c08 NVML: cannot get fan speed, error 17
20:37:53:748 c08 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:53:748 c08 NVML: cannot get fan speed, error 17
20:37:53:748 c08 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:53:748 c08 NVML: cannot get fan speed, error 17
20:37:53:748 c08 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:53:748 c08 NVML: cannot get fan speed, error 17
20:37:54:596 1970 recv: 69
20:37:54:596 1970 srv pck: 68
20:37:54:596 1970 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:54:596 1970 NVML: cannot get fan speed, error 17
20:37:54:596 1970 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:54:596 1970 NVML: cannot get fan speed, error 17
20:37:54:596 1970 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:54:596 1970 NVML: cannot get fan speed, error 17
20:37:54:596 1970 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:54:596 1970 NVML: cannot get fan speed, error 17
20:37:54:596 1970 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:54:596 1970 NVML: cannot get fan speed, error 17
20:37:54:596 1970 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:54:596 1970 NVML: cannot get fan speed, error 17
20:37:54:596 1970 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:54:596 1970 NVML: cannot get fan speed, error 17
20:37:54:596 1970 srv bs: 0
20:37:54:596 1970 sent: 196
20:37:56:885 c08 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:56:886 c08 NVML: cannot get fan speed, error 17
20:37:56:887 c08 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:56:887 c08 NVML: cannot get fan speed, error 17
20:37:56:888 c08 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:56:888 c08 NVML: cannot get fan speed, error 17
20:37:56:889 c08 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:56:890 c08 NVML: cannot get fan speed, error 17
20:37:56:890 c08 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:56:891 c08 NVML: cannot get fan speed, error 17
20:37:56:892 c08 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:56:892 c08 NVML: cannot get fan speed, error 17
20:37:56:893 c08 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:56:894 c08 NVML: cannot get fan speed, error 17
20:37:57:026 6a8 recv: 69
20:37:57:027 6a8 srv pck: 68
20:37:57:028 6a8 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:57:029 6a8 NVML: cannot get fan speed, error 17
20:37:57:029 6a8 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:57:030 6a8 NVML: cannot get fan speed, error 17
20:37:57:031 6a8 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:57:031 6a8 NVML: cannot get fan speed, error 17
20:37:57:032 6a8 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:57:033 6a8 NVML: cannot get fan speed, error 17
20:37:57:033 6a8 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:57:034 6a8 NVML: cannot get fan speed, error 17
20:37:57:035 6a8 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:57:035 6a8 NVML: cannot get fan speed, error 17
20:37:57:036 6a8 NVML: cannot get current temperature, error 999 (an internal driver error occurred)
20:37:57:036 6a8 NVML: cannot get fan speed, error 17
20:37:57:037 6a8 srv bs: 0
20:37:57:038 6a8 sent: 196
I've activated -wd 1 and it will trigger reboot.bat file to restart my computer.
however,
After the restart,
The miner shows the following error as it cant detect my gpu.
This error doesn't trigger the reboot.bat and it will keep prompting the error until i go into the computer and manually restart it.
cudaGetDeviceCount failed (30, unknown error), probably no CUDA devices
20:38:19:071 1510 No NVIDIA CUDA GPUs detected.
20:38:19:071 1510 No AMD OPENCL or NVIDIA CUDA GPUs found, exit
Does anyone know a solution for this?