upd: Solved!
simple add a pause before service start. For my 8 GPU rig i needed extra 50 sec. delay to properly initialize all GPUs (if you guys have rigs with more cards or general slow system, simple increase delay)
I have added one string in [Service] section in awesome.service file:
[Service]
ExecStartPre=/bin/sleep 50
ExecStart=/awesomeminer-remoteagent/AwesomeMiner.RemoteAgent.Linux
Thanks for all detailed feedback and good that you found a workaround. I will look into this scenario to find a better way for Remote Agent to handle it.
Hi patrike
looks like my solution not works always(( sometimes yes, sometimes no. I have increased delay time, but this caused other problem, service stuck in "pre-start" state after rebooting.
Looks like awesome.service code must better check GPU initialization before delegate control to AwesomeMiner main program.
Hope you solve this quickly. Till this only manual restarting helps.
Regards.
Thanks for letting me know. I will investigate how to resolve this in a good way.
looks like AwesomeMiner wrong identify a GPUs PCI_BUS_ID after rebooting linux. Maybe AM used old stored data.
if i try for example manualy change fan speed on selected card 7, awesome change randomly on other card. If i select all GPUs and try to set fan or power, changes only applied on first 4 of 8 cards. This simply dangerous how awesome.service manage GPUs overclocking.
I wont use Linux due correct GPU VRAM management, but looks like i must try using Win7 instead of Win10. Windows 10 is simply theft 20% of GPU memory for nothing (due new model of driver platform)
https://social.technet.microsoft.com/Forums/Lync/en-US/15b9654e-5da7-45b7-93de-e8b63faef064/windows-10-does-not-let-cuda-applications-to-use-all-vram-on-especially-secondary-graphics-cards?forum=win10itprohardwarebut for Linux is not so many mining software as for windows at the moment.
Is this on a 8 x 2080ti system?
If you run into this again, it would be interesting to know what the output of nvidia-smi looks like for these scenarios.
nvidia-smi --query-gpu=name,pci.bus,index,power.limit,power.min_limit,power.max_limit,power.default_limit,clocks.max.sm,clocks.max.memory --format=csv
Hi patrike
this is output:
root@hive2080TI:/awesomeminer-remoteagent# nvidia-smi --query-gpu=name,pci.bus,index,power.limit,power.min_limit,power.max_limit,power.default_limit,clocks.max.sm,clocks.max.memory --format=csv
name, pci.bus, index, power.limit [W], power.min_limit [W], power.max_limit [W], power.default_limit [W], clocks.max.sm [MHz], clocks.max.memory [MHz]
GeForce RTX 2080 Ti, 0x01, 0, 238.00 W, 100.00 W, 310.00 W, 250.00 W, 2100 MHz, 7000 MHz
GeForce RTX 2080 Ti, 0x02, 1, 247.00 W, 100.00 W, 330.00 W, 260.00 W, 2175 MHz, 7000 MHz
GeForce RTX 2080 Ti, 0x03, 2, 247.00 W, 100.00 W, 330.00 W, 260.00 W, 2175 MHz, 7000 MHz
GeForce RTX 2080 Ti, 0x05, 3, 238.00 W, 100.00 W, 310.00 W, 250.00 W, 2100 MHz, 7000 MHz
GeForce RTX 2080 Ti, 0x06, 4, 247.00 W, 100.00 W, 330.00 W, 260.00 W, 2175 MHz, 7000 MHz
GeForce RTX 2080 Ti, 0x07, 5, 247.00 W, 100.00 W, 330.00 W, 260.00 W, 2175 MHz, 7000 MHz
GeForce RTX 2080 Ti, 0x08, 6, 247.00 W, 100.00 W, 330.00 W, 260.00 W, 2175 MHz, 7000 MHz
GeForce RTX 2080 Ti, 0x0A, 7, 247.00 W, 100.00 W, 330.00 W, 260.00 W, 2175 MHz, 7000 MHz
Maybe better use for GPU addressing pci.bus parameter like (0x07, 0x08 ...) instead of simple GPU ID like 0, 1, 2 ...
But then i need to recreate all my GPU clocking profiles, where clocking parameters will be set for every GPU in rig.
Regards