i think i found the problem the information which program is pulling from device is wrong or these are max value which intentionally hardcoded in program , Ethar can you please set all dynamic , i mean device should report all parameters
Found 1 Cuda device.
Cuda device:GeForce RTX 3080(4095Mb) wrong
Device have: MP:68 Cores+0 wrong
Shared memory total:49152 i guess this is system memory but avaiable is 128GB
Constant memory total:65536 not sure how calculate this one
i am not sure but MP is unit of AMD cards and cuda for Nvidia , and cuda is 8k+ in 3080 but not sure what is 68 cores here
so many confusions
Program used cuda driver api(not runtime api that ussualy used) and code for GPU writed on ptx.
cuda.lib that used to call cuda driver api even x64 version alwayse return 32bit values.
In that case you can`t use/allocate GPU memory more than 2**32bytes
Also cuDeviceTotalMem() return 32bit values of memory that is why you see 4095mb
I write about this issues to nvidia few times but according to them they have no problem)
if you are looking to cuda.lib you will fined unofficial commands like cuDeviceTotalMem_v2 and other.
All this commands have prefix _v2 and this comands return correct 64bit values.
But nvidia say that they does not have commands with prefix _v2 ))
It is about limitation of 2**32 bytes GPU memory
About Device have: MP:68 Cores+0, here 0 because i didn`t add Ampere to programm:
Case 2 ;Fermi
Debug "Fermi"
If minor=1
cores = mp * 48
Else
cores = mp * 32
EndIf
Case 3; Kepler
Debug "Kepler"
cores = mp * 192
Case 5; Maxwell
Debug "Maxwell"
cores = mp * 128
Case 6; Pascal
Debug "Pascal"
cores = mp * 64
Case 7; Pascal
Debug "Pascal RTX"
cores = mp * 64
Default
Debug "Unknown device type"
EndSelect
by the way it need only for information and nothing more
to get corect number of cores need add only this
Case 8; Ampere
Debug "Ampere RTX"
cores = mp * 128
Default
Debug "Unknown device type"