Author

Topic: [ANN] cudaMiner & ccMiner CUDA based mining applications [Windows/Linux/MacOSX] - page 269. (Read 3426936 times)

full member
Activity: 137
Merit: 100
My brain hurts trying to integrate CudeMiner into nVMiner Smiley

Maybe one of you guys that know CUDA much better then I can explain this to me:
1) CudaMiner is setup to use: compute_10,sm_10 (CUDA code generation) out of the ZipFile
This compiles fine
2) I can change Code generation to: compute_30,sm_30;compute_35,sm_35
This too compiles fine in CudaMiner

3) I use said code in nvMiner with: compute_30,sm_30;compute_35,sm_35 and get this:

Error   15   error : Instruction 'shf.l' requires .target sm_35 or higher   C:\CCMiner\nvminer\ptxas Debug\nv_kernel2.compute_30.ptx, line 4555;   nvminer

Anyone know why?  I haven't dug into it yet but was hoping somebody with experience in CUDA could give me a hint where to look or what to do!

If I change Code generation in nvMiner to : compute_35,sm_35 I can compile/link (great first start) into an EXE that will immediately crash due to the following. Smiley

Seems that between the different program versions we have three different versions of device_config:
int device_config[8][2];
char *device_config[8];
int *device_config[8];

These are not compatible of course.  Taking a break for a few before resolving this and moving on to the next issue that will pop up.

Just wanted to post some progress and see if anyone could maybe tell me why I'm seeing the problem described with shf.l and maybe what to do about it?

Sounds a bit like you dropped a #if __CUDA_ARCH__ >= 350 or #if __CUDA_ARCH__ < 350 somewhere along the way. Shf.l is funnelshift and is available only from 3.5 up, apparently you're ending up compiling it unconditionally regardless of arch and the compiler gives you the finger.
hero member
Activity: 868
Merit: 1000
to be honest I am not sure this is entirely a good idea.
Beside the fact, we will end up with some big stuff, Cudaminer and ccminer are two different softwares.  
I don't really see the point in putting them together. Blake512 can be moved (and I was thinking to do it, however it isn't on my top priority list) to ccminer for the rest this is less obvious.
It would be like merging ms words and powerpoint. (sure it could work, but it isn't that useful)

Yes I agree and that is also what makes it VERY DIFFICULT is that the two programs are similar but different also.  But if all the algos can be moved into one EXE if possible (still not sure it can be done) why not try?  The more algos we can support in one EXE (program) the easier it will be for people to use nVidia hardware and should make support a lot easier.  Would you not agree?  This is one of the biggest issue with AMD hardware IMHO at the moment (you need multiple different programs for all the algos).

This would be much easier if it were done by Christian who is ultimately familiar with both programs!!!  If he were to do it then no one would question it but only welcome it (myself included!).

I know we right now support both Windows and Linux.  If we were to diverge (not sure it's a good idea) from this I could build DLL files for each algo which would make compiling and integration of additional algos so much easier from a windows standpoint.  But I resist for now. Smiley

But I  respect your opinion djm34. Could you elaborate on why you think this might not be a good idea? (from an end user standpoint and not from a dev standpoint)
IMHO I would think 99% of the users would like one program regardless of what hell we as devs might need to go through to make this one program!

BTW, part of what I'm trying to do is make it easier for devs like yourself, tsiv and Christian.  You can concentrate on just the unique algos you want to work on and for now and I'll work on integrating it into a "master" framework.  This way each of you only has to work/worry about your own code.  I'll try and take care of the rest.  Maybe after I publish the "master" nvMiner code others can also help with integration.

Thanks,
Carlo

PS Any non devs want to comment on what you'd like?

I don't see a problem having 2 miner. Anyway , cuda is sort of like "old miner" and cc is like "new miner". What's on cuda will slowly phase out by year end. It's like SHA got phase out from GPU. I don't think its a good idea to waste your precious time on old stuff. If u spend equal amount of time on "new" development (e.g improving base algo hashrate. Like improving groetl pretty much improve all algo hashrate) , that is time well spend IMO.  
full member
Activity: 168
Merit: 100
diverging from linux isn't a good idea especially for large cluster (or people don't know all they can do with ssh without even logging to a machine ).

So if it means having a linuxminer and a winminer rather than a cudaminer and ccminer, this isn't a good idea. It means more work to implement new algo, not always sure that the two systems are at the same level of developpement (it means also two github which will be confusing...).
Agreed.   Thus far I've resisted anything windows only.  Everything so far should compile on Linux or Windows.
Quote
If we compare to amd, ccminer is equivalent to sph-sgminer, now I have tried to use the scrypt kernel of sph-sgminer the only thing my card did was crashing... so they did try to implement everything, but it didn't work (or at least it doesn't work for everybody).

it must be noted at the moment there are 4 or 5 differents version of sgminer doing the exact same things... (and this isn't scrypt or scrypt-n again...).

Agreed on the sgminer front.  Thus why I'm trying to head this off on the nVidia front by creating an exe that can handle all algos.

Carlo
legendary
Activity: 1400
Merit: 1050
to be honest I am not sure this is entirely a good idea.
Beside the fact, we will end up with some big stuff, Cudaminer and ccminer are two different softwares.  
I don't really see the point in putting them together. Blake512 can be moved (and I was thinking to do it, however it isn't on my top priority list) to ccminer for the rest this is less obvious.
It would be like merging ms words and powerpoint. (sure it could work, but it isn't that useful)

Yes I agree and that is also what makes it VERY DIFFICULT is that the two programs are similar but different also.  But if all the algos can be moved into one EXE if possible (still not sure it can be done) why not try?  The more algos we can support in one EXE (program) the easier it will be for people to use nVidia hardware and should make support a lot easier.  Would you not agree?  This is one of the biggest issue with AMD hardware IMHO at the moment (you need multiple different programs for all the algos).

I know we right now support both Windows and Linux.  If we were to diverge (not sure it's a good idea) from this I could build DLL files for each algo which would make compiling and integration of additional algos so much easier from a windows standpoint.  But I resist for now. Smiley

But I  respect your opinion djm34. Could you elaborate on why you think this might not be a good idea? (from an end user standpoint and not from a dev standpoint)

Thanks,
Carlo

diverging from linux isn't a good idea especially for large cluster (or people don't know all they can do with ssh without even logging to a machine ).

So if it means having a linuxminer and a winminer rather than a cudaminer and ccminer, this isn't a good idea. It means more work to implement new algo, not always sure that the two systems are at the same level of developpement (it means also two github which will be confusing...).

If we compare to amd, ccminer is equivalent to sph-sgminer, now I have tried to use the scrypt kernel of sph-sgminer the only thing my card did was crashing... so they did try to implement everything, but it didn't work (or at least it doesn't work for everybody).

it must be noted at the moment there are 4 or 5 differents version of sgminer doing the exact same things... (and this isn't scrypt or scrypt-n again...).
legendary
Activity: 2002
Merit: 1051
ICO? Not even once.
I think it would be a pretty great idea to import cudaminer algos into ccminer as cudaminer tend to have problems (random crashes, cpu usage, etc) while ccminer is pretty stable - at least in my experience..
full member
Activity: 168
Merit: 100
to be honest I am not sure this is entirely a good idea.
Beside the fact, we will end up with some big stuff, Cudaminer and ccminer are two different softwares.  
I don't really see the point in putting them together. Blake512 can be moved (and I was thinking to do it, however it isn't on my top priority list) to ccminer for the rest this is less obvious.
It would be like merging ms words and powerpoint. (sure it could work, but it isn't that useful)

Yes I agree and that is also what makes it VERY DIFFICULT is that the two programs are similar but different also.  But if all the algos can be moved into one EXE if possible (still not sure it can be done) why not try?  The more algos we can support in one EXE (program) the easier it will be for people to use nVidia hardware and should make support a lot easier.  Would you not agree?  This is one of the biggest issue with AMD hardware IMHO at the moment (you need multiple different programs for all the algos).

This would be much easier if it were done by Christian who is ultimately familiar with both programs!!!  If he were to do it then no one would question it but only welcome it (myself included!).

I know we right now support both Windows and Linux.  If we were to diverge (not sure it's a good idea) from this I could build DLL files for each algo which would make compiling and integration of additional algos so much easier from a windows standpoint.  But I resist for now. Smiley

But I  respect your opinion djm34. Could you elaborate on why you think this might not be a good idea? (from an end user standpoint and not from a dev standpoint)
IMHO I would think 99% of the users would like one program regardless of what hell we as devs might need to go through to make this one program!

BTW, part of what I'm trying to do is make it easier for devs like yourself, tsiv and Christian.  You can concentrate on just the unique algos you want to work on and for now and I'll work on integrating it into a "master" framework.  This way each of you only has to work/worry about your own code.  I'll try and take care of the rest.  Maybe after I publish the "master" nvMiner code others can also help with integration.

Thanks,
Carlo

PS Any non devs want to comment on what you'd like?
legendary
Activity: 1400
Merit: 1050
to be honest I am not sure this is entirely a good idea.
Beside the fact, we will end up with some big stuff, Cudaminer and ccminer are two different softwares.  
I don't really see the point in putting them together. Blake512 can be moved (and I was thinking to do it, however it isn't on my top priority list) to ccminer for the rest this is less obvious.
ccminer is nice and clean and rather good as a development platform adding more stuff in it, will make it more difficult to read.

It would be like merging ms words and powerpoint. (sure it could work, but it isn't that useful).
Or putting all linux packages in one package...


full member
Activity: 168
Merit: 100
full member
Activity: 168
Merit: 100
My brain hurts trying to integrate CudeMiner into nVMiner Smiley

Maybe one of you guys that know CUDA much better then I can explain this to me:
1) CudaMiner is setup to use: compute_10,sm_10 (CUDA code generation) out of the ZipFile
This compiles fine
2) I can change Code generation to: compute_30,sm_30;compute_35,sm_35
This too compiles fine in CudaMiner

3) I use said code in nvMiner with: compute_30,sm_30;compute_35,sm_35 and get this:

Error   15   error : Instruction 'shf.l' requires .target sm_35 or higher   C:\CCMiner\nvminer\ptxas Debug\nv_kernel2.compute_30.ptx, line 4555;   nvminer

Anyone know why?  I haven't dug into it yet but was hoping somebody with experience in CUDA could give me a hint where to look or what to do!

If I change Code generation in nvMiner to : compute_35,sm_35 I can compile/link (great first start) into an EXE that will immediately crash due to the following. Smiley

Seems that between the different program versions we have three different versions of device_config:
int device_config[8][2];
char *device_config[8];
int *device_config[8];

These are not compatible of course.  Taking a break for a few before resolving this and moving on to the next issue that will pop up.

Just wanted to post some progress and see if anyone could maybe tell me why I'm seeing the problem described with shf.l and maybe what to do about it?
hero member
Activity: 494
Merit: 500


Core clock: Gpu1+204, Gpu2+237, Gpu3+171 (stability for my hardware)
GPU memory: all +400. With +500 i have 300+ H/s but dont want in summertime Cheesy
Miner: ccminer -a cryptonight -o stratum+tcp://pool.minexmr.com:5555 -u address -p pass -l 8x60


thnk you for your input, I will try this out
Posted from Bitcointa.lk - #lKDNNMzkbFJj8fHe
full member
Activity: 145
Merit: 101
That is great , i can make my cards work on all the algos just not n-scrypt , what is your config like for this one ? ( as in your .bat is like? ) I have asus 750's too
Try cudaminer.exe --algo=scrypt:2048 -o pool address -u user -p pass -H 2 -i 0 -m 1 -l T5x20 With this config i have ~155 khash/s each card, but overclocked 750Ti 2Gb. Replace -l command to -l auto and when miner find you optimal configuration change auto to kernel prefix with launch configuration. Good luck!
full member
Activity: 207
Merit: 100
That is great , i can make my cards work on all the algos just not n-scrypt , what is your config like for this one ? ( as in your .bat is like? ) I have asus 750's too
n-scrypt

@ECHO off
cudaminer.exe -a scrypt:2048 -d 0,2 --benchmark
PAUSE

 =))

Ok , doesn't work here =/ will look into it some other time again .. but thanks
full member
Activity: 348
Merit: 102
That is great , i can make my cards work on all the algos just not n-scrypt , what is your config like for this one ? ( as in your .bat is like? ) I have asus 750's too
n-scrypt

@ECHO off
cudaminer.exe -a scrypt:2048 -d 0,2 --benchmark
PAUSE

 =))
full member
Activity: 207
Merit: 100
Hey guys , question here , i have both 750ti and non ti , and i can't seem to find any good settings for scrypt-n mining on the non ti one.
Anyone have any idea what settings would look like knowing they have 1gb of ram and only 512 ccores instead of the 640 of the Ti..
4x24 work for normal scrypt , but n-scrypt just crash all the time..
Any help appreciated Smiley thanks!

Intel® Core™ i7-930, ASUS SABERTOOTH X58, 16Gb DDR3, Win7x64, 335.23.

GPU #0: MSI GeForce GTX 750Ti, N750TI-2GD5/OC, 2Gb
GPU #2: ASUS GeForce GTX 750, GTX750-PHOC-1GD5, 1Gb

MaxCoin Keccak
[2014-07-01 16:16:47] GPU #0: GeForce GTX 750 Ti, 26015 khash/s
[2014-07-01 16:16:48] GPU #2: GeForce GTX 750, 11929 khash/s
[2014-07-01 16:16:48] Total: 37943 khash/s

VTC Adaptive N factor Scrypt
[2014-07-01 16:12:34] GPU #0: GeForce GTX 750 Ti, 134.23 khash/s
[2014-07-01 16:12:35] GPU #2: GeForce GTX 750, 111.58 khash/s
[2014-07-01 16:12:35] Total: 245.81 khash/s

Monero CryptoNight
[2014-07-06 03:02:52] GPU #0: GeForce GTX 750 Ti, 273.49 H/s
[2014-07-06 03:02:52] GPU #2: GeForce GTX 750, 257.16 H/s
[2014-07-06 03:02:52] accepted: 1391/1391 (100.00%), 530.65 H/s (yay!!!)

x11
[2014-07-03 20:08:18] GPU #0: GeForce GTX 750 Ti, 2701 khash/s
[2014-07-03 20:08:18] GPU #2: GeForce GTX 750, 2042 khash/s
[2014-07-03 20:08:18] Total: 4742 khash/s



That is great , i can make my cards work on all the algos just not n-scrypt , what is your config like for this one ? ( as in your .bat is like? ) I have asus 750's too
full member
Activity: 348
Merit: 102
Hey guys , question here , i have both 750ti and non ti , and i can't seem to find any good settings for scrypt-n mining on the non ti one.
Anyone have any idea what settings would look like knowing they have 1gb of ram and only 512 ccores instead of the 640 of the Ti..
4x24 work for normal scrypt , but n-scrypt just crash all the time..
Any help appreciated Smiley thanks!

Intel® Core™ i7-930, ASUS SABERTOOTH X58, 16Gb DDR3, Win7x64, 335.23.

GPU #0: MSI GeForce GTX 750Ti, N750TI-2GD5/OC, 2Gb
GPU #2: ASUS GeForce GTX 750, GTX750-PHOC-1GD5, 1Gb

MaxCoin Keccak
[2014-07-01 16:16:47] GPU #0: GeForce GTX 750 Ti, 26015 khash/s
[2014-07-01 16:16:48] GPU #2: GeForce GTX 750, 11929 khash/s
[2014-07-01 16:16:48] Total: 37943 khash/s

VTC Adaptive N factor Scrypt
[2014-07-01 16:12:34] GPU #0: GeForce GTX 750 Ti, 134.23 khash/s
[2014-07-01 16:12:35] GPU #2: GeForce GTX 750, 111.58 khash/s
[2014-07-01 16:12:35] Total: 245.81 khash/s

Monero CryptoNight
[2014-07-06 03:02:52] GPU #0: GeForce GTX 750 Ti, 273.49 H/s
[2014-07-06 03:02:52] GPU #2: GeForce GTX 750, 257.16 H/s
[2014-07-06 03:02:52] accepted: 1391/1391 (100.00%), 530.65 H/s (yay!!!)

x11
[2014-07-03 20:08:18] GPU #0: GeForce GTX 750 Ti, 2701 khash/s
[2014-07-03 20:08:18] GPU #2: GeForce GTX 750, 2042 khash/s
[2014-07-03 20:08:18] Total: 4742 khash/s

full member
Activity: 168
Merit: 100
With nVidia up until recently you just needed CudaMiner and/or ccminer.  With releases from djm and tsiv we would now have 4 different miners to keep track of.  I wanted to simply it as much as possible to get back to "roots" with only one or two miners.  Hopefully with any luck I can get it all into one exe.  The biggest challenge right now is the long compile times between code changes.  I'm pushing 4 hours on the compile it's doing right now. Sad

You might want to setup some conditional compilation defines so you can bypass those long compile bits when you are testing code which doesn't depend on them.  Also, if the section which takes so long to compile can be reused, branch it off into a separate compile section and just reuse the object binary when building the main application.

Nothing worse than slowing down your whole development cycle because of a few slow sections of code.


I'm fighting the unresolved external symbol and xxx already defined in yyy.obj stuff when linking. Smiley

It's tough to integrate the cudaminer routines as they use structures, enums and whatnot of the same name but different definitions so it's a tedious process.
full member
Activity: 145
Merit: 101
Core clock: Gpu1+204, Gpu2+237, Gpu3+171 (stability for my hardware)
GPU memory: all +400. With +500 i have 300+ H/s but dont want in summertime Cheesy
Miner: ccminer -a cryptonight -o stratum+tcp://pool.minexmr.com:5555 -u address -p pass -l 8x60

What BIOS you're using to have constant clock frequencies?
Not modified(original) 82.07.25.00.6B version for Gigabyte GV-N75TOC-2GI
legendary
Activity: 2002
Merit: 1051
ICO? Not even once.
Core clock: Gpu1+204, Gpu2+237, Gpu3+171 (stability for my hardware)
GPU memory: all +400. With +500 i have 300+ H/s but dont want in summertime Cheesy
Miner: ccminer -a cryptonight -o stratum+tcp://pool.minexmr.com:5555 -u address -p pass -l 8x60

What BIOS you're using to have constant clock frequencies?
sr. member
Activity: 401
Merit: 250
With nVidia up until recently you just needed CudaMiner and/or ccminer.  With releases from djm and tsiv we would now have 4 different miners to keep track of.  I wanted to simply it as much as possible to get back to "roots" with only one or two miners.  Hopefully with any luck I can get it all into one exe.  The biggest challenge right now is the long compile times between code changes.  I'm pushing 4 hours on the compile it's doing right now. Sad

You might want to setup some conditional compilation defines so you can bypass those long compile bits when you are testing code which doesn't depend on them.  Also, if the section which takes so long to compile can be reused, branch it off into a separate compile section and just reuse the object binary when building the main application.

Nothing worse than slowing down your whole development cycle because of a few slow sections of code.
full member
Activity: 145
Merit: 101
Hi, can you share your launch config and oc setting to get 280-290 on 750ti?  thx



Core clock: Gpu1+204, Gpu2+237, Gpu3+171 (stability for my hardware)
GPU memory: all +400. With +500 i have 300+ H/s but dont want in summertime Cheesy
Miner: ccminer -a cryptonight -o stratum+tcp://pool.minexmr.com:5555 -u address -p pass -l 8x60
Jump to: