Author

Topic: [ANN] cudaMiner & ccMiner CUDA based mining applications [Windows/Linux/MacOSX] - page 1112. (Read 3426996 times)

sr. member
Activity: 252
Merit: 254
...
If you overclock that gt430 a bit you can get in the 30kh/s range. 
I currently get 36kh/s on my gt430 with configuration 20x8. 


and how do i do that?  Roll Eyes

Google a tool called NvidiaInspector (I think it's from TechPowerup).  It will let you adjust the fan speeds, voltage, and clock speeds of the core/mem/shader of most all nvidia cards.

I use it to set the clocks on my gt430 card to 882Mhz core, 810Mhz mem, 1760Mhz shader, .990v.  That combo yields between 36kh/s-38kh/s.  If I go any higher than that I get the driver crash.

Your mileage may vary though on the clock speeds you can attain.  I have an EVGA GT430 so I'm not sure how it compares to other flavors.
newbie
Activity: 41
Merit: 0
...
If you overclock that gt430 a bit you can get in the 30kh/s range. 
I currently get 36kh/s on my gt430 with configuration 20x8. 


and how do i do that?  Roll Eyes
sr. member
Activity: 252
Merit: 254
Yes, without this on 64bit it kernel dies due to Warp Misaligned Address on compute_10.

ok then I'll put that in and make this a final upload for today. Thank you for the explanations and for developing the patch.

well deserved bottle of wine shall be corked:
Code:
[2013-04-22 19:29:03] 1 miner threads started, using 'scrypt' algorithm.
[2013-04-22 19:29:14] GPU #0: GeForce GT 430 with compute capability 2.1
[2013-04-22 19:29:14] GPU #0: interactive: 1, tex-cache: 0 , single-alloc: 0
[2013-04-22 19:29:14] GPU #0: Performing auto-tuning (Patience...)
[2013-04-22 19:29:21] GPU #0:   24.34 khash/s with configuration  4x6
[2013-04-22 19:29:21] GPU #0: using launch configuration  4x6
[2013-04-22 19:29:21] GPU #0: GeForce GT 430, 4608 hashes, 0.26 khash/s
[2013-04-22 19:29:21] GPU #0: GeForce GT 430, 1536 hashes, 15.62 khash/s
[2013-04-22 19:29:25] GPU #0: GeForce GT 430, 78336 hashes, 22.29 khash/s
[2013-04-22 19:29:30] GPU #0: GeForce GT 430, 112128 hashes, 22.11 khash/s
[2013-04-22 19:29:35] GPU #0: GeForce GT 430, 110592 hashes, 21.79 khash/s
[2013-04-22 19:29:40] GPU #0: GeForce GT 430, 109056 hashes, 21.37 khash/s
[2013-04-22 19:29:45] GPU #0: GeForce GT 430, 107520 hashes, 22.23 khash/s

and no i686 deps Smiley
first coin goes to you
thank you!


If you overclock that gt430 a bit you can get in the 30kh/s range. 
I currently get 36kh/s on my gt430 with configuration 20x8. 
newbie
Activity: 41
Merit: 0
Yes, without this on 64bit it kernel dies due to Warp Misaligned Address on compute_10.

ok then I'll put that in and make this a final upload for today. Thank you for the explanations and for developing the patch.

well deserved bottle of wine shall be corked:
Code:
[2013-04-22 19:29:03] 1 miner threads started, using 'scrypt' algorithm.
[2013-04-22 19:29:14] GPU #0: GeForce GT 430 with compute capability 2.1
[2013-04-22 19:29:14] GPU #0: interactive: 1, tex-cache: 0 , single-alloc: 0
[2013-04-22 19:29:14] GPU #0: Performing auto-tuning (Patience...)
[2013-04-22 19:29:21] GPU #0:   24.34 khash/s with configuration  4x6
[2013-04-22 19:29:21] GPU #0: using launch configuration  4x6
[2013-04-22 19:29:21] GPU #0: GeForce GT 430, 4608 hashes, 0.26 khash/s
[2013-04-22 19:29:21] GPU #0: GeForce GT 430, 1536 hashes, 15.62 khash/s
[2013-04-22 19:29:25] GPU #0: GeForce GT 430, 78336 hashes, 22.29 khash/s
[2013-04-22 19:29:30] GPU #0: GeForce GT 430, 112128 hashes, 22.11 khash/s
[2013-04-22 19:29:35] GPU #0: GeForce GT 430, 110592 hashes, 21.79 khash/s
[2013-04-22 19:29:40] GPU #0: GeForce GT 430, 109056 hashes, 21.37 khash/s
[2013-04-22 19:29:45] GPU #0: GeForce GT 430, 107520 hashes, 22.23 khash/s

and no i686 deps Smiley
first coin goes to you
thank you!
newbie
Activity: 14
Merit: 0
Great, now it works out-of-box for me (salsa_kernel), thanks.
hero member
Activity: 756
Merit: 502
Yes, without this on 64bit it kernel dies due to Warp Misaligned Address on compute_10.

ok then I'll put that in and make this a final upload for today. Thank you for the explanations and for developing the patch.
newbie
Activity: 14
Merit: 0
Yes, without this on 64bit it kernel dies due to Warp Misaligned Address on compute_10.
hero member
Activity: 756
Merit: 502
Additionally shared buffers for 64bit builds must be 64bit aligned.

Does this also apply when targeting compute_10, sm_10 (which is done in salsa_kernel.cu) ?

Christian
newbie
Activity: 14
Merit: 0
Additionally shared buffers for 64bit builds must be 64bit aligned.

If it's worth to save some memory for 32bit builds something like this can be done:

Code:
#if __x86_64__
#define _64BIT_ALIGN 1
#else
#define _64BIT_ALIGN 0
#endif

And for each buffer:

Code:
__shared__ uint32_t X[WARPS_PER_BLOCK][WU_PER_WARP][32+1+_64BIT_ALIGN];
hero member
Activity: 756
Merit: 502
Okay, this would be the 2nd attempt for the day.

uint32_t becomes typedef'd as unsigned int
ulong2 becomes uint2
ulong4 becomes uint4

and Titan kernel now does uint2 based memory transactions in a shared memory buffer of [16+2] width, which should reduce warp serialization.

now where's my bottle of wine?
full member
Activity: 196
Merit: 100
I am using 4/22 version.  Same cmd as before, however total has dropped from 520 to 410kH.
This got 520kH on the 2x titan in 4/17 release.

Wah! I need to empty a bottle of wine now.


Christian,

Perhaps it is something in my setup?  This is not made for a mining rig, I use it for day to day and gaming.

990x
12GB RAM
2x Titan
5760x1200 SLI
64bit win7

I have noticed when I set interactive to 1,1 it freezes, also when I try to let it auto-tune it freezes.

I have a ton of games and applications on this machine - so it may be my system.

Perhaps the new 4/22 build needs different settings than the ones I used on the 4/17 build?  I will try some more.

My dedicated mining machines are lean and mean and using AMD RADEON on Linux/a few win7 - so it is hard to compare.

You are trailblazing new ground!
newbie
Activity: 14
Merit: 0
Any suggestion for a portable 32 bit type among 32 bit and 64 bit builds?  I thought int changed size depending on architecture, long is always 32 bits, and long long is always 64 bits.

EDIT: I've been reading up on the differences between Microsoft's LLP64 model vs. Unix/Linux LP64 model. I will have to change a few things in the code, then.

Christian

For general purpose vars use uint*_t from

I'm not sure what should be used for CUDA vector types for portability.

On 64bit linux:
sizeof(ulong2): 16, sizeof(uint2): 8

On 32bit linux it's probably 8 for both.
legendary
Activity: 1792
Merit: 1008
/dev/null
wait, 1 titan is 7kh/s slower as my 580? that's sad Sad
hero member
Activity: 756
Merit: 502
I am using 4/22 version.  Same cmd as before, however total has dropped from 520 to 410kH.
This got 520kH on the 2x titan in 4/17 release.

Wah! I need to empty a bottle of wine now.
hero member
Activity: 756
Merit: 502
Isn't ulong the same as uint on 32 bit builds?
On 64bit linux it breaks things, because ulong is 64bit and uint is 32bit.

Any suggestion for a portable 32 bit type among 32 bit and 64 bit builds?  I thought int changed size depending on architecture, long is always 32 bits, and long long is always 64 bits.

EDIT: I've been reading up on the differences between Microsoft's LLP64 model vs. Unix/Linux LP64 model. I will have to change a few things in the code, then.

Christian
hero member
Activity: 756
Merit: 502
Do you feel you are at release candidate level yet?  I want to add this to guiminer-scrypt when it hits maturity.

hmm I am probably not going to change the console and command line options output now. But stability (error checking) has to be improved before this can even hit beta status.
newbie
Activity: 14
Merit: 0
Isn't ulong the same as uint on 32 bit builds?
On 64bit linux it breaks things, because ulong is 64bit and uint is 32bit.
legendary
Activity: 1484
Merit: 1005
Do you feel you are at release candidate level yet?  I want to add this to guiminer-scrypt when it hits maturity.
full member
Activity: 196
Merit: 100
I am using 4/22 version.  Same cmd as before, however total has dropped from 520 to 410kH.

Same clocks here is .bat

cudaminer.exe --url http://127.0.0.1:8332/ --userpass xxx.x:123 -i 0,0 -d 0,1 -m 1,1 -C 2,2 -l 84x4,84x4


This got 520kH on the 2x titan in 4/17 release.
hero member
Activity: 756
Merit: 502

I am like so close -----> <----- to throwing out the texture cache support in 64 bit builds.


Jump to: