[ANN] cudaMiner & ccMiner CUDA based mining applications [Windows/Linux/MacOSX] - page 1113.

dbabo

newbie

Activity: 41

Merit: 0

Quote from: Misiolap on April 22, 2013, 03:26:04 PM

My mistake, it shouldn't be there - at the moment -O1 for ld only turns on some optimizations for shared libraries, not the program binary.

-O3 takes whopping 5Khs/ out of mine super fast GT460

dbabo

newbie

Activity: 41

Merit: 0

Quote from: cbuchner1 on April 22, 2013, 03:29:08 PM

That doesn't qualify as almost! Wink

xa-xa close enough. I think i observed same errors before the patch. so it 9hopefully) something simple.

cbuchner1

hero member

Activity: 756

Merit: 502

That doesn't qualify as almost! Wink

dbabo

newbie

Activity: 41

Merit: 0

Quote from: cbuchner1 on April 22, 2013, 02:55:22 PM

Posted an April 22nd release.

Please let me know how it compiles on Linux 64 bit, and how it performs on Titan now.

The patch posted earlier wasn't really doing things right. CUDA textures should have stayed ulong2 and ulong4 type, but the uint32_t type needed to be moved over to unsigned long (from unsigned int previously) because otherwise there would be a mismatch with the texture types.

Christian,
configure works fine if i run:
./configure -with-cuda=/usr/local/cuda

instead of ./configure.sh

And it almost compiles - http://pastebin.com/raw.php?i=JZb62Jtd

Misiolap

newbie

Activity: 14

Merit: 0

My mistake, it shouldn't be there - at the moment -O1 for ld only turns on some optimizations for shared libraries, not the program binary.

cbuchner1

hero member

Activity: 756

Merit: 502

hmm, the patch posted earlier suggests the following configure line for 64 bits

./configure "CFLAGS=-O3" "CXXFLAGS=-O3" "LDFLAGS=-Wl,-O1" --with-cuda=/usr/local/cuda

not sure what the -Wl,-O1 linker flag is supposed to do.

cbuchner1

hero member

Activity: 756

Merit: 502

Posted an April 22nd release.

Please let me know how it compiles on Linux 64 bit, and how it performs on Titan now.

The patch posted earlier wasn't really doing things right. CUDA textures should have stayed ulong2 and ulong4 type, but the uint32_t type needed to be moved over to unsigned long (from unsigned int previously) because otherwise there would be a mismatch with the texture types.

cbuchner1

hero member

Activity: 756

Merit: 502

Quote from: Aggrophobia on April 22, 2013, 09:22:47 AM

autoadjust does not find the best values for my titan, had to find the best values Sad

it works with 70x4 280khash/s

it's autotune (TM) (R).

how's 35x8 ?

Christian

Aggrophobia

legendary

Activity: 1106

Merit: 1001

autoadjust does not find the best values for my titan, had to find the best values Sad

e: now i checked -D option
it works with 70x4 280khash/s

cbuchner1

hero member

Activity: 756

Merit: 502

Quote from: Misiolap on April 22, 2013, 07:42:05 AM

_shared__ uint32_t X[WARPS_PER_BLOCK][WU_PER_WARP][16+4];

Thanks! This helped. I did not know about newly added alignment restrictions in shared memory targeting SM 2.0 and higher. I guess that's because they're now having a unified pointer and addressing scheme. So if there's an alignment requirement, it applies to everything.

Finally the Titan kernel will get my large memory transaction fixes, which should boost performance notably.

Christian

Misiolap

newbie

Activity: 14

Merit: 0

Quote from: cbuchner1 on April 22, 2013, 03:49:34 AM

I've just run into the same compiler issue that borked the Titan kernels when I tried to compile salsa_kernel.cu for sm_30. The kernel will just crash.

Maybe using the NSight debugger I can figure out why this occurs.

Does the crash produce: CUDA_EXCEPTION_6, Warp Misaligned Address ?

I've been able to compile & run salsa_kernel for sm_21, without tex-cache, when accesses to X variable are 128-bit aligned,

ie. when it's declared like this:

Code:

_shared__ uint32_t X[WARPS_PER_BLOCK][WU_PER_WARP][16+4];

K1773R

legendary

Activity: 1792

Merit: 1008

/dev/null

Quote from: SubNoize on April 22, 2013, 07:13:30 AM

Quote from: K1773R on April 22, 2013, 06:47:10 AM

how much are you guys getting with a 580?

240KH/s give or take 10KH/s

sweet, i got ~257

(slightly OC)
as soon ive mined some coins il send a donation for sure Wink

SubNoize

newbie

Activity: 47

Merit: 0

Quote from: K1773R on April 22, 2013, 06:47:10 AM

how much are you guys getting with a 580?

240KH/s give or take 10KH/s

K1773R

legendary

Activity: 1792

Merit: 1008

/dev/null

how much are you guys getting with a 580?

cbuchner1

hero member

Activity: 756

Merit: 502

Quote from: InqBit on April 21, 2013, 06:05:39 PM

I assume you've seen this Kepler thread?

https://bitcointalk.org/index.php?topic=163750.0;topicseen

Seen this.

The challenges with the scrypt hashing are a bit greater than just using the funnel shifter for rotation. One issue is the speed and efficiency of memory access, the other issue is getting enough occupancy on Kepler's SMX (multiprocessor) units - shared memory and register limits are an issue. This mainly affects the GTX 660Ti, GTX 670, 680 and Titan devices which currently perform rather poor in comparison to the 5xx series.

cbuchner1

hero member

Activity: 756

Merit: 502

I've seen reports of a single overclocked Titan doing 290 kHash/s, using a somewhat earlier code version.

cbuchner1

hero member

Activity: 756

Merit: 502

I've just run into the same compiler issue that borked the Titan kernels when I tried to compile salsa_kernel.cu for sm_30. The kernel will just crash.

Maybe using the NSight debugger I can figure out why this occurs.

peacefulmind

full member

Activity: 196

Merit: 100

peacefulmind

full member

Activity: 196

Merit: 100

Christian,

Success,

copied from settings above but seems to be only 260kH/s per TITAN.

termhn

full member

Activity: 126

Merit: 100

Quote from: jasonharty24 on April 21, 2013, 11:12:35 PM

this is my 670gtx (GIGABYTE GV-N670OC-2GD) doing over 200khash/s

Code:

cudaminer.exe --url http://notroll.in:6332/ --userpass jasonharty24.4:12345 -i 0 -m 1 -C 2 -l 70x4

JESUS CHRIST that is a great OC!

Topic: [ANN] cudaMiner & ccMiner CUDA based mining applications [Windows/Linux/MacOSX] - page 1113. (Read 3426996 times)