Pages:
Author

Topic: [ANN] YACMiner - AMD GPU miner for Scrypt-Chacha, N-Scrypt, and Scrypt coins - page 6. (Read 47379 times)

sr. member
Activity: 272
Merit: 250
Hello , YACMiner has a bug . When I switch pool (use Key P + S + pool) , N Factor will auto change to 16 .  Backup pool can't run correctly . Please fix it.

Is not a bug, use  --nfmin  --nfmax  --starttime for each pool
newbie
Activity: 26
Merit: 10
Hello , YACMiner has a bug . When I switch pool (use Key P + S + pool) , N Factor will auto change to 16 .  Backup pool can't run correctly . Please fix it.
member
Activity: 81
Merit: 1002
It was only the wind.
Anyone mining a coin with an Nfactor of 14? Hashrates?
full member
Activity: 152
Merit: 100
Any1 got good configs for NF15?
Seems that 950h/s is too low for an r9 280x (i have it undervolted to 0.95 at ~930 core 1375 mem)
I was expecting ~1.3kh/s

Best regards!
sr. member
Activity: 416
Merit: 250
I did not found a solution to my problem, but a workaround!
It appears for some reason when I go above -l t24x2 with -L 4 the dirver crashes only on my Windows 8 rigs.
IDK if it is a bad Windows, or something else, but I could not fix it. Sad
I installed Windows 7 + Visual C 2010 + 337.88 drivers, and everything is working perfectly! Smiley
I am using now:
Code:
-L 4 -l t60x2 -H 2 -m 1 -i 0
which gives me around 3.6kH/s @ 1222/1500 MHz
sr. member
Activity: 318
Merit: 250
@King_pin

I use windows 7 on the 750ti rig and cudaminer 2014-02-28

Maybe you could try Kopiemtu.  Burn to a flash drive, ssh in, mine stop, monitor stop then change to cudaminer directory and start your command line.  I use Kopiemtu when I mine BBR but cudaminer works for UTC at similar speeds.

Just noticed were in the YACminer thread-sorry for going off topic.
member
Activity: 81
Merit: 1002
It was only the wind.
Nope. First, clz counts LEADING zeroes, we're trying to count trailing zeroes. Now, imagine we have 1001001 - using clz by itself and subtracting wouldn't work because it would stop at the first 1 it encounters, and there may be more. Not downshifting will screw it up, too - what if the bit we lop off by downshifting happened to be 1? Might work practically, though, not sure.

The nice thing about working with factors of 2 - there will only ever be a single 1 in the binary, so not much really matters.  In VS, the following does give the right answer for all the iterations of N.

Code:
	Nfactor = __lzcnt(N >> 1 );

either way, I'm glad someone is interested in improving this portion of the miner.

Ah, I see. Nice.
sr. member
Activity: 416
Merit: 250
Thanks, I am trying all types of settings but I get Errors.
I think it is something connected with Microsoft Visual C++.
I am using Windows 8 and Latest drivers, what else should I install or maybe my cudaminer is not right?
sr. member
Activity: 318
Merit: 250
@Kingpin


Here is what I use on my 750ti rig when mining UTC, gets a solid 3.2kh/s

cudaminer -s 10 --algo=scrypt-jane:UTC -H 2 -i 0 -m 1 -l t25x4 -L 4 -b 16384 -o stratum+tcp://stratum.tumblingblock.com:5555 -u **** -p x
member
Activity: 81
Merit: 1002
It was only the wind.

I'm surprised that 64-bit rotate works as well as it does; perhaps it's newer drivers. The chi step in Keccak is slow, though, and while this doesn't matter for performance, what in the flying fuck is this?

Code:
uint Nfactor = 0;
uint tmp = N >> 1;

/* Determine the Nfactor */
while ((tmp & 1) == 0) {
tmp >>= 1;
Nfactor++;
}

That shit just bugs me. It's far simpler to do this:

Code:
const uint Nfactor = 31 - clz((N >> 1) & -(N >> 1));

I'm guessing because looping and incrementing is easier to grok than bitwise comparison?  Heck, looking at what you've written, I had to do the manual look-and-see to even know that it comes up with the same answer, but then again, C is not my native language.  Today I learned the CLZ function... wouldn't a simpler formula:

Code:
const uint Nfactor = 31 - clz(N >> 1);

or even

Code:
const uint Nfactor = 30 - clz(N);

give the same result as flipping the number and bitwise-anding it?


Nope. First, clz counts LEADING zeroes, we're trying to count trailing zeroes. Now, imagine we have 1001001 - using clz by itself and subtracting wouldn't work because it would stop at the first 1 it encounters, and there may be more. Not downshifting will screw it up, too - what if the bit we lop off by downshifting happened to be 1? Might work practically, though, not sure.
sr. member
Activity: 416
Merit: 250
I acquired some GTX 750Ti 2GB but I can't seem to set them up properly.
The best I did was :
Code:
cudaminer.exe -a scrypt-jane:14 -o stratum+tcp -u user -p X -b 4096 -m 1 -i 0 -L 6 -l t96x1
Which gives me only around 2,5 kH/s they should be doing more than 3k at this freq.
GPU 1350MHz, MEM 1525MHz
I have 5 cards running on 8G of RAM

When I try settings like:

-l t60x2 -H 2 -m 1 -i 0 -L 4
-l t5x24 -H 2 -m 1 -i 0 -L 4

I get "Cuda error" than "Driver Error" or the machine freezes. Sad

Am I missing some command line like SETX....
sr. member
Activity: 318
Merit: 250
750 ti 3.25 kh/s
970gtx 9.5 kh/s

And the bit the AMD whore is really interested in.....R9280X 4.2kh/s, HD7950 3.8kh/s

Is the 280X at stock clocks?

No, 1100 & 1500mem

Thanks.

How are you getting on with NF14........worked any of your magic yet?

What u got Freya hashing at?
member
Activity: 81
Merit: 1002
It was only the wind.
Works on 14.9 now; not sure if it did before, though.

Are you sure you're posting in the right thread?  Are you saying you've updated YACMiner, or are you just trolling?

I've done some work on the OpenCL, but nothing on the host code.

Ahhh.  Other then the chacha-flexible branch of yacminer, I haven't really touched it since Mikaelh did some optimizing, and only then it was just updates so that lookup-gap wasn't compiled into the binary and could be changed in the host during runtime.

I'm surprised that 64-bit rotate works as well as it does; perhaps it's newer drivers. The chi step in Keccak is slow, though, and while this doesn't matter for performance, what in the flying fuck is this?

Code:
uint Nfactor = 0;
uint tmp = N >> 1;

/* Determine the Nfactor */
while ((tmp & 1) == 0) {
tmp >>= 1;
Nfactor++;
}

That shit just bugs me. It's far simpler to do this:

Code:
const uint Nfactor = 31 - clz((N >> 1) & -(N >> 1));
sr. member
Activity: 318
Merit: 250
750 ti 3.25 kh/s
970gtx 9.5 kh/s

And the bit the AMD whore is really interested in.....R9280X 4.2kh/s, HD7950 3.8kh/s

Is the 280X at stock clocks?

No, 1100 & 1500mem
sr. member
Activity: 318
Merit: 250
750 ti 3.25 kh/s
970gtx 9.5 kh/s

And the bit the AMD whore is really interested in.....R9280X 4.2kh/s, HD7950 3.8kh/s
member
Activity: 81
Merit: 1002
It was only the wind.
Works on 14.9 now; not sure if it did before, though.

Are you sure you're posting in the right thread?  Are you saying you've updated YACMiner, or are you just trolling?

I've done some work on the OpenCL, but nothing on the host code.
hero member
Activity: 693
Merit: 500
I've noticed you can always allocate more memory to a single GPU than you can get away with allocating to 4.  On my rig, I was able to allocate 3800 to one card, but had to drop below 3600 to allocate to all 4.  I can't explain why, and I can't explain why the new drivers do what they do, I'm riding on 13.12 until I pull the plug on my farm.  I have also, noted that on every 4 card rig I have, GPU 2 need to run at either a lower frequency or a lower intensity or with a lower memory allocation.  Always assumed it was because it was the one driving the display, but didn't dig into it.
legendary
Activity: 1901
Merit: 1024
I just tested with latest 14.12 omega drivers, I have 2x 290, one works fine even 5% faster, but the GPU in primary slot whatever I try give me:
Error -4: Enqueueing kernel onto command queue. (clEnqueueNDRangeKernel)

Also I had to drop GPU memory usage a lot by 15% (from ~3500Mb to 3070Mb) but still speed is better by 5%

With 13.12 both works fine

So far I tried -d 0 and 1 and only the one in second slot works, also tried connecting monitor just to intel integrated GPU and not using ATI for display, same...

Anyone have any idea can I try something?
member
Activity: 81
Merit: 1002
It was only the wind.
Works on 14.9 now; not sure if it did before, though.
hero member
Activity: 693
Merit: 500
Nope. First, clz counts LEADING zeroes, we're trying to count trailing zeroes. Now, imagine we have 1001001 - using clz by itself and subtracting wouldn't work because it would stop at the first 1 it encounters, and there may be more. Not downshifting will screw it up, too - what if the bit we lop off by downshifting happened to be 1? Might work practically, though, not sure.

The nice thing about working with factors of 2 - there will only ever be a single 1 in the binary, so not much really matters.  In VS, the following does give the right answer for all the iterations of N.

Code:
Nfactor = __lzcnt(N >> 1 );

either way, I'm glad someone is interested in improving this portion of the miner.
Pages:
Jump to: