Pages:
Author

Topic: NSGminer v0.9.4: The Fastest NeoScrypt GPU Miner - page 26. (Read 221582 times)

member
Activity: 81
Merit: 1002
It was only the wind.
I have compiled and uploaded the 64-bit Windows binaries per numerous requests. Tested fine on a Win7 notebook I've got recently.

https://github.com/ghostlander/nsgminer/releases/tag/nsgminer-v0.9.0

Wolf0, your kernel distributed by NiceHash with their miner is also well done, could use an idea or two out of it. One of my primary concerns was to get rid of scratch register usage to run more than one wavefront concurrently, but FastKDF seems to be too complicated to fit VRegs and SRegs alone. Although I've cut the scratch reg usage down by half which also helps.


Thanks for the compilation mate!

But I am getting these speeds with R9 280Xs using NSGminer (Win7, 64-bit, Catalyst 14.6):



Not really an improvement compared to what I'm getting with sgminer5-2-1-general:



Any suggestions?


5.2.1 general uses my code. You'll want different settings for his - also, intensity can't be tuned well on his, it's very coarse-grained.

As you can see partly in the top of the screenshots the settings are very different. For NSGminer I use the same command as stated in the OP: nsgminer --neoscrypt -g 1 -w 128 -I 16 (so not using the engine 1000 and memory 1500 option).

And for sgminer I use: sgminer.exe --algorithm neoscrypt --nfactor 10 --xintensity 2 --thread-concurrency 8192 --gpu-threads 2

When using 2 threads and an intensity of 13 with NSGminer a GPU immediately hangs, which is very unusual for R9 280Xs with these settings.

So for now I'll keep using your code, Wolf0 !


Nfactor is useless with mine - mine doesn't use TC, either. Feel free to omit them.
hero member
Activity: 935
Merit: 1001
I don't always drink...
Attempting to run miner on win 7-64 get the libwinpthread-1.dll is missing error.

My bad, have forgotten about static linking this time. It's fine now.

Code:
$ objdump -x nsgminer.exe | grep "DLL Name:"
        DLL Name: KERNEL32.dll
        DLL Name: msvcrt.dll
        DLL Name: msvcrt.dll
        DLL Name: USER32.dll
        DLL Name: WS2_32.dll


Thanks, I am running some tests on your latest Win 64 miner now with various AMD drivers on four R9 280x AMDs.  I'll report back later.
legendary
Activity: 1239
Merit: 1020
No surrender, no retreat, no regret.
Attempting to run miner on win 7-64 get the libwinpthread-1.dll is missing error.

My bad, have forgotten about static linking this time. It's fine now.

Code:
$ objdump -x nsgminer.exe | grep "DLL Name:"
        DLL Name: KERNEL32.dll
        DLL Name: msvcrt.dll
        DLL Name: msvcrt.dll
        DLL Name: USER32.dll
        DLL Name: WS2_32.dll
member
Activity: 81
Merit: 1002
It was only the wind.
I have compiled and uploaded the 64-bit Windows binaries per numerous requests. Tested fine on a Win7 notebook I've got recently.

https://github.com/ghostlander/nsgminer/releases/tag/nsgminer-v0.9.0

Wolf0, your kernel distributed by NiceHash with their miner is also well done, could use an idea or two out of it. One of my primary concerns was to get rid of scratch register usage to run more than one wavefront concurrently, but FastKDF seems to be too complicated to fit VRegs and SRegs alone. Although I've cut the scratch reg usage down by half which also helps.


Thanks for the compilation mate!

But I am getting these speeds with R9 280Xs using NSGminer (Win7, 64-bit, Catalyst 14.6):



Not really an improvement compared to what I'm getting with sgminer5-2-1-general:



Any suggestions?


5.2.1 general uses my code. You'll want different settings for his - also, intensity can't be tuned well on his, it's very coarse-grained.
legendary
Activity: 1239
Merit: 1020
No surrender, no retreat, no regret.
There is also CL_DEVICE_MAX_MEM_ALLOC_SIZE which is maximum 25% of CL_DEVICE_GLOBAL_MEM_SIZE. It's used for actual allocation by the miner. Maybe the driver adjusts it according to the number of GPUs available in the system to make sure they don't run out of memory in the worst case.

The goal of NeoScrypt was to create something both computationally and memory intensive yet suitable for practical use. It wasn't a design objective to make it GPU resistant. Quite the opposite actually. I didn't want it CPU only or ASIC only. Botnets rule in the 1st case and ASIC manufacturers with their farms in the 2nd one. The GPUs are optimal for decentralisation. If I increased memory hardness by designing a complicated tree structure resistant to TMTO attacks, it could phase the GPUs out or reduce their efficiency to an intolerable level. I also wanted NeoScrypt to be backward compatible with Scrypt for ease of deployment. Although NeoScrypt is memory strong anyway, it's a more balanced solution somewhere between computationally intensive algorithms like SHA-256 and memory dependent like Scrypt. I think neither SHA-256 nor Scrypt ASICs can support NeoScrypt easily as a by-product and its market share is too low to produce a customised ASIC design. Maybe in a few years, who knows.
hero member
Activity: 935
Merit: 1001
I don't always drink...
Attempting to run miner on win 7-64 get the libwinpthread-1.dll is missing error.

Try copying it from any other miner.

I just did that now I get another error:  The application was unable to start correctly (0xc000007b)
member
Activity: 81
Merit: 1002
It was only the wind.
win7-64-r9-280x-15.7 driver


nsgminer --neoscrypt -g 1 -w 128 -I 16  -o stratum+tcp://strat.dnb.io:3028 -O ygi.1:1234


what's wrong?


Catalysts above 14.7 don't work currently. Even if I patch the kernel to work around the issue, the performance is likely to decrease.


You might be not freeing some GPU memory or doing something odd with the driver - the miner seems to cause some instability, especially if restarted a few times without rebooting the system.

It doesn't (or shouldn't) allow to increase intensity once started because it needs to allocate more memory what it cannot do currently. However it's possible to decrease it and increase back to to the starting limit. I'll check the code in case I've missed something, but the driver is supposed to destroy all buffers when the miner quits.


I halved the necessary memory requirement in Neoscrypt by abusing the fact the scratchpad isn't updated as you wander it - I recompute every second value, this is more descriptive:

Code:

for(int i = 0; i < 64; ++i)
{
    Scratchpad[i] = X;
    X = BlkMix(BlkMix(X));
}

for(int i = 0; i < 128; ++i)
{
    uint idx = ((uint *)X)[48] & 0x7F;
   
    if(idx & 1)
    {
        Block tmp = Scratchpad[idx >> 1];
        X ^= BlkMix(tmp);
    }
    else
    {
        X ^= Scratchpad[idx >> 1];
    }
   
    X = BlkMix(X);
}
hero member
Activity: 935
Merit: 1001
I don't always drink...
Attempting to run miner on win 7-64 get the libwinpthread-1.dll is missing error.
legendary
Activity: 1239
Merit: 1020
No surrender, no retreat, no regret.
EDIT: How are you running intensity 16 and 17? Not even my 290X or Fury will enqueue the kernel with that setting.

Hope your OS is 64-bit? It's possible if you have a lot of free system memory. The driver allocates huge buffers there for swapping. CL_DEVICE_GLOBAL_MEM_SIZE should tell how much is available. clinfo outputs it under "Global memory size".
member
Activity: 81
Merit: 1002
It was only the wind.
win7-64-r9-280x-15.7 driver


nsgminer --neoscrypt -g 1 -w 128 -I 16  -o stratum+tcp://strat.dnb.io:3028 -O ygi.1:1234


what's wrong?


Catalysts above 14.7 don't work currently. Even if I patch the kernel to work around the issue, the performance is likely to decrease.


You might be not freeing some GPU memory or doing something odd with the driver - the miner seems to cause some instability, especially if restarted a few times without rebooting the system.

My 7950s seem to like doing the two SMix calls in parallel more than your original, however, I tried reducing the memory usage in both your design (sequential SMix calls) and mine (parallel SMix calls), and at least Tahiti is having none of that shit.Halving the memory usage allows higher intensities, but lower hashrate - the only gain is from the new SMix.
legendary
Activity: 1239
Merit: 1020
No surrender, no retreat, no regret.
NSGminer v0.9.1 released with my NeoScrypt OpenCL kernel v6. Should be compatible with the latest AMD Catalyst drivers. Also delivers a little performance improvement over the previous release.

Tried to use it, but this one complains about libcurl not supporting stratum+tcp, and if I remove it, the miner just tries http...

EDIT: Sorry, I'm an idiot; my test pool is down, but their website is up...

You have debug enabled probably. Of course libcurl doesn't support stratum+tcp. Disregard it.
legendary
Activity: 1239
Merit: 1020
No surrender, no retreat, no regret.
NSGminer v0.9.1 released with my NeoScrypt OpenCL kernel v6. Should be compatible with the latest AMD Catalyst drivers. Also delivers a little performance improvement over the previous release.
legendary
Activity: 1239
Merit: 1020
No surrender, no retreat, no regret.
For NSGminer, 4Gb of video RAM should allow for -I 17 with minor swapping over PCIe as there are some driver specific buffers consuming space. Even R9 280X can do -I 17 with only 3Gb of video RAM, though bus swapping is high, performance gain over -I 16 is negligible and power consumption gets higher noticeably.

As I've mentioned in the OP, use Catalyst 14.7 RC3 for Windows or 14.6 beta for Linux (fglrx-14.200).

NSGminer features lower rejects over SGminer even with stale submitting enabled. You shouldn't care much about WU because you can see share based hash rate for every card. That's like:

OCL 1:  67.5C 2970RPM | 341.8 337.6 360.3 KH/s | A:658 R:14 HW:0 U:0.9/m
member
Activity: 81
Merit: 1002
It was only the wind.
I have compiled and uploaded the 64-bit Windows binaries per numerous requests. Tested fine on a Win7 notebook I've got recently.

https://github.com/ghostlander/nsgminer/releases/tag/nsgminer-v0.9.0

Wolf0, your kernel distributed by NiceHash with their miner is also well done, could use an idea or two out of it. One of my primary concerns was to get rid of scratch register usage to run more than one wavefront concurrently, but FastKDF seems to be too complicated to fit VRegs and SRegs alone. Although I've cut the scratch reg usage down by half which also helps.


I know a bit of GCN assembly, but I usually stick to modifying disassembly from the OCL compiler to patch in things like GCN lane shuffle, as it's a pain to write a whole kernel in it - might be possible to dump the scratch regs using that, though.

I'm currently working on doing the SMix calls concurrently, but I'll see if I can coerce the compiler to let go of some scratch regs in FastKDF later.
newbie
Activity: 5
Merit: 0
hye guyz.....
im a new-noob here Cry

just a few question :

for NSGminer 0.9.0 - whats d best driver for R9 290 at Win 7 64bit?
                          - whats d best minimum ram size.....? (4G+4G=8G or 8G+8G=16G)
                          - need to use .conf file or .bat file....? (hows d optimum parameter should be set...?)

for SGminer 5.2.1 - whats d best driver for R9 290 at Win 7 64bit?
                        - whats d best minimum ram size.....? (4G+4G=8G or 8G+8G=16G)
                        - need to use .conf file or .bat file....? (hows d optimum parameter should be set...?)

soori for askin this question if its already been asked before........
Bitcointalk forum is gettin bigger n bigger wit a thousand post..hard to find d precise post as a reference........

Hepi New Year anyway.......


p/s - kindly askin for a screenshot due to succesful NSGminer 0.9.0 running on Neoscrypt (Feathercoin) dats show d Kh/s, R: and WU: number
full member
Activity: 132
Merit: 100
I have compiled and uploaded the 64-bit Windows binaries per numerous requests. Tested fine on a Win7 notebook I've got recently.

https://github.com/ghostlander/nsgminer/releases/tag/nsgminer-v0.9.0

Wolf0, your kernel distributed by NiceHash with their miner is also well done, could use an idea or two out of it. One of my primary concerns was to get rid of scratch register usage to run more than one wavefront concurrently, but FastKDF seems to be too complicated to fit VRegs and SRegs alone. Although I've cut the scratch reg usage down by half which also helps.


Thanks for the compilation mate!

But I am getting these speeds with R9 280Xs using NSGminer (Win7, 64-bit, Catalyst 14.6):



Not really an improvement compared to what I'm getting with sgminer5-2-1-general:



Any suggestions?


5.2.1 general uses my code. You'll want different settings for his - also, intensity can't be tuned well on his, it's very coarse-grained.

As you can see partly in the top of the screenshots the settings are very different. For NSGminer I use the same command as stated in the OP: nsgminer --neoscrypt -g 1 -w 128 -I 16 (so not using the engine 1000 and memory 1500 option).

And for sgminer I use: sgminer.exe --algorithm neoscrypt --nfactor 10 --xintensity 2 --thread-concurrency 8192 --gpu-threads 2

When using 2 threads and an intensity of 13 with NSGminer a GPU immediately hangs, which is very unusual for R9 280Xs with these settings.

So for now I'll keep using your code, Wolf0 !
full member
Activity: 132
Merit: 100
I have compiled and uploaded the 64-bit Windows binaries per numerous requests. Tested fine on a Win7 notebook I've got recently.

https://github.com/ghostlander/nsgminer/releases/tag/nsgminer-v0.9.0

Wolf0, your kernel distributed by NiceHash with their miner is also well done, could use an idea or two out of it. One of my primary concerns was to get rid of scratch register usage to run more than one wavefront concurrently, but FastKDF seems to be too complicated to fit VRegs and SRegs alone. Although I've cut the scratch reg usage down by half which also helps.


Thanks for the compilation mate!

But I am getting these speeds with R9 280Xs using NSGminer (Win7, 64-bit, Catalyst 14.6):



Not really an improvement compared to what I'm getting with sgminer5-2-1-general:



Any suggestions?
legendary
Activity: 1239
Merit: 1020
No surrender, no retreat, no regret.
win7-64-r9-280x-15.7 driver


nsgminer --neoscrypt -g 1 -w 128 -I 16  -o stratum+tcp://strat.dnb.io:3028 -O ygi.1:1234


what's wrong?


Catalysts above 14.7 don't work currently. Even if I patch the kernel to work around the issue, the performance is likely to decrease.


You might be not freeing some GPU memory or doing something odd with the driver - the miner seems to cause some instability, especially if restarted a few times without rebooting the system.

It doesn't (or shouldn't) allow to increase intensity once started because it needs to allocate more memory what it cannot do currently. However it's possible to decrease it and increase back to to the starting limit. I'll check the code in case I've missed something, but the driver is supposed to destroy all buffers when the miner quits.
member
Activity: 81
Merit: 1002
It was only the wind.
Well done on this one. The only ideas I have left are parallelism in doing FastKDF and the SMix calls - FastKDF could be done in parallel for multiple nonces in one kernel, and the SMix calls are independent (at the cost of doubling the scratchpad.)
legendary
Activity: 1239
Merit: 1020
No surrender, no retreat, no regret.
win7-64-r9-280x-15.7 driver


nsgminer --neoscrypt -g 1 -w 128 -I 16  -o stratum+tcp://strat.dnb.io:3028 -O ygi.1:1234


what's wrong?


Catalysts above 14.7 don't work currently. Even if I patch the kernel to work around the issue, the performance is likely to decrease.
Pages:
Jump to: