Pages:
Author

Topic: Wolf's XMR/BCN/DSH CPUMiner - 2x speed compared to LucasJones' - NEW 06/20/2014 - page 16. (Read 547096 times)

tgt
newbie
Activity: 19
Merit: 0
It's fully utilizing the cores (minus one for some reason... maybe the scheduler thread is included in the thread limit?) so I can't see a huge gain other than running it on bare metal.

The opteron cores are not as efficient per ghz as the haswells.  The only advantage is that you have physical cores without any hyperthreading.  I'll try and compile the miner on 12.04 to run it on bare metal and see what/if any advantages are seen vs running it kvm with cpu passthrough.

The server is about $3300, so it's not exactly cost effective.  Just fun to see.

edit: just noticed the instructions to install it on 12.04, derp.

http://i.imgur.com/iH7vs4r.png
Is that the 16 core Intel system or the AMD 48 core?

methinks its the Opteron machine.

Wolf0, thanks for sharing.  I should not be surprised that the coder of the miner is very effective at managing his rented rigs, and I am not really, but I gotta say that your ~64 h/s/core is quite impressive on the c3.8xlarge instances.  

Actually, I can get 70H/s/core on c3.8xlarge, 64H/s/core is just with this miner...

I get 64H/core on my opteron...  the speed actually tops out at 26 cores (26*64=1664H/s same as in the screenshot).  I guess that's L3 bandwidth limit - It's amusing to see how intel and amd are virtually identical per core due to this similar cache speed limitation.  Pushing numactl to force workloads onto physical cores/cache pairs didn't make a difference.
slb
hero member
Activity: 598
Merit: 501
Hi,
I am trying to compile on Mac, but I receive this error when I 'make'
Quote
make[3]: Nothing to be done for `all-am'.
gcc -DHAVE_CONFIG_H -I.  -pthread   -Qunused-arguments -falign-loops=16 -falign-functions=16 -falign-jumps=16 -falign-labels=16  -Ofast -flto -fuse-linker-plugin -funroll-loops -fvariable-expansion-in-unroller -ftree-loop-if-convert-stores -fmerge-all-constants -fbranch-target-load-optimize2 -fsched2-use-superblocks -maes   -MT minerd-cpu-miner.o -MD -MP -MF .deps/minerd-cpu-miner.Tpo -c -o minerd-cpu-miner.o `test -f 'cpu-miner.c' || echo './'`cpu-miner.c
clang: error: unknown argument: '-falign-loops=16' [-Wunused-command-line-argument-hard-error-in-future]
clang: note: this will be a hard error (cannot be downgraded to a warning) in the future
clang: error: unknown argument: '-falign-jumps=16' [-Wunused-command-line-argument-hard-error-in-future]
clang: note: this will be a hard error (cannot be downgraded to a warning) in the future
clang: error: unknown argument: '-falign-labels=16' [-Wunused-command-line-argument-hard-error-in-future]
clang: note: this will be a hard error (cannot be downgraded to a warning) in the future
clang: error: unknown argument: '-fuse-linker-plugin' [-Wunused-command-line-argument-hard-error-in-future]
clang: note: this will be a hard error (cannot be downgraded to a warning) in the future
clang: error: unknown argument: '-fvariable-expansion-in-unroller' [-Wunused-command-line-argument-hard-error-in-future]
clang: note: this will be a hard error (cannot be downgraded to a warning) in the future
clang: error: unknown argument: '-ftree-loop-if-convert-stores' [-Wunused-command-line-argument-hard-error-in-future]
clang: note: this will be a hard error (cannot be downgraded to a warning) in the future
clang: error: unknown argument: '-fbranch-target-load-optimize2' [-Wunused-command-line-argument-hard-error-in-future]
clang: note: this will be a hard error (cannot be downgraded to a warning) in the future
clang: error: unknown argument: '-fsched2-use-superblocks' [-Wunused-command-line-argument-hard-error-in-future]
clang: note: this will be a hard error (cannot be downgraded to a warning) in the future
make[2]: *** [minerd-cpu-miner.o] Error 1
make[1]: *** [all-recursive] Error 1
make: *** [all] Error 2

I have the dependencies installed. 'autogen.sh' and 'configure' run without errors.
hero member
Activity: 644
Merit: 502
It's fully utilizing the cores (minus one for some reason... maybe the scheduler thread is included in the thread limit?) so I can't see a huge gain other than running it on bare metal.

The opteron cores are not as efficient per ghz as the haswells.  The only advantage is that you have physical cores without any hyperthreading.  I'll try and compile the miner on 12.04 to run it on bare metal and see what/if any advantages are seen vs running it kvm with cpu passthrough.

The server is about $3300, so it's not exactly cost effective.  Just fun to see.

edit: just noticed the instructions to install it on 12.04, derp.


Is that the 16 core Intel system or the AMD 48 core?

methinks its the Opteron machine.

Wolf0, thanks for sharing.  I should not be surprised that the coder of the miner is very effective at managing his rented rigs, and I am not really, but I gotta say that your ~64 h/s/core is quite impressive on the c3.8xlarge instances. 
hero member
Activity: 979
Merit: 510
It's fully utilizing the cores (minus one for some reason... maybe the scheduler thread is included in the thread limit?) so I can't see a huge gain other than running it on bare metal.

The opteron cores are not as efficient per ghz as the haswells.  The only advantage is that you have physical cores without any hyperthreading.  I'll try and compile the miner on 12.04 to run it on bare metal and see what/if any advantages are seen vs running it kvm with cpu passthrough.

The server is about $3300, so it's not exactly cost effective.  Just fun to see.

edit: just noticed the instructions to install it on 12.04, derp.


Is that the 16 core Intel system or the AMD 48 core?
legendary
Activity: 1092
Merit: 1000
It's fully utilizing the cores (minus one for some reason... maybe the scheduler thread is included in the thread limit?) so I can't see a huge gain other than running it on bare metal.

The opteron cores are not as efficient per ghz as the haswells.  The only advantage is that you have physical cores without any hyperthreading.  I'll try and compile the miner on 12.04 to run it on bare metal and see what/if any advantages are seen vs running it kvm with cpu passthrough.

The server is about $3300, so it's not exactly cost effective.  Just fun to see.

edit: just noticed the instructions to install it on 12.04, derp.



What do you get when running on only 24 cores ?
tgt
newbie
Activity: 19
Merit: 0
It's fully utilizing the cores (minus one for some reason... maybe the scheduler thread is included in the thread limit?) so I can't see a huge gain other than running it on bare metal.

The opteron cores are not as efficient per ghz as the haswells.  The only advantage is that you have physical cores without any hyperthreading.  I'll try and compile the miner on 12.04 to run it on bare metal and see what/if any advantages are seen vs running it kvm with cpu passthrough.

The server is about $3300, so it's not exactly cost effective.  Just fun to see.

edit: just noticed the instructions to install it on 12.04, derp.

http://i.imgur.com/iH7vs4r.png
hero member
Activity: 644
Merit: 502
OK.  What instance type did you get 1030 hash/sec on 16 cores?
hero member
Activity: 644
Merit: 502
1600 H/s:

4x Opteron 6344 (48 physical cores @ 2.6ghz static)
64G ram (12x4GB ddr3-
14.04, built from wolf's git source

inside KVM with cpu-passthrough (couldn't get it to run on 12.04 which is host's bare metal OS, could potentially be faster)
sysctl -w vm.nr_hugepages=144

pool reporting > 2kh, miner at 1600-1650kh/s.



Could probably get better performance by a) running it on baremetal, b) using numactl and a better task scheduler.



I hit 1030 or so with 23 threads on AWS - you probably need to reduce your thread count.

Wolf, why would he want to REDUCE his hashrate?

He reports 1600 hash/sec and you report 1030.  Huh

Because he has 48 physical cores, I had 16.

OK. What AWS instance type has Opterons?
member
Activity: 81
Merit: 1002
It was only the wind.
If you consider cpu mining, you should consider the whole PC consumption, not just CPU.
Making a "traditional" desktop computer with a 4770K will cost more thant a GPU.

You're thinking from a single minded perspective. You are actually seeing the INTENTIONAL limitation of this algorithm.

My kids have a 2500k each and they get 110H/s when the CPU is at 50% the whole time they're on it. They use this miner in Windows. Measured AT THE WALL the power consumption goes up by 30w when the miner starts if the PC was at idle, when I have Hearthstone running in window mode, it is only going up by 20w with the miner.

So effectively, regular crappy $300 computers that I bought for my kids are getting me 110H/s for somewhere between 20w and 30w depending on what they're doing. An R9 280x draws around 300w from the wall at full power, if Claymore's miner is only using half their power, it would be 150w.

To break even in H/s you'd need to be getting closer to 660H/s per card, your results show 460 per card.

This means that people can't just buy a crap tonne of equipment and own the coin. It was intentionally made to be this way.

EDIT: Forgot to mention that the kids don't think it affects their gameplay. They play mostly Hearthstone, Path of Exile, Diablo 3, League of Legends and DotA 2.

The cryptonight algo was not designed to be more cpu friendly than gpu friendly. It is more cpu friendly actually.
I'm not complaining, i have some cpu at home (a dual xeon 2687w and a [email protected]) - i'm not pro or against cpu mining. I have a few gpu and some cpu.
But juste mesuring the difference when mining with cpu compared to when your kids computer are not mining is not, well, a good measure.
Such a computer while mining should draw ~250W (mesured at the wall). Maybe i'm wrong, i'll let you make the measure.
A simple rig designed for mining with gpu, with a little cpu (ga2016/2020), in iddle state, draw 80w, and when mining XMR with one R9 280X, draw 250W.
With the 5 R9 280X, 2300H/s, 1000W measured at the wall.
OK, my dual xeon give me 960H/s for less power, but i think we will see a lot of optimization (for both i hope) in a near future.

Quote
This means that people can't just buy a crap tonne of equipment and own the coin. It was intentionally made to be this way.
Why do you think it was designed intentionally in this way ? To be fair ?
Gpu friendly coin bring gpu farm and multipool, cpu only or cpu friendly bring botnet/amazon EC2 instances (see the boolberry thread, DGA talk about 200 EC2 for himself, and he is far from being the biggest one). In both case i'm still a very little miner

EDIT :
GPU miner coming to nvidia card (not released yet)
https://bitcointalksearch.org/topic/m.7458872

First test, with 6 x 750Ti : 270W at the wall (something like 35w per card), ~160H/s per cards

It's not released, and it's not going to be released. Trust me.


It was released early this morning..
https://github.com/tsiv/ccminer-cryptonight

Well, I stand corrected. But it sucks on AWS - I assumed it'd be great there. Probably why it was released.
sr. member
Activity: 378
Merit: 250
Ok, well I guess every day I have is a "bad day" compared to the world you live in where six figures USD worth of hardware is at your disposal and you nonchalantly work them like rented mules mining XMR like it's going out of style.  Kick a brother down a Xeon server or two. will ya?  Wink

It's more in the region of 7 figures USD worth of hardware. At the moment I'm just playing around on the portion of the development cluster.
Need to make sure things are running optimally before XMR mining is moved to production. I'm responsible that way Tongue
hero member
Activity: 644
Merit: 502
1600 H/s:

4x Opteron 6344 (48 physical cores @ 2.6ghz static)
64G ram (12x4GB ddr3-
14.04, built from wolf's git source

inside KVM with cpu-passthrough (couldn't get it to run on 12.04 which is host's bare metal OS, could potentially be faster)
sysctl -w vm.nr_hugepages=144

pool reporting > 2kh, miner at 1600-1650kh/s.



Could probably get better performance by a) running it on baremetal, b) using numactl and a better task scheduler.



I hit 1030 or so with 23 threads on AWS - you probably need to reduce your thread count.

Wolf, why would he want to REDUCE his hashrate?

He reports 1600 hash/sec and you report 1030.  Huh
hero member
Activity: 644
Merit: 502

really, because 10,000/528 = ~19.  19 is a "few machines?"

edit: and that diff is pathetically low.  444 is for a pentium 2 or someshit.

Yes, 19 is a "few machines". There are plenty more to use across different pools / coins.
Difficulty is assigned by the network and will scale up and down. 444 is the starting difficulty for that pool and its gotten considerably higher now.

Lighten up, you sound like you're having a bad day Wink

Ok, well I guess every day I have is a "bad day" compared to the world you live in where six figures USD worth of hardware is at your disposal and you nonchalantly work them like rented mules mining XMR like it's going out of style.  Kick a brother down a Xeon server or two. will ya?  Wink
member
Activity: 81
Merit: 1002
It was only the wind.
Im getting around 180H/s from my Xeon E5-2620 using this miner on Windows 7 64bit. My hashrate went up by using this miner from 150H/s. Seems bit small since Im under impression this processor should be rather good? Im new guy to CPU mining, did some mining before with my Quaddro K4000, but its just waste of time and I decided to try out with CPU.

ot: I guess you are the same wolf that runs the pool, Im seeing only 70H/s at pool statistics, normal?

If it's been a short time, yes, it's normal, as it takes 10min to be accurate.
legendary
Activity: 1092
Merit: 1000
I'm a Firestarter!
1600 H/s:

4x Opteron 6344 (48 physical cores @ 2.6ghz static)
64G ram (12x4GB ddr3-
14.04, built from wolf's git source

inside KVM with cpu-passthrough (couldn't get it to run on 12.04 which is host's bare metal OS, could potentially be faster)
sysctl -w vm.nr_hugepages=144

pool reporting > 2kh, miner at 1600-1650kh/s.



Could probably get better performance by a) running it on baremetal, b) using numactl and a better task scheduler.

Reduce it to sysctl -w vm.nr_hugepages=48
Will be at least 20% more power.
tgt
newbie
Activity: 19
Merit: 0
1600 H/s:

4x Opteron 6344 (48 physical cores @ 2.6ghz static)
64G ram (12x4GB ddr3-
14.04, built from wolf's git source

inside KVM with cpu-passthrough (couldn't get it to run on 12.04 which is host's bare metal OS, could potentially be faster)
sysctl -w vm.nr_hugepages=144

pool reporting > 2kh, miner at 1600-1650kh/s.

http://i.imgur.com/qdQkiBi.png

Could probably get better performance by a) running it on baremetal, b) using numactl and a better task scheduler.

member
Activity: 81
Merit: 1002
It was only the wind.
If you consider cpu mining, you should consider the whole PC consumption, not just CPU.
Making a "traditional" desktop computer with a 4770K will cost more thant a GPU.

You're thinking from a single minded perspective. You are actually seeing the INTENTIONAL limitation of this algorithm.

My kids have a 2500k each and they get 110H/s when the CPU is at 50% the whole time they're on it. They use this miner in Windows. Measured AT THE WALL the power consumption goes up by 30w when the miner starts if the PC was at idle, when I have Hearthstone running in window mode, it is only going up by 20w with the miner.

So effectively, regular crappy $300 computers that I bought for my kids are getting me 110H/s for somewhere between 20w and 30w depending on what they're doing. An R9 280x draws around 300w from the wall at full power, if Claymore's miner is only using half their power, it would be 150w.

To break even in H/s you'd need to be getting closer to 660H/s per card, your results show 460 per card.

This means that people can't just buy a crap tonne of equipment and own the coin. It was intentionally made to be this way.

EDIT: Forgot to mention that the kids don't think it affects their gameplay. They play mostly Hearthstone, Path of Exile, Diablo 3, League of Legends and DotA 2.

The cryptonight algo was not designed to be more cpu friendly than gpu friendly. It is more cpu friendly actually.
I'm not complaining, i have some cpu at home (a dual xeon 2687w and a [email protected]) - i'm not pro or against cpu mining. I have a few gpu and some cpu.
But juste mesuring the difference when mining with cpu compared to when your kids computer are not mining is not, well, a good measure.
Such a computer while mining should draw ~250W (mesured at the wall). Maybe i'm wrong, i'll let you make the measure.
A simple rig designed for mining with gpu, with a little cpu (ga2016/2020), in iddle state, draw 80w, and when mining XMR with one R9 280X, draw 250W.
With the 5 R9 280X, 2300H/s, 1000W measured at the wall.
OK, my dual xeon give me 960H/s for less power, but i think we will see a lot of optimization (for both i hope) in a near future.

Quote
This means that people can't just buy a crap tonne of equipment and own the coin. It was intentionally made to be this way.
Why do you think it was designed intentionally in this way ? To be fair ?
Gpu friendly coin bring gpu farm and multipool, cpu only or cpu friendly bring botnet/amazon EC2 instances (see the boolberry thread, DGA talk about 200 EC2 for himself, and he is far from being the biggest one). In both case i'm still a very little miner

EDIT :
GPU miner coming to nvidia card (not released yet)
https://bitcointalksearch.org/topic/m.7458872

First test, with 6 x 750Ti : 270W at the wall (something like 35w per card), ~160H/s per cards

It's not released, and it's not going to be released. Trust me.
legendary
Activity: 1092
Merit: 1000

Hm... try this:

Code:
make distclean
./autogen.sh
CFLAGS="-I/tmp/curl/include" LDFLAGS="-static -L/tmp/curl/lib" ./configure --with-libcurl=/tmp/curl

Same error,
checking for libcurl >= version 7.15.2... yes
checking whether libcurl is usable... no
configure: error: Missing required libcurl >= 7.15.2


Dynamic compilation works just fine...

Pastebin config.log.




http://pastebin.com/UMLA4K6B
You built libcurl with zlib, but you did not build a static copy; therefore it looks for one because you forced static compilation and cannot find it, thus the failure.


PFF, stupid curl configuration scrit, version 7.37 is buggy as hell!!

I got it compiled now with 7.34 :

Static CURL :
Code:
wget http://curl.haxx.se/download/curl-7.34.0.tar.gz
 ./configure --disable-shared --enable-static
make -j 4;make install;

cpu miner
Code:
./autogen.sh
./configure CFLAGS="-static"
make -j 4

legendary
Activity: 1092
Merit: 1000

Hm... try this:

Code:
make distclean
./autogen.sh
CFLAGS="-I/tmp/curl/include" LDFLAGS="-static -L/tmp/curl/lib" ./configure --with-libcurl=/tmp/curl

Same error,
checking for libcurl >= version 7.15.2... yes
checking whether libcurl is usable... no
configure: error: Missing required libcurl >= 7.15.2


Dynamic compilation works just fine...

Pastebin config.log.




http://pastebin.com/UMLA4K6B
legendary
Activity: 1092
Merit: 1000

Hm... try this:

Code:
make distclean
./autogen.sh
CFLAGS="-I/tmp/curl/include" LDFLAGS="-static -L/tmp/curl/lib" ./configure --with-libcurl=/tmp/curl

Same error,
checking for libcurl >= version 7.15.2... yes
checking whether libcurl is usable... no
configure: error: Missing required libcurl >= 7.15.2


Dynamic compilation works just fine...
legendary
Activity: 1092
Merit: 1000
Curl 7.31.1 compiled with :
./configure --disable-shared --enable-static --prefix=/usr/local --disable-ldap --disable-sspi
make -j 4;make install;

miner still errors out:
./autogen.sh
./configure CFLAGS="-static'
checking for the version of libcurl... 7.37.1
checking for libcurl >= version 7.15.2... yes
checking whether libcurl is usable... no
configure: error: Missing required libcurl >= 7.15.2

Any ideas ?

./configure is looking at curl prefix in /usr (depending on the distro). You might want configure curl like:
./configure --disable-shared --enable-static --prefix=/tmp/curl --disable-ldap --disable-sspi
make ; make install

then the miner:
./autogen.sh
./configure CFLAGS="-static' --with-libcurl=/tmp/curl
make



Tried it, no luck. Curl author should be hanged in public!!
Pages:
Jump to: