Pages:
Author

Topic: Wolf's XMR/BCN/DSH CPUMiner - 2x speed compared to LucasJones' - NEW 06/20/2014 - page 20. (Read 547105 times)

member
Activity: 81
Merit: 1002
It was only the wind.
Just a quick check and I am getting roughly 220 KH/s with my QA82 ES Xeon processor with 12 GB of RAM... would more RAM make any difference?

No, but try a different number of threads.

Can I run this miner to merge mine on Minergate?


Yes.

Another question, i'm running multiple VPS.

When i use on Ubuntu, i get around 100 H/s
On the same hardware, but running windows image, i get 280 ??

You're probably forgetting hugepages.

Mining with 9 threads for some reason produces 284 hash compared to say 16.... why would that be the case? Take it hyperthreading makes no difference? Also, in your front page, you said the Amazon EC2 manages 1.2KH/s.... which CPUs would those be ?!?!

Hyperthreading is pretty much useless here. Yep, and those are dual Xeons that cost $6500+ each.
legendary
Activity: 1610
Merit: 1000
Crackpot Idealist
This might vary a little depending on what ubuntu version you are using and ffs I have not run this in a while so it also might have a few un-needed steps but it should get you started. This is tested to work for ubuntu 12.04 x64 with AES capable CPU. This is all done in your terminal. The files are downloaded to your home folder (ie /home/cpuminer-multi)

First your dependencies:
Code:
sudo add-apt-repository ppa:costamagnagianfranco/autoconf -y
sudo apt-get update
sudo apt-get install build-essential screen automake m4 openssl libssl-dev git libjson0 libjson0-dev libcurl4-openssl-dev autoconf python-software-properties -y

Next get the miner source and compile:

Code:
git clone https://github.com/wolf9466/cpuminer-multi.git
cd cpuminer-multi
./autogen.sh
./configure CFLAGS="-march=native"
make

Optimize system for mining
Code:
threads=`nproc`
sub="1"
mul="3"
thr=$(($threads - $sub))
thruse=$(($thr * 3))
sudo sysctl -w vm.nr_hugepages=$thruse

Start your miner with screen

Code:
sudo screen ./minerd -a cryptonight -o stratum+tcp://mine.moneropool.com:3333 -u 485VWHzZMe8hGJCXDArFVf1KUqjbwUnKqCEfPTzcU8zt6geeCMaaXECEMYsxFwQdgKi9LVX3HNMb1FQoMLKpbFRE5rUGCNe -p x
full member
Activity: 183
Merit: 100
Can anyone link me to a guide suitable for a Linux beginner who doesn't know his github from his twatspoke on how to obtain and compile this miner, starting from a fresh ubuntu install?

I'd like to run some side by side comparisons.
member
Activity: 81
Merit: 1002
It was only the wind.
What .bat should I write to merge mine on MinerGate?
Did somebody do that?

The Fantomcoin website has instructions, I think.
full member
Activity: 168
Merit: 100
CN is not only CPU mining by a long shot. GPUs will be able to do better than CPUs - in fact, they already do. BBR, on the other hand, might be actually GPU-resistant. That totally fucked blockchain-in-mah-scratchpad thing causes an ouch slow copy from host memory to GPU memory very often. On top of that, I don't think you can interrupt kernels sometimes - so you'd have to wait for the kernel to finish wasting its time hashing old data.

Nope BBR is not GPU resistant and is being mined with GPUs.

Yes, it's being mined with GPUs. But the GPU implementation is slower than a good CPU one.

Let's just say I'll respectfully disagree. Smiley
full member
Activity: 168
Merit: 100
CN is not only CPU mining by a long shot. GPUs will be able to do better than CPUs - in fact, they already do. BBR, on the other hand, might be actually GPU-resistant. That totally fucked blockchain-in-mah-scratchpad thing causes an ouch slow copy from host memory to GPU memory very often. On top of that, I don't think you can interrupt kernels sometimes - so you'd have to wait for the kernel to finish wasting its time hashing old data.

Nope BBR is not GPU resistant and is being mined with GPUs.
member
Activity: 81
Merit: 1002
It was only the wind.
Just a quick check and I am getting roughly 220 KH/s with my QA82 ES Xeon processor with 12 GB of RAM... would more RAM make any difference?

No, but try a different number of threads.

Can I run this miner to merge mine on Minergate?


Yes.

Another question, i'm running multiple VPS.

When i use on Ubuntu, i get around 100 H/s
On the same hardware, but running windows image, i get 280 ??

You're probably forgetting hugepages.
full member
Activity: 139
Merit: 100

Oh, I see. Odd... maybe the older CPUs did hyperthreading better?

Maybe. Newer CPUs definitely dont like the default -t option.

I'm also getting a %10-%20 difference on a i7 4770 (210-230 when using -t 7 compared to 265-280 when using -t 4).



in one i5 laptop I get more hashes with -t 1 than with -t 3, for aes-ni cpus less threads means more performance.

Of course it could be correct. In case you have only 3MB cache memory on your processor then the -t 1 will be faster (see my previous post  with explanation). How much memory your processor has you can check eg. here: http://ark.intel.com/products/family/75024/4th-Generation-Intel-Core-i5-Processors#@All

Makes a lot of sense. There's no reason why you couldn't run more threads, but it would run a lot slower since it has to traipse out to main memory for the scratchpad all the damned time... but I wonder if the hardware prefetcher is smart enough to get that it should probably stuff the whole scratchpad in cache?


I do not know how exactly it works but this is one of the reasons why CryptoNight is declared as Only CPU-mining & ASIC-resistant. I am noob,  I posted here  information which I got from devs of BCN as you can see my Question on Github (because I thought that this is bug) https://github.com/amjuarez/bytecoin/issues/23.
 .... and Wolfi thx for the miner in my case of i5-2500K @ 4.3 is about 15% faster (v. 05-30-2014) than Minergare client (v3.0) (on Minergate pool) Wink
full member
Activity: 139
Merit: 100

Oh, I see. Odd... maybe the older CPUs did hyperthreading better?

Maybe. Newer CPUs definitely dont like the default -t option.

I'm also getting a %10-%20 difference on a i7 4770 (210-230 when using -t 7 compared to 265-280 when using -t 4).



in one i5 laptop I get more hashes with -t 1 than with -t 3, for aes-ni cpus less threads means more performance.

Of course it could be correct. In case you have only 3MB cache memory on your processor then the -t 1 will be faster (see my previous post  with explanation). How much memory your processor has you can check eg. here: http://ark.intel.com/products/family/75024/4th-Generation-Intel-Core-i5-Processors#@All
full member
Activity: 139
Merit: 100

Oh, I see. Odd... maybe the older CPUs did hyperthreading better?

Maybe. Newer CPUs definitely dont like the default -t option.

I'm also getting a %10-%20 difference on a i7 4770 (210-230 when using -t 7 compared to 265-280 when using -t 4).



In case you use it for CryptoNight hash algorithm then the answer could be this. CryptoNight hash algorithm requires 2 MB of cache for each of the mining threads. In my case of i5-2500K with 4 cores and 6MB I am able to use only -t 3 properly, so any additional thread (over 3) will only slow the process down.
legendary
Activity: 1092
Merit: 1000

Oh, I see. Odd... maybe the older CPUs did hyperthreading better?

Maybe. Newer CPUs definitely dont like the default -t option.

I'm also getting a %10-%20 difference on a i7 4770 (210-230 when using -t 7 compared to 265-280 when using -t 4).

legendary
Activity: 1092
Merit: 1000
I'm getting different results when running the miner on all threads vs only physical cores.

On newer CPUs i'm getting up to 50% more power when running only on physical cpu cores. On older CPUs (also with AES) i get 50% less when using only physical cores.
Why is this ?



That isn't obvious? Newer CPUs tend to be faster, and I don't mean in terms of clock speeds. They improve shit like out of order execution, speculative execution, branch prediction...

I think you misunderstood my post, example below :

dual E5-2620 (12 cores/24 threads)

-t 23 (default when run without -t tag) does around 310H/s
-t 12 (physical cores) does around 390H/s

Dual E5620 (8 cores/16 threads)
-t 15 (default when run without -t tag) does around 150H/s
-t 8  (physical cores) does around 100H/s

EDIT:Dual E5620 figures might not be correct, i performed tests earlier today (couple hours) and no longer have access. Difference was 50%
legendary
Activity: 1092
Merit: 1000
I'm getting different results when running the miner on all threads vs only physical cores.

On newer CPUs i'm getting up to 50% more power when running only on physical cpu cores. On older CPUs (also with AES) i get 50% less when using only physical cores.
Why is this ?

newbie
Activity: 38
Merit: 0
Download links broken. (ottrbutt.com and 46.105.182.112 didn't response).
member
Activity: 81
Merit: 1002
It was only the wind.
Em ... Im a newb on that Sad

No, I mean, I think there are Windows binaries on the github.
full member
Activity: 183
Merit: 100
I meant that if yours is quicker than his on AES-NI systems it must be down to that part of the code that uses AES-NI.

I'm running 2008 R2 Enterprise.
member
Activity: 81
Merit: 1002
It was only the wind.
Any links to Lucas software for NON AES-NI for windows other then the github?


Why? What's wrong with the github?
full member
Activity: 183
Merit: 100
Similar results here. I ran some tests with a dual E5520 server.

cpu-multi / LucasJones (these are the same thing, right?) v2.3.3 gets max 148 H/s and is pretty much always over 140.

Wolf's newly provided non-AES version gets 118 H/s max, with a range of 100 - 118.

Looks like any advantages of Wolf's miner come from the AES-NI code.
legendary
Activity: 3668
Merit: 6382
Looking for campaign manager? Contact icopress!
In case that this is of any interest:

I did some tests with W0lf's latest non-AES miner (yeah, unfortunately I cannot "install" AES for now).

On my I7-920, -t 8, it shows an average of 60 H/s.
Cpuminer-multi (1.0.3) shows about 70 H/s.

The numbers on the pool have a big variation area because I get - quite often - connection errors, so I don't have a reliable way to compare.

Is there any chance that the miner hash reporting is wrong?
Is there anything I can do to avoid the connection problems?
Are there some details I am missing?

----------

I've read your explanation about 32bit vs 64bit on XMR thread... hats off to you, sir.
hero member
Activity: 1274
Merit: 556
Probably a compatibility issue.
Testing on crypto-pool.fr gives more consistent results. Using -t 2 and -t 4 give similar result of between 40 and 50 H/s on the miner and the website... which is also consistent with the figure I would get from the Minergate client... except there I'd get 20% more from merged mining.
Pages:
Jump to: