Pages:
Author

Topic: [ANN]: cpuminer-opt v3.8.8.1, open source optimized multi-algo CPU miner - page 96. (Read 444043 times)

legendary
Activity: 1470
Merit: 1114
Hmq1725 Algo isn't working. Says stack smashing detected.

It works for me. Can you provide more details?
sr. member
Activity: 430
Merit: 250
Hmq1725 Algo isn't working. Says stack smashing detected.
full member
Activity: 224
Merit: 100
CryptoLearner
v3.5.8 is out.

The main thing is lyra2re is fixed on Windows. Also ported the cryptonight optimization from 3.5.7
to the non-aes version but the results were disappointing. Also disappointing were the results of precalculating
the first function's midstate hash for xevan and veltor. It seems to have a bigger impact on high hash rate algos.

Gonna test it on my good old CPU, thanks  Grin

Edit : Yes it's working, now on par with minergate private miner, yay \o/, good job !
legendary
Activity: 1470
Merit: 1114
i dont know if this is the desired behaviour, but in the HOWTO the following is stated:

Quote
libhugetlbfs can be used to make an existing application use hugepages
for all its malloc() calls.  This works on an existing (dynamically
linked) application binary without modification.

If that's the case maybe I don't need to change anything in cpuminer.

Edit: It looks like tranparent large pages is the way to go, and should have been from the start.
It wasn't so complicated when disk drives increased their block size.

my impression was that transparent/automatic huge pages won't increase the performance as much as using the hugepages calls directly in the programm, but if this is wrong that would be great news as implementing huge pages would then become easier by far (i suppose)

From my understanding of MMU and TLB from another architecture...
From the application perspective accessing huge pages is just derefeferencing a pointer like any other data
access. The magic happens when the CPU executes the load instruction and translates that pointer to a physical
memory address. The only difference is the accesses are faster because the translation is cached and
the same cached trasnslation can be used for the entire larger page before a new mapping is required for the next page.
At a higher level fewer TLBs can access more memory.

Reserving a section of memory for large pages and using the file system to access it may provide benefits beyond
large pages, though I don't see how. I would think going through the file system would add extra overhead.


interesting

maybe only a real world test will clear the performance questions up

Some of my old theory is coming back so while I'm on a roll...

Fragmentation should be less of an issue with pure HP because each page is guaranteeed to be contiguous,
and being all the same size can always find a fit. However, memory bloat will occur because every app's
VM size will be rounded up to the next page.

In a hybrid environment the huge page is just a container (pool)  for many small pages. As long as small pages
are preallocated from the pool in groups they will not cause problems for the huge pages. Things only go bad
when a scattering of small pages results in difficulty finding a contiguous block big enough for a huge page.
As long as containers are used they all fit together nicely just like on a ship.

Edit: Still rolling

Some of the large pages will be nearly empty if they belong to apps with low memory requirenents. Being able
to allocate small pages for these apps means they can share a big page container reducing gaps and increasing
memory efficiency.
hero member
Activity: 700
Merit: 500
i dont know if this is the desired behaviour, but in the HOWTO the following is stated:

Quote
libhugetlbfs can be used to make an existing application use hugepages
for all its malloc() calls.  This works on an existing (dynamically
linked) application binary without modification.

If that's the case maybe I don't need to change anything in cpuminer.

Edit: It looks like tranparent large pages is the way to go, and should have been from the start.
It wasn't so complicated when disk drives increased their block size.

my impression was that transparent/automatic huge pages won't increase the performance as much as using the hugepages calls directly in the programm, but if this is wrong that would be great news as implementing huge pages would then become easier by far (i suppose)

From my understanding of MMU and TLB from another architecture...
From the application perspective accessing huge pages is just derefeferencing a pointer like any other data
access. The magic happens when the CPU executes the load instruction and translates that pointer to a physical
memory address. The only difference is the accesses are faster because the translation is cached and
the same cached trasnslation can be used for the entire larger page before a new mapping is required for the next page.
At a higher level fewer TLBs can access more memory.

Reserving a section of memory for large pages and using the file system to access it may provide benefits beyond
large pages, though I don't see how. I would think going through the file system would add extra overhead.


interesting

maybe only a real world test will clear the performance questions up
legendary
Activity: 1470
Merit: 1114
i dont know if this is the desired behaviour, but in the HOWTO the following is stated:

Quote
libhugetlbfs can be used to make an existing application use hugepages
for all its malloc() calls.  This works on an existing (dynamically
linked) application binary without modification.

If that's the case maybe I don't need to change anything in cpuminer.

Edit: It looks like tranparent large pages is the way to go, and should have been from the start.
It wasn't so complicated when disk drives increased their block size.

my impression was that transparent/automatic huge pages won't increase the performance as much as using the hugepages calls directly in the programm, but if this is wrong that would be great news as implementing huge pages would then become easier by far (i suppose)

From my understanding of MMU and TLB from another architecture...
From the application perspective accessing huge pages is just derefeferencing a pointer like any other data
access. The magic happens when the CPU executes the load instruction and translates that pointer to a physical
memory address. The only difference is the accesses are faster because the translation is cached and
the same cached trasnslation can be used for the entire larger page before a new mapping is required for the next page.
At a higher level fewer TLBs can access more memory.

Reserving a section of memory for large pages and using the file system to access it may provide benefits beyond
large pages, though I don't see how. I would think going through the file system would add extra overhead.
hero member
Activity: 700
Merit: 500
i dont know if this is the desired behaviour, but in the HOWTO the following is stated:

Quote
libhugetlbfs can be used to make an existing application use hugepages
for all its malloc() calls.  This works on an existing (dynamically
linked) application binary without modification.

If that's the case maybe I don't need to change anything in cpuminer.

Edit: It looks like tranparent large pages is the way to go, and should have been from the start.
It wasn't so complicated when disk drives increased their block size.

my impression was that transparent/automatic huge pages won't increase the performance as much as using the hugepages calls directly in the programm, but if this is wrong that would be great news as implementing huge pages would then become easier by far (i suppose)
legendary
Activity: 1470
Merit: 1114
i dont know if this is the desired behaviour, but in the HOWTO the following is stated:

Quote
libhugetlbfs can be used to make an existing application use hugepages
for all its malloc() calls.  This works on an existing (dynamically
linked) application binary without modification.

If that's the case maybe I don't need to change anything in cpuminer.

Edit: It looks like tranparent large pages is the way to go, and should have been from the start.
It wasn't so complicated when disk drives increased their block size.
sr. member
Activity: 561
Merit: 255
Going stealthy
Is this version with fees or no fee? I hope the OP could be more clear on this. Thanks

It's open source, isn't that clear enough?

Never mind I though this was optiminer, as we know optiminer come with fees.
I didn't pay attention to the title much.
hero member
Activity: 700
Merit: 500
i dont know if this is the desired behaviour, but in the HOWTO the following is stated:

Quote
libhugetlbfs can be used to make an existing application use hugepages
for all its malloc() calls.  This works on an existing (dynamically
linked) application binary without modification.
legendary
Activity: 1470
Merit: 1114

you will probably need to write a wrapper function for the allocation part to handle linux/windows differences

I don't want to mess with the OS part. I will work with someone to integrate cpuminer into a mining distro
or, if not too difficult,  I can provide a standalone cpuminer package which will work with an OS preconfigured
for large pages. My last post suggested the latter was difficult.

KopiemTu is advertized as Nvidia mining distro but it doesn't mean it can't be more unless it's excludively
sponsored by Nvidia.

Either way I think a mining distro preconfigured for large pages with cpuminer included is the best approach.
It's just not something I can, or am willing, do all myself.

I will continue, so see what exactly is involved in making a cpuminer package that supports large pages,
as long as it's transparent to desktop users and requires no OS specific mods to the application code.


afaik the systemspecific code (that being hugetlbfs and largepages for ms) has to be written anyways if you want it to be included in any distro, the distro just takes away the os setup (which is easy), or have i missed something?

i meant distros with cpumining option already available

The arcticle I quoted says it's easier if pre-loaded and pre-linked.
To support a standalone large page enabled cpuminer application would require taking the difficult route.

I was talking in general terms about a distro. It could be an existing one or a new one, doesn't matter. A large
page enabled miner built into a large page enabled distro seems like the simplest route.

Edit: the fragmentation issue is also a concern. It may require rebooting when changing algos and modifying
algos not to use dynamic allocation/free while mining.

My bottom line concern is this feature is intended, and optimized, for single process servers and will not work
well in a desktop environment. I'm trying to identify all the pitfalls and develop an approach before starting to do
any real work.
hero member
Activity: 700
Merit: 500

you will probably need to write a wrapper function for the allocation part to handle linux/windows differences

I don't want to mess with the OS part. I will work with someone to integrate cpuminer into a mining distro
or, if not too difficult,  I can provide a standalone cpuminer package which will work with an OS preconfigured
for large pages. My last post suggested the latter was difficult.

KopiemTu is advertized as Nvidia mining distro but it doesn't mean it can't be more unless it's excludively
sponsored by Nvidia.

Either way I think a mining distro preconfigured for large pages with cpuminer included is the best approach.
It's just not something I can, or am willing, do all myself.

I will continue, so see what exactly is involved in making a cpuminer package that supports large pages,
as long as it's transparent to desktop users and requires no OS specific mods to the application code.


afaik the systemspecific code (that being hugetlbfs and largepages for ms) has to be written anyways if you want it to be included in any distro, the distro just takes away the os setup (which is easy), or have i missed something?

i meant distros with cpumining option already available
legendary
Activity: 1470
Merit: 1114

you will probably need to write a wrapper function for the allocation part to handle linux/windows differences

I don't want to mess with the OS part. I will work with someone to integrate cpuminer into a mining distro
or, if not too difficult,  I can provide a standalone cpuminer package which will work with an OS preconfigured
for large pages. My last post suggested the latter was difficult.

KopiemTu is advertized as Nvidia mining distro but it doesn't mean it can't be more unless it's excludively
sponsored by Nvidia.

Either way I think a mining distro preconfigured for large pages with cpuminer included is the best approach.
It's just not something I can, or am willing, do all myself.

I will continue, so see what exactly is involved in making a cpuminer package that supports large pages,
as long as it's transparent to desktop users and requires no OS specific mods to the application code.
hero member
Activity: 700
Merit: 500
Large pages looks like a great feature for a specialist mining distribution. Both the OS and the apps
could be preconfigured for large pages. User friendly and plug and play.

Thougths

enabling large pages in a linux pre built image is easy, should be doable, but im not aware of any cpumining distros/images, anybody?
obviously one can always ssh into the system and install the cpuminer himself, but then he also can enable large pages himself and its not userfriendly/plug&play


Here is the best known (to me) mining distro.

https://bitcointalksearch.org/topic/m.5764866

I also found this which describes how to enable an app to use large pages. It includes the following...

https://lwn.net/Articles/375096/
Quote
While applications can be modified to use any of the interfaces, it imposes a significant burden on the application developer.
 To make life easier, libhugetlbfs can back a number of memory region types automatically when it is either pre-linked or pre-loaded.
 This process is described in the HOWTO documentation and manual pages that come with libhugetlbfs.

Didn't read the HOWTO yet but what I infer from that is the OS and application are tightly coupled, which would make it
more challenging to build as a standalone application.


thats a gpu distro (nvidia), right? no oob cpuminer support afaik

i have also found the following stated from microsoft about large pages:

Large-page memory regions may be difficult to obtain after the system has been running for a long time because the physical space for each large page must be contiguous, but the memory may have become fragmented. Allocating large pages under these conditions can significantly affect system performance. Therefore, applications should avoid making repeated large-page allocations and instead allocate all large pages one time, at startup.

The memory is always read/write and nonpageable (always resident in physical memory).

The memory is part of the process private bytes but not part of the working set, because the working set by definition contains only pageable memory.

Large-page allocations are not subject to job limits.


that might be also the case for linux where its often advised to set the hugepage size on boot (in grub)

you will probably need to write a wrapper function for the allocation part to handle linux/windows differences
legendary
Activity: 1470
Merit: 1114
Welcome to the 3.5.8 version :-) !

You bombed my announcement. Grin
legendary
Activity: 1470
Merit: 1114
v3.5.8 is out.

The main thing is lyra2re is fixed on Windows. Also ported the cryptonight optimization from 3.5.7
to the non-aes version but the results were disappointing. Also disappointing were the results of precalculating
the first function's midstate hash for xevan and veltor. It seems to have a bigger impact on high hash rate algos.
legendary
Activity: 1260
Merit: 1046
Welcome to the 3.5.8 version :-) !
legendary
Activity: 1470
Merit: 1114
Large pages looks like a great feature for a specialist mining distribution. Both the OS and the apps
could be preconfigured for large pages. User friendly and plug and play.

Thougths

enabling large pages in a linux pre built image is easy, should be doable, but im not aware of any cpumining distros/images, anybody?
obviously one can always ssh into the system and install the cpuminer himself, but then he also can enable large pages himself and its not userfriendly/plug&play


Here is the best known (to me) mining distro.

https://bitcointalksearch.org/topic/m.5764866

I also found this which describes how to enable an app to use large pages. It includes the following...

https://lwn.net/Articles/375096/
Quote
While applications can be modified to use any of the interfaces, it imposes a significant burden on the application developer.
 To make life easier, libhugetlbfs can back a number of memory region types automatically when it is either pre-linked or pre-loaded.
 This process is described in the HOWTO documentation and manual pages that come with libhugetlbfs.

Didn't read the HOWTO yet but what I infer from that is the OS and application are tightly coupled, which would make it
more challenging to build as a standalone application.
hero member
Activity: 700
Merit: 500
Large pages looks like a great feature for a specialist mining distribution. Both the OS and the apps
could be preconfigured for large pages. User friendly and plug and play.

Thougths

enabling large pages in a linux pre built image is easy, should be doable, but im not aware of any cpumining distros/images, anybody?
obviously one can always ssh into the system and install the cpuminer himself, but then he also can enable large pages himself and its not userfriendly/plug&play

legendary
Activity: 1470
Merit: 1114
Large pages looks like a great feature for a specialist mining distribution. Both the OS and the apps
could be preconfigured for large pages. User friendly and plug and play.

Thougths
Pages:
Jump to: