Author

Topic: OFFICIAL CGMINER mining software thread for linux/win/osx/mips/arm/r-pi 4.11.0 - page 497. (Read 5805546 times)

legendary
Activity: 1386
Merit: 1097
You still don't fully understand it. Actually pool send much less amount of data than with getwork. You should really read "real world example" from the Stratum documentation page. If you're miner developer, you'll be surely familiar with the howto.

The new protocol really don't change any transaction confirmation times. Nothing changes in this area. Protocol just optimize the overhead on every level and give the miner great oportunity to iterate not only over ntime and nonce, but also over extranonce. And with cost of only one double-sha256 hash per new coinbase. Isn't that great? :-)


Ah OK, so you are dramatically increasing the amount of data sent - whenever there is a need to send it - the diagonal line of the merkle tree?

As I mentioned - wouldn't the amount of getworks still not be insignificant?
You want miners to change their work whenever a txn change is appropriate (new transactions or high fee transactions)
Otherwise the result of this will be to slow down transaction confirm times - transactions will increase their chance of missing going into the current block?

So it's either increase transaction times, or dramatically increase the amount of data sent in fewer getworks?
legendary
Activity: 1750
Merit: 1007
Ah OK, so you are dramatically increasing the amount of data sent - whenever there is a need to send it - the diagonal line of the merkle tree?

As I mentioned - wouldn't the amount of getworks still not be insignificant?
You want miners to change their work whenever a txn change is appropriate (new transactions or high fee transactions)
Otherwise the result of this will be to slow down transaction confirm times - transactions will increase their chance of missing going into the current block?

So it's either increase transaction times, or dramatically increase the amount of data sent in fewer gerworks?

By default, my pool sends new work once every 60 seconds.  That is not significantly different than how often many miners get new work (especially when nTime rolling is involved).

They payload of new work is ~1 KB.  It's sent over an always-open asynchronous TCP connection.  No HTTP overhead, no opening new connections either.  This is roughly equivalent to a single getwork today.  That 1 KB is enough to include ~1,000 transactions (which only requires 10 merklebranches to be sent), and provides enough work 18 Exahash/s.  Add an extra 65 bytes to the work push to double that to 1,025-2,047 transactions.  Another 65 for 2,048-4,096.  As you can see, this means that even at astronomical network size where a block would contain hundreds of thousands of transactions, the payload for the mining protocol will never exceed 2 KB.

Work submissions take only a few bytes.  The new protocol drastically reduces bandwith to the point that a 56k modem provides enough bandwidth to run a mining farm more than 1 billion times the size of the current network.

Additionally, the 18 Exahash uses a local ExtraNonce adjustment size of 4 bytes.  That can be increased to 8 bytes quite easily (the protocol defines the ExtraNonce size as variable, and provided by the pool server).  If the ExtraNonce size was increased to 8 bytes, you can run 4.2 billion times more work per push, and the only increase in bandwidth is it will require 8 more bytes to submit a share to the server.
legendary
Activity: 4592
Merit: 1851
Linux since 1997 RedHat 4
Since the miner in now doing all the work except counting shares, if I were to implement such a protocol I would include a miner option to specify a fee in the coinbase to be paid directly to the miner - and probably also a donation option to the miner devs too - since the miner is now doing all the work of the pool other than counting shares.

This moves the miner into being a partial bitcoind that must fully process transactions, block changes, orphan blocks, orphan transactions, generate merkle trees and decide about transaction worthyness to be included in blocks and the risks involved with that also.

I don't think you understand the protocol entirely.  The pool is still doing ALL of the bitcoind work.  It is creating the merkle branch list (picking the transactions and building the block essentially).  All it's doing is allowing the miner to adjust a piece of the coinbase transaction, which creates a new merkleroot, thus allowing it to continue hashing without interacting with the pool server once it runs out of nonce space or ntime rolling.
Ah OK, so you are dramatically increasing the amount of data sent - whenever there is a need to send it - the diagonal line of the merkle tree?

As I mentioned - wouldn't the amount of getworks still not be insignificant?
You want miners to change their work whenever a txn change is appropriate (new transactions or high fee transactions)
Otherwise the result of this will be to slow down transaction confirm times - transactions will increase their chance of missing going into the current block?

So it's either increase transaction times, or dramatically increase the amount of data sent in fewer getworks?
legendary
Activity: 4592
Merit: 1851
Linux since 1997 RedHat 4
I've already discussed this with ckolivas in the past.

cgminer C code being executed on a CPU is ... quite ... fast.

The delay is simply the verification of the share and the determination of the next getwork data (which is not a network getwork, cgminer already has the next piece of work necessary in most cases, and in the other cases it is not avoidable - getwork send and receive is already asynchronous)

The performance gain would be small and possibly not even noticeable at the scale of data reported by cgminer ... though on a slow old crappy CPU, that may or may not be the case.
To put that into context, if you have spent 10's of thousands of dollars on hashing hardware, get a normal CPU to run cgminer Tongue

My timing code already shows you information about that delay in the API stats command ...

Not an issue of speed, but of aproach.
Delays are messy. Seen the number of loops and sleeps in cgminer? Blurgh.
I don't see how those delays are removed by removing code that is running and instead having it wait ... Smiley

If on the other hand you are referring specifically to the BFL code - then discuss that with him when he returns.
legendary
Activity: 1795
Merit: 1208
This is not OK.
I've already discussed this with ckolivas in the past.

cgminer C code being executed on a CPU is ... quite ... fast.

The delay is simply the verification of the share and the determination of the next getwork data (which is not a network getwork, cgminer already has the next piece of work necessary in most cases, and in the other cases it is not avoidable - getwork send and receive is already asynchronous)

The performance gain would be small and possibly not even noticeable at the scale of data reported by cgminer ... though on a slow old crappy CPU, that may or may not be the case.
To put that into context, if you have spent 10's of thousands of dollars on hashing hardware, get a normal CPU to run cgminer Tongue

My timing code already shows you information about that delay in the API stats command ...

Not an issue of speed, but of aproach.
Delays are messy. Seen the number of loops and sleeps in cgminer? Blurgh.
legendary
Activity: 4592
Merit: 1851
Linux since 1997 RedHat 4
Just to put that into context:
a 1TH/s device can of course hash 1,000,000,000,000 hashes a second.
However, since the nonce size is 2^32, that means it will need to interact with the work source 232.8 times a second (without using roll-n-time)

Does anyone not one see that ANY change is a VERY short term solution ignoring fixing the size of the internal nonce?

That's why we're suggestion a new protocol which adds extremely light overhead.  The miner only needs to perform 1 extra hash per power of 2 transactions being included in the block.  There is no interacting with the work source if the miner supports it natively.  It can prepare the next piece of work in advance with CPU (on average this would be between 7 and 10 hashes, and it will likely never go beyond 12-14 (2^12 - 2^14 transactions in one block).
... yes the work source will need to do this work load.

Be the work source remote, local, or even included in the code.

My point is that the solution seems to be short term ... it's resource requirements do not seem to be many orders of magnitude smaller than the resources available.
legendary
Activity: 4592
Merit: 1851
Linux since 1997 RedHat 4
And while you're at it, change the mining API to event-driven Smiley

What do you mean exactly?

At the moment, hashing works in a loop... start work -> wait for results -> get results - > go back to start. The mining thread issues work, then collects it.

Where as each mining device could be asychronous. The device gets work from the queue when its idle, submits it when it's done. The device calls the shots... calls the routine to get work, calls the routine to submit work.
I've already discussed this with ckolivas in the past.

cgminer C code being executed on a CPU is ... quite ... fast.

The delay is simply the verification of the share and the determination of the next getwork data (which is not a network getwork, cgminer already has the next piece of work necessary in most cases, and in the other cases it is not avoidable - getwork send and receive is already asynchronous)

The performance gain would be small and possibly not even noticeable at the scale of data reported by cgminer ... though on a slow old crappy CPU, that may or may not be the case.
To put that into context, if you have spent 10's of thousands of dollars on hashing hardware, get a normal CPU to run cgminer Tongue

My timing code already shows you information about that delay in the API stats command ...
legendary
Activity: 1750
Merit: 1007
Since the miner in now doing all the work except counting shares, if I were to implement such a protocol I would include a miner option to specify a fee in the coinbase to be paid directly to the miner - and probably also a donation option to the miner devs too - since the miner is now doing all the work of the pool other than counting shares.

This moves the miner into being a partial bitcoind that must fully process transactions, block changes, orphan blocks, orphan transactions, generate merkle trees and decide about transaction worthyness to be included in blocks and the risks involved with that also.

I don't think you understand the protocol entirely.  The pool is still doing ALL of the bitcoind work.  It is creating the merkle branch list (picking the transactions and building the block essentially).  All it's doing is allowing the miner to adjust a piece of the coinbase transaction, which creates a new merkleroot, thus allowing it to continue hashing without interacting with the pool server once it runs out of nonce space or ntime rolling.


Just to put that into context:
a 1TH/s device can of course hash 1,000,000,000,000 hashes a second.
However, since the nonce size is 2^32, that means it will need to interact with the work source 232.8 times a second (without using roll-n-time)

Does anyone not one see that ANY change is a VERY short term solution ignoring fixing the size of the internal nonce?

That's why we're suggesting a new protocol which adds extremely light overhead.  The miner only needs to perform 1 extra hash per power of 2 transactions being included in the block.  There is no interacting with the work source if the miner supports it natively.  It can prepare the next piece of work in advance with CPU (on average this would be between 7 and 10 hashes, and it will likely never go beyond 12-14 (2^12 - 2^14 transactions in one block).
legendary
Activity: 4592
Merit: 1851
Linux since 1997 RedHat 4
Hello ckolivas,

any chance to add native support for Stratum mining protocol?

http://mining.bitcoin.cz/stratum-mining

There're already two pools supporting it (me and BtcGuild) and I expect that others will join us soon. I have also support from Python miners developers (poclbm and Guiminer) so we will add native support to their code. Unfortunately I'm not a C++ programmer, so I can give you only some consultations about the protocol itself but I cannot provide you any code.

The major improvement in all this stuff is that miner can produce unique coinbases locally, so creating block headers is done locally, without asking the server. Also the network layer is improved significantly, so you need only up to 10kB/minute of bandwidth even for 18ExaHash/s (10**18) rigs.

Basically you can just bundle mining proxy (pure python) which I provide together with cgminer and run it on the demand, but it is quite ugly solution and native support would be much better.

Let me know what you think about it!
Until ckolivas returns and gives his opinion, which may be completely different to mine ...
As an aside ... I will repeat comments I've made (here and elsewhere) about this generic idea, and add to them:

Since the miner in now doing all the work except counting shares, if I were to implement such a protocol I would include a miner option to specify a fee in the coinbase to be paid directly to the miner - and probably also a donation option to the miner devs too - since the miner is now doing all the work of the pool other than counting shares.

This moves the miner into being a partial bitcoind that must fully process transactions, block changes, orphan blocks, orphan transactions, generate merkle trees and decide about transaction worthyness to be included in blocks and the risks involved with that also.

I'm certainly not saying yes or no about implementing getmemorypool (which is where your idea actually comes from) however, noting the implications of it, I've wondered if anyone has considered them in detail (I've also had a quick click on the web link and the very first item has an issue in my opinion - the pool deciding when the miner should be given work ...)

I will also make comment on performance:
There are a bunch on simple improvements to the getwork protocol that could easily extend the life of the current protocol.
Simplest one would be reference information so that a share returned only requires 3 things: reference, nonce, time
and the getwork would only need to send: reference, version, prevblk, merkle, time, diff
There is a LOT of blank space in the getwork data - so comparing sizes is rather meaningless unless the blanks were removed ...
LP sux due to it's design (was Tycho drunk that night he designed it?) - and there are other options already available.

The roll-n-time hack only exists coz the bitcoin devs are wimps and can't do a hard fork (since they can't even smoothly do a soft fork) and fix the MSDOS restriction of the nonce size.
A nonce size increase would mean that the timing of when the miner would talk to the pool (or the local "Stratum" program) would only be at a frequency decided necessary to include new transactions or maybe also via notification due to a large fee transaction.
The simple problem is that with any such protocol using the current nonce size, the miner MUST talk to the work source (be it part of the same program, a local "Stratum" program, or a pool) every ~4billion hashes - which is not a big number when the miner could EASILY have multiple TerraHash/second - i.e. plug in a few if BFL's so called soon to come 1TH/s devices
I seriously don't understand why anyone coming up with these similar 'Stratum' ideas doesn't spot the obvious problem there, that is directly caused by the nonce size.
Moving to using an extra-nonce in the coinbase transaction is again only needed since the block header nonce is too small - yet there is PLENTY of space to increase it's size in the sha256 empty space ... to 128 bits without any issue.
The miner would never run out of work, before it needs to consider a work change, with a 128bit nonce (well maybe not for a few years at least Smiley
Using the 'extra-nonce' idea also means that the mining software is spending a LOT of unnecessary effort talking to the work source (and the work source is doing way more work than necessary) that a larger nonce size would remove.

Just to put that into context:
a 1TH/s device can of course hash 1,000,000,000,000 hashes a second.
However, since the nonce size is 2^32, that means it will need to interact with the work source 232.8 times a second (without using roll-n-time)

Does anyone not one see that ANY change is a VERY short term solution ignoring fixing the size of the internal nonce?
legendary
Activity: 1386
Merit: 1097
At the moment, hashing works in a loop... start work -> wait for results -> get results - > go back to start. The mining thread issues work, then collects it.

Great. I expected that it is related to the protocol proposal. I was surprised because actually it *is* event driven :-)

Quote from: eleuthria
That would be 75,600,000,000,000,000 TH/s (pretty sure I calculated that right).  Also known as future proof Smiley.
Quote

:-)
hero member
Activity: 896
Merit: 1000
At the moment, hashing works in a loop... start work -> wait for results -> get results - > go back to start. The mining thread issues work, then collects it.

Where as each mining device could be asychronous. The device gets work from the queue when its idle, submits it when it's done. The device calls the shots... calls the routine to get work, calls the routine to submit work.
@ckolivas could better answer this, but I think that a thread per mining device is not such a bad thing: a benefit is that in case of a GPU crashing, if the driver goes nuts it often doesn't bring all cards down but only one GPU.
With an event-based system, a single process will access all devices : a blocked syscall would freeze the entire process.
Usually I prefer event-based systems for handling asynchronous tasks, but here threads might well be the better option.
legendary
Activity: 1795
Merit: 1208
This is not OK.
And while you're at it, change the mining API to event-driven Smiley

What do you mean exactly?

At the moment, hashing works in a loop... start work -> wait for results -> get results - > go back to start. The mining thread issues work, then collects it.

Where as each mining device could be asychronous. The device gets work from the queue when its idle, submits it when it's done. The device calls the shots... calls the routine to get work, calls the routine to submit work.
legendary
Activity: 1750
Merit: 1007
Hello ckolivas,

any chance to add native support for Stratum mining protocol?

http://mining.bitcoin.cz/stratum-mining

There're already two pools supporting it (me and BtcGuild) and I expect that others will join us soon. I have also support from Python miners developers (poclbm and Guiminer) so we will add native support to their code. Unfortunately I'm not a C++ programmer, so I can give you only some consultations about the protocol itself but I cannot provide you any code.

The major improvement in all this stuff is that miner can produce unique coinbases locally, so creating block headers is done locally, without asking the server. Also the network layer is improved significantly, so you need only up to 10kB/minute of bandwidth even for 18ExaHash/s (10**18) rigs.

Basically you can just bundle mining proxy (pure python) which I provide together with cgminer and run it on the demand, but it is quite ugly solution and native support would be much better.

Let me know what you think about it!

Just to build on slush's request:  I previously sent you a PM about a new protocol months ago, and another one just the other day because my protocol proposal was posted.  I've withdrawn that proposal and have backed slush's because they accomplish the same task in the same way, just different markup.

Also, the 18 ExaHash/s number slush is posting isn't a limit of the protocol.  It can easily go beyond that with a simple change in initial handshake message.  18 EH/s not enough?  How about 4.2 billion 18 EH/s farms with only an extra 8 bytes per share submission?  That would be 75,600,000,000,000,000 TH/s (pretty sure I calculated that right).  Also known as future proof Smiley.
legendary
Activity: 1386
Merit: 1097
And while you're at it, change the mining API to event-driven Smiley

What do you mean exactly?
legendary
Activity: 1795
Merit: 1208
This is not OK.
Hello ckolivas,

any chance to add native support for Stratum mining protocol?

http://mining.bitcoin.cz/stratum-mining

There're already two pools supporting it (me and BtcGuild) and I expect that others will join us soon. I have also support from Python miners developers (poclbm and Guiminer) so we will add native support to their code. Unfortunately I'm not a C++ programmer, so I can give you only some consultations about the protocol itself but I cannot provide you any code.

The major improvement in all this stuff is that miner can produce unique coinbases locally, so creating block headers is done locally, without asking the server. Also the network layer is improved significantly, so you need only up to 10kB/minute of bandwidth even for 18ExaHash/s (10**18) rigs.

Basically you can just bundle mining proxy (pure python) which I provide together with cgminer and run it on the demand, but it is quite ugly solution and native support would be much better.

Let me know what you think about it!

And while you're at it, change the mining API to event-driven Smiley
legendary
Activity: 1386
Merit: 1097
Hello ckolivas,

any chance to add native support for Stratum mining protocol?

http://mining.bitcoin.cz/stratum-mining

There're already two pools supporting it (me and BtcGuild) and I expect that others will join us soon. I have also support from Python miners developers (poclbm and Guiminer) so we will add native support to their code. Unfortunately I'm not a C++ programmer, so I can give you only some consultations about the protocol itself but I cannot provide you any code.

The major improvement in all this stuff is that miner can produce unique coinbases locally, so creating block headers is done locally, without asking the server. Also the network layer is improved significantly, so you need only up to 10kB/minute of bandwidth even for 18ExaHash/s (10**18) rigs.

Basically you can just bundle mining proxy (pure python) which I provide together with cgminer and run it on the demand, but it is quite ugly solution and native support would be much better.

Let me know what you think about it!
sr. member
Activity: 369
Merit: 250
how often does it check if the main pool is back up?

Is there an option?
legendary
Activity: 1484
Merit: 1005
A bug exists for scrypt mining algorithm in which thread concurrencies of above 8192 (required to get better performance on 79xx cards) fail, preventing the GPU from using most of its memory.  This results in a -25% performance hit for 79xx cards mining the scrypt algorithm.
legendary
Activity: 2688
Merit: 1240
I tried an older version of libusb but having the same trouble..

also there is no libudev for mac os x (at least in the macports repositories..):

sudo port search libudev
No match for libudev found

I'll try to google it this evening, maybe there is something special with the mac port of libusb..

Everything works fine btw if i do not use "--enable-ztex"... ztex uses libusb exclusively it seems
legendary
Activity: 4592
Merit: 1851
Linux since 1997 RedHat 4
hey kano,

I'm talking about mac os x Smiley
... and I'm showing you the version numbers on linux.

OSX is VERY similar to linux.
If you ssh into an OSX box it's almost the same.
(and a lot of the software running the non-gui side of OSX comes from linux)
OSX is just unix with a window manager to make it look like an old mac ...

The libusb version number may be the issue.
As you can see, on linux it's 0.1.12 ... not 1.x

Also, you are missing libudev-dev
Jump to: