Pages:
Author

Topic: [ATTN: POOL OPERATORS] PoolServerJ - scalable java mining pool backend - page 8. (Read 31153 times)

sr. member
Activity: 266
Merit: 254
.maxConcurrentDownloadRequests
This setting is the number of miners that can make a get work request or your thread making get work request against the bitcoin daemon. I will assume you create threads here and they do the get work to fill a cache or a request.  20 is a safe maximum but if your system is large and on fiber optics You could do more.

This refers to psj -> bitcoind requests.  This side is completely asynchronous with miner -> psj requests. You should increase this value if your network latency between psj and bitcoind is higher than normal.  This ensures there's less idle gaps for each end while the packets are traversing the network.  

Quote
.maxConcurrentUpstreamSubmits
This is same as above but its submit to the bitcoin daemon and not really cashed but uses a thread for each thread.

yes basically the same as above but going to other direction.  Unless you've forceAllSubmitsUpstream=true this can be low as you'd only expect to be sending submits when you win a block.  Thanks to Eleuthria's testing I'm fairly confident now that it's safe to set forceAllSubmitsUpstream=false

Quote
.maxWorkAgeToFlush
When a thread makes a get work request to store in a cache this is how long in seconds that request is valid should no worker grab the cached work it will be flushed and a new get work request is called.  The bitcoin network updates the work every 60 seconds or more if there is no new transactions (people sending coins) that will be placed in the work load so this value should not be too high or the pool will have a lot of stales? (<- I'm guessing here need confirmation ->)  To small and the bitcoin daemon recieves more requests than necessary reducing performance?

Yes this is the cache 'expiry' time.  When a new block is detected the entire cache is dumped however so it's not really an issue with stales.  Just ensuring that work is relatively fresh so new transactions are included.  As long as the work is from the correct block you could in theory dish out work that is 10 mins old and it wouldn't break anything.

Quote
.minIntervalBetweenHttpRequests
A pool receives work requests from miners this settings is the number of milliseconds between ANY request can be made to poolserverj?  This means that poolserverj ignores or places in queue any request made under this time threshold?  If too many requests are made to the poolserverj they may timeout waiting for poolserverj to accept their request?  For example a value of 100 means only 10 requests can be made per second to poolserverj?
NOTE: If the above and bottom settings are what I think it is and you coded this right you are one BAD ASS DEVELOPER.

No this is relevent to the psj -> bitcoind side.  It should be set to 0 unless you have an unpatched bitcoind.  It just spaces out the requests a little so as not to overwhelm the bitcoind.  This does improve a little with an unpatched daemon but will only slow you down if you have the 4diff patch.

The client side throttling you're talking about is actually handled by the QoS filter.  And the number of requests it will service concurrently before the filtering kicks in is set by 'QoSMaxRequestsToServiceConcurrently=55'
If the server is under extreme load it will begin prioritizing requests.  Priority is determined from lowest to highest:

worker not found
worker found but password is bad
worker found and authenticated
worker found and has submitted at least 1 valid work
worker found and has submitted at least 10 valid works.

low priority requests will probably never get serviced until the server load drops to a more bearable level.

Quote
.minIntervalBetweenHttpRequestsWhenFrantic
When a new block is found on the network all previous work becomes invalid and workers need new work thus minIntervalBetweenHttpRequests limit should be lifted to allow a burst of requests.  This setting should be lower than minIntervalBetweenHttpRequests if not zero if your system can handle it.

Pretty much... This should be set to 0 in 99.9% of cases.
Quote
.maxCacheSize
This number is meaningless by it's self you need to figure out how much load you server can handle then this value is a calculation of that number plus head room for Frantic requests.  The data in this cache is removed by workers or timed out by the value of maxWorkAgeToFlush setting however it's all meaningless without knowledge of you hardware and network limits.  SEE optimal settings example below

Yes it's a key tuning parameter.  If it's too high you will have a lot of work wasted and burn CPU cycles uneccessarilly.  Too low and you will not have burst capacity.  The right number is essentially dependant on your getwork load and how much burst capacity you want to have.
Quote

.cacheWaitTimeout
Your docs say...
### maximum time in milliseconds to wait for the cache to return work before giving up and returning an error result.
Huh?  The cache is a separate application or thread that poolserverj waits for when miners make a work request?  Thus this the length of time it waits to get that data?  I would like to know is this memcache or something else internal to java like .NET cached objects?


It basically looks like this: bitcoind <-> work fetcher -> [cache: work queue] -> work server <-> miner.

However the the fetch and serve sides are async and barely interact.  The cache is basically just a queue of works with some metadata attached.

when a getwork request is received the server thread polls the queue.  If no work is available it sends a wake up call to the fetcher controller in case it's sleeping (very unlikely) then goes to sleep for cachewaitTimeout.  Whenever the fetcher puts new work in the queue it notifies the sleeping server threads. They wake up and try to poll again.  They may get beaten by another server thread so they go back to sleep again.  If the fetcher can't get a work from the queue for cacheWaitTimeout milliseconds it gives up and returns a JSON-RPC error message.

Quote

Optimal Settings Example
Lets say your system has bitcoind installed on it and can only handle 200 requests per second before falling down. [NOTE: this is an example, your results may vary, please call your government official who knows best about everything to regulate what you do.] Cheesy  So to be safe you say 100 request per second is your limit.  The first thing you would set is minIntervalBetweenHttpRequests and its easy to figure out since 100 reqests pers second and there is 1000 milliseconds in a second so the value should be 10.   The next setting is (you guessed it) minIntervalBetweenHttpRequestsWhenFrantic and we know our max is 200 so you could put 5 but 6 would be safe.  Next settings are maxWorkAgeToFlush and maxCacheSize that should (maxCacheSize / (maxWorkAgeToFlush / 1000))  to equal 200, your systems maximum requests.   Keeping in mind the maxWorkAgeToFlush should not be to low or too high.

Firstly I would set minIntervalBetweenHttpRequests and minIntervalBetweenHttpRequestsWhenFrantic to 0.  If you bitcoind does not have 4diff patch then apply it.

(maxCacheSize / (maxWorkAgeToFlush / 1000))  <- this will ensure on average you waste very little work, you will still waste some due to variance in the rate of requests.  However, you have no headroom for burst capacity.  I would recommend you at least double this number.  In a high load environment increase it quite a bit more.  I hope I'm not giving away trade secrets here but I think BTC guild have their's set to about 10 times this number.
sr. member
Activity: 266
Merit: 254
still on 2.9 and It works WAY WAY better than pushpool,  I could kiss you shadders  Kiss

HOWEVER I have only got it to go with half my load before it falls down because it sucks up 100% of the amazon micro CPU this is not good as amazon only gives you burst of 100% usage then throttles the VM down.

Pushpool will take 100% of my load but I get these errors...

Code:
2011-09-19 14:05:09: Listener for "coinserver3 test": 19/09/2011 14:05:09, Problems communicating with bitcoin RPC 0 2
2011-09-19 14:07:01: Listener for "coinserver3 test": 19/09/2011 14:07:01, Problems communicating with bitcoin RPC 0 2

Still trying to fine tune it so that I get the best performance need upgrade and code in a control panel for my web app.  Man I knew when I read your web site that my 10 BTC donation was worth every BTC cent.

If I can get the poolserverj handling 25 GHashs on on micro server with (bitcoind and namecoind running) I will kick you some more BTC or one better higher you on as a part time consultant but we will see.

Davinci

The CPU burst profile of an EC2-micro instance really isn't suitable for PSJ.  PSJ will always have a baseline level of load due to keeping the cache full this is an advantage when load increases but an EC2-micro is designed for scenario's where most of the time the load is near zero and burst occasionally.  If you have a constant baseline load it won't allow the CPU to burst.  So I think from memory you end up with about 0.2 of an EC2 compute unit as your constant capacity.  In pushpool's case it basically doesn't do anything until it gets a request so it's probably using the full 2 EC2 compute unit burst capacity..  For low load pools pushpool will probably run better on a micro that psj.  You would be better off running it on a small instance to get a constant 1 EC2 compute unit then raising it a large as your capacity needs increase.   

Amazon have a very detailed article on this but I can't find it... This talks about the same sort of thing though : http://huanliu.wordpress.com/2010/09/10/amazon-ec2-micro-instances-deeper-dive/
hero member
Activity: 780
Merit: 510
Bitcoin - helping to end bankster enslavement.
I would like to clarify some settings, instead of assuming I will ask even if I think I know I am right just can you just confirm or correct my assertion and move on.


.maxConcurrentDownloadRequests
This setting is the number of miners that can make a get work request or your thread making get work request against the bitcoin daemon. I will assume you create threads here and they do the get work to fill a cache or a request.  20 is a safe maximum but if your system is large and on fiber optics You could do more.

.maxConcurrentUpstreamSubmits
This is same as above but its submit to the bitcoin daemon and not really cashed but uses a thread for each thread.

.maxWorkAgeToFlush
When a thread makes a get work request to store in a cache this is how long in seconds that request is valid should no worker grab the cached work it will be flushed and a new get work request is called.  The bitcoin network updates the work every 60 seconds or more if there is no new transactions (people sending coins) that will be placed in the work load so this value should not be too high or the pool will have a lot of stales? (<- I'm guessing here need confirmation ->)  To small and the bitcoin daemon recieves more requests than necessary reducing performance?


.minIntervalBetweenHttpRequests
A pool receives work requests from miners this settings is the number of milliseconds between ANY request can be made to poolserverj?  This means that poolserverj ignores or places in queue any request made under this time threshold?  If too many requests are made to the poolserverj they may timeout waiting for poolserverj to accept their request?  For example a value of 100 means only 10 requests can be made per second to poolserverj?
NOTE: If the above and bottom settings are what I think it is and you coded this right you are one BAD ASS DEVELOPER.

.minIntervalBetweenHttpRequestsWhenFrantic
When a new block is found on the network all previous work becomes invalid and workers need new work thus minIntervalBetweenHttpRequests limit should be lifted to allow a burst of requests.  This setting should be lower than minIntervalBetweenHttpRequests if not zero if your system can handle it.

 
.maxCacheSize
This number is meaningless by it's self you need to figure out how much load you server can handle then this value is a calculation of that number plus head room for Frantic requests.  The data in this cache is removed by workers or timed out by the value of maxWorkAgeToFlush setting however it's all meaningless without knowledge of you hardware and network limits.  SEE optimal settings example below

.cacheWaitTimeout
Your docs say...
### maximum time in milliseconds to wait for the cache to return work before giving up and returning an error result.
Huh?  The cache is a separate application or thread that poolserverj waits for when miners make a work request?  Thus this the length of time it waits to get that data?  I would like to know is this memcache or something else internal to java like .NET cached objects?


Optimal Settings Example
Lets say your system has bitcoind installed on it and can only handle 200 requests per second before falling down. [NOTE: this is an example, your results may vary, please call your government official who knows best about everything to regulate what you do.] Cheesy  So to be safe you say 100 request per second is your limit.  The first thing you would set is minIntervalBetweenHttpRequests and its easy to figure out since 100 reqests pers second and there is 1000 milliseconds in a second so the value should be 10.   The next setting is (you guessed it) minIntervalBetweenHttpRequestsWhenFrantic and we know our max is 200 so you could put 5 but 6 would be safe.  Next settings are maxWorkAgeToFlush and maxCacheSize that should (maxCacheSize / (maxWorkAgeToFlush / 1000))  to equal 200, your systems maximum requests.   Keeping in mind the maxWorkAgeToFlush should not be to low or too high.


Shadders if any of my assertions are incorrect please correct them.

Looked at my poolserverj worker connection and it has collapsed for some reason. so maybe I'm wrong about the settings.
hero member
Activity: 780
Merit: 510
Bitcoin - helping to end bankster enslavement.
still on 2.9 and It works WAY WAY better than pushpool,  I could kiss you shadders  Kiss

HOWEVER I have only got it to go with half my load before it falls down because it sucks up 100% of the amazon micro CPU this is not good as amazon only gives you burst of 100% usage then throttles the VM down.

Pushpool will take 100% of my load but I get these errors...

Code:
2011-09-19 14:05:09: Listener for "coinserver3 test": 19/09/2011 14:05:09, Problems communicating with bitcoin RPC 0 2
2011-09-19 14:07:01: Listener for "coinserver3 test": 19/09/2011 14:07:01, Problems communicating with bitcoin RPC 0 2

Still trying to fine tune it so that I get the best performance need upgrade and code in a control panel for my web app.  Man I knew when I read your web site that my 10 BTC donation was worth every BTC cent.

If I can get the poolserverj handling 25 GHashs on on micro server with (bitcoind and namecoind running) I will kick you some more BTC or one better higher you on as a part time consultant but we will see.

Davinci
legendary
Activity: 1750
Merit: 1007
BTC Guild's testing has found 15 blocks with PoolServerJ so far, all of them were detected as proper difficulty by PSJ's internal test prior to sending to bitcoind.  No false positives, and no false negatives.
sr. member
Activity: 266
Merit: 254
Hi shads,

just downloaded the release. Is this the updated realase?

poolserverj.jar file size is 246kb

The one I'm looking at is 245.6kb so yes.  The wrong was was about 220kb I think.
full member
Activity: 142
Merit: 100
Hi shads,

just downloaded the release. Is this the updated realase?

poolserverj.jar file size is 246kb
sr. member
Activity: 266
Merit: 254
Apologies to anyone who's downloaded 0.3.0rc1 and tried to use it.  Due a series on unamusing screw ups the distribution contained an earlier version of poolserverj.jar.  A new distro is uploading right now and should be done in a couple of minutes.
sr. member
Activity: 266
Merit: 254
Hi Shadders

My testing of the merged-mine-proxy application created for nmc and btc merged mining was interesting.  I was able to get poolserverj running it was not hard but it sure is a CPU pig with just 1 miner, I think that's over the top for the default settings.  Anyhow that's not what I wanted to talk about, as I was able to merge mine on the test net with just merged-mine-proxy app and find block then when I switched poolserverj I found none!   Shocked

Then I tried pushpoold and same thing... NO blocks after 10 times the shares need based on the difficulty.  Mind you I did the test over and over again and some cases I did find a block but not within the normal variance of the difficulty. Only directly mining the merged-mine-proxy gave me blocks within normal variances.

So I changed pushpoold's server.json file to disable rewrite of the difficulty since it seemed that merged=mine-proxy already did that.  BAM! Pushpool worked as it should and is now finding blocks within normal variance, and it posted the shares to the database.

Switching back to poolserverj and I tried disabling the rewrite but I am not sure if I got the right setting. I'm not on the computer with setting so I will edit this post with the setting I changed but it did not work.  Poolserverj did not find blocks. 

So is there setting you know off hand that will stop poolserverj from rewriting the difficulty and post the shares to the DB?

Thanks

Davinci
BTW I will download the new version and try that one as well.

Hi Davinci,

To disable rewrite difficulty you will a find the following line in the sample properties file:
#source.local.1.rewriteDifficulty=false

just remove the comment marker ('#')

also ensure you've got:
useEasiestDifficulty=false

This is a testing/debugging setting specifically for load testing with the stress test client.

I was under the impression that pool servers needed some modifications to work with merged mining.  I could be wrong though.

Regarding poolserverj being a cpu pig, the default settings in sample config are probably more suited to larger pool.  So yes by default it will use a lot of CPU and memory... This is advantageous to a larger pool but a little wasteful for a smaller one. 

To rectify this please look at the articles in the documentation section: http://poolserverj.org/documentation/

Specifically: Performance & Memory tuning and Troubleshooting Tips.

cheers

hero member
Activity: 780
Merit: 510
Bitcoin - helping to end bankster enslavement.
Hi Shadders

My testing of the merged-mine-proxy application created for nmc and btc merged mining was interesting.  I was able to get poolserverj running it was not hard but it sure is a CPU pig with just 1 miner, I think that's over the top for the default settings.  Anyhow that's not what I wanted to talk about, as I was able to merge mine on the test net with just merged-mine-proxy app and find block then when I switched poolserverj I found none!   Shocked

Then I tried pushpoold and same thing... NO blocks after 10 times the shares need based on the difficulty.  Mind you I did the test over and over again and some cases I did find a block but not within the normal variance of the difficulty. Only directly mining the merged-mine-proxy gave me blocks within normal variances.

So I changed pushpoold's server.json file to disable rewrite of the difficulty since it seemed that merged=mine-proxy already did that.  BAM! Pushpool worked as it should and is now finding blocks within normal variance, and it posted the shares to the database.

Switching back to poolserverj and I tried disabling the rewrite but I am not sure if I got the right setting. I'm not on the computer with setting so I will edit this post with the setting I changed but it did not work.  Poolserverj did not find blocks. 

So is there setting you know off hand that will stop poolserverj from rewriting the difficulty and post the shares to the DB?

Thanks

Davinci
BTW I will download the new version and try that one as well.
sr. member
Activity: 266
Merit: 254
This is a major milestone release for PoolServerJ.  Many of fixes and improvements were as a result of an extensive stress testing process by BTC Guild while they were migrating the pool over.  To celebrate I've changed to license to fully open source.

Notable changes include:

    * Now licensed under GPL v3
    * complete rewrite of longpolling code which has improved longpolling performance markedly
    * alpha implementation of native longpolling listener with the aid of an intermediate daemon built by Caesium from rfcpool
    * numerous stability fixes.

Complete changelog (including the changes from the unreleased 0.2.10):

[0.3.0rc1]

- license change to GPL v3.0
- added a temporary logger to record all stages of submission where a real solution is found.
- fix from : convert submitted data to lowercase, some miners return in uppercase resulting in failed work lookup from map.
- new mgmt interface methods:
?method=setCacheSize&source=&value=
?method=setMaxConcurrentDl&source=&value=
?method=setMaxWorkAgeToFlush&source=&value=
?method=setAllCacheSize&value=
?method=setAllMaxConcurrentDl&value=
?method=setAllMaxWorkAgeToFlush&value=
?method=listWorkerCache

- force cache to be trimmed if shrinking cache size.  If left untrimmed this can result in a lag when all the work expires.  Normally as work is request it will open up a slot for fresh work to be fetched.  However due to cache being oversized this doesn't happen.  It's possible for the entire to end up getting purged so the server has to catch up filling the cache from daemon while servicing requests.
- fix: WorkerProxy was case sensitive.  In cases where database case was different to user supplied case of worker name this would cause a cache miss and force a db query every time.  Thanks to for finding.
- complete rewrite of longpolling code
- added async LP dispatch.
- restructured repo to include dependencies as source projects instead binaries
- fix: clean up longpoll shutdown.  Missed a threadpool executor which was preventing JVM exit.
- fix: longpoll connection are now all explicity closed on shutdown.  Some miners were not registering the closed socket so don't attempt to reconnect to LP when the server restarts.
- fix: prevent work being served from a source until it's confirmed on the new block
- pause work fetching for source until it's confirm on the new block
- add prev_block_hash checks to each incoming work as a new block indicator.  This reduces the amount of polling needed when not in native longpoll mode.
- fix: set autoReconnect=true on JDBC connections.  worker caching can leave long intervals between uses of the connection causing sql server to time it out.
- add fixed time worker cache eviction strategy.  resolves issue #6
- implementation of longpoll connection counting and enforcement of optional limits
- trace logging with target groups for granular tracing.
- clean up of sample properties file add new config options
- fix: nullpointer if share output file not specified.
- fix: nullpointer if request output file not specified.
- addresses issue #7.  When setting worker IP first check X-Forwarded-For header then falls back to remoteAddr.  This covers situations where the server is behind a load balancing proxy.
- fix: use username.intern() to gain a per user canonical sync lock object. Prevents an obscure bug where two near simultaneous initial connections from one worker can result in multiple db lookups where one hasn't been put into the cache before the other is looked up.
- add keep-alive header to json-rpc client requests.

improve worksource syncing on block change
- prevent entries entering cache during out of sync period
- synchronize the change of sync status process
- remove redundant sourcesOnCurrentBlockArray
- change NotifyLongpollClients thread to a Runnable task to avoid having to start up a new thread.
- acceptNotifyBlockChange now had double check inside sync block to prevent double accepts.
- prestart LP executor threads
- fix mismatched sync objects for block change syncing
- workSource resync moved inside sync block

native longpolling
- add registration of native-enabled sources to native lp listener
- improved debug logging
- address handling using canonical host name
- enable verification request to report success or failure

[0.2.10] - unreleased

- fix: WorkSource request throttling was only activating for HTTP level failures.  TCP failures (e.g. connection refused) would not activate throttling resulting in thousands of requests/sec and high CPU usage.
- convert blocknum from string to int
- refactor common elements to bitcoin-poolserverj-core
- fix missing semi-colons in sample sql  scripts

 
hero member
Activity: 780
Merit: 510
Bitcoin - helping to end bankster enslavement.
Thanks for the fix for my pool it helped a lot, there is no need for me to switch from pushpool right now I must focus adding more power to my pool so that it can handle more hashes.

Once that's done I can upgrade to PoolServerJ.
member
Activity: 118
Merit: 10
BTCServ Operator
hey, nice work so far, keep it up Smiley

what I missed at my first test was, that I was not able to connect my miners without supplying a password. Would be awesome if that could be implemented too. Maybe I did not find out how it works, but its possible?

take a look at the guide to building plugins:
http://poolserverj.org/documentation/plugin-guide/

It just so happens the example I used in the tutorial was an AnyPasswordWorkerAuthenticator plugin.  That example plugin is included in the build so you just need to do step 2 to enable it.

great! thanks. also sorry for not noticing it
sr. member
Activity: 266
Merit: 254
hey, nice work so far, keep it up Smiley

what I missed at my first test was, that I was not able to connect my miners without supplying a password. Would be awesome if that could be implemented too. Maybe I did not find out how it works, but its possible?

take a look at the guide to building plugins:
http://poolserverj.org/documentation/plugin-guide/

It just so happens the example I used in the tutorial was an AnyPasswordWorkerAuthenticator plugin.  That example plugin is included in the build so you just need to do step 2 to enable it.
member
Activity: 118
Merit: 10
BTCServ Operator
hey, nice work so far, keep it up Smiley

what I missed at my first test was, that I was not able to connect my miners without supplying a password. Would be awesome if that could be implemented too. Maybe I did not find out how it works, but its possible?
hero member
Activity: 780
Merit: 510
Bitcoin - helping to end bankster enslavement.
What's happening to namecoins?


I'm sending you 10 btc at 1LezqRatQz7MeNoCVziYwcdwtqeEbvrdAq for your work even if you don't help.
hero member
Activity: 780
Merit: 510
Bitcoin - helping to end bankster enslavement.
What's happening to namecoins?


Someone planning an attack, My pool can't handle the hash rates I can give you an amazon server or servers with pushpool compatible schema tonight (Im at work) and you can set it up PoolServerj.  Just tell me what you need.

read about it here...
Namecoin is Prime for a 51% attack
http://dot-bit.org/forum/viewtopic.php?f=2&t=292

sr. member
Activity: 266
Merit: 254
What's happening to namecoins?
hero member
Activity: 780
Merit: 510
Bitcoin - helping to end bankster enslavement.
What to be a hero with your software?

Help me set it up on an Amazon servers and save Namecoins PoolServerJ will be the hero.
legendary
Activity: 1750
Merit: 1007
3 blocks have been found by BTC Guild using PoolServerJ so far.  All 3 had the latest patch where it logs shares detected as matching current difficulty before submission and tracks them.

All 3 blocks were identified by PSJ as full difficulty shares before submission, and all 3 were accepted.  No false positives/false negatives so far.
Pages:
Jump to: