Author

Topic: [1500 TH] p2pool: Decentralized, DoS-resistant, Hop-Proof pool - page 718. (Read 2591919 times)

hero member
Activity: 896
Merit: 1000
I just noticed a very high amount of stales (50+%) since today 11AM GMT. My p2pool node doesn't show any sign of trouble (cpu, memory, disk and network are OK).
All the miners connected to it seem to have trouble, see : http://linode.bouton.name:9332/static/graphs.html

The node runs current git master. Is anyone seeing similar behavior ?
hero member
Activity: 516
Merit: 643
p2pool randomly freezes up (freezing my Mac for about ~10 seconds) every half an hour or so. Any idea what's causing this? Should I use a different version of python?

Code:
2012-04-11 07:47:16.501037 > Watchdog timer went off at:
2012-04-11 07:47:16.501107 >   File "run_p2pool.py", line 5, in
2012-04-11 07:47:16.501141 >     main.run()
2012-04-11 07:47:16.501172 >   File "/Users/christian/p2pool/p2pool/main.py", line 1005, in run
2012-04-11 07:47:16.501203 >     reactor.run()
2012-04-11 07:47:16.501234 >   File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/Twisted-12.0.0-py2.7-macosx-10.5-intel.egg/twisted/internet/base.py", line 1169, in run
2012-04-11 07:47:16.501267 >     self.mainLoop()
2012-04-11 07:47:16.501297 >   File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/Twisted-12.0.0-py2.7-macosx-10.5-intel.egg/twisted/internet/base.py", line 1178, in mainLoop
2012-04-11 07:47:16.501331 >     self.runUntilCurrent()
2012-04-11 07:47:16.501361 >   File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/Twisted-12.0.0-py2.7-macosx-10.5-intel.egg/twisted/internet/base.py", line 800, in runUntilCurrent
2012-04-11 07:47:16.501394 >     call.func(*call.args, **call.kw)
2012-04-11 07:47:16.501424 >   File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/Twisted-12.0.0-py2.7-macosx-10.5-intel.egg/twisted/internet/defer.py", line 368, in callback
2012-04-11 07:47:16.501456 >     self._startRunCallbacks(result)
2012-04-11 07:47:16.501487 >   File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/Twisted-12.0.0-py2.7-macosx-10.5-intel.egg/twisted/internet/defer.py", line 464, in _startRunCallbacks
2012-04-11 07:47:16.501520 >     self._runCallbacks()
2012-04-11 07:47:16.501550 >   File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/Twisted-12.0.0-py2.7-macosx-10.5-intel.egg/twisted/internet/defer.py", line 551, in _runCallbacks
2012-04-11 07:47:16.501583 >     current.result = callback(current.result, *args, **kw)
2012-04-11 07:47:16.501614 >   File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/Twisted-12.0.0-py2.7-macosx-10.5-intel.egg/twisted/internet/defer.py", line 1101, in gotResult
2012-04-11 07:47:16.501647 >     _inlineCallbacks(r, g, deferred)
2012-04-11 07:47:16.501677 >   File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/Twisted-12.0.0-py2.7-macosx-10.5-intel.egg/twisted/internet/defer.py", line 1045, in _inlineCallbacks
2012-04-11 07:47:16.501710 >     result = g.send(result)
2012-04-11 07:47:16.501740 >   File "/Users/christian/p2pool/p2pool/main.py", line 799, in status_thread
2012-04-11 07:47:16.501770 >     print this_str
2012-04-11 07:47:16.501799 >   File "/Users/christian/p2pool/p2pool/util/logging.py", line 81, in write
2012-04-11 07:47:16.501830 >     self.inner_file.write(data)
2012-04-11 07:47:16.501860 >   File "/Users/christian/p2pool/p2pool/util/logging.py", line 69, in write
2012-04-11 07:47:16.501891 >     self.inner_file.write('%s %s\n' % (datetime.datetime.now(), line))
2012-04-11 07:47:16.501921 >   File "/Users/christian/p2pool/p2pool/util/logging.py", line 55, in write
2012-04-11 07:47:16.501951 >     output.write(data)
2012-04-11 07:47:16.501981 >   File "/Users/christian/p2pool/p2pool/util/logging.py", line 46, in write
2012-04-11 07:47:16.502011 >     self.inner_file.write(data)
2012-04-11 07:47:16.502041 >   File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py", line 691, in write
2012-04-11 07:47:16.502073 >     return self.writer.write(data)
2012-04-11 07:47:16.502103 >   File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py", line 352, in write
2012-04-11 07:47:16.502134 >     self.stream.write(data)
2012-04-11 07:47:16.558924 >   File "/Users/christian/p2pool/p2pool/main.py", line 702, in
2012-04-11 07:47:16.559463 >     sys.stderr.write, 'Watchdog timer went off at:\n' + ''.join(traceback.format_stack())

It looks like it's getting stuck for a while writing something to the log file. Is it possible that your computer went to sleep and woke up? Is P2Pool's folder on a slow disk (network share, etc)?
legendary
Activity: 2126
Merit: 1001
Did you edit the bitcoin.conf file? More precisely, did you enter the username and password you use into that config?

Ente
legendary
Activity: 2912
Merit: 1060
This looks like a password problem, not blockchain though
twisted.web.error.Error: 401 Authorization Required
sr. member
Activity: 445
Merit: 250
try bitcoind getinfo and see if it is really up
This. Looks like bitcoind is still downloading the blockchain, and refusing rpc connections until it is up to date.
legendary
Activity: 2912
Merit: 1060
try bitcoind getinfo and see if it is really up

try my free node until your is setup to support p2pool
newbie
Activity: 13
Merit: 0
I have been mining with another pool for months and have decided to finally give p2pool a try. I am having problems getting it to work. I have the bitcoin client setup and in server mode with the conf file setup as indicated with a username and password I chose.

When I try to run the p2pool software I keep getting the following errors:

03:29:12.743000 > --- ---
2012-04-12 03:29:12.743000 >   File "p2pool\util\deferral.pyc", line 30, in f
2012-04-12 03:29:12.744000 >     
2012-04-12 03:29:12.744000 >   File "twisted\internet\defer.pyc", line 1018, in _inlineCallbacks
2012-04-12 03:29:12.744000 >     
2012-04-12 03:29:12.744000 >   File "twisted\python\failure.pyc", line 350, in throwExceptionIntoGenerator
2012-04-12 03:29:12.745000 >     
2012-04-12 03:29:12.745000 >   File "p2pool\main.pyc", line 30, in getwork
2012-04-12 03:29:12.745000 >     
2012-04-12 03:29:12.746000 >   File "twisted\internet\defer.pyc", line 1018, in _inlineCallbacks
2012-04-12 03:29:12.746000 >     
2012-04-12 03:29:12.746000 >   File "twisted\python\failure.pyc", line 350, in throwExceptionIntoGenerator
2012-04-12 03:29:12.746000 >     
2012-04-12 03:29:12.747000 >   File "p2pool\util\jsonrpc.pyc", line 54, in callRemote
2012-04-12 03:29:12.747000 >     
2012-04-12 03:29:12.747000 > twisted.web.error.Error: 401 Authorization Required

Can someone tell me what I am doing wrong? I am using windows 7 64 if that helps.
hero member
Activity: 682
Merit: 500

It can take 24+ hours with the same IP to get any incoming connections.

alright i'll just let it do it's thing and see what happens


you are already below pool levels

averaged over the last hour i have 10.4% dead vs the pool with 6.3% dead. not a major difference but seems the pool on average is doing a bit better?

The pool usually gives a range, last I checked it was like 3-12%. You should be fine.
hero member
Activity: 896
Merit: 1000

It can take 24+ hours with the same IP to get any incoming connections.

alright i'll just let it do it's thing and see what happens


you are already below pool levels

averaged over the last hour i have 10.4% dead vs the pool with 6.3% dead. not a major difference but seems the pool on average is doing a bit better?
legendary
Activity: 2058
Merit: 1452
after over 12 hours still had no incoming connections to p2pool

have verified port forwarding works so not sure what the issue could be especially with bitcoin getting connections just fine



added two x6500s to previous two ztex boards today

averages over the last hour
local
1270mh with 133mh dead (10.4%)
p2pool overall average 323gh with 20.4 dead (6.3%)


anyone have suggestions for bringing down local a bit to p2pool levels ? or is that not going to happen with 6500/ztex devices? overall it's not bad within a few percentage just curious
you are already below pool levels
hero member
Activity: 591
Merit: 500
after over 12 hours still had no incoming connections to p2pool

have verified port forwarding works so not sure what the issue could be especially with bitcoin getting connections just fine
It can take 24+ hours with the same IP to get any incoming connections.
hero member
Activity: 896
Merit: 1000
after over 12 hours still had no incoming connections to p2pool

have verified port forwarding works so not sure what the issue could be especially with bitcoin getting connections just fine



added two x6500s to previous two ztex boards today

averages over the last hour
local
1270mh with 133mh dead (10.4%)
p2pool overall average 323gh with 20.4 dead (6.3%)


anyone have suggestions for bringing down local a bit to p2pool levels ? or is that not going to happen with 6500/ztex devices? overall it's not bad within a few percentage just curious
legendary
Activity: 4592
Merit: 1851
Linux since 1997 RedHat 4
so its better high intensity and single thread than low intensity and double thread on a 5970?

Intensity is proportional to the uninterruptable time spent mining (as I said above and D&T above also)
So increasing the intensity doubles the amount of time that the GPU is working without replying
(i.e. double the average time working after an LP occurs)

Reducing the intensity is what you want to do, not increase (as the README says - decrease it by 1)

The reason why it is only 1, and not many is that there is the overhead of reducing the intensity:
Each time the intensity goes down 1, the amount of work required to setup the nonce-ranges and send them to the GPU doubles.

The simplest test is to just reduce it by 1 and see, then again by 1 and see again

The same with threads - the calculation of the size of the nonce range is actually (1 << (15 + Intensity)) * threads
(though that is just the base calculation and there are some limits applied to it)
So doubling the threads also doubles the size of the nonce range - same as above for intensity.

If you apply both, you'd need to see how the combination affects your rig.

You may find that if it works best at 1 thread and 8 intensity, however, 2 threads and 7 intensity may work the same
(but it may be better or even worse) - you just need to try it and see
donator
Activity: 1218
Merit: 1079
Gerald Davis
so its better high intensity and single thread than low intensity and double thread on a 5970?



I have found Intensity 8, 1 thread to work well for 5970. <0.2% DOA and <5% orphan without too much of a hit in throughput.  I did try Inensity 8 and Intensity 7 w/ 2 threads and got worse results.  Intensity 9, 1 thread also worked but had a higher stale rate.  There is some variance in stales so I opted for the I:8, T:1 instead.  I:9, T:2 was downright horrible with stales, 10%+.
legendary
Activity: 1064
Merit: 1000
so its better high intensity and single thread than low intensity and double thread on a 5970?

donator
Activity: 1218
Merit: 1079
Gerald Davis
good to know, so what advantage do you get using a single thread per GPU?

The thread completes faster so it has stale work due to LP.

Intensity determines the # of hashes in a workload (batch).
hashes in workload  = 2^(15+Insensity)

With 1 thread a 500 MH/s GPU simply runs at 500 MH/s.
With 2 threads the GPU is split into two threads each running at 250 MH/s.

For a given intensity a GPU will finish faster with less threads.

hero member
Activity: 896
Merit: 1000
kano,

good to know, so what advantage do you get using a single thread per GPU?

spiccioli.
What kano described matches my understanding so I assume using a single thread instead of two makes the single thread complete twice as fast : statistically the GPU latency should be halved. In term of latency gains, it should be nearly identical to decrementing intensity by one (which halves the space explored by the GPU on each work). Maybe with low intensities decreasing thread count may lower hashrate less than decreasing intensity further and is thus preferred with p2pool ?
legendary
Activity: 1379
Merit: 1003
nec sine labore
You can use two threads per GPU, though, so that when a long poll comes in, one thread can start fetching new data while the other is waiting for the GPU to finish.
Are you sure ? The way I understand cgminer's threads, they should all try to keep working in parallel (for threads each thread should be using 1/n of the processing power) and fetching work is done asynchronously so that it is ready as soon as a GPU thread is available. So with a given intensity the more threads you have, the more time you should spend working on a workbase invalidated by a long poll. This is how I understood the advice ckovilas gives in the cgminer's README to use only one thread.

gyverlb,

the one thread per GPU was a work-around for old versions of cgminer.

As I understand it, while a GPU is processing a batch, the thread that submitted it is blocked waiting for the answer, so, if you have a single thread it cannot fetch new work before the GPU completes its batch.

Using two threads makes it possible to have the second thread starting to fetch new work while the first one is still waiting for the GPU to finish its work.

I'm using two threads without problems (stales are around 1-2% lower than p2pool ones).

spiccioli.


Nope. It doesn't wait for work to finish before getting new work from the pool.
A separate thread deals with getting the work before it is needed so the GPU isn't idle for that long amount of time that would be spent sending out a work request and getting a reply.

kano,

good to know, so what advantage do you get using a single thread per GPU?

spiccioli.
legendary
Activity: 4592
Merit: 1851
Linux since 1997 RedHat 4
You can use two threads per GPU, though, so that when a long poll comes in, one thread can start fetching new data while the other is waiting for the GPU to finish.
Are you sure ? The way I understand cgminer's threads, they should all try to keep working in parallel (for threads each thread should be using 1/n of the processing power) and fetching work is done asynchronously so that it is ready as soon as a GPU thread is available. So with a given intensity the more threads you have, the more time you should spend working on a workbase invalidated by a long poll. This is how I understood the advice ckovilas gives in the cgminer's README to use only one thread.

gyverlb,

the one thread per GPU was a work-around for old versions of cgminer.

As I understand it, while a GPU is processing a batch, the thread that submitted it is blocked waiting for the answer, so, if you have a single thread it cannot fetch new work before the GPU completes its batch.

Using two threads makes it possible to have the second thread starting to fetch new work while the first one is still waiting for the GPU to finish its work.

I'm using two threads without problems (stales are around 1-2% lower than p2pool ones).

spiccioli.


Nope. It doesn't wait for work to finish before getting new work from the pool.
A separate thread deals with getting the work before it is needed so the GPU isn't idle for that long amount of time that would be spent sending out a work request and getting a reply.
legendary
Activity: 4592
Merit: 1851
Linux since 1997 RedHat 4
Except ... it isn't correct.

There IS wasted hashing power.

The problem is not P2Pool specifically, but rather that people believe it is OK to get a high crappy reject rate (9%) because someone here said it was OK to be that high rate while they were getting a much lower rate.

If you use a good miner program and configure it correctly you will not get a high crappy 9% reject rate.

The cause is actually that the miners are not by default configured to handle the ridiculously high share rate (10 seconds)
So P2Pool is the cause, but the solution is simply to configure your miner to handle that issue.

Aside: if you have one or more BFL FPGA Singles, you cannot mine on P2Pool without wasting a large % of your hash rate.

Except reject rate means nothing, delta of average reject rate is what you need to pay attention to.
Well, yes, but that is of course what I meant by saying that 9% is bad and you can get a lower %

Quote
Also, BFL's firmware is broken, they won't return shares until its done 2^32 hashes, and any attempts to force it to update on long polll dumps valid shares. BFL needs to fix their shit before they sell any more FPGAs.
Yep but to put it more specifically, the time to do 2^32 hashes at 830MH/s is 5.17s
Thus each BFL device will complete, on average, 1 nonce range, and then abort the 2nd one for each average 10 second share.
Thus, on average, it would only mine 5.17s out of every 10s or 51.7% thus wasting 48.3% of it's hashes ... yep it's that bad Tongue

Oddly enough, that is more similar to GPU mining than Icarus FPGA mining ...
GPU mining cannot be aborted for each nonce sub-range sent to the GPU, but of course as long as the processing time of the sub-range is small, then you aren't wasting much time waiting after an LP occurs

In this case each LP, you waste 1/2 of the expected processing time for a sub-nonce range (which is very small - but higher as you increase the intensity - each increase in intensity in cgminer increases it 2x)
On cgminer, an intensity of 9 usually means a nonce range of 2^24 or 2^25 which is of the order of 4.5 to 9 ms on ~370Mh/s (e.g. ~ ATI 6950) and of course different on other hardware

Thus with GPUs, reducing the intensity by one reduces the amount of time wasted each LP and since there are 60 times the number of LP's with P2Pool vs normal network LP's then of course that makes sense.

With the Icarus FPGA it aborts when it finds a share and returns it immediately.
This means that if you have to abort the work due to an LP, you know the hashes being thrown away containx no shares (there is a very tiny window afterwards that there could be shares - until the new work is sent)
So being able to abort and restart is very advantageous
Approx time is less than 0.014s on my hardware.
~0.014s is the overhead when processing a work (job start time after sending the work to the FPGA and the time to return the result if there is one)
Jump to: