Pages:
Author

Topic: Phoenix - Efficient, fast, modular miner - page 14. (Read 760895 times)

full member
Activity: 219
Merit: 120
August 13, 2011, 04:44:58 AM

Did another git pull this morning, and the stale shares issue is gone now, back to well below 1% for the last few hours. Jood job.

The miner hasn't yet locked up again so I can't tell if whatever problem that caused it is also solved. I'll add a -v as soon as I'm back at my machine tonight and then we will see.

About failover... I'm pointing the miner at my local mining proxy (cdhowie) which lives on the same machine. What do you think, should I use the same address as backup url, so even if not actually switching servers it will still reset the protocol if it's idle? Of course that wouldn't help with the proxy itself going down, but in my experience the proxy is very well behaved once you've figured out how to configure the underlying Apache for more concurrent threads and longer timeouts.

EDIT: One more thing, the phatk2 kernel produces a warning about variable t1 being defined but never used in the kernel (line 153 I think) on start-up, but then works fine as it seems. Is that normal and expected behaviour? Or am I missing out on an optimization that way?

The t1 variable in phatk2 is a dummy to make the compiler behave a certain way. I'm not quite sure why defining a dummy variable results in better performance, but if you need more information about the OpenCL level tweaks I recommend asking in the phatk thread:

https://bitcointalksearch.org/topic/modified-kernel-for-phoenix-15-7964

From kernel.cl:
Code:
	u W[124];
u Vals[8];

//Dummy Variable to prevent compiler from reordering between rounds
u t1;

//Vals[0]=state0;
Vals[1]=B1;
Vals[2]=C1;
Vals[3]=D1;
member
Activity: 78
Merit: 10
August 13, 2011, 03:37:18 AM
Could you run Phoenix with the -v option and post the last few log entries before the miner stopped working? The "idle timeout" simply ensures that ask() always returns after no more than 15 seconds. The idle issue on RPC was caused by a combination of the memory leaks and ask() never running its callbacks. By fixing the memory leaks and adding a timeout to ask() it should prevent the RPC protocol from hanging.

The increase in stale shares might be due to your internet connection not working normally. It also might be due to the keep-alive issue I just fixed in 1.6.1. Try 1.6.1 out once your internet connection is working normally again and see if you still have a high stale count.

Also, I am considering forcing a failover if the miner goes idle for more than 1 minute. This will make getting stuck idling impossible, (assuming the backup server is up) since the entire protocol object is destroyed and re-created when switching servers.

Did another git pull this morning, and the stale shares issue is gone now, back to well below 1% for the last few hours. Jood job.

The miner hasn't yet locked up again so I can't tell if whatever problem that caused it is also solved. I'll add a -v as soon as I'm back at my machine tonight and then we will see.

About failover... I'm pointing the miner at my local mining proxy (cdhowie) which lives on the same machine. What do you think, should I use the same address as backup url, so even if not actually switching servers it will still reset the protocol if it's idle? Of course that wouldn't help with the proxy itself going down, but in my experience the proxy is very well behaved once you've figured out how to configure the underlying Apache for more concurrent threads and longer timeouts.

EDIT: One more thing, the phatk2 kernel produces a warning about variable t1 being defined but never used in the kernel (line 153 I think) on start-up, but then works fine as it seems. Is that normal and expected behaviour? Or am I missing out on an optimization that way?
sr. member
Activity: 378
Merit: 250
August 12, 2011, 05:53:52 PM
There appears to be a problem running the phatk2 kernel on an ATI HD5450 under Windows.  It crashes the program once it starts the kernel.  I can run the kernel on my CPU (only as a test) and it works without crashing.  Any thoughts on this?  Phatk kernel works okay on it, but I would like the extra speed boost.

Have you tried specifying a WORKSIZE? The problem with phatk2 is that the kernel needs to know the worksize before it can be compiled. This prevents automatically setting the worksize to the maximum supported because that can only be detected once the kernel has been compiled. I added some code to phatk2 which checks if the worksize is supported by the device after compiling the kernel. If the worksize specified is too large then it will show an error. If setting a smaller worksize fixes the problem for you, then I will include this change in the next version.

See if specifying WORKSIZE=64 allows the kernel to run.
We have ignition!  Thank you so much!  I'll try to toss a donation your way...well, once I get enough to donate.  Worksize of 128 was max.
full member
Activity: 219
Merit: 120
August 12, 2011, 05:41:19 PM
There appears to be a problem running the phatk2 kernel on an ATI HD5450 under Windows.  It crashes the program once it starts the kernel.  I can run the kernel on my CPU (only as a test) and it works without crashing.  Any thoughts on this?  Phatk kernel works okay on it, but I would like the extra speed boost.

Have you tried specifying a WORKSIZE? The problem with phatk2 is that the kernel needs to know the worksize before it can be compiled. This prevents automatically setting the worksize to the maximum supported because that can only be detected once the kernel has been compiled. I added some code to phatk2 which checks if the worksize is supported by the device after compiling the kernel. If the worksize specified is too large then it will show an error. If setting a smaller worksize fixes the problem for you, then I will include this change in the next version.

See if specifying WORKSIZE=64 allows the kernel to run.
sr. member
Activity: 378
Merit: 250
August 12, 2011, 05:24:03 PM
There appears to be a problem running the phatk2 kernel on an ATI HD5450 under Windows.  It crashes the program once it starts the kernel.  I can run the kernel on my CPU (only as a test) and it works without crashing.  Any thoughts on this?  Phatk kernel works okay on it, but I would like the extra speed boost.
full member
Activity: 219
Merit: 120
August 12, 2011, 04:46:16 PM
Version 1.6.1 released.

Changes:
1. BFI_INT now enabled by default for phatk/phatk2
2. Added automatic failover to backup server if specified with -b
3. Additional memory leak fix for client3420.py
4. Fixed persistent connections

Download:
Windows binaries
Source code/Linux release (requires Python, Twisted, and PyOpenCL)


Hello jedi95,

first I'd like to thank you for going from SVN to Git and for today's fixes and addition.

With the current (as of the time of this posting) RPC code in Git I don't get the constant disconnects and reconnects as I had with the last SVN version; but I still get far more rejected shares, up from 1 or 2% to about 10%, compared to the code from SVN r101.

Also with the current RPC code the miner at some point just stopped mining, possibly because my Internet connection is behaving a bit like a moody child today. However this time it didn't reconnect when the line came back a short time later. Maybe your new idle timeout at work here? Or is this a misunderstanding on my part, I have to admit I haven't yet checked the code.

Keep up the good work!

Could you run Phoenix with the -v option and post the last few log entries before the miner stopped working? The "idle timeout" simply ensures that ask() always returns after no more than 15 seconds. The idle issue on RPC was caused by a combination of the memory leaks and ask() never running its callbacks. By fixing the memory leaks and adding a timeout to ask() it should prevent the RPC protocol from hanging.

The increase in stale shares might be due to your internet connection not working normally. It also might be due to the keep-alive issue I just fixed in 1.6.1. Try 1.6.1 out once your internet connection is working normally again and see if you still have a high stale count.

Also, I am considering forcing a failover if the miner goes idle for more than 1 minute. This will make getting stuck idling impossible, (assuming the backup server is up) since the entire protocol object is destroyed and re-created when switching servers.
member
Activity: 78
Merit: 10
August 12, 2011, 02:58:57 PM
Hello jedi95,

first I'd like to thank you for going from SVN to Git and for today's fixes and addition.

With the current (as of the time of this posting) RPC code in Git I don't get the constant disconnects and reconnects as I had with the last SVN version; but I still get far more rejected shares, up from 1 or 2% to about 10%, compared to the code from SVN r101.

Also with the current RPC code the miner at some point just stopped mining, possibly because my Internet connection is behaving a bit like a moody child today. However this time it didn't reconnect when the line came back a short time later. Maybe your new idle timeout at work here? Or is this a misunderstanding on my part, I have to admit I haven't yet checked the code.

Keep up the good work!
full member
Activity: 219
Merit: 120
August 12, 2011, 02:09:46 PM

Thank you very very much, I am gonna send you a bitcoin  Wink
But I am seeing a high rate of rejects now, not sure what is going on, but usually they were ~1%.

[12/08/2011 20:22:07] [317.82 Mhash/sec] [654 Accepted] [72 Rejected] [RPC (+LP)]
[12/08/2011 20:22:07] [318.03 Mhash/sec] [674 Accepted] [53 Rejected] [RPC (+LP)]


The 1.6.0 version crashes my Ubuntu on one of my 5850. When I use the kernel on the other card, everything works fine. I now mine with phatk2 on one card and phatk on the other. This combination works.
I think it has something to do with the video output. I have the screen plugged in on the card that i mine with phatk2 with. The other one, that crashes doesn't have a screen or dummy plug.

Interesting, I too have some problems since a few days on my 2x 5850 rig, one always locks up, sometimes after 2minutes
sometimes after 12 hours. (maybe since I changed to phatk 2.x?? this was also a few days ago...hmm)
Was already thinking I have some hardware problems, switched the 5850s to phatk 1 now, lets see if this is stable again.


I currently have 8 5870s running on phatk 2.2. (same version included in Phoenix 1.6.0)

I have not seen any crashes or lockups since switching, so I don't think the kernel itself is broken. It might be that it stresses the GPU more than the older phatk kernel, which could explain the stability issues.

Either way, I didn't write phatk 2.2 so you may want to consider posting in the phatk thread for help.
sr. member
Activity: 313
Merit: 250
August 12, 2011, 01:36:50 PM
[...]
3. Fixed RPC memory leaks and various other bugs
[...]

Thank you very very much, I am gonna send you a bitcoin  Wink
But I am seeing a high rate of rejects now, not sure what is going on, but usually they were ~1%.

[12/08/2011 20:22:07] [317.82 Mhash/sec] [654 Accepted] [72 Rejected] [RPC (+LP)]
[12/08/2011 20:22:07] [318.03 Mhash/sec] [674 Accepted] [53 Rejected] [RPC (+LP)]

The 1.6.0 version crashes my Ubuntu on one of my 5850. When I use the kernel on the other card, everything works fine. I now mine with phatk2 on one card and phatk on the other. This combination works.
I think it has something to do with the video output. I have the screen plugged in on the card that i mine with phatk2 with. The other one, that crashes doesn't have a screen or dummy plug.

Interesting, I too have some problems since a few days on my 2x 5850 rig, one always locks up, sometimes after 2minutes
sometimes after 12 hours. (maybe since I changed to phatk 2.x?? this was also a few days ago...hmm)
Was already thinking I have some hardware problems, switched the 5850s to phatk 1 now, lets see if this is stable again.




newbie
Activity: 40
Merit: 0
August 12, 2011, 08:08:18 AM
The 1.6.0 version crashes my Ubuntu on one of my 5850. When I use the kernel on the other card, everything works fine. I now mine with phatk2 on one card and phatk on the other. This combination works.
I think it has something to do with the video output. I have the screen plugged in on the card that i mine with phatk2 with. The other one, that crashes doesn't have a screen or dummy plug.
full member
Activity: 219
Merit: 120
August 12, 2011, 02:16:25 AM
Version 1.6.0 has been released.

Changes:
1. Added the new phatk 2.2 kernel under the name phatk2 (use -k phatk2)
2. Default kernel changed to phatk2
3. Fixed RPC memory leaks and various other bugs
4. Added a timeout to ask() within the RPC code. This will definitively eliminate idling due to protocol bugs.
5. Moved project to GitHub
6. Small change to the version number scheme

The new GitHub URL for the project is here:
https://github.com/jedi95/Phoenix-Miner

Download:
Windows binaries
Source code/Linux release (requires Python, Twisted, and PyOpenCL)
full member
Activity: 219
Merit: 120
August 11, 2011, 06:29:29 PM
The RPC code in the newest SVN version r115 is still not working right. Connections go up and down all the time with lots of idles. Maybe revert it back for the time being? I can do that by hand of course if need be but it's a bit of a hassle to maintain.

FWIW, the phatk2 kernel is exactly as fast as the r115 phatk kernel on my 6990. I measured over 24 hours with one of them on the first and one on the second GPU core of that card; there was no significant deviation in the number of shares produced by each core, less than 10 shares difference.


I think I found the source of the memory leak in the previous RPCClient. If it works in my testing then I will be switching it back.

As for phatk2, it's faster than the phatk kernel for me using my 5870s, but keep in mind that the 69xx cards use VLIW4 instead of VLIW5.
member
Activity: 78
Merit: 10
August 11, 2011, 05:27:04 PM
The RPC code in the newest SVN version r115 is still not working right. Connections go up and down all the time with lots of idles. Maybe revert it back for the time being? I can do that by hand of course if need be but it's a bit of a hassle to maintain.

FWIW, the phatk2 kernel is exactly as fast as the r115 phatk kernel on my 6990. I measured over 24 hours with one of them on the first and one on the second GPU core of that card; there was no significant deviation in the number of shares produced by each core, less than 10 shares difference.
full member
Activity: 219
Merit: 120
August 11, 2011, 03:57:05 PM
Is the kernel 2.2 inlcuded in the latest phoenix-miner (r111)? I am getting the same results with miner r111 and r111+2.2: 403MHash on XFX HD 5850 @ 970/300

The phatk 2.2 kernel is included with the latest SVN revision under the name phatk2. It is also used as the default kernel in the absence of the -k argument.
member
Activity: 238
Merit: 10
August 11, 2011, 05:31:23 AM
By the way:
I am selling two XFX 5850;)
https://bitcointalksearch.org/topic/m.434195
member
Activity: 238
Merit: 10
August 11, 2011, 05:25:57 AM
Is the kernel 2.2 inlcuded in the latest phoenix-miner (r111)? I am getting the same results with miner r111 and r111+2.2: 403MHash on XFX HD 5850 @ 970/300
newbie
Activity: 18
Merit: 0
August 10, 2011, 10:59:01 AM
new to phoenix and linux mining

just got a linuxcoin .2b worker up

phoneix.py  -v -k poclbm BFI_INT FASTLOOP=false AGGRESSION=11 DEVICE=0 WORKSIZE=128

on my 6950 2gb @ 840/1000 is pulling 315 hash.

seems a little low, my 6950 on my win7 box and guiminer is doing 330 with lower mem settings(and a lower fan speed in a hotter room)

is it low or am I just too used to staring at my 5870 miners?
sr. member
Activity: 313
Merit: 250
August 10, 2011, 08:02:18 AM
I noticed the memory leak does not happen on bitcoins.lc and they use longpolling.
Looking at the network traffic, it seems that bitcoins.lc uses keep-alive (slush also has keep-alive)
while the other pools (which are leaking memory) do not seem to use keep-alive!

Also there is something else which is weird.
Everything looks normal until phoenix sends out 2 similiar looking packets.
The server answers this with a error 400 bad request, which is followed by 2 RST.
This happens after every communication between miner and server.

This here is btcguild, but arsbitcoin behaves exactly the same.
Code:
14:04:20.907806 IP (tos 0x0, ttl 64, id 11848, offset 0, flags [DF], proto TCP (6), length 52)
    10.23.42.151.33398 > 69.42.216.173.8332: Flags [.], cksum 0x52ac (incorrect -> 0x91b8), seq 481, ack 182, win 123, options [nop,nop,TS val 718992981 ecr 132504115], length 0
        0x0000:  0014 bfa5 8817 0024 1d8a cc03 0800 4500  .......$......E.
        0x0010:  0034 2e48 4000 4006 b9f6 0a17 2a97 452a  .4.H@.@.....*.E*
        0x0020:  d8ad 8276 208c f8a3 dcc3 b6ca 5f86 8010  ...v........_...
        0x0030:  007b 52ac 0000 0101 080a 2ada f655 07e5  .{R.......*..U..
        0x0040:  da33                                     .3
14:04:21.179311 IP (tos 0x0, ttl 64, id 11849, offset 0, flags [DF], proto TCP (6), length 52)
    10.23.42.151.33398 > 69.42.216.173.8332: Flags [F.], cksum 0x52ac (incorrect -> 0x90a7), seq 481, ack 182, win 123, options [nop,nop,TS val 718993253 ecr 132504115], length 0
        0x0000:  0014 bfa5 8817 0024 1d8a cc03 0800 4500  .......$......E.
        0x0010:  0034 2e49 4000 4006 b9f5 0a17 2a97 452a  .4.I@.@.....*.E*
        0x0020:  d8ad 8276 208c f8a3 dcc3 b6ca 5f86 8011  ...v........_...
        0x0030:  007b 52ac 0000 0101 080a 2ada f765 07e5  .{R.......*..e..
        0x0040:  da33                                     .3
14:04:21.353823 IP (tos 0x0, ttl 54, id 34045, offset 0, flags [DF], proto TCP (6), length 316)
    69.42.216.173.8332 > 10.23.42.151.33398: Flags [P.], cksum 0x2324 (correct), seq 182:446, ack 482, win 54, options [nop,nop,TS val 132504159 ecr 718993253], length 264
        0x0000:  0024 1d8a cc03 0014 bfa5 8817 0800 4500  .$............E.
        0x0010:  013c 84fd 4000 3606 6c39 452a d8ad 0a17  .<[email protected]*....
        0x0020:  2a97 208c 8276 b6ca 5f86 f8a3 dcc4 8018  *....v.._.......
        0x0030:  0036 2324 0000 0101 080a 07e5 da5f 2ada  .6#$........._*.
        0x0040:  f765 4854 5450 2f31 2e31 2034 3030 2042  .eHTTP/1.1.400.B
        0x0050:  6164 2052 6571 7565 7374 0d0a 436f 6e74  ad.Request..Cont
        0x0060:  656e 742d 5479 7065 3a20 7465 7874 2f68  ent-Type:.text/h
        0x0070:  746d 6c0d 0a43 6f6e 6e65 6374 696f 6e3a  tml..Connection:
        0x0080:  2063 6c6f 7365 0d0a 4461 7465 3a20 5765  .close..Date:.We
        0x0090:  642c 2031 3020 4175 6720 3230 3131 2031  d,.10.Aug.2011.1
        0x00a0:  323a 3033 3a33 3320 474d 540d 0a43 6f6e  2:03:33.GMT..Con
        0x00b0:  7465 6e74 2d4c 656e 6774 683a 2031 3334  tent-Length:.134
        0x00c0:  0d0a 0d0a 3c48 544d 4c3e 3c48 4541 443e  ....
        0x00d0:  0a3c 5449 544c 453e 3430 3020 4261 6420  .400.Bad.<br />        0x00e0:  5265 7175 6573 743c 2f54 4954 4c45 3e0a  Request.
        0x00f0:  3c2f 4845 4144 3e3c 424f 4459 3e0a 3c48  .        0x0100:  313e 4d65 7468 6f64 204e 6f74 2049 6d70  1>Method.Not.Imp
        0x0110:  6c65 6d65 6e74 6564 3c2f 4831 3e0a 496e  lemented.In
        0x0120:  7661 6c69 6420 6d65 7468 6f64 2069 6e20  valid.method.in.
        0x0130:  7265 7175 6573 743c 503e 0a3c 2f42 4f44  request

.        0x0140:  593e 3c2f 4854 4d4c 3e0a                 Y>.
14:04:21.353861 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 40)
    10.23.42.151.33398 > 69.42.216.173.8332: Flags [R], cksum 0xe4ef (correct), seq 4171488452, win 0, length 0
        0x0000:  0014 bfa5 8817 0024 1d8a cc03 0800 4500  .......$......E.
        0x0010:  0028 0000 4000 4006 e84a 0a17 2a97 452a  .(..@[email protected]..*.E*
        0x0020:  d8ad 8276 208c f8a3 dcc4 0000 0000 5004  ...v..........P.
        0x0030:  0000 e4ef 0000                           ......
14:04:21.354062 IP (tos 0x0, ttl 54, id 34046, offset 0, flags [DF], proto TCP (6), length 52)
    69.42.216.173.8332 > 10.23.42.151.33398: Flags [F.], cksum 0x8fb7 (correct), seq 446, ack 482, win 54, options [nop,nop,TS val 132504159 ecr 718993253], length 0
        0x0000:  0024 1d8a cc03 0014 bfa5 8817 0800 4500  .$............E.
        0x0010:  0034 84fe 4000 3606 6d40 452a d8ad 0a17  [email protected]@E*....
        0x0020:  2a97 208c 8276 b6ca 608e f8a3 dcc4 8011  *....v..`.......
        0x0030:  0036 8fb7 0000 0101 080a 07e5 da5f 2ada  .6..........._*.
        0x0040:  f765                                     .e
14:04:21.354068 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 40)
    10.23.42.151.33398 > 69.42.216.173.8332: Flags [R], cksum 0xe4ef (correct), seq 4171488452, win 0, length 0
        0x0000:  0014 bfa5 8817 0024 1d8a cc03 0800 4500  .......$......E.
        0x0010:  0028 0000 4000 4006 e84a 0a17 2a97 452a  .(..@[email protected]..*.E*
        0x0020:  d8ad 8276 208c f8a3 dcc4 0000 0000 5004  ...v..........P.
        0x0030:  0000 e4ef 0000                           ......


I am not sure, but this does not look normal to me, with the bad request and all.

full member
Activity: 140
Merit: 100
August 04, 2011, 06:29:25 AM

Very curious. I switched to:

Code:
#!/bin/sh
export DISPLAY=:0
while true; do
    timeout -k 60 24h python phoenix.py -v \
-u http://user:pass@localhost:8332 \
-k phatk_dia DEVICE=0 AGGRESSION=13 BFI_INT WORKSIZE=128 \
VECTORS FASTLOOP=false
done

to force a restart every 24 hours, so I can't collect the data anymore.
sr. member
Activity: 313
Merit: 250
August 03, 2011, 10:47:07 PM

I don't get the same result.  When I mine solo against the standard client (RPC only, no long polling) memory fills up.

Hmm, thats weird, look at my memory graph, most of it is slush and at the end is arsbitcoin Smiley


Maybe there is some other difference between those pools that triggers this, oh well... Roll Eyes






Pages:
Jump to: