Pages:
Author

Topic: DiabloMiner GPU Miner - page 48. (Read 866596 times)

legendary
Activity: 1162
Merit: 1000
DiabloMiner author
July 07, 2011, 12:02:37 PM
Now, the somewhat interesting thing is, LP does not actually control when getworks happen. 100 mhash/sec causes nonce saturation in 45 seconds on a single getwork, and you have 2 getworks per GPU worked in parallel.

LP locking up due to network/pool stupidity only causes higher actual stale rates because LP isn't triggering getworks early. Any shares created until post-new block getwork are stale. It stops generating stales when the new getwork triggers.
I'm not sure I'm understanding this correctly - does this mean you don't use the LP response to update the current work? As I understood it, the LP response is the same thing as a normal getwork response and should be used as the current work, not trigger a new getwork request.

Anyway, not all the shares found after LP failure are rejected, only about 2/3 (which is annoyingly high, and probably caused both by my dodgy network connection and the low hashing speed). So normal getwork still works, that isn't a problem.

You're partly understanding it correctly. Look at the math I printed, a getwork lasts 45 seconds at 100 mhash/sec. There are 2 used per GPU. On my 5850@918, that is a getwork every 12 seconds on average (they come in groups of 2 every 24 seconds). An LP only causes a getwork up to 24 seconds early.

LP returns a getwork, it is reused, but that doesn't help the other threads. Thus, LP causes a getwork flush. Also, for my 5850, this also means I can only produce stales for up to 24 seconds. A terminally stuck LP thread will just cause spans of 24 seconds to be stale-happy. A block happens roughly every 8 minutes (although the network targets 10).
full member
Activity: 373
Merit: 100
July 07, 2011, 11:02:59 AM
Now, the somewhat interesting thing is, LP does not actually control when getworks happen. 100 mhash/sec causes nonce saturation in 45 seconds on a single getwork, and you have 2 getworks per GPU worked in parallel.

LP locking up due to network/pool stupidity only causes higher actual stale rates because LP isn't triggering getworks early. Any shares created until post-new block getwork are stale. It stops generating stales when the new getwork triggers.
I'm not sure I'm understanding this correctly - does this mean you don't use the LP response to update the current work? As I understood it, the LP response is the same thing as a normal getwork response and should be used as the current work, not trigger a new getwork request.

Anyway, not all the shares found after LP failure are rejected, only about 2/3 (which is annoyingly high, and probably caused both by my dodgy network connection and the low hashing speed). So normal getwork still works, that isn't a problem.
legendary
Activity: 1162
Merit: 1000
DiabloMiner author
July 07, 2011, 06:34:59 AM
Yeah, 60 minutes makes no sense. Bitcoin aims to create a new block every ten minutes. Generally, its closer to every 6 to 8. If 10 minutes have passed, and LP hasn't returned, its obvious we're most likely in a new block by now.

I'd say that 20 minutes makes more sense since the 10 minutes is supposed to be a statistical average, not some upper limit. It happens quite frequently (especially after a few quick blocks) that a longer block comes along to balance things out. OTOH I don't really see the harm in forcing more frequent updates.

Yeah, its the lack of harm that I don't really see an issue with just making both 10. Or even 15. Extremely few blocks are past 15 from what I can tell. The worst case is the connection times out and we cause getworks early.

setReadTimeout doesn't seem to do what the javadocs says it does, at least, not from what I can see in the source. I'll look at it later.

I haven't really looked at setReadTimeout's source, but it seems to be working fine, here. I have several "ERROR: Cannot connect to Bitcoin: Read timed out" messages in my log now, and after this, long polling messages start coming in again (that's one thing the 1 hour timeout is good for - I see with a high degree of certainty whether long polling stopped working).

Ahh, now, thats the magic words I was looking for I think. I should add this to both timeout and no timeout.

Now, the somewhat interesting thing is, LP does not actually control when getworks happen. 100 mhash/sec causes nonce saturation in 45 seconds on a single getwork, and you have 2 getworks per GPU worked in parallel.

LP locking up due to network/pool stupidity only causes higher actual stale rates because LP isn't triggering getworks early. Any shares created until post-new block getwork are stale. It stops generating stales when the new getwork triggers.
full member
Activity: 373
Merit: 100
July 07, 2011, 04:37:01 AM
Yeah, 60 minutes makes no sense. Bitcoin aims to create a new block every ten minutes. Generally, its closer to every 6 to 8. If 10 minutes have passed, and LP hasn't returned, its obvious we're most likely in a new block by now.

I'd say that 20 minutes makes more sense since the 10 minutes is supposed to be a statistical average, not some upper limit. It happens quite frequently (especially after a few quick blocks) that a longer block comes along to balance things out. OTOH I don't really see the harm in forcing more frequent updates.

setReadTimeout doesn't seem to do what the javadocs says it does, at least, not from what I can see in the source. I'll look at it later.

I haven't really looked at setReadTimeout's source, but it seems to be working fine, here. I have several "ERROR: Cannot connect to Bitcoin: Read timed out" messages in my log now, and after this, long polling messages start coming in again (that's one thing the 1 hour timeout is good for - I see with a high degree of certainty whether long polling stopped working).
legendary
Activity: 1162
Merit: 1000
DiabloMiner author
July 06, 2011, 08:58:17 PM
I just had a look at the code, and you're only having the connection attempt time out, not the connection itself. This patch solves my problem:
Code:
--- DiabloMiner-git/src/main/java/com/diablominer/DiabloMiner/DiabloMiner.java  2011-07-06 16:43:07.344491654 +0200
+++ DiabloMiner_build/src/main/java/com/diablominer/DiabloMiner/DiabloMiner.java        2011-07-06 22:54:21.861192261 +0200
@@ -1104,9 +1104,14 @@
         connection = (HttpURLConnection) bitcoind.openConnection(proxy);
 
       if(timeout)
+      {
         connection.setConnectTimeout(15000);
+      }
       else
+      {
         connection.setConnectTimeout(10 * 60 * 1000);
+        connection.setReadTimeout(60 * 60 * 1000);
+      }
I deliberately set the timeout to an hour so that the amount of false alarms stays low. You may want to implement a lower read timeout, I'm not sure what a decent limit is.

Yeah, 60 minutes makes no sense. Bitcoin aims to create a new block every ten minutes. Generally, its closer to every 6 to 8. If 10 minutes have passed, and LP hasn't returned, its obvious we're most likely in a new block by now.

setReadTimeout doesn't seem to do what the javadocs says it does, at least, not from what I can see in the source. I'll look at it later.
full member
Activity: 373
Merit: 100
July 06, 2011, 03:59:12 PM
I just had a look at the code, and you're only having the connection attempt time out, not the connection itself. This patch solves my problem:
Code:
--- DiabloMiner-git/src/main/java/com/diablominer/DiabloMiner/DiabloMiner.java  2011-07-06 16:43:07.344491654 +0200
+++ DiabloMiner_build/src/main/java/com/diablominer/DiabloMiner/DiabloMiner.java        2011-07-06 22:54:21.861192261 +0200
@@ -1104,9 +1104,14 @@
         connection = (HttpURLConnection) bitcoind.openConnection(proxy);
 
       if(timeout)
+      {
         connection.setConnectTimeout(15000);
+      }
       else
+      {
         connection.setConnectTimeout(10 * 60 * 1000);
+        connection.setReadTimeout(60 * 60 * 1000);
+      }
I deliberately set the timeout to an hour so that the amount of false alarms stays low. You may want to implement a lower read timeout, I'm not sure what a decent limit is.
full member
Activity: 373
Merit: 100
July 06, 2011, 09:59:52 AM
Thanks a bunch, this should significantly reduce the amount of stales I get! Grin
legendary
Activity: 1162
Merit: 1000
DiabloMiner author
July 06, 2011, 09:04:54 AM
Update: Anyone having problems on pools with extremely broken LP (such as connections never returning), it should be fixed now (it times out after 10 minutes instead of never). I've also upped the stack size so you shouldn't see OOM errors when the queues get overloaded (at least, it will take much longer). Also, the bug with slushpool where multiple execution threads would not be brought up properly has also been fixed.
full member
Activity: 373
Merit: 100
July 05, 2011, 11:50:29 AM
OSX's OpenCL implementation is a disaster.

Tell me about it. And it's an Nvidia card...
Anyhow, at least with this configuration, there is a serious performance regression and you should know about it... Wink

At any rate, the LP bug you're having doesn't make sense. It almost sounds like the pool ls intentionally not returning on LP. Look at the code yourself: if the LP either returns or errors out, it tells all the mining threads to go refresh their getworks, it then goes back to starting another LP.

It almost sounds like you're mining on a pool that is down, and my miner (obviously) isn't shutting down. The only way you'd realistically run out of memory (since I tell Java to limit it to a small size) is if the sendwork queue fills up.

It's not intentional (and for reference, I usually mine at deepbit). Our phone somehow causes some interference with the DSL connection, triggering a reconnect on every phone call and therefore usually a new IP address. LP obviously can't work properly after this.

Also note that I said there is a high chance it doesn't recover - sometimes it does. Anyway, I'm logging the current run, and will send you the complete log once the OOM occurs. So far there has been no longpolling debug message for >5 hours and there were some before the first reconnect. 1 share and three stales found so far.

Here's the log up to this point (using swepool to see whether it would make a difference; so far it hasn't):
Code:
[05.07.11 10:28:26] Started                                                  
[05.07.11 10:28:26] Connecting to: http://swepool.net:8337/                 
[05.07.11 10:28:26] Using Apple OpenCL 1.0 (Dec 23 2010 17:30:26)           
[05.07.11 10:28:40] Added GeForce 9400 (#1) (2 CU, local work size of 128)   
[05.07.11 10:28:41] DEBUG: Enabling long poll support                       
[05.07.11 10:35:42] DEBUG: Long poll returned                               
[05.07.11 10:41:46] DEBUG: Long poll returned                               
[05.07.11 13:07:06] DEBUG: Attempt 1 found on GeForce 9400 (#1)             
[05.07.11 13:07:07] Rejected block 1 found on GeForce 9400 (#1)             
[05.07.11 14:17:05] DEBUG: Attempt 2 found on GeForce 9400 (#1)             
[05.07.11 14:17:05] Rejected block 2 found on GeForce 9400 (#1)             
[05.07.11 15:54:03] DEBUG: Forcing getwork update due to nonce saturation   
[05.07.11 16:07:54] DEBUG: Attempt 3 found on GeForce 9400 (#1)             
[05.07.11 16:07:54] Accepted block 1 found on GeForce 9400 (#1)             
[05.07.11 17:35:49] DEBUG: Forcing getwork update due to nonce saturation   
[05.07.11 17:45:00] DEBUG: Attempt 4 found on GeForce 9400 (#1)             
[05.07.11 17:45:01] Rejected block 3 found on GeForce 9400 (#1)             
mhash 1,3/1,3 | a/r/hwe: 1/3/0 | ghash: 38,9 | fps: 21,0

(I just had a quick look at the code and long polling doesn't time out, which is obviously problematic. This would also fail if the network connection broke long enough for the long polling answer to fail to reach the client. jgarzik's cpuminer manages to revive the long polling connection after it is broken, so it can't be only the pool.)
legendary
Activity: 1162
Merit: 1000
DiabloMiner author
July 05, 2011, 07:50:27 AM
8 GPUs in one box, I can't say exactly what causes it be it pool downtime, etc. but I was seeing OOM errors after a few hours, changed maximum heap from 16 to 32mb and haven't seen an OOM since.

Yeah, I might make that change myself. 16 is just cutting it too thin.
legendary
Activity: 1162
Merit: 1000
DiabloMiner author
July 05, 2011, 07:49:02 AM
Hey all.

I decided to give Diablo miner a try as it has a option I like

Anyway every time I try and Mine using it I get a popup window saying Java has shit its self. I bet it is a Simple fix as well. Any ideas ?

  Problem Event Name:   BEX
  Application Name:   java.exe
  Application Version:   6.0.260.3
  Application Timestamp:   4dc11607
  Fault Module Name:   atiocl.dll
  Fault Module Version:   2.3.451.0
  Fault Module Timestamp:   4cfd9c01
  Exception Offset:   004245d3
  Exception Code:   c0000417
  Exception Data:   00000000
  OS Version:   6.1.7601.2.1.0.256.1
  Locale ID:   2057

EDIT again its not working. Reinstalled it worked once everything fine. Closed it down to change a setting and its crashing again.

Found out what my Issue was which was causing the crashes I had app 2.3 . Installed 2.4 and running fine. getting 502mhash/s out of a 5770 + 6870. Dont know how this was causing the issue but nevermind

Love the fact you can set all your miners away with 1 window.

Ahh, weird. 2.3 should have just refused to work altogether. 2.3 does not support 68xx afaict, only 2.4.
sr. member
Activity: 464
Merit: 250
July 05, 2011, 07:16:55 AM
Hey all.

I decided to give Diablo miner a try as it has a option I like

Anyway every time I try and Mine using it I get a popup window saying Java has shit its self. I bet it is a Simple fix as well. Any ideas ?

  Problem Event Name:   BEX
  Application Name:   java.exe
  Application Version:   6.0.260.3
  Application Timestamp:   4dc11607
  Fault Module Name:   atiocl.dll
  Fault Module Version:   2.3.451.0
  Fault Module Timestamp:   4cfd9c01
  Exception Offset:   004245d3
  Exception Code:   c0000417
  Exception Data:   00000000
  OS Version:   6.1.7601.2.1.0.256.1
  Locale ID:   2057

EDIT again its not working. Reinstalled it worked once everything fine. Closed it down to change a setting and its crashing again.

Found out what my Issue was which was causing the crashes I had app 2.3 . Installed 2.4 and running fine. getting 502mhash/s out of a 5770 + 6870. Dont know how this was causing the issue but nevermind

Love the fact you can set all your miners away with 1 window.
legendary
Activity: 1428
Merit: 1000
https://www.bitworks.io
July 05, 2011, 05:57:45 AM
8 GPUs in one box, I can't say exactly what causes it be it pool downtime, etc. but I was seeing OOM errors after a few hours, changed maximum heap from 16 to 32mb and haven't seen an OOM since.
legendary
Activity: 1162
Merit: 1000
DiabloMiner author
July 05, 2011, 05:51:27 AM
Machine stats:
Mac Mini with OSX 10.6.7
uname -a: Darwin zentrale1.local 10.7.0 Darwin Kernel Version 10.7.0: Sat Jan 29 15:17:16 PST 2011; root:xnu-1504.9.37~1/RELEASE_I386 i386
Apple OpenCL 1.0 (Dec 23 2010 17:30:26)
Nvidia GeForce 9400

Update: Finish async networking, now everything is async. getwork, sendwork, and LP now use 1 thread per miner instance instead of 3 per GPU.
Ever since this update, whenever long polling fails (because of network errors), there is a high chance that it won't pick up again (I've had >1h go by without a long polling message in debug mode). Pretty soon after the nonce saturation message, I get an OOM Exception (will paste it in here as soon as I get it again).


Uodate: Behold, the frankenkernel. A mix of DiabloKernel and phatk.

Ever since this update, the hashing speed went from "mhash 2,2/2,2" to "mhash 1,4/1,4".


The lates git updates don't change anything here.

OSX's OpenCL implementation is a disaster.

At any rate, the LP bug you're having doesn't make sense. It almost sounds like the pool ls intentionally not returning on LP. Look at the code yourself: if the LP either returns or errors out, it tells all the mining threads to go refresh their getworks, it then goes back to starting another LP.

It almost sounds like you're mining on a pool that is down, and my miner (obviously) isn't shutting down. The only way you'd realistically run out of memory (since I tell Java to limit it to a small size) is if the sendwork queue fills up.
full member
Activity: 373
Merit: 100
July 05, 2011, 03:58:31 AM
Machine stats:
Mac Mini with OSX 10.6.7
uname -a: Darwin zentrale1.local 10.7.0 Darwin Kernel Version 10.7.0: Sat Jan 29 15:17:16 PST 2011; root:xnu-1504.9.37~1/RELEASE_I386 i386
Apple OpenCL 1.0 (Dec 23 2010 17:30:26)
Nvidia GeForce 9400

Update: Finish async networking, now everything is async. getwork, sendwork, and LP now use 1 thread per miner instance instead of 3 per GPU.
Ever since this update, whenever long polling fails (because of network errors), there is a high chance that it won't pick up again (I've had >1h go by without a long polling message in debug mode). Pretty soon after the nonce saturation message, I get an OOM Exception (will paste it in here as soon as I get it again).


Uodate: Behold, the frankenkernel. A mix of DiabloKernel and phatk.

Ever since this update, the hashing speed went from "mhash 2,2/2,2" to "mhash 1,4/1,4".


The lates git updates don't change anything here.
hero member
Activity: 772
Merit: 500
July 05, 2011, 12:09:08 AM
Update: Removed a lot of dead code that the compiler should remove, and I think it might have been missing some.

Did you take a look at my kernel mod Wink?

Dia

Your mod looks almost identical to the changes I made when I unmacro'ed frankenkernel.

I guess it is some kind of logical evolution, I didn't look into any other kernel than the original phatk. But I like this open-idea with a little feeling of competition Smiley.

Dia
legendary
Activity: 1162
Merit: 1000
DiabloMiner author
July 04, 2011, 08:55:55 PM
Update: Removed a lot of dead code that the compiler should remove, and I think it might have been missing some.

Did you take a look at my kernel mod Wink?

Dia

Your mod looks almost identical to the changes I made when I unmacro'ed frankenkernel.
sr. member
Activity: 464
Merit: 250
July 04, 2011, 03:25:34 PM
Hey all.

I decided to give Diablo miner a try as it has a option I like

Anyway every time I try and Mine using it I get a popup window saying Java has shit its self. I bet it is a Simple fix as well. Any ideas ?

  Problem Event Name:   BEX
  Application Name:   java.exe
  Application Version:   6.0.260.3
  Application Timestamp:   4dc11607
  Fault Module Name:   atiocl.dll
  Fault Module Version:   2.3.451.0
  Fault Module Timestamp:   4cfd9c01
  Exception Offset:   004245d3
  Exception Code:   c0000417
  Exception Data:   00000000
  OS Version:   6.1.7601.2.1.0.256.1
  Locale ID:   2057

EDIT again its not working. Reinstalled it worked once everything fine. Closed it down to change a setting and its crashing again.
newbie
Activity: 27
Merit: 0
July 04, 2011, 01:56:25 PM
About 3:30am CDT all my miners when belly up except 1.  The triggering event was something between me and btcguild's central server.  The reason one remained up was it was connected to the east server.  Anyway, here's best I could pen and paper copy from the ones that went down.... all the same major message.

Code:
Exception in thread "DiabloMiner LongPollAsync"
java.lang.OutOfMemoryError: Java heap space
what came after that varied a bit from nothing to

either
at java.util.Formatter.parse(Formatter.java:2480)
at java.util.Formatter.parse(Formatter.java:2414)
at java.util.Formatter.parse(Formatter.java:2367)

or it referred to Getwork Parser.

I suspect this was triggered by the long not able to connect to btcguild central but I suppose it could be coincidental.

Installed the version with the .jar file dated 7/2/11 yesterday around 8pm CDT

PS:  btw, they were disconnected from central for about 8 hours because I decided to sleep in today :-(
hero member
Activity: 772
Merit: 500
July 04, 2011, 11:35:28 AM
Update: Removed a lot of dead code that the compiler should remove, and I think it might have been missing some.

Did you take a look at my kernel mod Wink?

Dia

Hehe, was just going to suggest him to take a look at it Smiley
Here's the link Diablo, if you need it: http://forum.bitcoin.org/index.php?topic=25860.0
Hope you can use it

Btw. my nick is a long time one and no offense to you Diablo Cheesy. Well some changes seem like we've got similar ideas or he already took a look ^^.

Dia
Pages:
Jump to: