SOLVED BITCHES!!!! well,.. mostly. I guess
"FIXED BITCHES!!!!" would be more accurate....
I'd need feedback and some network engineers to "solve" it but I *think* this is the "FIX".
I posted this to reddit as a response to the same question and wanted to make sure I got it out here as well as this seems to be the single most annoying thing you can run into with these things.
Initially my blades too would do the bounce every 2min 3sec thing. As we all know now this is natural and it just means that the blade isn't getting work, has switched to the 2nd server STILL isn't getting work, and resets itself when the 2:30 timer runs out.
I setup proxy for slush's pool (
via the instructions everyone under the sun posts as the fix for an issue that obviously isn't the proxy)
I would see the proxy Get requests from the blades, the blades would crush the work, then stop. (ie ramp up, then stop doing anything) If you look at the web dashboard you see everything ramp up to 2-6GH/s then start dropping immediately) If you change the port to 3333 you'll see the proxy server start bitching the second you save the config so,..between that and the proxy showing a burst of requests, you can see that the blade is talking to the proxy.
On the other side of the equation, the proxy screen will tell you if it's talking to the pool, if it's talking to the pool chances are it's OK.
The strange thing is that this would happen both at my workplace (nice fat 20M connection to the internet, free electric and a cooled data room) AND on my home ISP (DSL *yeah I live in the middle of a forest* 3M, nice box fan on it and paying for those KW/Hs)
Same results both places one insanely complex network, and one super simple network.
This of course made me worry about hardware but I have 5 blades and they were all acting identical.
I did have the HP power supply and the backplane, but I VOM'd it out and it was well within range, plus I could access the web dashboards on :8000 so everything *seemed* to be working to that point.
I'm pretty sure the problems with the blades are almost NEVER electric unless you really screw something up and don't follow the setup directions,.... (put the wires on backwards,..let your cat chew on the cords etc..)
So I set up my blades for bitminter's getwork. Normal config pool,pool 80,80 etc.. bypassing the proxy server entirely.
It worked *better* at this point, but using the two ports 80 and 8333, I could see that it was still pool bouncing. (80,80 you can't tell on the dashboard so it *looks* like it's mining ok but slow if you have two of the same port listed) This got all my blades running at 50% (Hey at least they were running!) so I put them all on 80,80 after that and started looking at other things while they eeked out a little coinage.
At this point I did a LOT of stupid shooting in the dark (different proxy programs, firewall rules etc...) all to no avail when I pointed back at the proxy.
What's happening:The blade is requesting work, then it stops because of some interference (more about that later) and it either manages to restart when it switches pools, or it hits the 2:30 mark and resets again (usually the latter on proxy, the former on getwork)
The proxy server is fine and doing it's job,... "Here take all this work.... uh,.. do you want any more... Hello?,.. Hello? OH!,.. here's a ton more of work,.. (what took you so long?!)" but the blade isn't requesting world consistantly.
For some reason setting it up for getwork either resets less frequently, or picks up more reliably when it bounces between pools, not 100% sure on that but again just pool bouncing on getwork gave me about 5GH/s per blade. Plus this also proves to you that your stuff is capable of working, and that lets you start to believe you can make them work.
But stratum via proxy is so much faster/efficient!! There HAD to be a fix (despite what the internet might unhelpfully say) so we tried some "other stuff".
If you're running your proxy, you can see them work "kinda" if you hit the "Switch server" button twice in a row. It's like this forces the blade to do that burst request to the proxy w/out giving it time to hit the 2:30 mark.
(Actually had a mouse macro that was going to each config browser page and hitting this button twice every 20 seconds and it got me up to about 80-85%).
Yeah I know,... stupid solution, but it was getting us closer to the target!!!!
Still not good enough though, these things weren't cheap after all,. so we plowed on...
I had some free time on Thanksgiving Holiday so I sat down and created a new isolated network on my firewall, (sonicwall, setup a new interface and set it up like a DMZ) I setup the cards 100% identical to the way I had them on my LAN and my home system and
BTCOOM!,..they came up and are between 10-12GH/s constant now.
So,.... What did I do on this config that made it different from my work LAN and my Home LAN?
First of all (based on some posts I had seen about cell phones, VOIP, stuff,.. I Made sure there were NO wireless devices on the new LAN. Not a WAP, not a laptop, not a router, not a phone NOTHING using wireless. LAN to WAN, 6 devices (5 blades one proxy, NO cross talk to other LANs)
(My work LAN has a ton of wifi stuff and my home router was a wired/wireless router.) So initially neither environment was free of wifi stuff.
Then I turned off ALL QoS (
http://www.draytek.com/.upload/pdffiles/4f9250d3a31bffbc72250beffc568d63.pdf) and checking/filtrations options on the firewall. I allowed *everything* in and out w/no restrictions. *My thought being that any type of high speed/queuing/checking/filtering/virus scanning etc.. could
potentially lead to stutters.* As this was on an isolated LAN I wasn't really worried about hacking for the hour or so I was testing, but I wouldn't suggest this option for the long haul.
After it was running smoothly I then looked to see what IP range and ports were being used to send/receive work on the proxy, excluded everything else and they're humming along perfectly now, and are secure once more.
Not sure WHY wireless is an issue, but in some cases it seems that "it just is". At some point I'm going to throw a blade on the home LAN and watch the stutters and see what's being reported but for now I think this is the fix for this issue.
I've talked to one other person who switched from a wireless router at home to a crappy wired router and his blade came up as well, turning off QoS added speed overall. I would love to hear back from anyone else who tries any/all of these steps and can tell us if they have similar success, or new issues.
The QoS/Filter type services I can understand, (you have to inspect and scan packets to know what to do with them and/or if they're all virused up so having that process inject some kind of lag/break/etc.. that an ASIC just might not like is believable, but not "just having wireless" doesn't seem like it should cause any issues. My understanding is that it's just going to report itself as a standard network device on the ethernet side so....I'm baffled. Maybe some hard core Networking guys could chime in on this one?
Spent about 5 days of my life testing/trying/shooting in the dark on this, but now I'm hitting as high as 18GH/s so if you're having this issue I REALLY HOPE THIS HELPS!!!!! Let us know if it does and/or if you go down some other path that gets you to the solution.
BTC 19h2PkTKsPcMxtwGspcpnQSEuVCefSBjmp
LTC LSTjG357aKAwJEMFzNUAm2RY4wyUyvKL59