The problem was basically that I was without good internet access again for about 7-8 hours. I was in a car with people from my "real" job for about 4 hours and then I had to go out and socialize with work people. I was working feverishly to fix the site via my smartphone the entire time, but had to limit my time spent on the phone.
I'll be returning home later today and then it's back to "normal" support until the next time I need to travel.
What's ironic is that the root cause of all this was me trying to bring down the stales on DOGE by twiddling with the DB thresholds on the DOGE pools. Rather than committing shares every 30 seconds, I brought that down to 10 seconds. The old DB just could not handle 3x the number of connections and started getting behind on shares. I didn't notice this until about 2pm the next day and that's when I finally decided to bring down the website to try to allow the DB to catch up. But when I saw that there were over 60 million shares in the DOGE shares table, I knew that it was going to take hours and hours and I thought I could speed things up by bringing down the old DB and migrating it to a larger system. At the time DOGE blocks could not even be scored because the queries were taking so long that they were timing out.
The provider took over 2 hours to make a snapshot of a 160GB VPS and then another 45 minutes to bring up the new db. During the snapshot was when I posted the update that was visible on the site. When the new DB server finally came up I was still out with my colleagues and had to wait until I got back to my hotel room to complete bringing things back up.
Right now, all 3 pools are under DDoS attack, I suppose by either people who are pissed, or competitors that want to maximize the impact of this downtime to try to get people to switch pools.
DDoS protected US-West and EU pools are coming early next week, those will be the high difficulty pools I spoke about last week. I already have the EU server set up, just waiting till I get home to configure things.
As far as support expectations, it's all best effort. One thing I know I did wrong was that I should have brought down the pools before I started the DB rebuild. But I had no idea it was going to take almost 3 hours to redeploy that VPS. If I knew that, I definitely would have brought down the pools first. But in the grand scheme of things people should not be depending on just one pool (even if it's the awesome Multipool).
I haven't read most of the comments because they'll probably just bum me out, and get me distracted from finishing the work that's needed, which is to get all of the DOGE shares into the DB from yesterday and get all the blocks scored and paid out. But rest assured that all the shares that were submitted will be accounted for, even shares that were submitted during the downtime. If any shares are missing I may need to use an average over multiple blocks, but the blocks should be paid fairly in any case.
Again I apologize for this downtime, please help me make Multipool even more successful so that I can leave this "day job" and hire some more support people
Glad to hear, great news!
So I guess this means I didn't lose my shares for 15 hours of mining at 8MH/s and payment will follow later on today or tomorrow?
Success with the share counting dude, hope this is a fully or almost fully automated task )