They also seem extremely flaky regardless of them always having quality staff around. The hashrate graphs are usually all over. I have seen them on occasion operate at their advertised hashrate though.
It isn't one or the other. "High quality datacenter with quality GPUs" and "shitty fpgas" are NOT the only two options. It can simply be shitty gpus in a shitty warehouse with no cooling.
It seems highly likely these are actual FPGAs or ASICs from a custom manufacturer that haven't hit the market yet. Maybe they figure they could make more simply be using them for 'private use' or selling them to one individual. Either way given the miner market it doesn't seem like miners would tolerate another scrypt in x11, perhaps they know that too and instead of selling them and having everyone abandon the algo, they're trying to milk it.
This is irresponsible conjecture at best.
Each one of those miners would probably be around 120k in classic new GPUs, so it's not 'that' much' if you have a lot of money to burn and don't really care if it's not recoverable.
I can build x11 GPU farms at $40/mh with the right parts, 3x less than what you quoted. And i've done it, too.
Even though the burden of proof is not on me, look at the following thread and subsequent quotes:
https://darkcointalk.org/threads/darkcoin-fpga-mining-co-op.836/page-4It was determined that the sha3 candidates are too complicated for affordable fpgas. Even single hash pow functions using sha3 candidates can't be sufficiently unrolled to produce high enough hash rates. Implementing all 11 used in x11 would require a massive fpga, and the results would be poor. IIRC testing showed that skein running on a stratix board only managed to produce about 10% of the hashrate of a radeon 7950, so there really isn't any profit to be made unless you're already sitting on a large fpga farm of very expensive boards.
Space is at a premium on fpgas. Unlike gpus, which process the same instruction on many cores at a time, each part of the fpga only operates on one thing at a time, essentially having a stream of data going through the code, so to get good performance, code has to be unrolled when possible, since otherwise it will hold up the next data while one part goes through the same section multiple times. The 11 algos would be seperate sections, and would run in parallel like you were told, the problem is fitting them unrolled enough to get decent performance. You end up having to make major performance/size tradeoffs, and the performance goes to shit. There might be high-end fpgas that could fit it, but they would be extremely costly.
I don't know very much about asic designs, but I'm guessing you have more flexibility regarding code size, and also your cost per chip would be much lower, and you'd draw less electricity, so those problems likely wouldn't affect asics as much.
Space isn't just a problem for fitting all the algorithms on one board. Even fitting the individual algorithms(512 bit sha3 candidates) alone on boards get poor performance as they can't be unrolled enough and still fit on affordable boards.