In discussions about the Cloudflare disaster, I've seen people saying that protecting against denial of service attacks without Cloudflare is basically impossible. While it is much more difficult, it's not impossible.
First of all, there's the major issue of sites giving their HTTPS keys to Cloudflare. One of the main reasons that Cloudflare needs to see into HTTPS is so they can detect application-level attacks, such as slow or large HTTP requests, SQL injections, etc. These sorts of attacks can all be fixed server-side, and furthermore Cloudflare cannot fix these attacks in general. Relying on Cloudflare to protect against all such attacks is really "security through obscurity", and will not stand up against a serious attacker. If you handle application-level attacks on your own, then security-wise, there's very little reason for Cloudflare to be able to see into HTTPS. They should be able to work purely at the TCP/IP level.
(Looking into HTTPS is also needed for caching at the DDoS-protection server, but for dynamic web applications, caching can't be used all that much anyway. Cloudflare also uses their ability to man-in-the-middle HTTPS connections in order to display a CAPTCHA, but I think that this is unbelievably annoying and unnecessary.)
Aside: Cloudflare offers a piece of software which they claim allows you to use their service without giving them your HTTPS key. While it allows you not to give them your key, the software sends each HTTPS session key to Cloudflare, allowing them to do pretty much everything they could do if they had the key. Maybe it's a little better than nothing, but it's mostly a false sense of security.
Where DDoS providers are really useful is against attacks at the network and transport layers. Assuming application-level attacks are already addressed, the two most powerful DDoS strategies are:
- TCP SYN and UDP floods, which work by overwhelming the server's ability to track and process connections at a software level.
- Traffic floods, which work by overwhelming the networking hardware either leading into the machine or upstream of the machine.
Most sites today will be taken down by even a small SYN or UDP flood. However, for attacks up to a moderate size, these attacks can be mitigated quite efficiently by blocking UDP somewhere upstream of the server (unsolicited UDP connections are unused by most servers) and having a SYNPROXY iptables setup on the server. If all sites did this and handled their application-level DoS attack vectors, then the common Cloudflare subscription types would be completely redundant, since Cloudflare will not protect against large attacks without a very expensive subscription anyway.
For large attacks, you need to handle it by:
- SYN/UDP floods: Have multiple SYNPROXY servers in front of the application server, or have several reverse proxy servers in front of the application server, or have multiple application servers which can be used in parallel. The main point of this is to filter out bad packets before they can clog up the application server, not really to cache or whatever.
- Traffic floods: Ensure that the complete path between the Internet and your outermost server(s) can handle the bandwidth of the attack. Oftentimes, some Ethernet switch just outside of your server will be the bottleneck. Again, having multiple outermost servers is beneficial because it splits up the attacker's traffic.
These large attacks are where you really want a DDoS-protection company; while it is completely possible to set up the above-mentioned protection against large attacks by yourself, it is expensive and difficult. And because the actually-dangerous attacks are at the network or transport layer, there's really no reason for the DDoS protection company to need to see into the underlying HTTPS traffic (though most of them want to do this anyway...). However, although there are many DDoS protection companies that can protect you from large DDoS attacks, they are very expensive; if you can get it for less than a few hundred dollars per month, then the DDoS protection company is probably lying about what you're actually getting.
The sub-$200/month Cloudflare subscription that most people use is a cheap magic bullet for driving off only the weakest attackers, while also selling your soul to Cloudflare.
If I was going to design something like Cloudflare, I'd have it do just two things:
1. Across all sites hosted on the service, I'd monitor the behavior of client IPs to classify IPs as "probably a real person", "probably an attacker", or "new/unknown". Based on these classifications, I'd apply rate-limits and blocks per IP address in order to help sites screen out abusive traffic. This sort of wide view of IP address behavior is something that a service-provider for many sites can do better than a single site.
2. While having a bunch of Internet-facing servers and super-high-capacity connections, I'd block UDP and handle the TCP handshake on behalf of the real server, only forwarding through TCP connections which complete the handshake. This increases latency somewhat, but probably not that badly, and it could activate only when a DoS attack seems to be actually happening. It could also take into consideration the IP address classifications of the previous point (but with the knowledge in mind that IP addresses might be fake before the TCP handshake completes).
This'd protect against DDoS attacks without significantly hurting security.
It's bad that DDoS attacks are something that needs to be worried about. It is especially bad for anonymity, since currently websites must be able to block IP addresses in order to protect against application-level DDoS attacks; IP addresses are the only scarce resource that attackers can be forced to burn through. If a site allows connections via Tor, then during a DDoS via the Tor network, all of Tor is going to need to be blocked in order to block the attacker. If a site allows connections via a Tor hidden service, then during a DDoS via the hidden service, the site is going to have to shut down the hidden service. By design, there's no way of distinguishing good Tor users from bad Tor users, so if serious abuse is coming via Tor, you just have to block Tor entirely.
This is a serious flaw in the Internet which needs to be fixed. One solution would be to add some sort of cost to packets and/or connections, such as a proof-of-work or micropayment. Ideally, proof-of-work would be added to IP packets and dropped by backbone routers if insufficient, though I'm not sure how you'd coordinate the correct PoW difficulty. Maybe you could have a minimal PoW difficulty checked by backbone routers and then a much larger PoW difficulty at the destination.
Concerning Tor and similar darknets, I also think that it would be useful to have an HTTP extension for expensive, long-term proofs-of-work. For example, maybe you'd have to work on the PoW for several hours before you could post on a Tor-based forum, but then you could reuse that PoW at any later time on that site. This way, there'd be some cost associated with people having their account/PoW banned for abusive behavior: they'd have to recompute the PoW. This would replace the currently-common mechanism of banning IP addresses, which is impossible on Tor hidden services.
I've also been thinking recently that the whole structure of the Internet may be largely incorrect. If the Internet was a distributed data store like Freenet instead of a point-to-point system, DoS attacks would be impossible unless you managed to attack the entire network as a whole. It'd also be much easier to achieve anonymity, since all low-latency point-to-point anonymity systems such as Tor are terribly weak to intersection and timing attacks, but these attacks don't really work in the distributed-data-store model.
I think that ~99% of things done on the Internet can, with work, be converted to the distributed data store model. Streaming content, static websites, and email are pretty easy. Dynamic Web applications like forums and Facebook-type sites can be done using static pages with fancy JavaScript applications which work within the distributed-data-store framework. (This would be quite a change in how people write web applications, though; there'd certainly be no way to quickly/easily translate some PHP code directly into the data-store model, for example.) The only thing that this model is not very good at is one-time one-to-few communication such as video/voice calling; for these few use-cases, the point-to-point model could still be used.