Hmmm... one of my machines is at 5% stales, the other is at 19.5% stales, I'm beginning to wonder if it is some sort of network congestion since one is connected to a higher-priority switch on the network. Also, the one with lower stales is on cgminer 2.3.1, higher stales is the newer 2.3.2.
A couple things for p2pool you want to be on 2.3.3 or higher. Some of the older versions have a bug where cgminer ignores the submit-stale flag set by the pool.
Discarded % is irrelevant. It is the amount of work cgminer requested that it discard BEFORE starting to work. This is simply a metric between GPU hashing power and LP interval.
If your GPU is 300 MH/s (each GPU is what matters) then it will complete a getwork in ~15s but LP will occur early than that so most queued up work will never be started. You can reduce the amount of discarded work by setting the queue param to 1 (and threads =1) but you will still have a lot of discarded work.
One other thing to consider is some routers have an issue w/ high # of open connections (especially multiple rigs all running cgminer opening dozens of simultaneous connections). newer version of cgminer helps this but some routers still lag under the load (# of connections not bandwidth).