In case I wasn't clear enough: The difference is in that when I have to guess SHA256(K) I have to guess 256 pseudorandom bits. But when I only have to guess K, then maybe a dictionary is enough (or a short bruteforce run). Also, if I can guess P1 and thus get K = C1 ^ P1, I learn something about you and about how you choose your secrets. However, if all I get from it is SHA256(K), it is a lot less useful. These are clearly weaknesses, given that a tiny modification avoids them completely.
K[n] = SHA256(SHA256(k[n-1])+P[n])
... then you can't decrypt anymore.
The more variants you come up with per hour, the less attention (and relevant attempts of analysis) they will grab. Why not think about it for a week, attempt to crack it yourself, and when it stands, then post?
The difference is that you call "K" what I call "K'", and call "j" what I call "K", and you calculate j(n+1) = j(n)+1 while I stay nearer to the GP by calculating j(n+1) = SHA256(j(n)).
Using a counter is better for random access (think seek), and multithreaded implementation.
However the security properties are doubtful, because you mix the counter and the key material. The longer the counter range (=max message stream size), the lower the maximum key entropy. It's advisable to calculate SHA256(counter || key) instead of SHA256(counter). Counter is then just an IV or NONCE, and you can set it to 0 at the start of the stream instead of keeping it secret. You can also apply the "classic" knowlegde of cryptography. I.e. don't use any NONCE/KEY pair twice, may reveal NONCE but not KEY, etc.
There are many possible algorithms, each with different properties. It's a matter of what you expect from it, and the GP isn't particularily clear at this.