Is S is a string which remains constant over multiple iterations of the PoW function then yes, you can do the state-reusing-attack (Already used in bitcoin mining and termed the "mid-state").
But if we assume S to be a mix of previous block and input to the current PoW function and the different Ci being the different buffers that result from this particular input, we should be fine going with either SHA-256(S.Ci) or SHA-256(Ci.S) ... or do I miss something?
Yes, the attack I described later in the post. S is the buffer, which the attacker can keep constant.
Regarding the proof, SHA-256(Ci.S) has the same complexity as SHA-256(S). If Ci has constant length this should be "almost" equivalent to a pure SHA-256(S) with a starting-state that is different (and influenced by Ci).
You seem to be confusing SHA-256 the algorithm, with SHA-256 the function. There are an infinite number of algorithms which compute SHA-256 the function, but we assume none are supra-linear because if such existed, collisions could be produced.
Consider the following generalisation of the "mid-state" optimisation. Let T be a string of length(s+c). Suppose s bits of T are fixed, while c of them vary randomly. These c bits must be in well-defined (i.e., computable) positions within T. You lose nothing if you assume they are all a fixed distance from either the beginning or the end of the string. The idea is that we find a function f(T) which reads only the fixed bits of T and produces some fixed length output, and another function g(T,f(T)) which reads only the randomly varying bits of T, as well as the output of f. Finally suppose g(T,f(T)) = SHA-256(T) for all T.
If such f and g exit, then g is particularly well-behaved. the number of bits in its input that it actually reads does not depend upon s, so a machine with random access to memory will have an algorithm to compute it whose running time is O(1). A look-up table would work. The complexity of f will be at least O(s).
If we can find algorithms to compute f and g, then we compute f once, (It may take a long time), and g (which is quick) over and over, so to be provably secure, we need to prove one of the following statements:
1. No such f and g exist.
2. f is not computable.
3. No feasible algorithm for f exists
4. It is not feasible to find a feasible algorithm for f.
Can you prove any of these things? I can't.
Having said all that, this is of purely theoretical interest. An attacker does not need the additional speedup that this construction would give him. "My scheme" is dead already from the attacks I have already described.