The basic idea of mining is that you proof that you did something. Actually you don't do anything useful while mining you don't solve any problems or anything. You are just doing something where you may get result which others can verify and see how difficult it was to get (on average). That's needed because if we didn't have that mechanism anyone could create thousands of blocks per second and the blockchain would be useless.
That's accomplished by hashing data. Hashing functions are rather complicated. Just think of it as a machine that will take data and give you a fixed length result. If you give the machine the same input again it will give you the same output again. We are unable to compute the input from the output of a hashing function (at least if it's a good one like SHA256).
Let's say a block only contains some transactions, the bitcoin address of the one that gets paid the block reward and some random data for the sake of simplicity.
A block is considered a valid block if it's hash is lower than a certain value determined by the current difficulty.
Now let's get back to your understanding:
We don't need to try all variants of 4 specific bytes. We can change as much data in the block as we want, and we don't have to go through everything. No need to try 1 then 2 then 3 then 4 etc.
You can change the nonce, the extra nonce you can add more transactions or you could change the payout address to change the hash since that's all part of the input that goes to the hash function. Usually the nonce/extra nonce are increased until a solution is found, but you don't have to do it like that.
Well if you're solo mining you don't mind, since you are using your very own payout address so other people are giving another input to the hashing function even if they use the same nonce.
But what if someone tried that very combination already somehow?
That's so unlikely that we don't consider it as a possibility. Think about you going out of your home with your shovel and you teleport yourself to a random planet in the universe to a random spot on that planet. How likely is it that someone else dug there already? It's pretty much the same with your bitcoin block.
If you are mining in a pool the pool gives you the work and it can give you a certain nonce range or it could give every member another payout address (which is still owned by the pool) or it could add extra transactions to every member. Get creative
Imagine you found a solution a solution. Now you change the payout address in the block to get the btcs for yourself. Now you have changed the block and therefore the input of the hash function -> you're solution doesn't fit to that input anymore -> it's an invalid block
You submit solutions to a pool which aren't real ones. They are easier to get but you still give them to the pool to proof that you are working. Actually they are useless it's only to proof that you should get something if anyone else finds a real block.
If someone finds a real block (it contains the payout address of the pool and the user is unable to change it see question 2) he can send it directly to the bitcoin network or the pool which will send it to the network. That block will earn the pool a reward which it will distribute to all the pool members according to how much easier shares the pool got from every member (that's why those shares are needed, they don't have any value but to show the pool that a member deserves something)
It's payed to the payout address contained in the block. That address doesn't have to be the address of the solver it could also be your grandma's or some address you found on the internet. If that sounds stupid to you, that's exactly how the reward in pooled mining goes to the pool see question 3.
I have to go now so I'm unable to check my post for typos etc.
srry
I hope it's useful to you though.