Yes. This mechanism is commonly called "proof of work".
It is the mechanisms of:
- All participants agreeing to only accept blocks that have a hash value less than the current target
- All participants agreeing to adjust that target after every 2,016 blocks
- All participants agreeing that the new target value will be adjusted proportionally to how much time the previous 2,016 blocks differed from 20,160 minutes
- All participants agreeing that the maximum adjustment for the new target value will not exceed an increase four times the previous target nor a decrease beyond one-fourth of the previous target.
Mining bitcoin has two functional pieces. The first is to build the unsolved block header. This is typically accomplished by mining pools. They choose a set of transactions, calculate a merkle root, connect the header to the previous block by including the hash of that block in the current header, and set a timestamp. The second piece is to find a hash for that block header which is lower than the current target. Mining pool participants running ASIC equipment typically perform this work for the mining pool. The pool hands off the headers to the participants, and they repeatedly adjust the nonce value and hash the resulting header to see if they end up with a low enough hash value. As soon as a participant finds a low enough hash value, the block is broadcast to the entire network so work can begin on the next block.
Before mining? I don't understand the question. That's a bit like asking if swallowing is required before eating. It is part of the process. Mining is the process of finding a hash value that is less than the current difficulty target. Without a difficulty target, there is nothing to find a hash lower than.
That depends on what you mean when you say "the puzzle".
If, when you say "the puzzle", you mean: Finding a valid hash value for the block header (One that's lower than the current target value). That never changes, Finding valid hash values has been the puzzle since the first block and continues to be the puzzle for as long as Bitcoin exists.
If, when you say "the puzzle", you mean: The current target value that the hash value must be smaller than. That changes every 2,016 blocks (which, at approximately 10 minutes per block, works out to be just about every 2 weeks).
If when you say "the puzzle", you mean: The exact nonce value that will result in the next block being valid. That changes thousands of times per second. Mining pools are constantly creating new block headers, looking for one that can hash to a low enough value. Each block header is slightly different than the others. Most of the headers generated have no valid solution and are simply forgotten once the pool participants have tried all possible solutions and determined that a different header will be needed. Mining pool participants worldwide attempt a sum total of more than 300,000,000,000,000,000,000 solutions every second. While some solutions are found after only a few tries, and some take many, many more, the average number of "puzzles" needed for each solved block every 10 minutes or so is more than 180,000,000,000,000,000,000,000.
Imagine I give you a device that generates a random number between 1 and 1,000 once every second. Now imagine that I tell you that you can "succeed" if your device generates any number that is less than 900. You'll "succeed" pretty quickly since MOST of the numbers that you generate will be less than 900. Now, imagine that I adjust the target difficulty and tell you that you can only succeed if the number you generate is less than 500. Sometimes you'll "succeed" right away. Sometimes it will take a bit longer, because some of your random numbers will be larger than 500. The AVERAGE amount of time per success will be longer than it was when the target was 900. Now imaging that I adjust the difficulty again and tell you that you'll only "succeed" when your random number is less than 100 (or less than 10, or less than 3). Perhaps you can see that as the target number gets smaller, the average amount of time that it takes to "succeed" increases.
This is how the mining difficulty regulated the period of time between blocks. The target number is somewhere between 0 and 115792089237316195423570985008687907853269984665640564039457584007913129639936. If the blocks are coming too fast, then the target is moved closer to 0, if the blocks are coming too slowly, then the target is moved closer to 115792089237316195423570985008687907853269984665640564039457584007913129639936. The exact amount of the adjustment is proportional to the difference in time between the actual amount of time for the previous 2,016 blocks, and 20,160 minutes.
Hope that helped.