ie. instead of the metric being "is a real person completing the task", you could sidestep the hassle of involving cameras and just change the metric to "is the task being completed to a satisfactory level" which is much easier to check and won't involve messy privacy stuff.
I like your continued interest; helping me flush out ideas.
I need to verify that the user is unique - they cannot complete the task successfully if I don't know they are unique. When I get into mediation, for example - a person could paste an address that isn't correct. The system won't know, but a person can look at the evidence and say "yes, that is his address" or "no, that is not his address". If you get one person running dozens of tasks, he can easily skew the results in his favor.
If I want to be able to pay per hour, I need to make sure the person is the same as they were before, and that they are in front of the computer.
Ok, I get you. You need to know that each user is unique as they are completing tasks.
Just as a suggestion, there's two points that you can do this: 1st is the point you're looking at now, at the time of the task being completed. Using a camera you can see that they are indeed someone unique. You do this every time they complete a task.
The 2nd option though is checking they are unique a single time before they are viable for tasks, and then attaching them to their account. Logically, if they wanted to set up lots of accounts, they would fail the subsequent attempts because of a single check on each account.
The way that most platforms achieve this is using a unique govt ID, photo + message on a piece of paper. Once they complete that and send it to you, you can assign "that face" to "that account" and never have to worry about it again.
The reason this is superior is two-fold: one, people are used to doing this already with other platforms so it sounds way less weird than having a camera watching them. No one will bat an eye if you ask for a photo with their passport and a message saying "vod's website name, date, signature".
Two, it's computationally less expensive for you, and reduces process times as well. Incorporating cameras into your system means they have to run autonomously at the right time, record things for you to see later, or even worse (computationally) you incorporate some kind of deep learning model into the mix which checks peoples' faces on-the-fly each time they complete a task. It's additional latency, things that need to be completed etc etc where it's a huge reduction of complexity to check once, store that image, and then every time a new person creates an account run the deep learning model on that image against all other images you already have stored for existing accounts.
Besides it's a super low-jack method of achieving this. You can use file uploading, and an extra page in your workflow, and you can even run the deep learning model for face recognition/comparison on an aws instance without needing to even connect that to the system of the site. The alternative is building camera systems into your processes and then further complicating it by having on-the-fly facial recognition (unless you're going to do it all manually, which is a bad idea for a few reasons, but also would be easier/optimal with the system I suggested anyway).