In our country, our government is banning crypto, so if I posted my information here and if tomorrow some government agency wanted to track me down - then your tool is going to enable them to do that. And possibly land a person in jail for 10 years - just for being associated with crypto.
Now, I'd like to hear your thoughts about these 2 situations. What do you intend to do to avoid / solve such issues?
There are limitations as to how many tweets can be distributed to entity user per day and per month. The daily number of tweets that can be sent to a third party is large (50,000), but is a small percentage of the total tweets posted every day (500 million). If someone like DPR was posting on twitter instead of bitcointalk, they probably would still have gotten caught, while a HK protestor would probably be safe on twitter, while the HK government (a sockpuppet of the Chinese government) might be investigated similar to how legendster describes if they are posting on bitcointalk.
LoyceV is not the one who invented scraping forum posts, nor is he the only one to be actively scraping posts. There are many ways to download forum posts via automated means, and it is not difficult to get posts into a DataFrame that can later be analyzed.
The the root cause problem is the administration allows too much access to posts. A straightforward solution is to have an hourly rate limit as to how many page views an individual IP address/range can access on an hourly and daily basis that is something above what a *person* would see in the normal course of reading, but well below the necessary amount of page views required to view all posts. The scraping of posts for non-academic use should also be explicitly prohibited by the administration.