Just found this project as I was searching through and I know Theymos used to get staff to mod the patrol links.
As far as I know, that's the reason patrol exists. But it's a terrible amount of messages, mostly low quality, and much more than any human should deal with. At best, you can check a very small percentage.
And that was part of what they had to do, they were special staff from the moderators though as they only had the ability to delete replies from newbies (I think) and no higher ranks (though I'm not entirely sure what the patrol link does
The patrol page shows all recent posts made by Newbies. That's about it.
As far as I know, all Mods can nuke Newbies. (Nuking them means a permban that deletes all their posts)
if you downloaded the entire patrol pages, can I have a copy?
I download patrol every 5 minutes. So far I've seen at least 10 minutes history on patrol, so several of my downloads overlap. I don't keep them permanently, I have 2856 files now (10 days history, 1.5 GB, 200 MB compressed). I can upload them somewhere if you're interested. Just so you know: it's way too much data to process manually!
Yeah, I was going/hoping to automate everything in python or c# (c# is faster). I think (when I was exploring with SMF 2 years ago) every post head is in a similar html structure.
I want to run analytics on it to view stats on small posts (posts with less than 5, 10, 20, 30, 40 and 50 chars to check how insubstantial they are and to see if there is anything else that can be gathered from that trend) if you want to do it yourself
I highly doubt this will produce anything useful, but by all means: prove me wrong! It would be interesting to see.
(or if you know a way of downloading them from archive.org as this forum's data ends up there daily as a mirror but I think it's just the public stuff).
Archive.org won't archive patrol as often as it regenerates.
That part is purely an interest thing. I'm not sure where you'd put it. I think most suggest pastebin for plain text and I assume it's raw html that you have gathered. Is there a script you have used to gather this?
I know archive.org will not function that quickly also but I don't really want to kill the forum's server by sending pings every 5 minutes. It's also a user-only privellage as far as I can gater. It would be interesting though to try to download and parse every post on the forum as a whole to check it (though I'm still trying to work out how to download from archive.org to do this, I did have a script that attempted to clone every page on this forum directly from the forum's server (starting with 1) but it didn't incorporate the 1 second request limit so it just kept on getting a 502/503 server too busy page - it also decided to clone every link in pages because it was programmed by me so I had a copy of everyone's signature's websites for the first few signatures it picked up).
Still, it would be interesting to get these general stats - especially for the patrol where everything starts (as a newbie).
The special staff members, were ones who weren't moderators and were just regular members (sort of). They weren't assigned to a specific board - which moderators generally are and were assigned to "Patrol" the forum as a whole. I think they could hand out tempabans but not completely nuke accounts.
EDIT:
Found these two:
https://bitcointalksearch.org/user/gekkondev-1835142https://bitcointalksearch.org/user/punipuni7-992109These were within the first 200 objects on patrol. I was just quicklly scrolling and thought they looked a bit interesting, essentially the same thing written on all of them
Also went back for a little nostalgia and:
https://bitcointalksearch.org/user/jackg-543626 - not particularly short and spammy, however did remeber that part where I picked on -ck at the start (just interesting to look back on - and some of it was a bit spammy or where I recommended a ponzi that stole all the coins I had then which wasn't that many) - I was really after a compairson to these newbies posts also it is worth noting, I signed up here 3 months after starting with Bitcoin.