I scraped the
earned merit data of 315 people participating in bitcoin paying signature campaigns, just for fun and to have a comparison versus my
earlier dataset of 503 altcoin paying signature campaign participants.
The raw data is available in the format
Username | Earned Merit | Profile URL. The data is separated by campaign name, so you could compare between campaigns if you wish.
Data setData was obtained from 15 campaigns. I aimed to scrape all of them, but was not able to due to formatting, or copying restrictions. Some campaigns blocked copying, so I respected their wishes and didn't take their data. Those campaigns were:
- Crypto-Games.net (oddly, all other Lutpin run campaigns were scrape-able)
- NitrogenSports
- Coinpayments
Other campaigns did not have a public dataset:
- YoBit.net
- Coinroll
Finally, some campaigns had weird formatting, making it too time consuming for me to grab the profile data. Those campaigns were:
- BitDice
- OneHash
- FortuneJack
There are potential errors in the data, especially for campaigns where "Profile links may be incorrect" is noted on the spreadsheet. Campaigns without that string had their profile links taken directly from what the manager had, while campaigns with the string did not have links
[1]. Those without links had their profile link scraped
[2], which could result in errors.
Statistics - Average of 16.2 merits each
- Median of 4 merits
- Excluding the 75 people with no merits, the average person earned 21.3 merits
Possible Interpretations- People participating in Bitcoin paying campaigns generally have a better post quality than those without a campaign or in altcoin bounties, as managers are usually very picky about what they accept. You could infer that the merit system is working, as people with good posts are getting many merits.
How?I'm scraping merit data using Google Sheets, with this formula:
=SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(MID(IMPORTXML(D2,"//*[@id=""bodyarea""]"), FIND("Merit:", IMPORTXML(D2,"//*[@id=""bodyarea""]")) + 6, 4), "P", ""), "o", ""), "s", "")-LOOKUP(MID(IMPORTXML(D2,"//*[@id=""bodyarea""]"), FIND("Position", IMPORTXML(D2,"//*[@id=""bodyarea""]"))+9, 2),{"Co","Fu","He","Jr","Le","Me","Ne","Sr"},{"0","100","500","0","1000","10","0","250"})-0
It grabs their merits, subtracts how much merit they would have been given at the start of the system based on current rank, and returns the result. Does not work if they have ranked up, ranked down, are Hero Members who were given 1000 initally or with Copper Members. You need to subtract/add the correction manually, and it's pretty easy to spot errors. This formula grabs their current merit count:
=SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(MID(IMPORTXML(D2,"//*[@id=""bodyarea""]"), FIND("Merit:", IMPORTXML(D2,"//*[@id=""bodyarea""]")) + 6, 4), "P", ""), "o", ""), "s", "")
Be warned: You will get rate limited eventually. My original plan was to grab 1000 users worth of data, but it was getting really slow.
The
D2 in the formula is a link to the user's profile. Data will automatically update as long as the formula is present.
[1] Legit question, why? Seems kinda annoying to do post counting with that
[2] Scraped using a DuckDuckGo
[3] with the below formula. A3 represents the username.
=IMPORTXML("https://duckduckgo.com/?q=%22View+the+profile+of+"&A3&"%22+site%3Abitcointalk.org", "(//div)[9]")
[3] Google wouldn't let me scrape Google search, so DuckDuckGo it is!
Side note: Bing search is absolute trash.[4][4] I literally searched for "View the profile of XXXX", in quotations so that it returns exact phrase matches, yet I still got completely unrelated results very often