Author

Topic: Additional sMerit Analysis (smerit.txt) (Read 255 times)

legendary
Activity: 2338
Merit: 10802
There are lies, damned lies and statistics. MTwain
April 17, 2018, 12:44:59 PM
#6
<...>
So, if I'm not wrong, the merit sents seems to be related for a few. I mean: in the forum, there are almost 2 million users (according to the stats from the last week). Even when your analysis was made a month ago, the number of users sending/receiving merits is not comparable to the total.
So, can it show us how many users are truly active?
<...>

Phew, at last an easy one...

Based on the complete merit file with all awarded sMerit until the 13/04/2018, there are:

Nº of sMerit awarded in a Tx:                   131.060   
Nº of distinct users that Sent sMerit:         12.518   
Nº of distinct users that Received sMerit:   13.807

The Forum Accounts created toll just went over 2 million as you say, but there are many that are bot created and even more that are inactive.
According to Vod's website: (http://dev.martinlawrence.ca/bpip/), there are something in the range on 700-800K active users if my memory doesn´t fail (Vod’s website is being updated right now, so we'll have to wait to see his count today).

Definition of active user could vary depending on what one considers (I can't see Vod's criteria now due to the update, but it's something like having logged in within the last 3 months). There are many bot account though apparently.

All in all, proportion of merited users vs active accounts is very small. That is why in some of my posts this week I suggested that sMerit awarding needs to be incentivated, since there seem to be many active accounts that are not into the game.

It would be interesting to see what the ranks are of all active users and their merit, so as to see the maximum potential sMerit that could be placed into circulation (to do that, maybe Vod could add these aggregates to his website to give overall insights). Same with inactive.
I can’t get hold of so much data without a larger dedication, but some of that hopefully could be supplied by theymos in the new files he’s considering.

But all in all yes, the number of people playing the game seems small to say the least.
legendary
Activity: 3304
Merit: 3096
April 17, 2018, 12:16:45 PM
#5
Awesome, again, thanks.
Yet this is kind of difficult to understand, after taking a while it is a clear analysis.
So, if I'm not wrong, the merit sents seems to be related for a few. I mean: in the forum, there are almost 2 million users (according to the stats from the last week). Even when your analysis was made a month ago, the number of users sending/receiving merits is not comparable to the total.
So, can it show us how many users are truly active?

PD. Take a deserve vacation if you need, I think we all here will be expecting your return impatiently!!
(You should be paid for this awesome work Tongue)
legendary
Activity: 2338
Merit: 10802
There are lies, damned lies and statistics. MTwain
March 20, 2018, 08:14:59 AM
#4
You have obviously done a lot of work, but the post is far too long and complex, unless one is attempting a research project.

Can you give us a brief summary in fewer than 10 lines please.

I edited my post to include a Summary at the top of my original post. The general idea is that what little info is included in the smerit.txt file can be used to speed up the location of potential cases of account farming due to sMerit transfer being either circular, or one directional sMerit Transfers but performed in many transactions (i.e. transfering 1 or 2 sMerit multiple times to the same account).
legendary
Activity: 1203
Merit: 1000
March 20, 2018, 07:31:09 AM
#3
nice work from you. I agree that there can be more clear to understand.
legendary
Activity: 2814
Merit: 2472
https://JetCash.com
March 20, 2018, 07:13:43 AM
#2
You have obviously done a lot of work, but the post is far too long and complex, unless one is attempting a research project.

Can you give us a brief summary in fewer than 10 lines please.
legendary
Activity: 2338
Merit: 10802
There are lies, damned lies and statistics. MTwain
March 20, 2018, 06:58:16 AM
#1
This is going to be kind of long … but it’s interesting.

Summary:
Using the downloadable sMerit file alone, being it thin in terms of data columns available, we can see that over a seven week period:
-   Most users that give away sMerit do it a few times during the seven week period, and that is mostly of one or two sMerits in total.

-   Likewise, most users that receive merit receive less than 3 times in that period, 50% aprox. Receiving one or two sMerits.

-   The fun part of the counter part of the above, since we can see all the cases of users that have given losts of sMerit and see how many users received it.

-   The best part is all the given/received Merit part, since we can use that (whoever is in charge of it) for having a view of cases where:

o   Users give away a lot of SMerit to a same single user in multiple occasion (i.e. many small Transactions).
o   Users that give away a fair bit of sMerit to a user and end up getting a similar amount back (circular sMerit).

These cases are where we can find candidates to account farming that pass sMerit around their accounts. My Google Sheet has all the data needed to spot these cases:  https://docs.google.com/spreadsheets/d/1Su1vulMzsbkYPOsU8llr8PrQXRPvy4oM_E218klwHh8/edit#gid=968451010.


Introduction:
The following set of cooked data does not intend to focus on any specific set of users, nor justify/question the sMerit TXs they perform. That is, in any case, the task of whom Admins see fit to do since it would require manual work on the provided data. It could help though to locate farmed clustered accounts.

Also note that the analysis is quantitative and not qualitative/quality bases. That is data is counted but underlying messages are not analyzed.

Having said that, I downloaded yesterday’s version of Merit.txt to see what cooked data can be made from the provided raw data. The raw data provides information of all sMerit being transferred from user A to user B in a given timeframe (just under 2 months of transactions).

Data columns provided are Date/Time of sMerit Tx, userIds involved in Tx, amount of sMerit and MessageId.

The complete cooked data is available as a google sheet: https://docs.google.com/spreadsheets/d/1Su1vulMzsbkYPOsU8llr8PrQXRPvy4oM_E218klwHh8/edit#gid=968451010. A few charts are also included and tabbing is much better than on my post, so it's better to look at the data on the google sheet.

(not easy to get it all on to google sheets since it’s rather easy to reach the 2 millions cell limit of the product).

1) File Summary (Tab 1):

File:                                                                                   Merit.txt   
Downloaded:                                                                     19/03/2018   
Period of data within                                                           24/01/2018 .. 16/03/2018   (beginning and end dates may be incomplete - not 24 hours)
Nº Records (sMerit Transactions)                                         45.115   
Sum sMerit                                                                        104.569   
Nº Distinct Meriter Users (givers)                                         10.835   
Nº Distinct Merited Users (receivers)                                    11.775   
Nº Distinct Pairs of Users in sMerit Txs (givers -> receivers)   36.573   
      
Only users with a merit Tx are included in the above file (no information on users with no merit Tx)      
No Rank is provided in file, so Rank Analysis not available      

There are 45.115 Transaction of sMerit, involving 10.835 givers and 11.775 receivers.

I don't know the total amount of members in BitcoinTalk Forum, but contrasting the above data against total members would be an interesting set of ratios (possibly contrasting both with total users and total 'active' users).

The average sMerit given away by each user (with a sMerit Tx) per week in the timeframe is 1,38, being 1,27 the average sMerit received by a user that receives sMerit. This doesn´t mean much. What is way more interesting in any case are the extreme cases as we’ll see later on.

The remaining analysis is performed considering the 7 week data timeframe as a whole, without breaking data down per weeks or days (that analysis has already been done nicely in the “Merit Stat from theymos data” thread by Zentdex).

Remember, the context of the universe studied is that of users that give receive sMerit, so all averages and data is regarding this subset of users, not all forum users (users that do not have an sMerit Tx in the timeframe are not part of the analytical universe).

2) Given sMerit (Tab 2):


NumTimesGivenGroup is a variable that counts the number of times a user has given away sMerit.

Most users have given sMerit away once within the 7 week timeframe (52,77%), while nearly 1,83% have done it 20 times or above.

NumTimesGivenGroup   NumUsers   % Total   % Total Aggregate
1               5.718   52,77%   52,77%
2               1.680   15,51%   68,28%
3               1.006   9,28%   77,56%
4               562      5,19%   82,75%
5               391      3,61%   86,36%
6               263      2,43%   88,79%
7               190      1,75%   90,54%
8               137      1,26%   91,80%
9               110      1,02%   92,82%
[10 .. 19)   443   4,09%   96,91%
[20 .. 29)   137   1,26%   98,17%
[30 .. 39)   65   0,60%   98,77%
[40 .. 49)   39   0,36%   99,13%
[50 .. 59)   15   0,14%   99,27%
[60 .. 69)   14   0,13%   99,40%
[70 .. 79)   13   0,12%   99,52%
[80 .. 89)   7   0,06%   99,58%
[90 .. 99)   6   0,06%   99,64%
[>= 100)   39   0,36%   100,00%
         
Total   10.835   100,00%   
   

MeritGivenGroup is a variable that counts the number of sMerit a user has given away (regardless of the number of Txs involved).

Most users have given away less or equal to 3 sMerit within the 7 week timeframe (54,20%), while 7% have given away 20 sMerit or above.

What we don´t have yet is raw data on just how much sMerit each user had at the beginning of the timeframe, to see the proportion of sMerit given out.

Logically, the sMerit is related to rank and function (Admin, Moderator, SMerit Source, etc), but that data is not in the raw dataset to cross.

MeritGivenGroup   NumUsers   % Total   % Total Aggregate
1   3.983   36,76%   36,76%
2   1.190   10,98%   47,74%
3   700            6,46%   54,20%
4   858            7,92%   62,12%
5   1.006    9,28%   71,41%
6   428           3,95%   75,36%
7   278           2,57%   77,92%
8   154           1,42%   79,34%
9   123           1,14%   80,48%
[10 .. 19)   882   8,14%   88,62%
[20 .. 29)   475   4,38%   93,00%
[30 .. 39)   214   1,98%   94,98%
[40 .. 49)   115   1,06%   96,04%
[50 .. 59)   114   1,05%   97,09%
[60 .. 69)   53   0,49%   97,58%
[70 .. 79)   41   0,38%   97,96%
[80 .. 89)   51   0,47%   98,43%
[90 .. 99)   35   0,32%   98,75%
[>= 100)   135   1,25%   100,00%
         
Total   10.835   100,00%   


The top to sMerit givers would be:

user_from   NumTimesGiven   sumMeritGiven   NumDistinctUsersReceived   AverageGivenToUser   NumTimesGivenGroup   MeritGivenGroup   AverageGivenGroup
72795   116   1.212   103   11,77   [>= 100)   [>= 100)   [10 .. 20)
30747   326   725   185   3,92   [>= 100)   [>= 100)   [03 .. 04)
98986   274   649   165   3,93   [>= 100)   [>= 100)   [03 .. 04)
140584   371   616   199   3,10   [>= 100)   [>= 100)   [03 .. 04)
153634   248   462   122   3,79   [>= 100)   [>= 100)   [03 .. 04)
55384   194   409   70   5,84   [>= 100)   [>= 100)   [05 .. 06)
18321   297   399   240   1,66   [>= 100)   [>= 100)   [01 .. 02)
507936   177   384   116   3,31   [>= 100)   [>= 100)   [03 .. 04)
290195   208   378   97   3,90   [>= 100)   [>= 100)   [03 .. 04)
51173   194   349   125   2,79   [>= 100)   [>= 100)   [02 .. 03)
1192397   291   332   228   1,46   [>= 100)   [>= 100)   [01 .. 02)
976210   204   302   75   4,03   [>= 100)   [>= 100)   [04 .. 05)
234771   198   301   147   2,05   [>= 100)   [>= 100)   [02 .. 03)
347141   56   291   40   7,28   [50 .. 59)   [>= 100)   [07 .. 08)
24140   221   274   66   4,15   [>= 100)   [>= 100)   [04 .. 05)
113670   57   261   43   6,07   [50 .. 59)   [>= 100)   [06 .. 07)
252510   249   256   207   1,24   [>= 100)   [>= 100)   [01 .. 02)
487418   180   251   125   2,01   [>= 100)   [>= 100)   [02 .. 03)
553678   111   248   82   3,02   [>= 100)   [>= 100)   [03 .. 04)

These users give away lots of sMerit in many Txs, being user 72795 the one with the highest sMerit giveaway involved.
The complete list is provided in the referenced google sheet.

What’s interesting is to order the data by average given to a user (descending). The top 20 cases would be:

user_from   NumTimesGiven   sumMeritGiven   NumDistinctUsersReceived   AverageGivenToUser   NumTimesGivenGroup   MeritGivenGroup   AverageGivenGroup
402519   2   100   1   100,00   2   [>= 100)   [>= 100)
679185   2   78   1   78,00   2   [70 .. 79)   [70 .. 80)
105730   2   72   1   72,00   2   [70 .. 79)   [70 .. 80)
990983   3   60   1   60,00   3   [60 .. 69)   [60 .. 70)
216987   3   100   2   50,00   3   [>= 100)   [50 .. 60)
377987   42   100   2   50,00   [40 .. 49)   [>= 100)   [50 .. 60)
224248   2   100   2   50,00   2   [>= 100)   [50 .. 60)
163375   2   100   2   50,00   2   [>= 100)   [50 .. 60)
188453   2   100   2   50,00   2   [>= 100)   [50 .. 60)
316552   1   50   1   50,00   1   [50 .. 59)   [50 .. 60)
102607   1   50   1   50,00   1   [50 .. 59)   [50 .. 60)
343531   2   50   1   50,00   2   [50 .. 59)   [50 .. 60)
80045   1   50   1   50,00   1   [50 .. 59)   [50 .. 60)
928416   1   50   1   50,00   1   [50 .. 59)   [50 .. 60)
905665   1   50   1   50,00   1   [50 .. 59)   [50 .. 60)
37924   1   50   1   50,00   1   [50 .. 59)   [50 .. 60)
400908   1   50   1   50,00   1   [50 .. 59)   [50 .. 60)
821846   1   50   1   50,00   1   [50 .. 59)   [50 .. 60)
127521   1   50   1   50,00   1   [50 .. 59)   [50 .. 60)

Note that users that in the list above,  the number of merit recipients are small (1 or 2 sMerit receivers), but the average sMerit is 50 or above. One case even sends over the 50 sMerits in 42 different Txs to 2 users.

3) Received sMerit (Tab 3):

Symmetrical to the Given sMerit analysis, I performed a Received sMerit analysis.

NumTimesReceivedGroup is a variable that counts the number of times a user has received sMerit.

Most users have received sMerit once or twice within the 7 week timeframe (67,97%), while nearly 7% have received sMerit 10 times or above.

I won't bore you with detailed data here (that’s what the google sheet is for).

MeritRecievedGroup is a variable that counts the number of sMerit a user has given away (regardless of the number of Tx involved).
Most users have received between 1 and 3 sMerit within the 7 week timeframe (58,25%), while 22,8% have received 10 sMerit or above.

Top 20 receivers are:

user_to   NumTimesReceived   sumMeritReceived   NumDistinctUsersFrom   AverageReceivedFromUser   NumTimesReceivedGroup   MeritReceivedGroup   AverageReceivedGroup
35   451   1.636   354   4,62           [>= 100)   [>= 100)   [04 .. 05)
3   87   736   79   9,32                   [80 .. 89)   [>= 100)   [09 .. 10)
520313   144   686   121   5,67           [>= 100)   [>= 100)   [05 .. 06)
976210   305   642   113   5,68           [>= 100)   [>= 100)   [05 .. 06)
1076869   77   478   58   8,24           [70 .. 79)   [>= 100)   [08 .. 09)
101872   126   457   80   5,71       [>= 100)   [>= 100)   [05 .. 06)
98986   143   372   90   4,13           [>= 100)   [>= 100)   [04 .. 05)
459836   204   339   114   2,97        [>= 100)   [>= 100)   [02 .. 03)
1108331   186   338   180   1,88           [>= 100)   [>= 100)   [01 .. 02)
30747   161   334   115   2,90           [>= 100)   [>= 100)   [02 .. 03)
1038794   77   298   68   4,38           [70 .. 79)   [>= 100)   [04 .. 05)
886352   12   283   11   25,73   [10 .. 19)   [>= 100)   [20 .. 30)
26945   47   281   47   5,98           [40 .. 49)   [>= 100)   [05 .. 06)
915311   53   274   37   7,41           [50 .. 59)   [>= 100)   [07 .. 08)
225292   15   265   11   24,09   [10 .. 19)   [>= 100)   [20 .. 30)
479624   192   261   70   3,73           [>= 100)   [>= 100)   [03 .. 04)
925926   20   255   16   15,94   [20 .. 29)   [>= 100)   [10 .. 20)
944905   23   251   13   19,31   [20 .. 29)   [>= 100)   [10 .. 20)
905442   17   244   17   14,35   [10 .. 19)   [>= 100)   [10 .. 20)

There a wide variety of average sMerit received from a single user within this top 20, ranging from 25,73 to 1,88.

Again, sorting the data set in the google by average received, and combining it with number of times or users may lead to something interesting.


4) Given/Received sMerit I (Tab 4):

This is also interesting, even more so I think. It is based on something I started last week and now update here.

The idea is to see how much sMerit a user A gives to a user B, and how much B gives back to user A. My google sheet lists all the combinations aggregated by each pair of users, counting the total Smerit given (SumMeritGiven) and received (sumMeritReceivedBack) in the timeframe and the number of Transactions used in the process (numTimesGiven, numTimesReceivedBack).

NumTimesGivenGroup is a variable that counts the number of times a user A has given away sMerit to user B (note that this a is between user analysis).

Most users A have given sMerit away to user B once or twice within the 7 week timeframe (95,69%), while 1,1% (407 cases) have done it 5 times or above, giving sMerit to the same user. This 1,1% may be interesting to observe in more detail.

MeritGivenGroup is a variable that counts the sMerit a user A has given away to user B.

Most users A have given between 1 and 5 sMerit away to user B within the 7 week timeframe (91,14%), while 2,26% (825 cases) have given 20 sMerit or above.

NumTimesReceivedBackGroup is a variable that counts the number of times a user A has received sMerit from user B (having user A given sMerit to user B too).

Most users (82,77%) do not receive any sMerit back, but therefore 17,23% do to some degree.
This is not necessarily an indicator or wrong doing at all , but amongs the 17,23% is where we’ll find the farms.

The detailed table is interesting and can help trackdown farms from the quantitative point of view (apply sort and filters).

I checked cases reported on other threads as sources of potential sMerit exchange farms (i.e subject on Suspected users that are abusing merit 3.0) and they mostly came up on the list. What the list does not account for is potential cased that people report of sMerit given away to low quality posts.
 
5) Given/Received sMerit II (Tab 5):

This is the same as the previos list, but narrowed down to all cases where sMerit given from A to B is the same as sMerit received from B to A in the range of +/- 20%.

This list also removes “duplicate” cases (in the previous list, users A and B are potentially figuring twice; once as sender and once as receiver).

There are only 160 cases in this second list where users send 5 sMerit or above from A to B and get roughly the same back.





Jump to: