Author

Topic: Visualizing shilling. (Read 127 times)

sr. member
Activity: 602
Merit: 295
Hail Eris!
November 17, 2017, 10:57:21 PM
#1
Hey guys.  I do work with data mining and visualization and have come up with a fun way to visualize and detect shilling, at least for our toy data set.  We look forward to using it to analyze BCT threads.

First we trained a Naive Bayes classifier to predict the sentiment of movie reviews using a corpus of movie reviews which were tagged as either negative or positive.  It basically establishes the probability of texts being negative or positive given the frequency of different keywords.

We can use this to establish the probability of some unknown text being positive and the probability of it being negative.

Then we wrote a toy data set of movie reviews to simulate the introduction of shilling.  The first six were positive and then the next six we simulate the entrance of a shiller by alternative positive and negative.

         "wonderful wonderful wonderful",
         "real neat ",
          "best movie of the year",
         "the movie was great I loved it",
         "best acting favorite",
         "awful",
         "the movie was amazing",
         "terrible movie",
         "favorite actor",
         "worst movie bad acting"
         "most amazing movie ever",
         "sucks bad"

Then we used our classifier to predict the negative and positive sentiment scores which were then visualized.  The scores for reviews were visualized as points where one axis represents the probability of the text being negative and the other the probability of it being positive.  A third axis represents time.



Thought some of you might appreciate this.  If you are interested in the script used send me a message and I will send it to you.

Jump to: