Author

Topic: Finding coins intended to be short-lived (Read 940 times)

sr. member
Activity: 420
Merit: 263
let's make a deal.
January 24, 2014, 12:54:24 AM
#6
i retained the quotes for the word clouds; i figure people quote things they're talking about so having certain words or users pop up repeatedly from quotes would be indicative of the thread activity.  i think quotes will increase the frequency of certain words disproportionately.  This may (or may not!) reflect thread participants fixating on specific words (e.g. premine, fork), so it
would give more weight to words that are nestled in a pryramid of quote.

we can also try to strip a thread of quotes and run it alongside the same thread with quotes to see how it affects frequency.  

note the wordle discards common words like 'the', and 'and', so there are some artifacts in the output that skew normal linguistic distribution.  however, the content of specific jargon (scrypt, mhash/USD) is retained so we do get the meat of the thread distilled in keyword form.

Frequency of many word may not be as informative as the frequency of usernames.  in the various threads i do recognize usernames that i do associate with certain postings (e.g. trolls, crusader, dump-and-pumper, ideologue) which give me some idea of the ongoing discussions.  

Certain words may correlate trends with frequency:  I could see that the 280x is by far the predominant video card being used, dwarfing 270x and 290x discussions.  

Warning words like "fork" or "difficulty" (esp. in a young coin) are red flags where even a low frequency would cause concern.   if we're just looking for these red flags, the wordle is acting like a turbocharger for your bullshit detector.

We could eventually quantify intuition, or gut instinct, by weighting words with importance, priority, and alarm.  However that sounds like a lot of work.  At a glance, the word cloud can give us a feel for what trends are within certain threads, but i we should be able to get more information by improving our sort method(e.g. only username analysis, or divide the wordle into 24 hour intervals)


legendary
Activity: 1862
Merit: 1114
WalletScrutiny.com
January 24, 2014, 12:09:20 AM
#5
kalus thanks for your effort! I guess quoting has a strong effect on word frequency or did you filter that out?
sr. member
Activity: 420
Merit: 263
let's make a deal.
January 23, 2014, 10:52:27 PM
#4
out of interest i threw a couple of threads into wordle to see what keyword frequency could say:

visacoin:  (lol 'scam' 'op' 'ipo')



amerocoin: ('addnode' having wallet troubles on the launch?)



best $25,000 gpu for the job:  (270x, 280x, 290x, khash)



catcoin:  (meow



based purely on word frequency, there are words that do ring alarm bells (e.g. visacoin thread - scam, ipo, whitepaper). As the thread gets longer, the keywords do still show up strongly (e.g. scam, coingen, expensive, america).  the frequency of specific, technical words (e.g. best video card thread - 280x, kwh, hash) indicate more focused topics.  weasel words/marketing words also stick out like a sore thumb (e.g. value, relaunch, everyone)  I was worried that serial quotes or quote pyramids would change the distribution of the usernames, but they are proportional to the activity of the user in the thread.  of interest are the threads where wallet addresses show up in the wordle.  there's a lot of interesting info researching distribution and frequency alone.  

someone could throw the words into a T-test to determine statistical significant correlation between pairs of words.  It would also be interesting to track correlation over time (e.g. 'scam' appears on these dates onwards, or this user appears for these #days only)  to see if there are trends in posting, posters, opinion, or information.  

legendary
Activity: 1862
Merit: 1114
WalletScrutiny.com
January 23, 2014, 10:11:06 PM
#3
going by post count alone isn't a good metric for users, and it isn't a good measure of a coins worth either.  Why would a coin thread where 80% of the comments are "my wallet doesn't sync" and "cryptsy stole my mooncoin!" warrant recognition?

i've seen scam threads that go on for over 20 pages.  that doesn't mean they should be recognized or lauded.  

I did *not* advocate to give any value to long threads. The opposite is true: I have the feeling that if people gave a f*** about some coin for 3000 posts, it must be more than just another scam. I'm sure others have the same impulse and I'm sure yet others try to engineer long threads to gain relevance. Discovering via easy metrics if a thread was engineered would help to call out these scammers.
sr. member
Activity: 420
Merit: 263
let's make a deal.
January 23, 2014, 09:22:40 PM
#2
going by post count alone isn't a good metric for users, and it isn't a good measure of a coins worth either.  Why would a coin thread where 80% of the comments are "my wallet doesn't sync" and "cryptsy stole my mooncoin!" warrant recognition?

i've seen scam threads that go on for over 20 pages.  that doesn't mean they should be recognized or lauded.  

How many posters are there per topic(how small is the group of supporters)?

an index for postcount:user would certainly be insightful.
legendary
Activity: 1862
Merit: 1114
WalletScrutiny.com
January 23, 2014, 09:13:21 PM
#1
Watching the endless inflow of new alt-coins on the """list of all cryptocoins""" and seeing how even coins I never heard of were discussed for over 3k posts without making it on the list makes me wonder who are these people that pump just about any alt coin, trying to get legitimacy by being listed on whatever list of important whatevers? Could somebody please parse all the respective announcement threads and get some intelligence on the topic? I would be interested in:

How many posters are there per topic(how small is the group of supporters)?
How many of them discuss on more than one/two/three alt coins(get rich quick with whatevers)?
How many posters in the alt-coins only do post in alt coins(prime sock puppet suspects)?

I'm sure whoever parsed all these threads would get some interesting insights.
Jump to: