Pages:
Author

Topic: Forum Word Clouds - Which words are most common in Topics (Read 348 times)

legendary
Activity: 2310
Merit: 10758
There are lies, damned lies and statistics. MTwain
I've tried to replicate the exercise in the OP, but instead of focusing on topic title (which is what I did before), I went for the whole merited post content. I has hopes that it would bring something meaningful, but I'm not convinced now, and actually prefer the results of the first approach (trending topics) to the later (merited posts), although the context is different.

The WordCloud I've created is for Meta. What I've done is count the number of occurrences all the words of all the Meta merited posts, then group manually words with a common structure (i.e. Merit and Merits) from the top of the list, and finally proceeded to perform a manual selection at my discretion of what I considered were relevant words (i.e. discard words like 'the', 'to', 'this' and so on).

The result for Meta is as follows:


It varies from the OP WordCloud for Meta, but I don’t see a clear relation to being merited by using a certain set of keywords. This goes to show, as I see it, that it’s not just the words that are necessary, but a proper construction and context.  Of course these are not retrievable from just a word count, nor is the post’s overall tone and style.

The number of occurrences of the above words is as follows in the aggregate Meta awarded posts:
Code:
tag                                     nTimes
---------------------------------------------
rank                                    5900
Post                                    9978
Merit                                   7631
thread                                  1924
give                                    4565
forum                                   2071
people                                  2062
system                                  2017
sMerit                                  1680
account                                 1304
spam                                    1764
members                                 1764
User                                    1999
think                                   1331
topic                                   1454
time                                    1082
board                                   1214
quality                                 891
need                                    811
know                                    801
receive                                 898
Bitcoin                                 665
trust                                   659
newbies                                 877
source                                  598
campaign                                831
activity                                533
read                                    465
help                                    454
signature                               432
find                                    392
take                                    392
theymos                                 390
work                                    381
better                                  370
list                                    369
Jr                                      359
earn                                    358
Discussion                              349
section                                 349
Hero                                    326
problem                                 319
idea                                    312
Legendary                               299
bitcointalk                             296
English                                 289
trying                                  286
rules                                   280
community                               268
Please                                  268
person                                  267
example                                 265
understand                              259
believe                                 255
bad                                     254
bounty                                  254
reason                                  254
negative                                251
great                                   243
data                                    242
link                                    239
hope                                    238
unique                                  237
agree                                   234
different                               234
OP                                      230
money                                   229
profile                                 228
local                                   226
change                                  224
create                                  218
opinion                                 214
check                                   212
question                                201
meta                                    200
learn                                   197

I'm not sure it's a line of work worth pursuing...
legendary
Activity: 2310
Merit: 10758
There are lies, damned lies and statistics. MTwain
Seems legit. What tool did you use to make these?
Have you manually examined all of the topic titles then create a Photoshop (or MSPaint) image?

It's based on the recent 200 titles alright, and the images just displayed almost half of it [but the whole thing (except the center) looks uninteresting ],
I want to see most of them.

I think if this was a phrase cloud and not limited to thread titles, the whole forum's center and largest would be: "I think"

This initial approach is a rather simple one. What I was looking for was trending words, thus the focus on 200 topic titles per forum section/subsection. I'm working on a more complex one, but it takes quite some time and it has to be done in off hours.
The painting part is done using a tool called Worditout. It's simple, but for this scenario it got the job done.
newbie
Activity: 322
Merit: 0
Using this knowledge everyone is able to create an attractive and very interesting thread Grin in any part of bitcointalk forum.
As for me, using this words for write sentences can be really funny!
hero member
Activity: 1022
Merit: 564
Need some spare btc for a new PC
I'm surprised there is no mention of prostitution, alia and cute little slaves. .

haha that's why I'd like to see one made for politics and society.  Grin

Well, those pics once again show how bad is the Economics section. The most popular keywords mostly consist of the common things that could be also popular in other sections. I see  a strong lack of any kind of "intellectual" economic discussion there while people are trying to spam this section mostly with the crypto shit.

Well not that much. Economics is a wide term and pretty much everything is on topic there. Tho, I'm surprised with that "Philippines" and that the words like "end" or "die" aren't a little more involved, since people there tend to create lots of "is bitcoin dead" topics. Even more now when the price's lower. Grin
staff
Activity: 3248
Merit: 4110
I was hoping to construct a reply based on the words used in this section, but my creativity isn't good enough. Quite enjoy these statistics that you keep releasing though. The good thing about these word clouds is that I would expect most of these words to consist in their given sections. I guess that it shows that most of the topics/threads in each section are at least on topic somewhat.
full member
Activity: 715
Merit: 220
Nice Job! It give indeed a good overview of the different part of the forum.
As stated before, filtering of common words (even if this is manual filtering) would be great in order to be more accurate.
But again, congrats for your job!
full member
Activity: 1736
Merit: 121
what a pointless thread, op just fishing for merit.

Your post is amplifying hate for innovative work. Again, if you are creative then show it because users will appreciate it too by either giving some merit or thumbs up. But if you do a shit stunt,  Grin.....I trust the users that you must surely get a spank on the ass
full member
Activity: 924
Merit: 148
Well, those pics once again show how bad is the Economics section. The most popular keywords mostly consist of the common things that could be also popular in other sections. I see  a strong lack of any kind of "intellectual" economic discussion there while people are trying to spam this section mostly with the crypto shit.
legendary
Activity: 2310
Merit: 10758
There are lies, damned lies and statistics. MTwain
I'm surprised there is no mention of prostitution, alia and cute little slaves. .

I figure they were not yesterday night's trending topics ... (the graphics are based on last 200 topic headers at time of editing).
copper member
Activity: 1330
Merit: 899
🖤😏
I'm surprised there is no mention of prostitution, alia and cute little slaves. .
legendary
Activity: 2450
Merit: 2190
I have created a word cloud for 9 section/subsections selected at my discretion, using the Topic names only, for the most recent 200 Topics in each of these exercises. Topics get moved around often, so the actual words could change from hour to hour really, but it's not a bad starting point for this analysis.
Good work. But it is not convenient to look at these images representing clouds. You better to post the most frequently used words as a text in some order. And as it mentioned above, the common words and particles that not concerned to the crypto world should be omitted.

And I am not surprised that the one of the most frequently used word in the current section is "merit". Smiley
legendary
Activity: 2310
Merit: 10758
There are lies, damned lies and statistics. MTwain
<...>

Sure, what you say makes sense. Bare in mind that the tool I used is extremely simple and going only into subject names is not that profound.Whole threads should give us better underlying view into favoured concepts/terms. What this does open up is an interesting possibility to perform these kind of analysis for anyone with access to better tools and a larger master of them applying content/context based filters to group or exclude terms and so forth.

Even more interesting would be to do stuff such as what I wrote on my reply to vlom a few posts above, with social network analysis of Bitcointalk as a brand. That is out of my personal scope though.
full member
Activity: 420
Merit: 182
Interesting work, but I'd filter out some of the common words which are ambiguous or meaningless without more context like: more, again, never, just, world, take, only, now, etc...

Honestly, upon further consideration I don't know that much can be inferred even from seemingly unambiguous words like, "recommend," as someone could either be recommending for or against a course of action.

To be somewhat more constructive rather than just point out crap I don't like - because who likes a Debbie Downer, right? - I would filter on just coin names for the altcoin sections as that would show relative interest levels and actually be quite useful as a gauge of sentiment/popularity for various coins.

legendary
Activity: 2310
Merit: 10758
There are lies, damned lies and statistics. MTwain
Oh, just realized this is only for the topic titles, I though first it is for the content of the last 200 topics.
That one would be interesting....

Yes, It would be great to actually go into the threads, but for now topic titles is the deepest we can go without better access to forum content data.
The idea was to create these word clouds for fun, and see if any weird stuff cropped up on the subjects themselves.

Note: Added bounties..
legendary
Activity: 2310
Merit: 10758
There are lies, damned lies and statistics. MTwain
@OP: could you please provide the figures too. not just the pictures.

Danke.. Unfortunately, the software I used is very simple and only gives me the actual word count if I go and click word by word .. I'd take ages with this tool to get a word count....

What I really had in mind was doing something similar but based on social network traffic such as twitter, facebook and so on, just to see what could be retrieved in relation to Bitcointalk. The idea was to treat the forum as a brand and see what the associated word cloud looked like. This is often done in marketing, along with obtaining the sentiment, temperature and tone of the related flowing messages for example.
Alas, what I found around for free didn't convince me, and it was rather partial and thus non-representative. These kind of studies can be performed by social media agencies, but I lack direct access to the tools.

Another interesting thing to do would be to do the same as in the OP but going in deeper: not just Subject name as I have done, but going in to the actual threads and getting their content too. That, from outside the forum software, is too cumbersome to do too, but conceptually interesting.
legendary
Activity: 2828
Merit: 6108
Blackjack.fun
Hmm, pretty interesting..
Segwit missing (or I'm missing it) from Bitcoin Discussion, guess nobody cares discussing it
Philippines and China the top countries in Economics

Edit.

Another interesting thing to do would be to do the same as in the OP but going in deeper: not just Subject name as I have done, but going in to the actual threads and getting their content too. That, from outside the forum software, is too cumbersome to do too, but conceptually interesting.

Oh, just realized this is only for the topic titles, I though first it is for the content of the last 200 topics.
That one would be interesting....
legendary
Activity: 1498
Merit: 1113
what a pointless thread, op just fishing for merit.

no so pointless as your comment. and have a look how many merits OP received. just two from me. the OP did somesthing and i like what he did. thats why i sent him two. btw: i am generous in giving merits. soon i wont have any left.

@OP: could you please provide the figures too. not just the pictures.
legendary
Activity: 2310
Merit: 10758
There are lies, damned lies and statistics. MTwain
what a pointless thread, op just fishing for merit.

You’re pretty dumb for a newbie. Bring something new to the table or lurk in the gutters
member
Activity: 240
Merit: 10
what a pointless thread, op just fishing for merit.
legendary
Activity: 2982
Merit: 2681
Top Crypto Casino
Well, it is good to see that the Bitcoin forum still has Bitcoin as the main topic.
When I took the first look to your thread I was expecting "Merit" and "Scam" as the most used words, due to the rising of scamming accusations regarding merit, as it is happening at least on the meta board. I'm glad to see "Bitcoin" is still the center of the matter in here, as well as "Ethereum" in the Altcoin Section, instead of "Bounty", "ICO" or "Airdrop".

Nevertheless, the meta activity is still somehow concerning, for maybe the correct principal subject should be "Forum", as it probably used to be.
Once more, thanks for your work, great, as usual.
Pages:
Jump to: