Pages:
Author

Topic: Plagiarism: Where Do We Draw the Line? (Read 874 times)

legendary
Activity: 2940
Merit: 7892
September 10, 2021, 08:48:06 PM
#63
We can use all sorts of tricks to catch plagiarists and they'll just make their posts more and more obscure - copying from outside sources, translating from other languages, etc - as long as there is a financial incentive to do so.

The lengths some people will go to in order to avoid having an original thought are pretty amazing. I mean, how hard is it to just think of a sentence in your head and transcribe it into a post?

The other day a WO semi-regular was banned for copying sports articles written in Italian and Google Translating them. He was a legendary too. Possibly the worst part is he's not even Italian.
legendary
Activity: 3654
Merit: 8909
https://bpip.org
September 09, 2021, 09:53:24 AM
#62
You'd first need a set of master data with all possible 6 word snippets of text from all the existing posts. (provided someone is copying only from existing Bitcoin posts). This would then have to be compared with the set of snippets formed from every new post. While this could be done, I believe the space and memory requirements would be pretty huge. Though, doesn't google do it for like, all of the internet? And Altavista used to do it at one time. Now, google has humungous capacity of course but I don't think that the old sites like Altavista had those.

There are clever indexing methods that make this kind of search relatively quick and also can match slight variations, even word spinning to an extent.

I'm not really sure what you're proposing (checking your posts against all other posts? why exactly?) but the plagiarism problem in general is not a technical one. We can use all sorts of tricks to catch plagiarists and they'll just make their posts more and more obscure - copying from outside sources, translating from other languages, etc - as long as there is a financial incentive to do so.
copper member
Activity: 1610
Merit: 1898
Amazon Prime Member #7
September 09, 2021, 09:25:40 AM
#61
The problem is that it is really not possible to check every new post for plagiarism because the cost of checking an additional post will grow for every additional post written. For example, if there are 100 posts that exist on the forum, the cost of checking a new post against all existing posts is 100 units. Once there are 1000 posts on the forum, the cost of checking a single new post against all existing posts is 1000 units. For each additional post made, it costs one additional unit to check a single additional post. This is obviously not sustainable.
Thanks for chiming in. Discussing these things is always interesting. You are talking about the time complexity of such a search and match algorithm.
Right. As the number of posts increase, so does the amount of time it takes to check one additional post.

You'd first need a set of master data with all possible 6 word snippets of text from all the existing posts. (provided someone is copying only from existing Bitcoin posts). This would then have to be compared with the set of snippets formed from every new post. While this could be done, I believe the space and memory requirements would be pretty huge.
You are describing one way in which all current posts could be checked for plagiarism (at least plagiarism by copying other users' posts).

What you describe is missing two things. Existing posts would not be checked for plagiarism, and if a post is written in the future and is subsequently plagiarized, the setup you describe would not catch it.
legendary
Activity: 2730
Merit: 7065
Farewell, Leo. You will be missed!
September 09, 2021, 07:02:42 AM
#60
AFAIK, in such cases, if the user has ample/sufficient proof that he owns the account that initially published/posted the article in another platform, social media or whatever, then they prolly wouldn't be banned, the only problem here is the means by which such user is going to prove beyond reasonable doubts to be reposting an already existing work of his.
That's not too difficult to do and prove. One way you can prove the article is yours and that the account that posted it belongs to you is to put up links in the source someone claims you stole it from that point to all the other sites where you posted the article. That should include Bitcointalk. That's sufficient proof that you are the same person. Or you know the original poster and/or paid him to help you. 
legendary
Activity: 1946
Merit: 1224
'Life's but a walking shadow'!
September 08, 2021, 07:55:36 AM
#59
I personally faced a problem that one gets banned only for publishing own articles. Mod bens for plagiarism, but don't understand why they banned them from publishing their own article. He was a crypto blogger, used to publish the same article on different platforms.
If you don't mind, can you lead us to this case through a link or something. AFAIK, in such cases, if the user has ample/sufficient proof that he owns the account that initially published/posted the article in another platform, social media or whatever, then they prolly wouldn't be banned, the only problem here is the means by which such user is going to prove beyond reasonable doubts to be reposting an already existing work of his. Having said that though, copying your article exactly the same way and posting on numerous platforms is somewhat unnecessary, since you wrote the initial topic yourself, you can just basically do a brief overview of it, and obviously it'll come in different words, then at the end of it, you just simply put a link to of course the original source which still happens to be your work.
newbie
Activity: 1
Merit: 0
September 08, 2021, 07:44:44 AM
#58
Plagiarism is totally a copy and paste of already existing work which everyone is aware of, so members or users that are into plagiarism only think of their upliftment or benefit alone, the reason while majority of them associate into altitude of yours its because they need to be mentioned, known and also make their work to be perfect, so narrating or using it via bitcointalk community here, the objective of uses who indulged in such is to earn a huge merits that will elevate them. So not knowing that plagiarism is not a work of initiative, right now a plagiarism can be called a setback from my principles of understanding of plagiarism.
legendary
Activity: 1876
Merit: 1157
September 08, 2021, 01:18:20 AM
#57
The problem is that it is really not possible to check every new post for plagiarism because the cost of checking an additional post will grow for every additional post written. For example, if there are 100 posts that exist on the forum, the cost of checking a new post against all existing posts is 100 units. Once there are 1000 posts on the forum, the cost of checking a single new post against all existing posts is 1000 units. For each additional post made, it costs one additional unit to check a single additional post. This is obviously not sustainable.
Thanks for chiming in. Discussing these things is always interesting. You are talking about the time complexity of such a search and match algorithm. I read some of this stuff back when I took a course in Python. It was enlightening to read about algorithms and make small enumeration programs. Programming i guess is all about practice and actually building upon existing complexity. I did make a program to sort for myself a very poorly formatted data fed into excel in CSV forms. But having been busy in other stuff did not leave room to continue learning.

You'd first need a set of master data with all possible 6 word snippets of text from all the existing posts. (provided someone is copying only from existing Bitcoin posts). This would then have to be compared with the set of snippets formed from every new post. While this could be done, I believe the space and memory requirements would be pretty huge. Though, doesn't google do it for like, all of the internet? And Altavista used to do it at one time. Now, google has humungous capacity of course but I don't think that the old sites like Altavista had those.
newbie
Activity: 7
Merit: 0
September 07, 2021, 05:18:32 AM
#56
Plagiarism destroys one’s reputation, the penalty is an immediate ban of the user. IMHO many newbies tend to plagiarise because they don't read the rules and guidelines.
“No idea is completely new, only evolving” Many writers get ideas and inspiration from other writers, but you cannot copy a chunk of sentences from another writer word for word without being accused of plagiarism. Paraphrasing doesn't help either. Plagiarism can be accidental or intentional. Example of intentional plagiarism is when a post is written in one language is rewritten in another and posted in local board without citing the original poster. I don’t quite know where the forum stands on accidental plagiarism. When newbies ( I use newbies as a case study because I believe they are more likely to plagiarise although I do not have the stats to prove that) are interested in a particular topic they gather all sorts of information, this may eventually cause problems distinguishing between common knowledge, facts and information that needs citation.
I personally faced a problem that one gets banned only for publishing own articles. Mod bens for plagiarism, but don't understand why they banned them from publishing their own article. He was a crypto blogger, used to publish the same article on different platforms. If other social platform does not interrupt then getting banned is hard to see. I think moderators should update the system to justify the members.
One user works day by day to contributes to the country and builds a reputable profile but one flag can destroy the dignity.
Note It's all my own opinion.
copper member
Activity: 1610
Merit: 1898
Amazon Prime Member #7
September 07, 2021, 01:56:48 AM
#55
This could be an interesting exercise if done on the, say, 100 most prolific and discernibly original posters on the forum. I think I will make the cut in atleast the top 200, if not 200100. Anyways, that an idea right there for the OP to check "Where to draw the line".
Go for it if you want (I don't know how good your researching skills are), or maybe someone like LoyceV or one of the statistics gurus will do it.  If someone does do it though, I do hope it doesn't result in good members getting banned--unless they obviously deserve to be.
If i had anywhere near the skills needed to do this, I would probably be a software dev myself and not installing propulsion equipment in train engines, LOL. Its more of an idea for someone with the dev skills to do it. I can then give myself one of those pompous managerial designations like "Research design consultant" or something.

A general algorithm would probably parse through all of the post history and compare it with everyone else's in snippets of 6 words each finding a match percentage. Lots of enumeration, which I was never good at. Then you'd have to root out the edge cases like quotes and references. It can be done and could even be an interesting open source project.

EDIT: Maybe if we had a sort of hackathon bounty for it for building this and some other tools on a platform like Gitcoin. Would be great to see a Bitcointalk Tribe in Gitcoin. See, there i go giving away "ideas" again.What say ya? @Theymos
It is not terribly difficult to remove things such as quotes from posts. Markup (things such as bold, and links) can also be removed trivially.

Splitting up the text of posts into sets of 6 words will be expensive, but is doable. A text with n words will have n - 6 sets of words.

The problem is that it is really not possible to check every new post for plagiarism because the cost of checking an additional post will grow for every additional post written. For example, if there are 100 posts that exist on the forum, the cost of checking a new post against all existing posts is 100 units. Once there are 1000 posts on the forum, the cost of checking a single new post against all existing posts is 1000 units. For each additional post made, it costs one additional unit to check a single additional post. This is obviously not sustainable.
jr. member
Activity: 56
Merit: 37
September 07, 2021, 01:46:56 AM
#54
Plagiarism destroys one’s reputation, the penalty is an immediate ban of the user. IMHO many newbies tend to plagiarise because they don't read the rules and guidelines.
“No idea is completely new, only evolving” Many writers get ideas and inspiration from other writers, but you cannot copy a chunk of sentences from another writer word for word without being accused of plagiarism. Paraphrasing doesn't help either. Plagiarism can be accidental or intentional. Example of intentional plagiarism is when a post is written in one language is rewritten in another and posted in local board without citing the original poster. I don’t quite know where the forum stands on accidental plagiarism. When newbies ( I use newbies as a case study because I believe they are more likely to plagiarise although I do not have the stats to prove that) are interested in a particular topic they gather all sorts of information, this may eventually cause problems distinguishing between common knowledge, facts and information that needs citation.
In fact, in a sense, all our speeches and the knowledge we have learned are the summaries of our predecessors. This is the result of our childhood learning, including some technical language, when we are expressing our views. Is this plagiarism? The definition of plagiarism should be copied throughout. If you say a few celebrities, you will be judged plagiarism, which is a bit nitpicking. Plagiarism is prohibited and originality is encouraged. There is nothing wrong with it. I have no objection to banning accounts because of plagiarism, but I am a bit unacceptable because of the bones in the skin of the egg.
legendary
Activity: 1876
Merit: 1157
September 07, 2021, 01:24:16 AM
#53
This could be an interesting exercise if done on the, say, 100 most prolific and discernibly original posters on the forum. I think I will make the cut in atleast the top 200, if not 200100. Anyways, that an idea right there for the OP to check "Where to draw the line".
Go for it if you want (I don't know how good your researching skills are), or maybe someone like LoyceV or one of the statistics gurus will do it.  If someone does do it though, I do hope it doesn't result in good members getting banned--unless they obviously deserve to be.
If i had anywhere near the skills needed to do this, I would probably be a software dev myself and not installing propulsion equipment in train engines, LOL. Its more of an idea for someone with the dev skills to do it. I can then give myself one of those pompous managerial designations like "Research design consultant" or something.

A general algorithm would probably parse through all of the post history and compare it with everyone else's in snippets of 6 words each finding a match percentage. Lots of enumeration, which I was never good at. Then you'd have to root out the edge cases like quotes and references. It can be done and could even be an interesting open source project.

EDIT: Maybe if we had a sort of hackathon bounty for it for building this and some other tools on a platform like Gitcoin. Would be great to see a Bitcointalk Tribe in Gitcoin. See, there i go giving away "ideas" again.What say ya? @Theymos
member
Activity: 143
Merit: 17
September 06, 2021, 08:59:00 AM
#52
Research is all about reading and compelling other people's work into your own, this is all what a literature review is all about, checking bitcoin wallpaper you could see lots of references made by Satoshi, invention is a gradual process you can't do it all alone you take other people's work and try to take advantage of their weakness, give solution to it and that's it but a reference must be made to the existed /copied project. Merits fishing or not do not steal people's work without citing them.

Anyways, that an idea right there for the OP to check "Where to draw the line".
He has a couple of hours until he gets banned, since his idea is only to divert attention.
I don't really know while some person's engaged in plagiarism, if someone wants to impact knowledge to another person through someone's work or already existing articles i think the best option is that after using the work you have to add the source to indicate that is work is does not belong to you, some people claim that they are author of the work, and that is one of the major reason that put them into trouble.
legendary
Activity: 2730
Merit: 7065
Farewell, Leo. You will be missed!
September 06, 2021, 06:55:03 AM
#51
I remember that there was an instance of plagiarism even against someone like "Lauda", who i believe was never at a loss of words and would have been silly to use someone else's words.
There was, yeah. I think Lauda even said something like if someone were to go through all his/her posts, they could probably find cases where there are missing sources. I am not 100% sure about this, but I think that's how I remember it.

Then again, writing 2000 plus posts over 4 years while reading articles and gathering information from multiple sources, is it possible that a person's words may seem plagiarized. I think it is difficult to come to a conclusion without some real world exercise.
I thought and wrote about that yesterday.

If someone browses this forum, reads books about Bitcoin, and watches YouTube videos on that subject, that person has probably acquired all kinds of info and knowledge that he will remember. Naturally, it's hard to remember what you heard or read where. But when making a post or replying to someone on Bitcointalk, you can always say something like: "I read", "I remember seeing", "Member X said in one of his posts", "I can't remember the source, but...." 

All that relieves you of the responsibility that you are trying to make it look like what was written are your words. You don't have to cite the source for everything if you don' know it, but make sure to mention that someone else said it. I haven't looked into what Mpamaegbu and Pokapoka124 are blamed for, so this is only a general recommendation.

Imagine you like helping others with their issues regarding Bitcoin, wallets, security, etc. Now imagine there are 5, 10, 20, X number of other people doing the same. All of you have similar interests, you are all writing similar recommendations, use similar terminology, phrases, etc. Sooner or later, there will be 2 or3 posts that lookalike. But that's why admins take their time and probably consider many factors before applying the ban hammer. 
legendary
Activity: 2170
Merit: 3858
Farewell o_e_l_e_o
September 05, 2021, 10:41:31 PM
#50
Even if such a software exists, it should use the famous "five consecutive words" rule.
Text spinning, paraphrasing are considered as plagiarism. If you intentionally spin text around, paraphrase original content and intentionally (sure for such cases) skip source of content, you are plagiarist.

In academic space, plagiarism is treated more seriously than in the forum.

Softwares are tools to detect potential cases, then moderators will handle it with support from softwares and community reports.
legendary
Activity: 1078
Merit: 1022
Hello Leo! You can still win.
September 05, 2021, 07:53:09 PM
#49
Isn't there some advanced software that academic institutions use to detect plagiarism? Wonder if we can get an overview of the algorithm used by that software. That could easily govern the set of rules that should form plagiarism.

Even if such a software exists, it should use the famous "five consecutive words" rule. Also, it could be some set of plagiarism rules implemented by the software developers. I think the plagiarism rules frequently adopted are not that universal  based, rather based on ethical considerations. I think some plagiarism rules in this forum are not cast on stones but rallies around the intention and level of prior positive contributions.
If some strictness would be applied in explanation rather than implementation it will be fine cool.
legendary
Activity: 3248
Merit: 3098
September 05, 2021, 07:11:48 PM
#48
He has a couple of hours until he gets banned, since his idea is only to divert attention.
How do you know that? 

he also found Pokapoka124's plagiarism. It seems to me to be a more serious case than Mpamaegbu (I see you followed the whole thing there), Pokapoka124 even tried to avoid everything by subsequently adding source links.

After I discovered Pokapoka124's plagiarism, he quickly added a link to the source and escaped the ban. But he is indeed a deliberate plagiarist, as evidenced by the second report. The date of the original is dated January 19, 2021, the plagiarism date is  January 21, 2021 (no links provided). If you check his message history, you will understand that he forms most of his texts by copying clippings from new sites.

Plagiarism
User: Pokapoka124
Post link: https://bitcointalksearch.org/topic/--5311064 (https://ninjastic.space/topic/5311064)
Additional link to third source: https://cointelegraph.com/news/bible-quotation-found-in-bitcoin-block-number-666-666

People usually encrypt messages on Bitcoin blockchain through a sent transaction. These messages varies from different individual and organisation records. The first message recorded on Bitcoin blockchain network was in 2009. It was encrypted in the bitcoin blockchain’s genesis block.
Recently a certain biblical message was seen encrypted in a transaction in block number 666,666 of the Bitcoin (BTC) blockchain. The message is a quotation from the sixth book in the New Testament, St.Paul’s epistle to the Romans. It reads:
“Do not be overcome by evil, but overcome evil with good – Romans 12:21.”

The individual who sent the transaction paid over $50 in fees. This is quite expensive compared to the normal peak average fee on Bitcoin blockchain in a day via bitinfocharts.com records. This shows that the sender’s interest was to have the transaction, and the message, imbedded in block 666,666. Why pay such a high fee just to save a message on the blockchain?

Quote
hero member
Activity: 1400
Merit: 655
Bitcoin is achievement
September 05, 2021, 05:37:57 PM
#47
Anyways, that an idea right there for the OP to check "Where to draw the line".
He has a couple of hours until he gets banned, since his idea is only to divert attention.
I don't really know while some person's engaged in plagiarism, if someone wants to impact knowledge to another person through someone's work or already existing articles i think the best option is that after using the work you have to add the source to indicate that is work is does not belong to you, some people claim that they are author of the work, and that is one of the major reason that put them into trouble.
legendary
Activity: 3234
Merit: 6706
Proudly Cycling Merits for Foxpup
September 05, 2021, 11:58:11 AM
#46
He has a couple of hours until he gets banned, since his idea is only to divert attention.
How do you know that? 

One more thing to take note is that most users who get onto good campaigns start getting scrutinized by a lot of people.
You think?  If that's true, I wonder if it's being done by members who didn't get accepted into the campaign, other forum members, or the campaign manager.  Whatever the case may be, it's probably a good idea (though Mpamaegbu got caught in such an investigation, and personally I think it was a stupid mistake on his part).  Plagiarists certainly don't need to be rewarded for stealing other people's work.

This could be an interesting exercise if done on the, say, 100 most prolific and discernibly original posters on the forum. I think I will make the cut in atleast the top 200, if not 200100. Anyways, that an idea right there for the OP to check "Where to draw the line".
Go for it if you want (I don't know how good your researching skills are), or maybe someone like LoyceV or one of the statistics gurus will do it.  If someone does do it though, I do hope it doesn't result in good members getting banned--unless they obviously deserve to be.
full member
Activity: 148
Merit: 102
September 05, 2021, 11:38:11 AM
#45
Anyways, that an idea right there for the OP to check "Where to draw the line".
He has a couple of hours until he gets banned, since his idea is only to divert attention.
legendary
Activity: 1876
Merit: 1157
September 05, 2021, 11:34:14 AM
#44
Well this issue really isn't just one for the forum but probably one that linguists and proof-readers must also struggle with. Isn't there some advanced software that academic institutions use to detect plagiarism? Wonder if we can get an overview of the algorithm used by that software. That could easily govern the set of rules that should form plagiarism.

One more thing to take note is that most users who get onto good campaigns start getting scrutinized by a lot of people. I remember that there was an instance of plagiarism even against someone like "Lauda", who i believe was never at a loss of words and would have been silly to use someone else's words.

Then again, writing 2000 plus posts over 4 years while reading articles and gathering information from multiple sources, is it possible that a person's words may seem plagiarized. I think it is difficult to come to a conclusion without some real world exercise. Why not do a challenge here at the forum. I think there are a lot of original posters here with 2-4000 samples of their writing available. I can emphatically claim that I have NEVER copied anything verbatim or intentionally. If i use an article, i just put them in quotes. Yet, who is to say that I never missed adding the quotes in making 2000 posts, or that my thought train was not somehow affected by what I may have read somewhere else, in a way that even if am writing those words on my own, they are sub-consciously mirroring something I have read the same day or a few days before.

This could be an interesting exercise if done on the, say, 100 most prolific and discernibly original posters on the forum. I think I will make the cut in atleast the top 200, if not 200100. Anyways, that an idea right there for the OP to check "Where to draw the line".
Pages:
Jump to: