Author

Topic: [FIXED]Homographs are fixed.Thank you theymos, again. See my report. (Read 724 times)

legendary
Activity: 2240
Merit: 3150
₿uy / $ell ..oeleo ;(
member
Activity: 164
Merit: 35
Earn 20% ref commission https://bit.ly/2MaHCEr
Copied text from the search results and tested here >
The URL is missing however I guess it's the same URL below
https://www.textmagic.com/free-tools/unicode-detector
legendary
Activity: 2240
Merit: 3150
₿uy / $ell ..oeleo ;(
could you point me to the original content for the first message or the source from where was copied?

No problem >

The original :
~ snip~

Well forum may not be the best solution for publishing news but a lot of people are simply accustomed to forums ... anyway thank you for your efforts and answers here.
The copy:
Greetings  Well, forum may not be the best solution for publishing news but a lot of people are simply accustomed to forums ... anyway thank you for your efforts and answers here.

BTW,I'll copy this to the Plagiarisms accusation thread.
hero member
Activity: 784
Merit: 1416
could you point me to the original content for the first message or the source from where was copied?
legendary
Activity: 2240
Merit: 3150
₿uy / $ell ..oeleo ;(
I have tested it now and I can say that this is great!! I'm able to search for homographs as before and I get results highlighted in yellow as before, but not the actual posts are converted to Latin so it's easier to directly search for plagiarism.

Test Example >

Here is one from today >
Greetings  Well, forum may not be the best solution for publishing news but a lot of people are simply accustomed to forums ... anyway thank you for your efforts and answers here.
This is how it looks in the search engine, same as before>


Copied text from the search results and tested here >


Copied text from the actual post and tested here >




Here is the effect of the fix on the Cyrillic posting outside the local section >

я paд чтo yзнaл o пpoeктe пoчти в caмoм нaчaлe eгo paбoты, нpaвитcя чтo paзpaбoтчики внeдpяют cиcтeмy пocтeпeннo, нe фopcиpyя coбытия!лyчшe cдeлaть внaчaлe пpoдyкт и yжe пoтoм выxoдить c ним, a нe дopaбaтывaть eгo yжe в пpoцecce.yдaчи вaм!


legendary
Activity: 2240
Merit: 3150
₿uy / $ell ..oeleo ;(
The English sections should only contain English. If a post is posted in Russian in one of the English sections it would be off topic and should be reported

Yes, I report every post I find written in other languages than English.

Done. I only did the ones that look really similar to Latin characters, and it only applies to English sections. It's done at display time, so it's retroactive.

Great, I'll be monitoring the next few days to see how it goes Smiley
copper member
Activity: 2996
Merit: 2374
The English sections should only contain English. If a post is posted in Russian in one of the English sections it would be off topic and should be reported
legendary
Activity: 4536
Merit: 3188
Vile Vixen and Miss Bitcointalk 2021-2023
Done. I only did the ones that look really similar to Latin characters, and it only applies to English sections. It's done at display time, so it's retroactive.
What does this mean for Russian text that is legitimately posted in English sections?

For reference, the correct translation of "ктo-нибyдь" is "someone" or "somebody", not "who - нибyдь". Come on, even Google Translate gets that one right. Roll Eyes
Nope, Google Translate can't make heads or tails of it now. Sad This could be a problem (though whether it's a bigger problem than plagiarism remains to be seen).
administrator
Activity: 5222
Merit: 13032
Done. I only did the ones that look really similar to Latin characters, and it only applies to English sections. It's done at display time, so it's retroactive.
legendary
Activity: 2240
Merit: 3150
₿uy / $ell ..oeleo ;(

To be more serious:
These spammers who use these special characters don't think that mods will easily find those posts and will delete them? Their activity will decrease and they won't get paid... It's a kind of thing that can be spotted easily so it doesn't worth the effort, but this is just my opinion.... Do you think they don't read the Meta section at all?

I asked many times to add those to the rules but got no support from theymos. They can get away only with deleted posts instead of ban as they are hiding plagiarism but it's difficult to prove it.
Oh they read meta for sure, when I suggested to ban everyone who has more than 1 changed character in a post, they started posting with only one hompgraph - the popular "a very" spam.
hero member
Activity: 1442
Merit: 629
Vires in Numeris
Some of these are "legit" symbols in various languages, correct? For example Russian and I believe Hebrew use different symbols than English does.
Correct. That's why theymos wants to auto-replace them only on the English boards.
I'm not sure if that's going to help though, plagiarism by homograph attacks is much easier to detect than plagiarism through text spinners.
Is it also possible to auto-replace some other kind of strings like 'good project' etc.. to something like this: 'please ban me I'm a bounty hunter' ? Cheesy
Also, you have to wait, report badges were here first to implement Smiley

To be more serious:
These spammers who use these special characters don't think that mods will easily find those posts and will delete them? Their activity will decrease and they won't get paid... It's a kind of thing that can be spotted easily so it doesn't worth the effort, but this is just my opinion.... Do you think they don't read the Meta section at all?

legendary
Activity: 2240
Merit: 3150
₿uy / $ell ..oeleo ;(
It's good that you guys created a list with them but to be realistic, I don't think theymos will fix any of these.

I think he has more important items on his agenda.

He said that if the things with the hompgraphs became more serious, he gonna implement this "fix". I think 80 hompgraphs per day is a serious thing
sr. member
Activity: 616
Merit: 279
It's good that you guys created a list with them but to be realistic, I don't think theymos will fix any of these.

I think he has more important items on his agenda.
legendary
Activity: 3290
Merit: 16489
Thick-Skinned Gang Leader and Golden Feather 2021
Some of these are "legit" symbols in various languages, correct? For example Russian and I believe Hebrew use different symbols than English does.
Correct. That's why theymos wants to auto-replace them only on the English boards.
I'm not sure if that's going to help though, plagiarism by homograph attacks is much easier to detect than plagiarism through text spinners.
legendary
Activity: 2240
Merit: 3150
₿uy / $ell ..oeleo ;(
if i understood correctly they are mixing letter from different alphabets, this could be quite easy to spot by:

  • parsing the message
  • reporting the message
  • then check manually the message

I mean i don't see this going very far with this trick

I think Quickseller was hinting maybe on an automated program that can check in the different language sections for the valid and invalid characters.


What i meant was, if your message contains a small % of cyrillic caracthers, because some ordinary character was substituted to hide the plagiarism, that can quite easily spotted by checking the the text.
 
Regular expressions, if i remember correctly, can do that quite easily for other languages characters.



The easiest way to spot it is by searching for a single Cyrillic character, like for example "a", and excluding the local sections.
Then you get all the posts listed, often there are posts in Russian which I also report. I wish I had a report button from the search results but.. no.


Great, thanks everyone for the help, now we gonna sit and wait for reaction from the headquarters.

A bump to attract theymos' attention, I think I have to hire a bumping bot here Cheesy jk.
hero member
Activity: 784
Merit: 1416
if i understood correctly they are mixing letter from different alphabets, this could be quite easy to spot by:

  • parsing the message
  • reporting the message
  • then check manually the message

I mean i don't see this going very far with this trick

I think Quickseller was hinting maybe on an automated program that can check in the different language sections for the valid and invalid characters.


What i meant was, if your message contains a small % of cyrillic caracthers, because some ordinary character was substituted to hide the plagiarism, that can quite easily spotted by checking the the text.
 
Regular expressions, if i remember correctly, can do that quite easily for other languages characters.

legendary
Activity: 1288
Merit: 1926
฿ear ride on the rainbow slide
if i understood correctly they are mixing letter from different alphabets, this could be quite easy to spot by:

  • parsing the message
  • reporting the message
  • then check manually the message

I mean i don't see this going very far with this trick

I think Quickseller was hinting maybe on an automated program that can check in the different language sections for the valid and invalid characters.

Personally I favor posting unpleasant messages on ICOs that employ Bots and shills to promote their product. (Like I have done before)
I make them all different so I can't be reported for multi posts.

If others start doing that then eventually it will be pointless to use shills to promote ICOs.

I read the white paper. I’ve also stayed around after the countless delays and dates being changed

I think that the potential of the Кrios to take advantage of the computing power of the entire Internet destroys the fictitious belief that the cryptocurrency has no value, is a bubble or the latest fashion of technology.

Кrios provides new forms of financing to companies wishing to raise funds for their startup projects. The Кrios platform includes a centralized exchange of listings with decentralized interconnection.

This is a great project, bringing great benefits to the community. Not only that, it also brings new development for all of us. It is pride and happiness to be able to participate and I've known about ico for a long time, but this is probably the first time I've been so surprised to see the benefits and the advancement of your ideas.

Go for it guys this is a great project!! This is an amazing project, When I read your white sheet, I was totally delighted with how this would change our life. I think that it depends on each of us.

Just some of the fake comments by new shills posting on this thread.

A shill is a confidence trickster or swindler who poses as a genuine customer to entice or encourage others.




Is it wise to trust a ICO or coin that uses dishonesty to attract investors ?


hero member
Activity: 784
Merit: 1416
if i understood correctly they are mixing letter from different alphabets, this could be quite easy to spot by:

  • parsing the message
  • reporting the message
  • then check manually the message

I mean i don't see this going very far with this trick
copper member
Activity: 2996
Merit: 2374
Some of these are "legit" symbols in various languages, correct? For example Russian and I believe Hebrew use different symbols than English does.

Maybe someone can compile a list of symbols used in each language in the local section (along with English), and those symbols can be all that is allowed to be used.

Edit : 🔑
staff
Activity: 2436
Merit: 2347
Characters that look the same in Latin and Cyrillic:

Code:
a, A, c, C, e, E, p, P, o, O, y, x, X, B, H, K, T, M


Latin

a  a  -->
A  A  -->
c  c  -->
C  C  -->
e  e -->
E  E  -->
K  K  -->
p  p -->
P  P  -->
o  o -->
O  O  -->
y  y -->
x  x -->
X  X  -->
B  B  -->
H  H  -->
T  T  -->
M  M  -->
....................
Cyrillic

a  &‌#1072;
A  &‌#1040;
c  &‌#1089;
C  &‌#1057;
e  &‌#1077;
E  &‌#1045;
К  &‌#1050;
p  &‌#1088;
P  &‌#1056;
o  &‌#1086;
O  &‌#1054;
y  &‌#1091;
x  &‌#1093;
X  &‌#1061;
B  &‌#1042;
H  &‌#1053;
T  &‌#1058;
M  &‌#1052;
....................





member
Activity: 66
Merit: 62
I think I got most of them. They come from the Cyrillic, Greek, and Armenian alphabets. Info from this wikipedia page: https://en.wikipedia.org/wiki/IDN_homograph_attack

Homograph Character -> Regular Latin Character

Uppercase

Code:
A -> A
A -> A
B -> B
B -> B
C -> C
E -> E
E -> E
Ғ -> F
G -> G
H -> H
H -> H
I -> I
I -> I
J -> J
К -> K
K -> K
Լ -> L
M -> M
M -> M
N -> N
O -> O
O -> O
O -> O
P -> P
P -> P
S -> S
S -> S
T -> T
T -> T
U -> U
X -> X
X -> X
Y -> Y
Y -> Y
Z -> Z

Lowercase

Code:
a -> a
c -> c
d -> d
e -> e
ε -> e
g -> g
h -> h
h -> h
h -> h
i -> i
ι -> i
j -> j
κ -> k
Ӏ -> l
յ -> j
n -> n
η -> n
n -> n
o -> o
o -> o
o -> o
o -> o
p -> p
ρ -> p
q -> q
զ -> q
s -> s
τ -> t
υ -> u
u -> u
u -> u
ѵ -> v
ν -> v
w -> w
ω -> w
x -> x
χ -> x
y -> y
γ -> y

Accents & Other Marks

Code:
Ӓ -> Ä
Ё -> Ë
Ї -> Ï
Ӧ -> Ö
ӓ -> ä
ё -> ë
ї -> ï
ӧ -> ö

Numbers

Code:
Ձ -> 2
շ -> 2
З -> 3
Յ -> 3
Ч -> 4
б -> 6

CJK Compatability (not used as much b/c it doesn't look as similar, but might as well add it to the list anyway)
https://en.wikipedia.org/wiki/CJK_Compatibility

Code:
㍲ -> da
㍳ -> AU
㍴ -> bar
㍶ -> pc
㍷ -> dm
㍺ -> IU
㎅ -> KB
㎆ -> MB
㎇ -> GB
㎎ -> mg
㎏ -> kg
㎙ -> fm
㎚ -> nm
㎜ -> mm
㎝ -> cm
㎞ -> km
㎩ -> Pa
㎭ -> rad
㎰ -> ps
㎱ -> ns
㎳ -> ms
㎹ -> MV
㎿ -> MW
㏄ -> cc
㏅ -> cd
㏊ -> ha
㏌ -> in
㏐ -> lm
㏑ -> ln
㏒ -> log
㏓ -> lx
㏕ -> mil
㏖ -> mol
㏚ -> PR
㏛ -> sr
copper member
Activity: 630
Merit: 420
We are Bitcoin!
Reference: http://jrgraphix.net/r/Unicode/0400-04FF

Version one:
Ѐ = 0400
Ё = 0401
Ђ = 0402
Ѓ = 0403
Є = 0404
S = 0405
I = 0406
Ї = 0407
J = 0408
Љ = 0409
Њ = 040a
Ћ = 040b
Ќ = 040c
Ѝ = 040d
Ў = 040e
Џ = 040f
A = 0410
Б = 0411
B = 0412
Г = 0413
Д = 0414
E = 0415
Ж = 0416
З = 0417
И = 0418
Й = 0419
К = 041a
Л = 041b
M = 041c
H = 041d
O = 041e
П = 041f
P = 0420
C = 0421
T = 0422
У = 0423
Ф = 0424
X = 0425
Ц = 0426
Ч = 0427
Ш = 0428
Щ = 0429
Ъ = 042a
Ы = 042b
Ь = 042c
Э = 042d
Ю = 042e
Я = 042f
a = 0430
б = 0431
в = 0432
г = 0433
д = 0434
e = 0435
ж = 0436
з = 0437
и = 0438
й = 0439
к = 043a
л = 043b
м = 043c
н = 043d
o = 043e
п = 043f
p = 0440
c = 0441
т = 0442
y = 0443
ф = 0444
x = 0445
ц = 0446
ч = 0447
ш = 0448
щ = 0449
ъ = 044a
ы = 044b
ь = 044c
э = 044d
ю = 044e
я = 044f
ѐ = 0450
ё = 0451
ђ = 0452
ѓ = 0453
є = 0454
s = 0455
i = 0456
ї = 0457
j = 0458
љ = 0459
њ = 045a
ћ = 045b
ќ = 045c
ѝ = 045d
ў = 045e
џ = 045f
Ѡ = 0460
ѡ = 0461
Ѣ = 0462
ѣ = 0463
Ѥ = 0464
ѥ = 0465
Ѧ = 0466
ѧ = 0467
Ѩ = 0468
ѩ = 0469
Ѫ = 046a
ѫ = 046b
Ѭ = 046c
ѭ = 046d
Ѯ = 046e
ѯ = 046f
Ѱ = 0470
ѱ = 0471
Ѳ = 0472
ѳ = 0473
Ѵ = 0474
ѵ = 0475
Ѷ = 0476
ѷ = 0477
Ѹ = 0478
ѹ = 0479
Ѻ = 047a
ѻ = 047b
Ѽ = 047c
ѽ = 047d
Ѿ = 047e
ѿ = 047f
Ҁ = 0480
ҁ = 0481
҂ = 0482
҃ = 0483
҄ = 0484
҅ = 0485
҆ = 0486
҇ = 0487
҈ = 0488
҉ = 0489
Ҋ = 048a
ҋ = 048b
Ҍ = 048c
ҍ = 048d
Ҏ = 048e
ҏ = 048f
Ґ = 0490
ґ = 0491
Ғ = 0492
ғ = 0493
Ҕ = 0494
ҕ = 0495
Җ = 0496
җ = 0497
Ҙ = 0498
ҙ = 0499
Қ = 049a
қ = 049b
Ҝ = 049c
ҝ = 049d
Ҟ = 049e
ҟ = 049f
Ҡ = 04a0
ҡ = 04a1
Ң = 04a2
ң = 04a3
Ҥ = 04a4
ҥ = 04a5
Ҧ = 04a6
ҧ = 04a7
Ҩ = 04a8
ҩ = 04a9
Ҫ = 04aa
ҫ = 04ab
Ҭ = 04ac
ҭ = 04ad
Y = 04ae
ү = 04af
Ұ = 04b0
ұ = 04b1
Ҳ = 04b2
ҳ = 04b3
Ҵ = 04b4
ҵ = 04b5
Ҷ = 04b6
ҷ = 04b7
Ҹ = 04b8
ҹ = 04b9
Һ = 04ba
h = 04bb
Ҽ = 04bc
ҽ = 04bd
Ҿ = 04be
ҿ = 04bf
Ӏ = 04c0
Ӂ = 04c1
ӂ = 04c2
Ӄ = 04c3
ӄ = 04c4
Ӆ = 04c5
ӆ = 04c6
Ӈ = 04c7
ӈ = 04c8
Ӊ = 04c9
ӊ = 04ca
Ӌ = 04cb
ӌ = 04cc
Ӎ = 04cd
ӎ = 04ce
ӏ = 04cf
Ӑ = 04d0
ӑ = 04d1
Ӓ = 04d2
ӓ = 04d3
Ӕ = 04d4
ӕ = 04d5
Ӗ = 04d6
ӗ = 04d7
Ә = 04d8
ә = 04d9
Ӛ = 04da
ӛ = 04db
Ӝ = 04dc
ӝ = 04dd
Ӟ = 04de
ӟ = 04df
Ӡ = 04e0
ӡ = 04e1
Ӣ = 04e2
ӣ = 04e3
Ӥ = 04e4
ӥ = 04e5
Ӧ = 04e6
ӧ = 04e7
Ө = 04e8
ө = 04e9
Ӫ = 04ea
ӫ = 04eb
Ӭ = 04ec
ӭ = 04ed
Ӯ = 04ee
ӯ = 04ef
Ӱ = 04f0
ӱ = 04f1
Ӳ = 04f2
ӳ = 04f3
Ӵ = 04f4
ӵ = 04f5
Ӷ = 04f6
ӷ = 04f7
Ӹ = 04f8
ӹ = 04f9
Ӻ = 04fa
ӻ = 04fb
Ӽ = 04fc
ӽ = 04fd
Ӿ = 04fe
ӿ = 04ff

without the equal sign and the new line symbol:
Code:
Ѐ 0400
Ё 0401
Ђ 0402
Ѓ 0403
Є 0404
S 0405
I 0406
Ї 0407
J 0408
Љ 0409
Њ 040a
Ћ 040b
Ќ 040c
Ѝ 040d
Ў 040e
Џ 040f
A 0410
Б 0411
B 0412
Г 0413
Д 0414
E 0415
Ж 0416
З 0417
И 0418
Й 0419
К 041a
Л 041b
M 041c
H 041d
O 041e
П 041f
P 0420
C 0421
T 0422
У 0423
Ф 0424
X 0425
Ц 0426
Ч 0427
Ш 0428
Щ 0429
Ъ 042a
Ы 042b
Ь 042c
Э 042d
Ю 042e
Я 042f
a 0430
б 0431
в 0432
г 0433
д 0434
e 0435
ж 0436
з 0437
и 0438
й 0439
к 043a
л 043b
м 043c
н 043d
o 043e
п 043f
p 0440
c 0441
т 0442
y 0443
ф 0444
x 0445
ц 0446
ч 0447
ш 0448
щ 0449
ъ 044a
ы 044b
ь 044c
э 044d
ю 044e
я 044f
ѐ 0450
ё 0451
ђ 0452
ѓ 0453
є 0454
s 0455
i 0456
ї 0457
j 0458
љ 0459
њ 045a
ћ 045b
ќ 045c
ѝ 045d
ў 045e
џ 045f
Ѡ 0460
ѡ 0461
Ѣ 0462
ѣ 0463
Ѥ 0464
ѥ 0465
Ѧ 0466
ѧ 0467
Ѩ 0468
ѩ 0469
Ѫ 046a
ѫ 046b
Ѭ 046c
ѭ 046d
Ѯ 046e
ѯ 046f
Ѱ 0470
ѱ 0471
Ѳ 0472
ѳ 0473
Ѵ 0474
ѵ 0475
Ѷ 0476
ѷ 0477
Ѹ 0478
ѹ 0479
Ѻ 047a
ѻ 047b
Ѽ 047c
ѽ 047d
Ѿ 047e
ѿ 047f
Ҁ 0480
ҁ 0481
҂ 0482
҃ 0483
҄ 0484
҅ 0485
҆ 0486
҇ 0487
҈ 0488
҉ 0489
Ҋ 048a
ҋ 048b
Ҍ 048c
ҍ 048d
Ҏ 048e
ҏ 048f
Ґ 0490
ґ 0491
Ғ 0492
ғ 0493
Ҕ 0494
ҕ 0495
Җ 0496
җ 0497
Ҙ 0498
ҙ 0499
Қ 049a
қ 049b
Ҝ 049c
ҝ 049d
Ҟ 049e
ҟ 049f
Ҡ 04a0
ҡ 04a1
Ң 04a2
ң 04a3
Ҥ 04a4
ҥ 04a5
Ҧ 04a6
ҧ 04a7
Ҩ 04a8
ҩ 04a9
Ҫ 04aa
ҫ 04ab
Ҭ 04ac
ҭ 04ad
Y 04ae
ү 04af
Ұ 04b0
ұ 04b1
Ҳ 04b2
ҳ 04b3
Ҵ 04b4
ҵ 04b5
Ҷ 04b6
ҷ 04b7
Ҹ 04b8
ҹ 04b9
Һ 04ba
h 04bb
Ҽ 04bc
ҽ 04bd
Ҿ 04be
ҿ 04bf
Ӏ 04c0
Ӂ 04c1
ӂ 04c2
Ӄ 04c3
ӄ 04c4
Ӆ 04c5
ӆ 04c6
Ӈ 04c7
ӈ 04c8
Ӊ 04c9
ӊ 04ca
Ӌ 04cb
ӌ 04cc
Ӎ 04cd
ӎ 04ce
ӏ 04cf
Ӑ 04d0
ӑ 04d1
Ӓ 04d2
ӓ 04d3
Ӕ 04d4
ӕ 04d5
Ӗ 04d6
ӗ 04d7
Ә 04d8
ә 04d9
Ӛ 04da
ӛ 04db
Ӝ 04dc
ӝ 04dd
Ӟ 04de
ӟ 04df
Ӡ 04e0
ӡ 04e1
Ӣ 04e2
ӣ 04e3
Ӥ 04e4
ӥ 04e5
Ӧ 04e6
ӧ 04e7
Ө 04e8
ө 04e9
Ӫ 04ea
ӫ 04eb
Ӭ 04ec
ӭ 04ed
Ӯ 04ee
ӯ 04ef
Ӱ 04f0
ӱ 04f1
Ӳ 04f2
ӳ 04f3
Ӵ 04f4
ӵ 04f5
Ӷ 04f6
ӷ 04f7
Ӹ 04f8
ӹ 04f9
Ӻ 04fa
ӻ 04fb
Ӽ 04fc
ӽ 04fd
Ӿ 04fe
ӿ 04ff


Version two:

Ѐ = 0400
Ё = 0401
Ђ = 0402
Ѓ = 0403
Є = 0404
Ѕ = 0405
І = 0406
Ї = 0407
Ј = 0408
Љ = 0409
Њ = 040a
Ћ = 040b
Ќ = 040c
Ѝ = 040d
Ў = 040e
Џ = 040f
А = 0410
Б = 0411
В = 0412
Г = 0413
Д = 0414
Е = 0415
Ж = 0416
З = 0417
И = 0418
Й = 0419
К = 041a
Л = 041b
М = 041c
Н = 041d
О = 041e
П = 041f
Р = 0420
С = 0421
Т = 0422
У = 0423
Ф = 0424
Х = 0425
Ц = 0426
Ч = 0427
Ш = 0428
Щ = 0429
Ъ = 042a
Ы = 042b
Ь = 042c
Э = 042d
Ю = 042e
Я = 042f
а = 0430
б = 0431
в = 0432
г = 0433
д = 0434
е = 0435
ж = 0436
з = 0437
и = 0438
й = 0439
к = 043a
л = 043b
м = 043c
н = 043d
о = 043e
п = 043f
р = 0440
с = 0441
т = 0442
у = 0443
ф = 0444
х = 0445
ц = 0446
ч = 0447
ш = 0448
щ = 0449
ъ = 044a
ы = 044b
ь = 044c
э = 044d
ю = 044e
я = 044f
ѐ = 0450
ё = 0451
ђ = 0452
ѓ = 0453
є = 0454
ѕ = 0455
і = 0456
ї = 0457
ј = 0458
љ = 0459
њ = 045a
ћ = 045b
ќ = 045c
ѝ = 045d
ў = 045e
џ = 045f
Ѡ = 0460
ѡ = 0461
Ѣ = 0462
ѣ = 0463
Ѥ = 0464
ѥ = 0465
Ѧ = 0466
ѧ = 0467
Ѩ = 0468
ѩ = 0469
Ѫ = 046a
ѫ = 046b
Ѭ = 046c
ѭ = 046d
Ѯ = 046e
ѯ = 046f
Ѱ = 0470
ѱ = 0471
Ѳ = 0472
ѳ = 0473
Ѵ = 0474
ѵ = 0475
Ѷ = 0476
ѷ = 0477
Ѹ = 0478
ѹ = 0479
Ѻ = 047a
ѻ = 047b
Ѽ = 047c
ѽ = 047d
Ѿ = 047e
ѿ = 047f
Ҁ = 0480
ҁ = 0481
҂ = 0482
҃ = 0483
҄ = 0484
҅ = 0485
҆ = 0486
҇ = 0487
҈ = 0488
҉ = 0489
Ҋ = 048a
ҋ = 048b
Ҍ = 048c
ҍ = 048d
Ҏ = 048e
ҏ = 048f
Ґ = 0490
ґ = 0491
Ғ = 0492
ғ = 0493
Ҕ = 0494
ҕ = 0495
Җ = 0496
җ = 0497
Ҙ = 0498
ҙ = 0499
Қ = 049a
қ = 049b
Ҝ = 049c
ҝ = 049d
Ҟ = 049e
ҟ = 049f
Ҡ = 04a0
ҡ = 04a1
Ң = 04a2
ң = 04a3
Ҥ = 04a4
ҥ = 04a5
Ҧ = 04a6
ҧ = 04a7
Ҩ = 04a8
ҩ = 04a9
Ҫ = 04aa
ҫ = 04ab
Ҭ = 04ac
ҭ = 04ad
Ү = 04ae
ү = 04af
Ұ = 04b0
ұ = 04b1
Ҳ = 04b2
ҳ = 04b3
Ҵ = 04b4
ҵ = 04b5
Ҷ = 04b6
ҷ = 04b7
Ҹ = 04b8
ҹ = 04b9
Һ = 04ba
һ = 04bb
Ҽ = 04bc
ҽ = 04bd
Ҿ = 04be
ҿ = 04bf
Ӏ = 04c0
Ӂ = 04c1
ӂ = 04c2
Ӄ = 04c3
ӄ = 04c4
Ӆ = 04c5
ӆ = 04c6
Ӈ = 04c7
ӈ = 04c8
Ӊ = 04c9
ӊ = 04ca
Ӌ = 04cb
ӌ = 04cc
Ӎ = 04cd
ӎ = 04ce
ӏ = 04cf
Ӑ = 04d0
ӑ = 04d1
Ӓ = 04d2
ӓ = 04d3
Ӕ = 04d4
ӕ = 04d5
Ӗ = 04d6
ӗ = 04d7
Ә = 04d8
ә = 04d9
Ӛ = 04da
ӛ = 04db
Ӝ = 04dc
ӝ = 04dd
Ӟ = 04de
ӟ = 04df
Ӡ = 04e0
ӡ = 04e1
Ӣ = 04e2
ӣ = 04e3
Ӥ = 04e4
ӥ = 04e5
Ӧ = 04e6
ӧ = 04e7
Ө = 04e8
ө = 04e9
Ӫ = 04ea
ӫ = 04eb
Ӭ = 04ec
ӭ = 04ed
Ӯ = 04ee
ӯ = 04ef
Ӱ = 04f0
ӱ = 04f1
Ӳ = 04f2
ӳ = 04f3
Ӵ = 04f4
ӵ = 04f5
Ӷ = 04f6
ӷ = 04f7
Ӹ = 04f8
ӹ = 04f9
Ӻ = 04fa
ӻ = 04fb
Ӽ = 04fc
ӽ = 04fd
Ӿ = 04fe
ӿ = 04ff


without the equal sign and the new line symbol:
Code:
Ѐ 0400
Ё 0401
Ђ 0402
Ѓ 0403
Є 0404
Ѕ 0405
І 0406
Ї 0407
Ј 0408
Љ 0409
Њ 040a
Ћ 040b
Ќ 040c
Ѝ 040d
Ў 040e
Џ 040f
А 0410
Б 0411
В 0412
Г 0413
Д 0414
Е 0415
Ж 0416
З 0417
И 0418
Й 0419
К 041a
Л 041b
М 041c
Н 041d
О 041e
П 041f
Р 0420
С 0421
Т 0422
У 0423
Ф 0424
Х 0425
Ц 0426
Ч 0427
Ш 0428
Щ 0429
Ъ 042a
Ы 042b
Ь 042c
Э 042d
Ю 042e
Я 042f
а 0430
б 0431
в 0432
г 0433
д 0434
е 0435
ж 0436
з 0437
и 0438
й 0439
к 043a
л 043b
м 043c
н 043d
о 043e
п 043f
р 0440
с 0441
т 0442
у 0443
ф 0444
х 0445
ц 0446
ч 0447
ш 0448
щ 0449
ъ 044a
ы 044b
ь 044c
э 044d
ю 044e
я 044f
ѐ 0450
ё 0451
ђ 0452
ѓ 0453
є 0454
ѕ 0455
і 0456
ї 0457
ј 0458
љ 0459
њ 045a
ћ 045b
ќ 045c
ѝ 045d
ў 045e
џ 045f
Ѡ 0460
ѡ 0461
Ѣ 0462
ѣ 0463
Ѥ 0464
ѥ 0465
Ѧ 0466
ѧ 0467
Ѩ 0468
ѩ 0469
Ѫ 046a
ѫ 046b
Ѭ 046c
ѭ 046d
Ѯ 046e
ѯ 046f
Ѱ 0470
ѱ 0471
Ѳ 0472
ѳ 0473
Ѵ 0474
ѵ 0475
Ѷ 0476
ѷ 0477
Ѹ 0478
ѹ 0479
Ѻ 047a
ѻ 047b
Ѽ 047c
ѽ 047d
Ѿ 047e
ѿ 047f
Ҁ 0480
ҁ 0481
҂ 0482
҃ 0483
҄ 0484
҅ 0485
҆ 0486
҇ 0487
҈ 0488
҉ 0489
Ҋ 048a
ҋ 048b
Ҍ 048c
ҍ 048d
Ҏ 048e
ҏ 048f
Ґ 0490
ґ 0491
Ғ 0492
ғ 0493
Ҕ 0494
ҕ 0495
Җ 0496
җ 0497
Ҙ 0498
ҙ 0499
Қ 049a
қ 049b
Ҝ 049c
ҝ 049d
Ҟ 049e
ҟ 049f
Ҡ 04a0
ҡ 04a1
Ң 04a2
ң 04a3
Ҥ 04a4
ҥ 04a5
Ҧ 04a6
ҧ 04a7
Ҩ 04a8
ҩ 04a9
Ҫ 04aa
ҫ 04ab
Ҭ 04ac
ҭ 04ad
Ү 04ae
ү 04af
Ұ 04b0
ұ 04b1
Ҳ 04b2
ҳ 04b3
Ҵ 04b4
ҵ 04b5
Ҷ 04b6
ҷ 04b7
Ҹ 04b8
ҹ 04b9
Һ 04ba
һ 04bb
Ҽ 04bc
ҽ 04bd
Ҿ 04be
ҿ 04bf
Ӏ 04c0
Ӂ 04c1
ӂ 04c2
Ӄ 04c3
ӄ 04c4
Ӆ 04c5
ӆ 04c6
Ӈ 04c7
ӈ 04c8
Ӊ 04c9
ӊ 04ca
Ӌ 04cb
ӌ 04cc
Ӎ 04cd
ӎ 04ce
ӏ 04cf
Ӑ 04d0
ӑ 04d1
Ӓ 04d2
ӓ 04d3
Ӕ 04d4
ӕ 04d5
Ӗ 04d6
ӗ 04d7
Ә 04d8
ә 04d9
Ӛ 04da
ӛ 04db
Ӝ 04dc
ӝ 04dd
Ӟ 04de
ӟ 04df
Ӡ 04e0
ӡ 04e1
Ӣ 04e2
ӣ 04e3
Ӥ 04e4
ӥ 04e5
Ӧ 04e6
ӧ 04e7
Ө 04e8
ө 04e9
Ӫ 04ea
ӫ 04eb
Ӭ 04ec
ӭ 04ed
Ӯ 04ee
ӯ 04ef
Ӱ 04f0
ӱ 04f1
Ӳ 04f2
ӳ 04f3
Ӵ 04f4
ӵ 04f5
Ӷ 04f6
ӷ 04f7
Ӹ 04f8
ӹ 04f9
Ӻ 04fa
ӻ 04fb
Ӽ 04fc
ӽ 04fd
Ӿ 04fe
ӿ 04ff


Let me know how much it helps.
legendary
Activity: 2240
Merit: 3150
₿uy / $ell ..oeleo ;(
See here :

http://sites.psu.edu/symbolcodes/languages/europe/cyrillic/cyrillicchart/

I guess there are other alfabets which can be used too but the Cyrillic is what is mainly used in the hompgraph attacks here.

There are some more resources and info in the quoted thread :

Does someone have a table of these characters? I can automatically convert non-standard characters to ASCII.

This is the one I use:
http://sites.psu.edu/symbolcodes/languages/europe/cyrillic/cyrillicchart/
~
There are more characters see the link I posted, not only the main Cyrillic, like:
CYRILLIC CAPITAL LETTER DZE   S   &‌#1029;   &‌#x0405;


copper member
Activity: 630
Merit: 420
We are Bitcoin!
Anyone, who had the time to do it?
I'm on the mobile and it's awful.
Give me some resources to start (URL, keywords or stuffs). I have time to make the list if it does not take too much time like a day or two.
legendary
Activity: 2240
Merit: 3150
₿uy / $ell ..oeleo ;(
The problem with the homographs is finally solved.

Done. I only did the ones that look really similar to Latin characters, and it only applies to English sections. It's done at display time, so it's retroactive.

I've tested it and it's better than ever. See my conclusion here > https://bitcointalksearch.org/topic/m.44859677
It won't affect the legit Cyrillic posts outside the local section, with the only exception that if you copy/quote the text from a Cyrillic post the changed/fixed letters will remain in Latin. See an example in the conclusion.


So the hompgraphs are back and are more active than before.
Just for the last 24 hours got 82 cases of hompgraph attacks.

Hompgraphs from the last 24 hours:

Had a look through the whitepaper and have to say it looks great. It explained a few things around masternodes as well.
~
... And many mone up to 82 cases

What we need to do is to make one simple list of all the characters and theymos will fix it.

~
BTW, the main blocker for me taking action was that I never got around to compiling the table of homographic characters and their ASCII counterparts. If this crops up again, it'd be helpful if someone would compile a nice plaintext " -> " table.

Anyone, who had the time to do it?
I'm on the mobile and it's awful.



[UPDATE] we have some lists finished bellow, now it's just to decide which one to use.

I kind of liked the homographs, it's pretty easy to spot the plagiarism, maybe it will be a good idea to just color them in red or put a dot after each homograph so we can see them easily.
Jump to: