The OP is very similar to something I once proposed:
If the real Satoshi has published a lot of text under his real name, it should be possible to figure out who he's most likely to be using statistical analysis.
I'm surprised nobody has attempted this, a general solution to this problem would have some pretty cool uses in computer forensics.
I first suggested this be done several months ago, and twice more after that. It has proven successful to link a old paper to William Shakespeare once.
~Bruno~
Won't work if "Satoshi Nakamoto" is another
Nicolas Bourbaki - if your training texts are written by many people, then they won't be much use if your algorithm is assuming it is one person.
There's an article that stated after gleaning all that Satoshi wrote, they were surprised to find so few grammatical errors. Two people with the same writing style and avoiding errors--maybe. Three or more--I lean toward no.
The following is what I was eluding to:
http://en.wikipedia.org/wiki/Shakespeare_authorship_question#Evidence_for_Shakespeare.27s_authorship_from_his_worksBeginning in 1987, Ward Elliott, who was sympathetic to the Oxfordian theory, and Robert J. Valenza supervised a continuing stylometric study that used computer programs to compare Shakespeare's stylistic habits to the works of 37 authors who had been proposed as the true author. The study, known as the Claremont Shakespeare Clinic, was last held in the spring of 2010. The tests determined that Shakespeare's work shows consistent, countable, profile-fitting patterns, suggesting that he was a single individual, not a committee, and that he used fewer relative clauses and more hyphens, feminine endings, and run-on lines than most of the writers with whom he was compared. The result determined that none of the other tested claimants' work could have been written by Shakespeare, nor could Shakespeare have been written by them, eliminating all of the claimants whose known works have survived—including Oxford, Bacon, and Marlowe—as the true authors of the Shakespeare canon.
I believe the following is what I had in mind (or something similar):
http://www.philocomp.net/humanities/signatureWelcome to the home page of Signature, a program designed to facilitate "stylometric" analysis and comparison of texts, with a particular emphasis on author identification.
Signature used to investigate claims that Obama's book was written by an ex-terrorist.
Signature used to support Coleridge's authorship of an anonymous 1821 translation of Goethe's Faustus.
Signature used to test authorship of famous cyphers.
http://en.wikipedia.org/wiki/StylometryStylometry is the application of the study of linguistic style, usually to written language, but it has successfully been applied to music and to fine-art paintings as well.
Stylometry is often used to attribute authorship to anonymous or disputed documents. It has legal as well as academic and literary applications, ranging from the question of the authorship of Shakespeare's works to forensic linguistics.
...
Modern stylometry draws heavily on the aid of computers for statistical analysis, artificial intelligence and access to the growing corpus of texts available via the Internet. Software systems such as Signature (freeware produced by Dr Peter Millican of Oxford University) and JGAAP (the Java Graphical Authorship Attribution Program—freeware produced by Dr Patrick Juola of Duquesne University) make its use increasingly practicable, even for the non-expert.
Do you believe
Singature or
JGAAP can find Satoshi Nakamoto?