Skip to content

An algorithm for TC Mendenhall's 1901 stylometry analysis using word length.

Notifications You must be signed in to change notification settings

crispin-cas9/word-length-authorship

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Stylometry analysis using word length

Based on a technique by TC Mendenhall.

Disclaimer: This algorithm isn't an accurate method of determining authorship, since the output varies greatly between verse and prose. However, it's still pretty interesting to compare different authors in this way, and the method does have some merits.

Required libraries: numpy, pandas, matplotlib, seaborn

Data sources

http://cs.stanford.edu/people/karpathy/char-rnn/shakespeare_input.txt

http://shakespeare.mit.edu/

http://www.gutenberg.org/ebooks/author/410

http://www.elizabethanauthors.org/king-leir-1605-1-16.htm

http://extra.shu.ac.uk/emls/iemls/renplays/miseries.htm

http://www.gutenberg.org/files/1962/1962-h/1962-h.htm

http://www.gutenberg.org/ebooks/author/296

https://en.wikipedia.org/wiki/Edward_de_Vere,_17th_Earl_of_Oxford#Literary_reputation

About

An algorithm for TC Mendenhall's 1901 stylometry analysis using word length.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages