It's not completely random. I've just tried sticking sections of works by authors mentioned in this thread in to see if it can identify them, and it's not doing badly.
I gave it a slice of Finnegan's Wake, and it told me it sounded like James Joyce. It would be a pretty bad algorithm if it couldn't identify James Joyce.
Then I got it to correctly identify passages from Dan Brown and Mario Puzo. Quite impressive.
One possibility is that it's just matching word frequencies. To check this, I tried a few strings of my own devising:
"mafia mafia don don mafia mafia" --> Mario Puzo
"vatican conspiracy vatican conspiracy vatican conspiracy" --> Dan Brown
"oh woe is me, life sucks, everything is crap" --> Chuck Palahniuk
I gave it a slice of Finnegan's Wake, and it told me it sounded like James Joyce. It would be a pretty bad algorithm if it couldn't identify James Joyce.
Then I got it to correctly identify passages from Dan Brown and Mario Puzo. Quite impressive.
One possibility is that it's just matching word frequencies. To check this, I tried a few strings of my own devising:
"mafia mafia don don mafia mafia" --> Mario Puzo
"vatican conspiracy vatican conspiracy vatican conspiracy" --> Dan Brown
"oh woe is me, life sucks, everything is crap" --> Chuck Palahniuk