Identifying authors through a computer program.

Ask questions about projects relating to: computer science or pure mathematics (such as probability, statistics, geometry, etc...).

Moderators: AmyCowen, kgudger, bfinio, MadelineB, Moderators

Locked
whitesa
Posts: 2
Joined: Mon Nov 08, 2010 5:49 pm
Occupation: Student 7th grade
Project Question: What are the simularities between binary, decimal, and hexadecimal notation
Project Due Date: January 2011
Project Status: I am just starting

Identifying authors through a computer program.

Post by whitesa »

I am an eighth grader and working on background research for the project idea of building a computer app that identify's author by their writing. I was wondering if you had any recommendations for sources.
hhemken
Former Expert
Posts: 266
Joined: Mon Oct 03, 2005 3:16 pm

Re: Identifying authors through a computer program.

Post by hhemken »

whitesa,

Have you looked at the ScienceBuddies version?:
https://www.sciencebuddies.org/science- ... p022.shtml

The trick is to use counts of various things as described there to distinguish between authors. For more ideas, make sure you google something like this:

Code: Select all

text analysis identifying authors
For a huge sample of texts, try Project Gutenberg:
http://www.gutenberg.org/
I would recommend you use the plain text versions. You would have to cut out the extraneous stuff at the beginning and the end of texts.

If you can run your program against many large texts, you may also be able to classify them by rough date of publication, author gender, and who knows what else.

Good luck!
Heinz Hemken
Mentor
Science Buddies Expert Forum
Locked

Return to “Grades 6-8: Math and Computer Science”