Computer Sleuth: Questions about Project

Ask questions about projects relating to: computer science or pure mathematics (such as probability, statistics, geometry, etc...).

Moderators: AmyCowen, kgudger, bfinio, MadelineB, Moderators

Locked
[Div]
Posts: 1
Joined: Wed Sep 02, 2015 7:23 pm
Occupation: Student, 8th Grade
Project Question: http://www.sciencebuddies.org/science-f ... shtml#help

Looking to get information and research for this topic.
Project Due Date: 26th of October
Project Status: Not applicable

Computer Sleuth: Questions about Project

Post by [Div] »

https://www.sciencebuddies.org/science- ... shtml#help

Hi, I'm planning to work on the project above but I am wondering how I should get my research and what do I have to account for when writing my code to get data and later be able to compare it to other works. I was thinking of finding the average word length, the amount of 4-6 letter words, counting words that are longer than 7 or 8 letters, but I don't know if I should have a table of common words, how many sentences and sentence length. I'm wondering what else can I have or look at reference of seeing what other people have used to make identification easier.

Thanks, would mean a lot if I can get some help on this and then start right away.
HowardE
Posts: 496
Joined: Thu Nov 20, 2014 1:35 pm
Occupation: Science Buddies content developer
Project Question: N/A
Project Due Date: N/A
Project Status: Not applicable

Re: Computer Sleuth: Questions about Project

Post by HowardE »

You might want to do some web searching on stylometry and read some of the research papers on how the specific algorithms work. What you're suggesting though is a great start. Keeping counts of the numbers of words of each length, how often are words repeated, how many appear in a common word list - those are all really good ideas.

You can also take advantage of many classical book available through Project Gutenberg (https://www.gutenberg.org/). If your methods work, you should be able to compare a couple of Shakespeare works and see similarities but see differences in style when compared to Jules Verne.

This is a really fun area of study. I worked on a commercial product some years back that was intended to help people in a company learn a company's specific style when they write documents. I think you'll enjoy the project. Please write back if you have questions or just to tell us your progress.

Howard
Locked

Return to “Grades 6-8: Math and Computer Science”