I assume you're talking about [[this project]]
Some good starting hypotheses might include ones about the frequencies of certain letters or words. Would you expect 'e' or 'o' to show up the most often? Is 'and' used more commonly than 'the'? You could even link this to a more complicated idea if time allows. For example, I've often heard it said that 'ivy' is the hardest Hangman word, and definitely harder than 'and' or 'the'. Based on the relative frequencies of those letters, does that seem to be true?
Some Google searches for ".text analysis with Java", "text analysis with Java", and "word count with Java" might turn up some interesting examples, tutorials, and the like. I'd also recommend, if you decide to do this project, that you look at the [[Javadoc]]
, as it's enormously helpful.
I hope this helped, and if you have more questions, feel free to ask!