Home Project Ideas Project Guide Ask An Expert Blog Careers Teachers Parents Students

Paragraph Stats: Writing a JavaScript Program to 'Measure' Text

Difficulty
Time Required Short (2-5 days)
Prerequisites An understanding of the material covered in "ABC's of Programming: Writing a Simple "Alphabetizer" with JavaScript"
Material Availability Readily available
Cost Very Low (under $20)
Safety No issues

Abstract

This is a more challenging first-time programming project. You'll learn how to use JavaScript to create a simple program to analyze one or more paragraphs of text. Your program will count sentences, words and letters, and report the resulting statistics. You'll be able to run your program in your Web browser.

Objective

The objective of this project is to write a JavaScript program to make some basic measurements on a block of text:

  • number of sentences contained in the text,
  • number of words in each sentence,
  • number of letters in each word,
  • average number of words per sentence, and
  • average word length.

Credits

Andrew Olson, Ph.D., Science Buddies

Cite This Page

MLA Style

Science Buddies Staff. "Paragraph Stats: Writing a JavaScript Program to 'Measure' Text" Science Buddies. Science Buddies, 18 Mar. 2014. Web. 25 July 2014 <http://www.sciencebuddies.org/science-fair-projects/project_ideas/CompSci_p003.shtml>

APA Style

Science Buddies Staff. (2014, March 18). Paragraph Stats: Writing a JavaScript Program to 'Measure' Text. Retrieved July 25, 2014 from http://www.sciencebuddies.org/science-fair-projects/project_ideas/CompSci_p003.shtml

Share your story with Science Buddies!

I did this project I Did This Project! Please log in and let us know how things went.


Last edit date: 2014-03-18

Introduction

This is an example of a more advanced first-time programming project. You will be writing a simple program to gather statistics on a block of text. The project uses JavaScript, an interpreted programming language supported by most Web browsers.

Prerequisites

It is assumed that you know how to use a text editor to write an HTML file containing your program, and that you know how to load and run the file in your browser.

You will also need to know the basics of how JavaScript functions can be written to work with HTML FORM elements. Specifically, you will be working with HTML TEXTAREA and INPUT elements, and using JavaScript String and Array objects in your functions.

If you need help with any of this, read the Introduction section in: "ABC's of Programming: Writing a Simple "Alphabetizer" with JavaScript."

New Material

In this project, you will learn some important methods of program control logic, which you will use again and again as you write more programs. These methods are used in just about every programming language (though the actual syntax may vary slightly from language to language). Specifically, you will learn about the JavaScript "for" and "while" loop control statements, and "if...else"conditional statements. You will also learn about 2-dimensional arrays (lists of lists).

Writing a JavaScript Program to "Measure" a Block of Text

The goal is to write a program that can take a block of text and calculate:

  1. the number of sentences contained in the text,
  2. the number of words in each sentence,
  3. the number of letters in each word,
  4. the average number of words per sentence, and
  5. the average word length.

To tackle this, as with any programming project, a good way to start is to break it down into manageable sub-tasks. So for starters, let's try counting the number of letters in each word.

Step One: Loops for Counting Letters in Each Word

We'll assume that we've retrieved the text from the form, split it into the separate words, and stored the words in an array. To make it specific, let's say that the input text was: "the quick brown fox jumped over the lazy dog", and our array is called "textWords". Our array, then, looks like this:
textWords[0]=the
textWords[1]=quick
textWords[2]=brown
textWords[3]=fox
textWords[4]=jumped
textWords[5]=over
textWords[6]=the
textWords[7]=lazy
textWords[8]=dog
To count the number of letters per word, we need to step through the array, and get the length of each element. To do this, we use a common programming control technique: a loop.

JavaScript (and many other languages) has two ways of writing a loop, the "for" statement and the "while" statement. The following code snippet uses a while loop to count the number of letters in our text input:
var nWords = textWords.length;
var nLetters = new Array(nWords); // array for word lengths
var totalLetterCount = 0;
var nLettersPerWord = 0;
var i = 0; // loop counter

while(i < nWords)
{
   nLetters[i] = textWords[i].length;
   totalLetterCount += textWords[i].length; // running total of word lengths
   ++i; // increment the loop counter
}
nLettersPerWord = totalLetterCount/nWords;

The number of words, nWords, is equal to the length of our array (remember that the Array object's .length property is equal to the number of elements in the array). We create an array, nLetters, to store the length of each word. nLetters has the same number of elements as our textWords array. We also create variables to keep track of the running total of the letter count, totalLetterCount, and a counter variable, i, to keep track of where we are in the array. Notice how we named the variables to reflect their function. We also initialized (assigned initial values) to all of the variables, which is a good habit to get into.

The while loop works like this: as long as the condition in parentheses (i < nWords) evaluates to true, the program will continue to execute the statements contained in the following set of braces. So as long as the counter has not reached the end of the array, the program will continue to loop. The three statements within the braces instruct the program to:

  1. store the current word length in the corresponding position of our nLetters array;
  2. add the current word length to the variable letterCount;
  3. increment (add 1 to) the loop counter.

Notice that the statements within the while loop are indented, so that we can easily see they belong to the loop. These statements use two handy operators you may not have seen before, += and ++. Both operators save you some typing.
letterCount += textWords[i].length;
is equivalent to:
letterCount = letterCount + textWords[i].length;
and
++i;
is equivalent to:
i = i + 1;
Which would you rather type?

When the program has iterated through the loop 9 times, the loop counter, i, will be equal to nWords. The while condition will now evaluate to false and the program will proceed to the first statement following the closing brace.

The for loop construction (below) is similar, but it keeps track of all the counter bookkeeping in one place. Here is the same code snippet, using a for loop:
var nWords = textWords.length;
var letterCount = 0;
var nLettersPerWord = 0;

for(var i = 0; i < nWords; ++i)
{
   letterCount += textWords[i].length;
}
nLettersPerWord = letterCount/nWords;

The for construction has the statement for initializing the counter (var i = 0;), the condition statement to evaluate to test if the statements between the braces should be executed (i < nWords;), and the statement to increment the counter (++i). (Note that there is no semicolon after increment statement, just as there is no semicolon following the condition in the while statement.)

How do you choose which loop method to use? If your program has enough information to calculate how many times it needs to iterate (go through) the loop, then a for statement is usually clearest. It's easier to keep track of the loop counter if all of the statements are together in one place. So, for iterating through an array, the for loop is usually the best choice. In other cases, say when you are repeatedly processing input from the user and waiting for a special input to signal that it is time to go on to something else, then a while loop makes more sense.

With what you know now, you should be able to come up with something like this:

Simple Word Length Counter

Type or paste a list of words, separated by spaces, into the box below, then press the "Count Word Length" button to count the number of letters in each word.

 

Average number of letters per word:

It's a start, and works well enough for word lists separated by spaces. But what happens if you paste in ordinary text, punctuation and all? Aha! Punctuation marks get counted as letters. Read on to learn how to eliminate those pesky commas, periods, question marks and the like from your word lists.

Step Two: Nested Loops and Conditional Statements

Suppose that instead of counting all the letters in the textWords array, you only wanted to count how many times the letter "e" was used. We can use a conditional statement to modify our previous letter-counting loop so that only e's are counted. Here's how:
var eCount = 0;

for(var i = 0; i < textWords.length; ++i){ // outer loop
   for(var j = 0; j < textWords[i].length; ++j){ // inner loop
      if(textWords[i].toString().charAt[j] == 'e' ||
         textWords[i].toString().charAt[j] == 'E'){ // if
         ++eCount;
      } // end of if
   } // end of inner loop
} // end of outer loop

Now we have two nested loops , the first one, with counter i, to step through the array of words, and the second one, with counter j, to step through the individual characters within each word. (Notice how we've used increasing indentation to identify the loops—this is where a text editor specifically for programming comes in handy.) We've used the Array object's .toString() method, and then the String object's .charAt() method to step through the individual characters. The statement:
if(textWords[i].toString().charAt(j) == 'e' || textWords[i].toString().charAt(j) == 'E')
means "if this character is a lowercase 'e' OR if this character is an uppercase 'E', then execute the statements between the following braces (otherwise, skip them)". The == is a comparison operator, (like < or >). If the two operands are equal, the comparison evaluates to true, if the two operands are not equal, the comparison evaluates to false. (Note that the equality comparison operator, ==, is not the same as the assignment operator, =, which assigns the value of the right-hand operand to the left-hand operand.) The || is a logical operator, in this case logical OR.

The previous example showed how for loops can be nested. Conditional loops can be nested as well. Suppose that you wanted to watch your P's and Q's in the sample text. You would first check to see if the character was a "P", then check to see if it was a "Q", then go on to the next letter. Here's how:
var pCount = 0;
var qCount = 0;

for(var i = 0; i < textWords.length; ++i){
   for(var j = 0; j < textWords[i].length; ++j){
      if(textWords[i].toString().charAt(j) == 'p' ||
         textWords[i].toString().charAt(j) == 'P'){
         ++pCount;
      }
      else{
         if(textWords[i].toString().charAt(j) == 'q' ||
            textWords[i].toString().charAt(j) == 'Q'){
            ++qCount;
         }
      }
   }
}

Here we've used the else construction to tell the program what to do when the first if statement evaluates to false (i.e., when the character is not a lowercase or uppercase 'p'). In that case, there is a nested if statement, which checks to see if the character is a 'q'. If it is, qCount is incremented, otherwise, the loop continues.

The project "ABC's of Programming: Writing a Simple "Alphabetizer" with JavaScript" shows how to use the JavaScript String object's .split() method to break a block of text into separate "words". However the .split() method is quite limited. For example, there can be only a single pattern for word breaks, but words in ordinary text are separated by both spaces and punctuation marks. You should be able to apply what you've learned about loops and conditional statements to write your own splitWords() function. It should take a paragraph of ordinary text as input and create an array of words as output. You can use nested for loops to iterate through each character in the text. Within the inner for loop you can use nested if...else statements evaluate each character and decide what to do with it.

Step Three: Using Two-Dimensional Arrays

You've already learned that an array is a list, for example, a list of all of the words in a sentence. A two-dimensional array is a list of lists. Following the previous example, our two-dimensional array would be a list of sentences, with each of the sentences consisting of a list of words.

Here's another example. Instead of sentences, just think of lines of text. Each line is an array of words, and each block of text is an array of lines. If our lines of text were:
the quick brown fox
jumped over the lazy dog
then a two-dimensional array containing all the words, line-by-line, would look like this:
lines[0][0]=the
lines[0][1]=quick
lines[0][2]=brown
lines[0][3]=fox
lines[1][0]=jumped
lines[1][1]=over
lines[1][2]=the
lines[1][3]=lazy
lines[1][4]=dog
The first index number indicates the line number, and the second index number indicates the position of the word within that line. You can use nested loops to iterate through all of the lines (outer loop), and all of the words within each line (inner loop).

As you might imagine, it keeps on going: arrays can have three (or even more) dimensions. Continuing our example, the next dimension would be lists of lines (paragraphs, maybe, or pages), and the next dimension would be lists of these (chapters or books), and so on.

You can use two-dimensional arrays in your project to make lists of sentences, which, in turn, are made up of lists of words.

We've covered a lot of ground! If you've gotten this far, you should be ready to write a program to "measure" some text.

Note for JavaScript files with Internet Explorer:
If you experience difficulty running your JavaScript code in Internet Explorer, we strongly suggest that you install the Firefox web browser and use it instead of Internet Explorer. For more information or to download the Firefox installer, see: http://www.mozilla.com/firefox/.

If you want to continue to use Internet Explorer, try adding the following line at the beginning of your file:
<!-- saved from url=(0014)about:internet -->
This line will cause Internet Explorer to run your file according to the security rules for the Internet zone on your computer. In our experience this may work, or it may not. For more information see: http://msdn.microsoft.com/library/default.asp?url=/workshop/author/dhtml/overview/motw.asp and http://www.phdcc.com/xpsp2.htm.

Terms and Concepts

To write a simple program in JavaScript, you should do research that enables you to understand the following terms and concepts:

  • HTML concepts:
    • start tags and end tags,
    • comments,
    • the <HEAD> section,
    • the <SCRIPT> section,
    • the <BODY> section,
    • the <FORM> section,
    • the <INPUT> tag,
    • the <TEXTAREA> tag.
  • JavaScript concepts:
    • functions,
    • variables,
    • objects,
    • properties,
    • methods,
    • events,
    • arrays (including 2-dimensional arrays).
  • specific JavaScript methods:
    • .split(), a String object method for separating text into arrays,
  • general programming concepts:
    • reserved words
    • control statements (e.g., "if...else" statements, "for" and "while" loops)
    • operators:
      • assignment operators, (e.g., =,+=,-=,*=,/=)
      • comparison operators, (e.g., <,>,<=,>=,==)
      • logical operators (AND, OR and NOT, i.e., &&,||,!)
    • multi-dimensional arrays

Bibliography

  • You can find a step-by-step JavaScript tutorial at the link below. After studying the tutorial, you should be able to answer all of the questions listed above, and you'll be ready to write a simple calculator.
    Webteacher Software. (2006). JavaScript Tutorial for the Total Non-Programmer. Webteacher Software, LLC. Retrieved June 6, 2006, from http://www.webteacher.com/javascript/index.html
  • Introduction to Programming by Matt Gemmell describes what programming actually is:
    Gemmell, M. (2007). Introduction to Programming. Dean’s Director Tutorials & Resources. Retrieved March 14, 2014, from http://www.deansdirectortutorials.com/Lingo/IntroductionToProgramming.pdf
  • HTML Forms reference:
    W3C. (1999). Forms in HTML Documents, HTML 4.01 Specification. World Wide Web Consortium: Massachusetts Institute of Technology, Institut National de Recherche en Informatique et en Automatique, Keio University. Retrieved June 6, 2006, from http://www.w3.org/TR/REC-html40/interact/forms.html
  • A list of reserved words in JavaScript (you cannot use these words for function or variable names in your program, because they are reserved for the programming language itself):
    JavaScript Kit. (2006). JavaScript Reserved Words. Retrieved June 6, 2006, from http://www.JavaScriptkit.com/jsref/reserved.shtml
  • A JavaScript reference:
    Mozilla Developer Network and Individual Contributors. (2013, December 14). JavaScript reference. Retrieved March 14, 2014, from https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference
  • If you get interested and start doing a lot of web development, you may want to use a text editor that is a little more sophisticated than Notepad. An editor designed for web development and programming can help with formatting so that your code is more readable, but still produce plain text files. This type of editor can also fill in web-specific HTML coding details and do "syntax highlighting" (e.g., automatic color-coding of HTML), which can help you find errors. One option is the CoffeeCup HTML Editor, which is available in the following formats:

Materials and Equipment

To do this experiment you will need the following materials and equipment:

Experimental Procedure

  1. Using what you have learned about programming in Javascript, create a program that will take a block of text as input and that can count:
    • the number of sentences contained in the text,
    • the number of words per sentence,
    • the number of letters per word,
    • the average number of words per sentence, and
    • the average word length.
  2. There are several sub-tasks here. Take the sub-tasks one at a time, and gradually build up the capabilities of your program.
  3. Verify that the code for each sub-task is working properly before moving on to the next sub-task.
  4. Test your program by copying and pasting paragraphs of text from various sources. Do counts by hand to test the accuracy of your program.

Here are some generally applicable programming tips to keep in mind as you get started with this project.
  1. Plan your work.
    • Methodically think through all of the steps to solve your programming problem.
    • Try to parcel the tasks out into short, manageable functions.
    • Think about the interface for each function: what arguments need to be passed to the function so that it can do its job?
  2. Use careful naming, good formatting, and descriptive comments to make your code understandable.
    • Give your functions and variables names that reflect their purpose in the program. A good choice of names makes the code more readable.
    • Indent the statements in the body of a function so it is clear where the function code begins and ends.
    • Indent the statements following an "if", "else", "for", or "while" control statement. That way you can easily see what statements are executed for a given branch or loop in your code.
    • Descriptive comments are like notes to yourself. Oftentimes in programming, you'll run into a problem similar to one you've solved before. If your code is well-commented, it will make it easier to go back and re-use pieces of it later. Your comments will help you remember how you solved the previous problem.
  3. Work incrementally.
    • When you are creating a program, it is almost inevitable that along the way you will also create bugs. Bugs are mistakes in your code that either cause your program to behave in ways that you did not intend, or cause it to stop working altogether. The more lines of code, the more chances for bugs. So, especially when you are first starting, it is important to work incrementally. Make just one change at a time, make sure that it works as expected and then move on.
    • With JavaScript, this is easy to do. To check your code, all you need to do is use your web browser to open the HTML file containing your code. After you've made a change in the code with your text editor, just save the file, then switch to your browser and hit the re-load page button to see the change in action.
    • Test to make sure that your code works as expected. If your code has branch points that depend on user input, make sure that you test each of the possible branch points to make sure that there are no surprises.
    • Also, it's a good idea to backup your file once in awhile with a different name. That way, if something goes really wrong and you can't figure it out, you don't need to start over from scratch. Instead you can go back to an earlier version that worked, and start over from there.
    • As you gain more experience with a particular programming environment, you'll be able to write larger chunks of code at one time. Even then, it is important to remember to test each new section of code to make sure that it works as expected before moving on to the next piece. By getting in the habit of working incrementally, you'll reduce the amount of time you spend identifying and fixing bugs.
  4. When debugging, work methodically to isolate the problem.
    • We told you above that bugs are inevitable, so how do you troubleshoot them? Well, the first step is to isolate the problem: which line caused the program to stop working as expected? If you are following the previous tip and working incrementally, you can be pretty sure that the problem is with the line you just wrote.
    • Check for simple typos first. A misspelled function name will not be recognized. In JavaScript, a misspelled variable name simply creates a new variable. This is an easy mistake to make, and can be hard to find.
    • Avoid using reserved words as variable or function names.
  5. Test your program thoroughly.
    • You should test the program incrementally as you write it, but you should also test the completed program to make sure that it behaves as expected.

Share your story with Science Buddies!

I did this project I Did This Project! Please log in and let us know how things went.


Variations

Share your story with Science Buddies!

I did this project I Did This Project! Please log in and let us know how things went.

Ask an Expert

The Ask an Expert Forum is intended to be a place where students can go to find answers to science questions that they have been unable to find using other resources. If you have specific questions about your science fair project or science fair, our team of volunteer scientists can help. Our Experts won't do the work for you, but they will make suggestions, offer guidance, and help you troubleshoot.

Ask an Expert

Related Links

If you like this project, you might enjoy exploring these related careers:

Computer programmer typing on a keyboard

Computer Programmer

Computers are essential tools in the modern world, handling everything from traffic control, car welding, movie animation, shipping, aircraft design, and social networking to book publishing, business management, music mixing, health care, agriculture, and online shopping. Computer programmers are the people who write the instructions that tell computers what to do. Read more
NASA flight software engineer

Computer Software Engineer

Are you interested in developing cool video game software for computers? Would you like to learn how to make software run faster and more reliably on different kinds of computers and operating systems? Do you like to apply your computer science skills to solve problems? If so, then you might be interested in the career of a computer software engineer. Read more
Software quality engineer working on database

Software Quality Assurance Engineer & Tester

Software quality assurance engineers and testers oversee the quality of a piece of software's development over its entire life cycle. Their goal is to see to it that the final product meets the customer's requirements and expectations in both performance and value. During the software life cycle, they verify (officially state) that it is possible for the software to accomplish certain tasks. They detect problems that exist in the process of developing the software, or in the product itself. They try and make things not work (try to "break" the software) by creating errors or combinations of errors that a user might make. For example, if a user enters a period or a pound sign for a password, will that break the software? They seek to anticipate potential issues with the software before they become visible. At the end of the life cycle, they reflect upon how problems or bugs arose, and figure out ways to make the software development process better in the future. Read more

Looking for more science fun?

Try one of our science activities for quick, anytime science explorations. The perfect thing to liven up a rainy day, school vacation, or moment of boredom.

Find an Activity