Jump to main content

Paragraph Stats: Writing a JavaScript Program to 'Measure' Text

1
2
3
4
5
12 reviews

Abstract

This is a more challenging first-time programming project. You'll learn how to use JavaScript to create a simple program to analyze one or more paragraphs of text. Your program will count sentences, words and letters, and report the resulting statistics. You'll be able to run your program in your Web browser.

Summary

Areas of Science
Difficulty
 
Time Required
Short (2-5 days)
Prerequisites
An understanding of the material covered in "ABC's of Programming: Writing a Simple 'Alphabetizer' with JavaScript"
Material Availability
Readily available
Cost
Very Low (under $20)
Safety
No issues
Credits
Andrew Olson, Ph.D., Science Buddies

Objective

The objective of this project is to write a JavaScript program to make some basic measurements on a block of text:

Introduction

This is an example of a more advanced first-time programming project. You will be writing a simple program to gather statistics on a block of text. The project uses JavaScript, an interpreted programming language supported by most Web browsers.

Prerequisites

It is assumed that you know how to use a text editor to write an HTML file containing your program, and that you know how to load and run the file in your browser.

You will also need to know the basics of how JavaScript functions can be written to work with HTML FORM elements. Specifically, you will be working with HTML TEXTAREA and INPUT elements, and using JavaScript String and Array objects in your functions.

If you need help with any of this, read the Introduction section in: "ABC's of Programming: Writing a Simple 'Alphabetizer' with JavaScript"

New Material

In this project, you will learn some important methods of program control logic, which you will use again and again as you write more programs. These methods are used in just about every programming language (though the actual syntax may vary slightly from language to language). Specifically, you will learn about the JavaScript "for" and "while" loop control statements, and "if...else"conditional statements. You will also learn about 2-dimensional arrays (lists of lists).

Writing a JavaScript Program to "Measure" a Block of Text

The goal is to write a program that can take a block of text and calculate:

  1. the number of sentences contained in the text,
  2. the number of words in each sentence,
  3. the number of letters in each word,
  4. the average number of words per sentence, and
  5. the average word length.

To tackle this, as with any programming project, a good way to start is to break it down into manageable sub-tasks. So for starters, let's try counting the number of letters in each word.

Step One: Loops for Counting Letters in Each Word

We'll assume that we've retrieved the text from the form, split it into the separate words, and stored the words in an array. To make it specific, let's say that the input text was: "the quick brown fox jumped over the lazy dog", and our array is called "textWords". Our array, then, looks like this:
textWords[0]=the
textWords[1]=quick
textWords[2]=brown
textWords[3]=fox
textWords[4]=jumped
textWords[5]=over
textWords[6]=the
textWords[7]=lazy
textWords[8]=dog
To count the number of letters per word, we need to step through the array, and get the length of each element. To do this, we use a common programming control technique: a loop.

JavaScript (and many other languages) has two ways of writing a loop, the "for" statement and the "while" statement. The following code snippet uses a while loop to count the number of letters in our text input:
var nWords = textWords.length;
var nLetters = new Array(nWords); // array for word lengths
var totalLetterCount = 0;
var nLettersPerWord = 0;
var i = 0; // loop counter

while(i < nWords)
{
   nLetters[i] = textWords[i].length;
   totalLetterCount += textWords[i].length; // running total of word lengths
   ++i; // increment the loop counter
}
nLettersPerWord = totalLetterCount/nWords;

The number of words, nWords, is equal to the length of our array (remember that the Array object's .length property is equal to the number of elements in the array). We create an array, nLetters, to store the length of each word. nLetters has the same number of elements as our textWords array. We also create variables to keep track of the running total of the letter count, totalLetterCount, and a counter variable, i, to keep track of where we are in the array. Notice how we named the variables to reflect their function. We also initialized (assigned initial values) to all of the variables, which is a good habit to get into.

The while loop works like this: as long as the condition in parentheses (i < nWords) evaluates to true, the program will continue to execute the statements contained in the following set of braces. So as long as the counter has not reached the end of the array, the program will continue to loop. The three statements within the braces instruct the program to:

  1. store the current word length in the corresponding position of our nLetters array;
  2. add the current word length to the variable letterCount;
  3. increment (add 1 to) the loop counter.

Notice that the statements within the while loop are indented, so that we can easily see they belong to the loop. These statements use two handy operators you may not have seen before, += and ++. Both operators save you some typing.
letterCount += textWords[i].length;
is equivalent to:
letterCount = letterCount + textWords[i].length;
and
++i;
is equivalent to:
i = i + 1;
Which would you rather type?

When the program has iterated through the loop 9 times, the loop counter, i, will be equal to nWords. The while condition will now evaluate to false and the program will proceed to the first statement following the closing brace.

The for loop construction (below) is similar, but it keeps track of all the counter bookkeeping in one place. Here is the same code snippet, using a for loop:
var nWords = textWords.length;
var letterCount = 0;
var nLettersPerWord = 0;

for(var i = 0; i < nWords; ++i)
{
   letterCount += textWords[i].length;
}
nLettersPerWord = letterCount/nWords;

The for construction has the statement for initializing the counter (var i = 0;), the condition statement to evaluate to test if the statements between the braces should be executed (i < nWords;), and the statement to increment the counter (++i). (Note that there is no semicolon after increment statement, just as there is no semicolon following the condition in the while statement.)

How do you choose which loop method to use? If your program has enough information to calculate how many times it needs to iterate (go through) the loop, then a for statement is usually clearest. It's easier to keep track of the loop counter if all of the statements are together in one place. So, for iterating through an array, the for loop is usually the best choice. In other cases, say when you are repeatedly processing input from the user and waiting for a special input to signal that it is time to go on to something else, then a while loop makes more sense.

With what you know now, you should be able to come up with something like this:

Simple Word Length Counter

Type or paste a list of words, separated by spaces, into the box below, then press the "Count Word Length" button to count the number of letters in each word.

 

Average number of letters per word:

It's a start, and works well enough for word lists separated by spaces. But what happens if you paste in ordinary text, punctuation and all? Aha! Punctuation marks get counted as letters. Read on to learn how to eliminate those pesky commas, periods, question marks and the like from your word lists.

Step Two: Nested Loops and Conditional Statements

Suppose that instead of counting all the letters in the textWords array, you only wanted to count how many times the letter "e" was used. We can use a conditional statement to modify our previous letter-counting loop so that only e's are counted. Here's how:
var eCount = 0;

for(var i = 0; i < textWords.length; ++i){ // outer loop
   for(var j = 0; j < textWords[i].length; ++j){ // inner loop
      if(textWords[i].toString().charAt[j] == 'e' ||
         textWords[i].toString().charAt[j] == 'E'){ // if
         ++eCount;
      } // end of if
   } // end of inner loop
} // end of outer loop

Now we have two nested loops , the first one, with counter i, to step through the array of words, and the second one, with counter j, to step through the individual characters within each word. (Notice how we've used increasing indentation to identify the loops—this is where a text editor specifically for programming comes in handy.) We've used the Array object's .toString() method, and then the String object's .charAt() method to step through the individual characters. The statement:
if(textWords[i].toString().charAt(j) == 'e' || textWords[i].toString().charAt(j) == 'E')
means "if this character is a lowercase 'e' OR if this character is an uppercase 'E', then execute the statements between the following braces (otherwise, skip them)". The == is a comparison operator, (like < or >). If the two operands are equal, the comparison evaluates to true, if the two operands are not equal, the comparison evaluates to false. (Note that the equality comparison operator, ==, is not the same as the assignment operator, =, which assigns the value of the right-hand operand to the left-hand operand.) The || is a logical operator, in this case logical OR.

The previous example showed how for loops can be nested. Conditional loops can be nested as well. Suppose that you wanted to watch your P's and Q's in the sample text. You would first check to see if the character was a "P", then check to see if it was a "Q", then go on to the next letter. Here's how:
var pCount = 0;
var qCount = 0;

for(var i = 0; i < textWords.length; ++i){
   for(var j = 0; j < textWords[i].length; ++j){
      if(textWords[i].toString().charAt(j) == 'p' ||
         textWords[i].toString().charAt(j) == 'P'){
         ++pCount;
      }
      else{
         if(textWords[i].toString().charAt(j) == 'q' ||
            textWords[i].toString().charAt(j) == 'Q'){
            ++qCount;
         }
      }
   }
}

Here we've used the else construction to tell the program what to do when the first if statement evaluates to false (i.e., when the character is not a lowercase or uppercase 'p'). In that case, there is a nested if statement, which checks to see if the character is a 'q'. If it is, qCount is incremented, otherwise, the loop continues.

The project "ABC's of Programming: Writing a Simple 'Alphabetizer' with JavaScript" shows how to use the JavaScript String object's .split() method to break a block of text into separate "words". However the .split() method is quite limited. For example, there can be only a single pattern for word breaks, but words in ordinary text are separated by both spaces and punctuation marks. You should be able to apply what you've learned about loops and conditional statements to write your own splitWords() function. It should take a paragraph of ordinary text as input and create an array of words as output. You can use nested for loops to iterate through each character in the text. Within the inner for loop you can use nested if...else statements evaluate each character and decide what to do with it.

Step Three: Using Two-Dimensional Arrays

You've already learned that an array is a list, for example, a list of all of the words in a sentence. A two-dimensional array is a list of lists. Following the previous example, our two-dimensional array would be a list of sentences, with each of the sentences consisting of a list of words.

Here's another example. Instead of sentences, just think of lines of text. Each line is an array of words, and each block of text is an array of lines. If our lines of text were:
the quick brown fox
jumped over the lazy dog
then a two-dimensional array containing all the words, line-by-line, would look like this:
lines[0][0]=the
lines[0][1]=quick
lines[0][2]=brown
lines[0][3]=fox
lines[1][0]=jumped
lines[1][1]=over
lines[1][2]=the
lines[1][3]=lazy
lines[1][4]=dog
The first index number indicates the line number, and the second index number indicates the position of the word within that line. You can use nested loops to iterate through all of the lines (outer loop), and all of the words within each line (inner loop).

As you might imagine, it keeps on going: arrays can have three (or even more) dimensions. Continuing our example, the next dimension would be lists of lines (paragraphs, maybe, or pages), and the next dimension would be lists of these (chapters or books), and so on.

You can use two-dimensional arrays in your project to make lists of sentences, which, in turn, are made up of lists of words.

We've covered a lot of ground! If you've gotten this far, you should be ready to write a program to "measure" some text.

Note for JavaScript files with Internet Explorer:
If you experience difficulty running your JavaScript code in Internet Explorer, we strongly suggest that you install the Firefox web browser and use it instead of Internet Explorer. For more information or to download the Firefox installer, see: http://www.mozilla.com/firefox/.

If you want to continue to use Internet Explorer, try adding the following line at the beginning of your file:
<!-- saved from url=(0014)about:internet --> This line will cause Internet Explorer to run your file according to the security rules for the Internet zone on your computer. In our experience this may work, or it may not. For more information see http://www.phdcc.com/xpsp2.htm.

Terms and Concepts

To write a simple program in JavaScript, you should do research that enables you to understand the following terms and concepts:

Bibliography

  • You can find a step-by-step JavaScript tutorial at the link below. After studying the tutorial, you should be able to answer all of the questions listed above, and you'll be ready to write a simple calculator. Webteacher Software. (2006). JavaScript Tutorial for the Total Non-Programmer. Webteacher Software, LLC. Retrieved June 6, 2006.
  • Introduction to Programming by Matt Gemmell describes what programming actually is: Gemmell, M. (2007). Introduction to Programming. Dean's Director Tutorials & Resources. Retrieved March 14, 2014.
  • HTML Forms reference: W3C. (1999). Forms in HTML Documents, HTML 4.01 Specification. World Wide Web Consortium: Massachusetts Institute of Technology, Institut National de Recherche en Informatique et en Automatique, Keio University. Retrieved June 6, 2006.
  • A list of reserved words in JavaScript (you cannot use these words for function or variable names in your program, because they are reserved for the programming language itself): JavaScript Kit. (2006). JavaScript Reserved Words. Retrieved June 6, 2006.
  • A JavaScript reference: Mozilla Developer Network and Individual Contributors. (2013, December 14). JavaScript reference. Retrieved March 14, 2014.
  • If you get interested and start doing a lot of web development, you may want to use a text editor that is a little more sophisticated than Notepad. An editor designed for web development and programming can help with formatting so that your code is more readable, but still produce plain text files. This type of editor can also fill in web-specific HTML coding details and do "syntax highlighting" (e.g., automatic color-coding of HTML), which can help you find errors. To find one, search online for "free HTML editor."

Materials and Equipment

To do this experiment you will need the following materials and equipment:

Experimental Procedure

  1. Using what you have learned about programming in JavaScript, create a program that will take a block of text as input and that can count:
    • the number of sentences contained in the text,
    • the number of words per sentence,
    • the number of letters per word,
    • the average number of words per sentence, and
    • the average word length.
  2. There are several sub-tasks here. Take the sub-tasks one at a time, and gradually build up the capabilities of your program.
  3. Verify that the code for each sub-task is working properly before moving on to the next sub-task.
  4. Test your program by copying and pasting paragraphs of text from various sources. Do counts by hand to test the accuracy of your program.
Here are some generally applicable programming tips to keep in mind as you get started with this project.
  1. Plan your work.
    • Methodically think through all of the steps to solve your programming problem.
    • Try to parcel the tasks out into short, manageable functions.
    • Think about the interface for each function: what arguments need to be passed to the function so that it can do its job?
  2. Use careful naming, good formatting, and descriptive comments to make your code understandable.
    • Give your functions and variables names that reflect their purpose in the program. A good choice of names makes the code more readable.
    • Indent the statements in the body of a function so it is clear where the function code begins and ends.
    • Indent the statements following an "if", "else", "for", or "while" control statement. That way you can easily see what statements are executed for a given branch or loop in your code.
    • Descriptive comments are like notes to yourself. Oftentimes in programming, you'll run into a problem similar to one you've solved before. If your code is well-commented, it will make it easier to go back and re-use pieces of it later. Your comments will help you remember how you solved the previous problem.
  3. Work incrementally.
    • When you are creating a program, it is almost inevitable that along the way you will also create bugs. Bugs are mistakes in your code that either cause your program to behave in ways that you did not intend, or cause it to stop working altogether. The more lines of code, the more chances for bugs. So, especially when you are first starting, it is important to work incrementally. Make just one change at a time, make sure that it works as expected and then move on.
    • With JavaScript, this is easy to do. To check your code, all you need to do is use your web browser to open the HTML file containing your code. After you've made a change in the code with your text editor, just save the file, then switch to your browser and hit the re-load page button to see the change in action.
    • Test to make sure that your code works as expected. If your code has branch points that depend on user input, make sure that you test each of the possible branch points to make sure that there are no surprises.
    • Also, it's a good idea to backup your file once in awhile with a different name. That way, if something goes really wrong and you can't figure it out, you don't need to start over from scratch. Instead you can go back to an earlier version that worked, and start over from there.
    • As you gain more experience with a particular programming environment, you'll be able to write larger chunks of code at one time. Even then, it is important to remember to test each new section of code to make sure that it works as expected before moving on to the next piece. By getting in the habit of working incrementally, you'll reduce the amount of time you spend identifying and fixing bugs.
  4. When debugging, work methodically to isolate the problem.
    • We told you above that bugs are inevitable, so how do you troubleshoot them? Well, the first step is to isolate the problem: which line caused the program to stop working as expected? If you are following the previous tip and working incrementally, you can be pretty sure that the problem is with the line you just wrote.
    • Check for simple typos first. A misspelled function name will not be recognized. In JavaScript, a misspelled variable name simply creates a new variable. This is an easy mistake to make, and can be hard to find.
    • Avoid using reserved words as variable or function names.
  5. Test your program thoroughly.
    • You should test the program incrementally as you write it, but you should also test the completed program to make sure that it behaves as expected.
icon scientific method

Ask an Expert

Do you have specific questions about your science project? Our team of volunteer scientists can help. Our Experts won't do the work for you, but they will make suggestions, offer guidance, and help you troubleshoot.

Global Connections

The United Nations Sustainable Development Goals (UNSDGs) are a blueprint to achieve a better and more sustainable future for all.

This project explores topics key to Industry, Innovation and Infrastructure: Build resilient infrastructure, promote sustainable industrialization and foster innovation.

Variations

Careers

If you like this project, you might enjoy exploring these related careers:

Career Profile
Computers are essential tools in the modern world, handling everything from traffic control, car welding, movie animation, shipping, aircraft design, and social networking to book publishing, business management, music mixing, health care, agriculture, and online shopping. Computer programmers are the people who write the instructions that tell computers what to do. Read more
Career Profile
Software quality assurance engineers and testers oversee the quality of a piece of software's development over its entire life cycle. Their goal is to see to it that the final product meets the customer's requirements and expectations in both performance and value. During the software life cycle, they verify (officially state) that it is possible for the software to accomplish certain tasks. They detect problems that exist in the process of developing the software, or in the product itself.… Read more
Career Profile
Are you interested in how a website is set up and how the website runs? As a web developer and designer you could design a website's look and feel and create the code to make sure the website works. You could set up a website for your favorite store with payment options, making sure it works with the ever growing list of browsers and devices. Do you like working behind the scenes? You could design the layout or write the supporting code for an app or website while collaborating with other web… Read more

News Feed on This Topic

 
, ,

Cite This Page

General citation information is provided here. Be sure to check the formatting, including capitalization, for the method you are using and update your citation, as needed.

MLA Style

Science Buddies Staff. "Paragraph Stats: Writing a JavaScript Program to 'Measure' Text." Science Buddies, 14 July 2023, https://www.sciencebuddies.org/science-fair-projects/project-ideas/CompSci_p003/computer-science/paragraph-stats-writing-a-javascript-program-to-measure-text. Accessed 19 Mar. 2024.

APA Style

Science Buddies Staff. (2023, July 14). Paragraph Stats: Writing a JavaScript Program to 'Measure' Text. Retrieved from https://www.sciencebuddies.org/science-fair-projects/project-ideas/CompSci_p003/computer-science/paragraph-stats-writing-a-javascript-program-to-measure-text


Last edit date: 2023-07-14
Top
We use cookies and those of third party providers to deliver the best possible web experience and to compile statistics.
By continuing and using the site, including the landing page, you agree to our Privacy Policy and Terms of Use.
OK, got it
Free science fair projects.