Abstract
Sunspot activity has been monitored continuously since about 1700. The historical data shows that sunspot activity rises and falls in a roughly 11-year cycle. This project shows you how you can use both graphical and statistical analysis to look for patterns in cyclical data.Objective
Do sunspot cycles consistently have a faster rise time and a slower decay time? The goal of this project is to find out, using historical data and basic statistical analysis. As an added bonus, you'll learn the basics of working with a spreadsheet program.
Credits
Andrew Olson, Ph.D., Science Buddies
Sources
- The idea for this project sprang from a discussion of the aspect ratio in statistical displays on page 25 in:
Tufte, E.R., 1997. Visual Explanations: Images and Quantities, Evidence and Narrative. Cheshire, CT: Graphics Press. - The original source which Tufte cites is:
Cleveland, W.S., 1994. The Elements of Graphing Data, revised edition. Murray Hill, NJ: AT&T Bell Laboratories, pp. 66 - 79. - Monthly sunspot data was obtained from:
Van der Linden, R.A.M. and the SIDC team, 1749-2006, "Online Catalogue of the Sunspot Index," SIDC, Royal Observatory of Belgium [accessed February 20, 2006] http://sidc.oma.be/. - Galileo's sunspot drawing was reproduced, with permission, from Professor Owen Gingerich's copy of the first edition of Istoria e Dimostrazioni Intorno Alle Macchie Solari e Loro Accidenti Rome (History and Demonstrations Concerning Sunspots and their Properties, published 1613), and published online here:
Van Helden, A. and E. Burr, 1995. "Galileo's Sunspot Drawings," The Galileo Project, Rice University [accessed February 20, 2006] http://galileo.rice.edu/sci/observations/sunspot_drawings.html.
Cite This Page
MLA Style
APA Style
Share your story with Science Buddies!
I Did This Project! Please log in and let us know how things went.Last edit date: 2015-02-23
Introduction
Sunspots were first discovered by Galileo Galilei in 1612. Galileo made regular observations of sunspots and was able to prove that he was seeing features on the surface of the sun, which moved as the sun rotated. (Another interesting project involving sunspots is to recreate Galileo's experiments using satellite imagery that you can collect online. See Using the Solar & Heliospheric Observatory Satellite (SOHO) to Determine the Rotation of the Sun.)
What is a sunspot, anyway? The SOHO Explore Glossary defines a sunspot this way: "a temporary disturbed area in the solar photosphere that appears dark because it is cooler than the surrounding areas. Sunspots consist of concentrations of strong magnetic flux. They usually occur in pairs or groups of opposite polarity that move in unison across the face of the Sun as it rotates." (SOHO Explore Glossary, 2006)
To see what sunspots looks like, here are two images of the sun's photosphere, taken by the Solar and Heliospheric Observatory (a joint project of NASA and the European Space Agency). The one on the left was taken on November 15, 1999. The one on the right was taken on February 20, 2006.
For more solar images, check out the SOHO links in the Bibliography. The EIT (Extreme ultraviolet Imaging Telescope) images show the sun's atmosphere for specific wavelengths in the ultraviolet region of the spectrum. For example, at 171 angstroms (one angstrom is one ten-billionth of a meter, or 10^{-10} m) the UV light is mostly emitted by Fe IX and X (iron ionized 8 or 9 times) at 1 million degrees Kelvin. Iron emissions provide visualization of the magnetic field lines. Here are two examples of these amazing images, corresponding to the same dates as the visible-light images:
We've come a long way from Galileo's telescope in 1612! But as you'll see, there is still value in data from hundreds of years ago. We have annual data on sunspot numbers going back to 1700, and monthly data to 1749. The sunspot number for an observation is equal to the number of individual sunspots observed plus ten times the number of groups of sunspots observed. The reason for doing this is that viewing conditions are not always ideal, and an average group has about ten sunspots. This way, the data is reliable even when small spots are hard to visualize. The monthly sunspot number is the average of all the daily numbers for the month. Here are two graphs of the monthly data.
The data is the same in both graphs; it is just shown with two different vertical scales. It is obvious from both graphs that sunspot activity is cyclical, with the numbers regularly rising and falling. The tick marks on the horizontal axis are at 11-year intervals, the approximate length of the solar cycle. For example, if you compare the tick marks over the period from 1838 to 1893, you see that they fall at about the peak of each of those cycles.
The reason for showing two graphs is to show you that sometimes you can discover something new in your data just by looking at it differently. The top graph is how the data would appear with typical default settings of a graphing program. Typical default settings will produce a graph with an aspect ratio (horizontal length divided by vertical length) of about 1.6, and a y-axis scale chosen so that the data fills the graph region. These settings produce a reasonable-looking graph, and the cyclical nature of sunspot activity is readily apparent.
Compare the two graphs carefully, though, and see if the lower graph shows you anything more. The lower graph is made following an idea from William Cleveland. He used a computer algorithm to select an aspect ratio so that selected line segments in the data would have a slope of ±45° (Cleveland, 1994, cited in Tufte, 1997). In the upper graph, the sunspot cycles all rise and fall steeply. This makes it difficult for your eye to notice any subtle patterns in the cycles. Cleveland's idea is to select an aspect ratio so that the rising and falling slopes are closer to a ±45° angle, on average. Showing the data this way makes it much easier for your eye to see subtle patterns in the cycles. For example, compare the onset of each cycle to the decay. The onset time is the time from the beginning of the cycle to the maximum of the cycle. The decay time is the time from the maximum of the cycle to the end of the cycle. Are they about equal, or is one longer than the other? Which has a steeper slope? This comparison is much easier to make in the lower graph than in the upper one.
For some of the cycles, the onset time and decay time look fairly similar. For others, it appears that the onset time is shorter than the decay time. Is there a way to measure how strong the effect is? By answering these kinds of questions, we can get a better understanding of the solar physics underlying sunspots.
In this project, you will learn how to use basic statistical analysis of the historical data to test the hypothesis that sunspot cycles consistently have a faster rise time and a slower decay time. As an added bonus, you'll learn the basics of working with a spreadsheet program.
Terms and Concepts
To do this project, you should do research that enables you to understand the following statistical terms and concepts:
- sunspots,
- parts of the sun:
- core,
- radiative zone,
- convective zone,
- chromosphere,
- photosphere,
- corona;
- sunspot cycle,
- statistical terms:
- population,
- sample,
- mean,
- standard deviation,
- variance,
- null hypothesis,
- t-test,
- statistical significance.
Bibliography
- Van Helden, A. and E. Burr, 1995. "Galileo's Sunspot Drawings," The Galileo Project, Rice University [accessed February 20, 2006] http://galileo.rice.edu/sci/observations/sunspot_drawings.html.
- Here are several links to the ESA/NASA SOHO site:
- The SOHO home page is:
ESA & NASA, 2006. "SOHO Home Page," European Space Agency and NASA [accessed February 20, 2006] http://sohowww.nascom.nasa.gov/. - For background information on the Solar and Heliospheric Observatory, see:
ESA & NASA, 2006. "SOHO Explore," European Space Agency and NASA [accessed February 20, 2006] http://sohowww.nascom.nasa.gov/explore/. - For a glossary of solar terms, see:
ESA & NASA, 2006. "Key Terminology for Understanding the Sun," European Space Agency and NASA [accessed February 20, 2006] http://sohowww.nascom.nasa.gov/explore/glossary.
- The SOHO home page is:
- Hathaway, D.H., 2005. "The Sunspot Cycle," NASA Science Directorate, Marshall Space Flight Center [accessed February 20, 2006] http://science.nasa.gov/solar/sunspots.htm.
- Historical sunspot data is available from:
Van der Linden, R.A.M. and the SIDC team, 2006, "Online Catalogue of the Sunspot Index," SIDC, Royal Observatory of Belgium [accessed February 20, 2006] http://sidc.oma.be/. - Here is an Excel tutorial to get you started with using a spreadsheet program:
Excel Easy. (n.d.). Excel Easy: #1 Excel tutorial on the net. Retrieved March 10, 2014, from http://www.excel-easy.com - This website has descriptions and calculators for several statistical tests, including the paired t-test that you'll use in this project:
Kirkman, T., date unknown. "Student's t-Tests," Department of Physics, College of St. Benedict & St. John's University [accessed February 23, 2006] http://www.physics.csbsju.edu/stats/t-test.html.
News Feed on This Topic
Materials and Equipment
To do this experiment you will need the following materials and equipment:
- computer with Internet access and a spreadsheet program.
Share your story with Science Buddies!
I Did This Project! Please log in and let us know how things went.Experimental Procedure
- Do your background research so that you are knowledgeable about the terms and concepts.
- If you are not familiar with using a spreadsheet program, be sure to take the time to go through the Excel tutorial listed in the Bibliography.
- Here is a short version of the data analysis steps you'll be following in order to test the hypothesis that the sunspot cycle onset times are faster than the decay times. The links following each step will jump you to the detailed instructions for that step. You can use your browser's "Back" button to return to these brief instructions.
- Import the historical data into Excel. (Downloading and Importing Data)
- Graph the data so that you can identify the approximate beginning and end of each cycle. (Graph the Data)
- From the spreadsheet data, identify the beginning, end, and maximum of each cycle. Make a table of these values. (Extracting the Cycle Data)
- Use the spreadsheet functions to calculate the onset time and decay time for each cycle. Also, calculate the difference between onset time and decay time for each cycle. (Calculating Onset Time and Decay Time)
- Compare the results. Is there a difference between onset time and decay time? Is the difference statistically significant? You will be using a "paired t-test" for this calculation. The t-test tells you how confident you can be that your results are not simply due to random chance. You can use the t-test calculator given in the Bibliography, or you can put the formula in your spreadsheet, if you know it. (Testing for Statistical Significance)
- If spreadsheets are something new for you, then the detailed explanations that follow should help. If you are already comfortable with using spreadsheets, then you should be in good shape on your own.
Downloading and Importing Data
Note: the spreadsheet details provided are based on Microsoft Excel 2003. For other versions of Excel, or other spreadsheets, you can use the instructions here as a guideline and consult the help function of your spreadsheet program for more specifics.
- Download a table of historical sunspot numbers from http://sidc.oma.be/. Click on the link for "Sunspot archive & graphics," then select one of the ASCII data files. To download, you can right-click on the file link, and select "Save As...". Save the file with a ".txt" extension. We recommend that you use either the "monthly and monthly smoothed sunspot number" or the "yearly sunspot number" file.
- The "Readme" file (available on the same page as the historical data) explains the format of the data files.
- Import the sunspot data into your spreadsheet program. Here's how to do it in Excel:
- From the menu, select File/Open.... You'll see a dialog:
- At the bottom of the File Open dialog, under "Files of type:" use the drop-down list to select "Text Files (*.prn, *.txt, *.csv)".
- Navigate to the directory where you saved your sunspot file, select it, and click "Open."
- Excel now takes you through the "Text Import Wizard," a series of three dialogs. The first dialog looks like this (monthly sunspot data file shown):
- Make sure "Fixed width" file type is selected, then click "Next."
- The second Text Import Wizard dialog is used to set the field widths. It looks like this:
- For the yearly sunspot data file, you can accept the defaults (skip down to step 3j).
- For the monthly sunspot data file, there is a fourth column of numbers, which starts at the seventh row. (This is the monthly smoothed sunspot number, which is a running average of the preceeding and following six months.) Use the arrow to scroll down to row seven, as shown:
- To import the fourth data column, click next to it as shown:
- Click "Next." For the yearly sunspot data, click "Finish" and go on to the section "Graph the Data".
- For the monthly sunspot data, scroll down to the last few rows of data in the fourth column, and you will see some asterisks (indicating that these averages were computed with provisional data). Make sure that the asterisks are in the fifth data column, as shown here:
- You can choose to skip import of the asterisk column if you want. Click "Finish" to import the monthly data.
- From the menu, select File/Open.... You'll see a dialog:
- Before you graph your data, insert a new row at the top of the spreadsheet and add labels to identify the data in each column. (If you don't know how to do this, the Excel tutorial can you show you, or you can use the Help function in Excel.)
- Also, it is useful to know the range of values you'll be graphing, so that you can select the appropriate scale for your graph. You can use the "min" and "max" functions to have the spreadsheet find these values for you. (Again, the Excel tutorial shows how to do this.)
- Use the Chart Wizard to create a graph of the data. You'll use the graph to identify the approximate location for the beginning of each cycle.
- Select Insert/Chart... to open Excel's Chart Wizard.
- Select the "Area" Chart type, and the top-left Chart sub-type as shown:
- Click "Next," then select the "Series" tab in the "Source Data" dialog as shown:
- In this example, we're going to be plotting two series of data: the monthly numbers, and then the smoothed monthly numbers. The dialog shows all of the information filled in for the first series.
- We gave it a name: "Monthly".
- We provided the range of values to plot on the y-axis (the monthly sunspot numbers from column "C", rows 3–3087).
- We provided the range of labels for the x-axis (the year, from column "B", rows 3–3087).
- When your first data series is all set up, click the "Add" button to add the next data series, like this:
- When your second data series is all set up, click the "Next" button for step 3, Chart Options. Add a title for the chart, and labels for the x- and y-axes:
- We think the chart looks cleaner without heavy gridlines, so turned them off using the Gridlines tab:
- Click "Next", then select where you want the chart to appear (we've chosen to place it on a new worksheet), then click "Finish."
- The graph created by the Chart Wizard is OK, but there are still some additional tweaks to make it better. For example, we can choose an interval for labeling the years on the x-axis that relates to the 11-year period of the solar cycle. We can change the scale of the y-axis to more closely match the range of the data. We can change the color of the data series, and show the smoothed data as a line over the solid monthly average.
- To make changes to the x-axis, double-click on the x-axis labels. You should see a "Format Axis" dialog:
- Change the "Major tick mark type" to "Outside", as shown above. Then click on the "Scale" tab:
- We'd like our x-axis labels to relate to the 11-year period of the solar cycle. A reasonable choice is to have labels appear every 55 years, with tick marks in between at 11-year intervals. So we've changed the "Number of categories between tick-mark labels" to 660 (= 55 years × 12 months/year). We've changed "Number of categories between tick marks" to 132 (= 11 years × 12 months/year). When you've made these changes, click "OK."
- To make changes to the y-axis, double-click on the y-axis labels. You should see a "Format Axis" dialog again. Set the "Major tick mark type" to "Outside", as we did for the x-axis. Then select the "Scale" tab:
- Here we've set the minimum and maximum values are set closer to the actual range of the data (0–253.8, in our case). When you've made these changes, click "OK."
- Finally, to change options for the appearance of the data (like color), double click on a data point. (If you "hover" the mouse over the graph, you should see a tooltip that tells you which data series you're over.) Here we double-clicked on one of the "Smoothed" data points:
- We want the "Smoothed" data set to appear as a line, on top of the "Monthly" data set. So we selected "None" for the Area, and chose a "Custom" border, which we set to a brown color and a medium-thick line. For the "Monthly" data, we chose "None" for the border, and a black-filled Area. Here is how the finished graph looks:
- The colors are chosen to go well with one another and with the background. Jarring contrasts are avoided. We also don't bother with effects like flashy gradient fills that have no relation to the data. Avoid meaningless clutter in your graphs. You should strive to present the data clearly, so it can speak for itself.
- Now that you have a graph of the data, you can use it to help identify the beginning and end of each solar cycle. In order to measure onset time and decay time accurately, we need complete solar cycles. Using the monthly data, there were 22 complete cycles between 1775 and 1996.
- You should construct a second table in your spreadsheet that shows the beginning, peak and end of each cycle. We can build this table by finding the row numbers for the beginning of each solar cycle in the data table.
- Use the graph to find the approximate dates for the beginning of each cycle (the lowest point in the "trough" between cycles). When the mouse "hovers" over a data point, a tooltip pops up with information about that point, as shown:
- Then, switch to the data table and scroll down to that approximate date (in this case, 1755.877). Scan the sunspot numbers around that date and find the smallest one. That's the data row that you want. In the column at the right, write in the cycle number (starting with zero, as shown).
- Then copy the row number to your new data table, as shown:
- Next, you'll use Excel's "value" function to copy the data values for the beginning of the cycle.
- Enter the formula:
- Hit "Enter" and Excel copies the value for you:
- Then copy your formula:
- And paste it in to the next four columns:
- Excel fills in the remaining values for you:
- Enter the formula:
- Repeat steps 3–6 for each of the 22 cycles, leaving a blank row in your new data table (for filling in the maximum point of each cycle later). Remember to increase the cycle number by 1 each time.
- Now you're ready to find the maximum for each cycle. Again, you can have Excel do some of the work for you. Using the row number information from your table, you'll use the "max" formula to have Excel find the maximum sunspot number within each cycle.
- Column "C" is the monthly sunspot number, and the first cycle begins at row 79 and ends at row 212. Here's the formula for the maximum monthly sunspot number for the first cycle:
- Click "Enter" and Excel finds the value for you:
- Column "C" is the monthly sunspot number, and the first cycle begins at row 79 and ends at row 212. Here's the formula for the maximum monthly sunspot number for the first cycle:
- Now you need to find the date that corresponds to the maximum sunspot number for the cycle. You can use Excel's "Find" feature for this.
- First click on column "C" (the monthly sunspot number) to select only this column for searching. Then select Edit/Find... (or use Ctrl+F) and fill in the maximum value that Excel just found for you (in this case, "107.2"):
- Click on "Find Next." Excel finds the first occurrence of this value in column "C." You can copy the date information:
- and then paste it in your data table:
- Note that the value you are searching for may occur more than once in the data table. If the date is too early, just click on "Find Next" again until you find the date for the right solar cycle.
- First click on column "C" (the monthly sunspot number) to select only this column for searching. Then select Edit/Find... (or use Ctrl+F) and fill in the maximum value that Excel just found for you (in this case, "107.2"):
- Repeat steps 8 and 9 for each of the 22 solar cycles.
Calculating Onset Time and Decay Time
- Now you have all the information you need to calculate the onset time and decay time for each solar cycle. With what you've learned about spreadsheets so far, you should be able to do this on your own.
- In addition, you should have Excel calculate the difference between the onset time and decay time for each solar cycle.
- Finally, for all three values (onset time, decay time and difference), calculate the average and standard deviation for all 22 solar cycles.
- Is there a difference?
Testing for Statistical Significance
- How can you tell if a trend in your data is real? Is there some way for you to know how reliable the data is? Yes: the answer is to apply the proper tests of statistical significance.
- Here is a quick summary of how such a test works:
- Formulate a "null hypothesis" (essentially the converse of what you want to prove). In this case, the null hypothesis would be that there is no difference between onset time and decay time.
- Select the appropriate statistical test for the data. In this case, the onset time and decay time are paired values, since each one is describing a different feature of the same solar cycle. The test to use in this case is "Student's paired t-test.
- The result of the t-test will tell us, under certain assumptions, the probability that the null hypothesis is true. By convention, the probability is referred to as a "p value" and is given as an inequality. For example, if p < 0.01, this means that the probability of the null hypothesis is true is less than 0.01, or less than 1 in 100. Another way to put it is that we are 99% certain that our desired hypothesis (the converse of the null hypothesis) is, in fact, true.
- You can use this link: http://www.physics.csbsju.edu/stats/t-test.html to run a paired t-test on your data. Select the "paired t-test" choice. First you'll have to enter the number of data pairs (22). Next you'll get a page with a form where you can enter your pairs of onset and decay times for each solar cycle. When you've entered all the data, click on the "Calculate Now" button at the top of the page to get the results (which you can copy and paste for your lab notebook).
- Include the p value with your results on your display board. You should be prepared to explain what it means.
Share your story with Science Buddies!
I Did This Project! Please log in and let us know how things went.Variations
- In addition to testing the onset and decay times, test the rate of rise and decay (includes both time and magnitude). Is there a correlation between the magnitude of sunspot activity and the a difference in onset vs. decay rate?
Share your story with Science Buddies!
I Did This Project! Please log in and let us know how things went.Ask an Expert
The Ask an Expert Forum is intended to be a place where students can go to find answers to science questions that they have been unable to find using other resources. If you have specific questions about your science fair project or science fair, our team of volunteer scientists can help. Our Experts won't do the work for you, but they will make suggestions, offer guidance, and help you troubleshoot.Ask an Expert
Related Links
If you like this project, you might enjoy exploring these related careers:
Statistician
Statisticians use the power of math and probability theory to answer questions that affect the lives of millions of people. They tell educators which teaching method works best, tell policy-makers what levels of pesticides are acceptable in fresh fruit, tell doctors which treatment works best, and tell builders which type of paint is the most durable. They are employed in virtually every type of industry imaginable, from engineering, manufacturing, and medicine to animal science, food production, transportation, and education. Everybody needs a statistician! Read moreAstronomer
Astronomers think big! They want to understand the entire universe—the nature of the Sun, Moon, planets, stars, galaxies, and everything in between. An astronomer's work can be pure science—gathering and analyzing data from instruments and creating theories about the nature of cosmic objects—or the work can be applied to practical problems in space flight and navigation, or satellite communications. Read moreNews Feed on This Topic
Looking for more science fun?
Try one of our science activities for quick, anytime science explorations. The perfect thing to liven up a rainy day, school vacation, or moment of boredom.
Find an Activity