Can Machine Learning Forecast Future Sea Levels?
Abstract
Global sea levels are rising, but modeling and predicting them can be challenging with more and more extreme weather events. Can we use machine learning to track and predict sea level changes? In this science project, you will use previously collected data and machine learning to see if you can predict future relative sea level changes.
Summary
None
Readily available
No issues
Objective
To use a machine learning algorithm to predict future sea level changes.
Introduction
The world is constantly changing! In fact, there are large global changes that impact our environment, even though we may not always see or notice them directly. The large tectonic plates of the world are constantly shifting, and sometimes, when these plates slide by one another, we feel them as earthquakes when this occurs on land. Abnormal changes to our oceans are also taking place. Over land, extremely large glaciers and ice sheets are melting, and the water runs off into the ocean. Additionally, the water in the ocean is thermally expanding due to global warming. This increase in ocean levels significantly impacts coastlines. In fact, these changes can drastically change our coastal environments due to increased flooding, erosion, and destruction of wetland habitats.
.jpg)
A satellite image contrasts the effects of extreme temperatures on Antarctica. The two panels show the same coastal area one month apart. In the top panel (January 13, 2020), the land is largely covered by a thick layer of bright white snow and ice. In the bottom panel (February 12, 2020), the landscape is significantly different. The white ice has receded, exposing large areas of dark land. A clear increase in meltwater is also visible, appearing as vibrant blue patches on the remaining ice. The image serves as a powerful visual record of the rapid melting caused by a record heatwave.
Scientists are observing that most local relative sea levels (RSL) – the sea level measured relative to the land surface – are rising, and at the same time, there is a global increase in overall sea levels. How do we know that sea levels are rising? Scientists use instruments like satellite altimeters, which send radar signals to measure ocean height, and they also collect ocean temperature readings. These physical measurements can then be combined with physics-based models. If you've taken a physics class, you've probably seen how equations can be used to predict motion–like figuring out the path of a thrown ball or the speed of an accelerating car. In a similar way, scientists can write equations that describe how oceans and the atmosphere behave, and then use those equations to try to predict complex things like weather or sea level changes. However, predicting something as complicated as the entire ocean system is much harder than predicting where a ball will land!
That’s where machine learning can offer a different approach. Instead of relying on equations that describe the physics of the ocean, machine learning algorithms look at large amounts of past data (such as recorded sea levels and temperatures) and try to find patterns. Then they use those patterns to make predictions about the future. In other words, physics-based models explain why the system behaves the way it does, while machine learning models focus on making predictions directly from the data–even if they don’t explain the underlying physics of how the ocean is changing.
In this science project, you will use a machine learning algorithm, the Prophet model, to predict future local relative sea level (RSL) changes at a specific U.S. tide gauge station. You will test whether the model can identify complex patterns in the historical data that are difficult to capture with physics-based equations, and explore how well it can make predictions of local sea level changes.
Watch this video to learn more about the Prophet model:
Terms and Concepts
- Tectonic plates
- Earthquake
- Glacier
- Ice sheet
- Thermal expansion
- Global warming
- Wetland habitats
- Relative sea levels (RSL)
- Physics-based models
- Machine learning
- Prophet model
- Mean Absolute Error (MAE)
- Symmetric Mean Absolute Percentage Error (SMAPE)
Questions
- What are the consequences of glacier ice melting on the environment?
- What is the difference between relative and global sea level changes?
- What are the dangers of the global rise in sea levels to ecosystems? Which habitats will be affected the most initially?
- What instruments do scientists use to measure sea levels?
- What data does the machine learning algorithm use to predict future sea levels?
Bibliography
To learn more about sea level trends:
- Lindsey, Rebecca. (2023, August 22). Climate Change: Global Sea Level. Retrieved October 1, 2025.
- National Oceanic and Atmospheric Administration. (n.d.) Relative Sea Level Trends. Retrieved October 1, 2025.
- National Oceanic and Atmospheric Administration. (n.d.) U.S. Linear Relative Sea Level (RSL) trends and 95% Confidence Intervals (CI). Retrieved October 1, 2025.
To learn more about the Prophet model:
- LaBarr, Aric. (2022, February 16). What is the Prophet Model. YouTube. Retrieved October 1, 2025.
- Prophet. (n.d.). Prophet Quick Start. Facebook. Retrieved October 1, 2025.
Materials and Equipment
- Computer with Internet access
Experimental Procedure

Setting Up the Google Colab Environment
- You will need a Google account. If you do not have one, make one when prompted.
- Download the sea_level_prediction.ipynb file from Science Buddies. This is the code you will need to process your data.
- Within your Google Drive, click on ‘MyDrive,’ then create a new folder and rename it
sea_level_prediction. Inside the folder, upload thesea_level_prediction.ipynbfile. - Double-click on the
sea_level_prediction.ipynbfile. This should automatically open in Google Colab.- Read the Troubleshooting Tips and How to Use This Notebook sections. Follow the instructions you find in that section.
- Run the block under Importing Libraries to ensure you have access to all the functions we will use for this project.
Collecting the Data
- Navigate to the NOAA Sea Level Trends page.
- Browse the table and choose a location you would like to use for your sea level prediction.
- Some locations have longer data records than others – for best results, pick one with at least 100 years of data.
- Click the Station ID number (in the left column) for the location you selected.
- On the station’s page, click “Export to CSV” under the graph to download the data file.
- Open the
.csvfile in Microsoft Excel or Google Sheets (the steps are the same for both). - Select the first five cells by clicking on the first cell, holding shift on your keyboard, then clicking on the fifth cell (just before the column names). Right-click and choose Delete.
- Review the column headers and make sure there are no extra spaces before or after each column name. If you find any, remove them.
- Save the updated file.
- Open the
- Rename the downloaded file to
sea_level_predicton.csv. - Upload this file to your
sea_level_predictionfolder in Google Drive.
Loading the Data into a Pandas DataFrame
- (Code Block 1A) Run this code block to create a DataFrame, like a table, that will be used to load and manipulate the data in the notebook. The data from your .csv file will populate a table below the code block.
Preprocessing
- (Code Block 2A) This code block creates a new column
dsby combining theMonthandYearcolumns, converting them into proper datetime objects. The Prophet model requires the input data to follow a specific format: thedscolumn must contain datestamps (time information). Thedscolumn tells Prophet when each observation occurred, which is essential for building and projecting the time series model. Run this code block.- Note: Since
MonthandYearare stored as integers, they must be converted to strings before forming a valid date string.
- Note: Since
- (Code Block 2B) This code block sets the
dscolumn as the DataFrame index for easier date-based operations and removes the now redundantMonthandYearcolumns. Run this code block.
Visualizing the Data
Plotting the data before modeling with the Prophet model is important because it lets you visually inspect trends, seasonality, and anomalies in the time series. For example, you can see if the sea level generally rises over time, detect any recurring seasonal patterns, or spot outliers that might need cleaning. Understanding these characteristics helps you choose the right model settings in the Prophet model and anticipate how well the model may perform.
- (Code Block 3A) This code block creates a scatter plot of the
Monthly_MSLcolumn to visualize monthly sea level measurements over time. The plot allows you to quickly observe trends, seasonal patterns, and potential outliers, which is useful for understanding the data before building a time series model.- Has the rate of sea level rise changed over the years? Does it seem to accelerate, slow down, or stay constant?
- Do you see any points that stand out as unusually high or low compared to the rest of the data? What might cause these anomalies?
- Based on what you see, what do you think might happen to sea levels in the next few decades?
- (Code Block 3B) This code block defines a function
plot_msl_by_year_rangethat plots monthly sea level (Monthly_MSL) data for a chosen year range and overlays a 12-month rolling average to highlight long-term trends. Run this code block.- A rolling average (or moving average) smooths the data by calculating the mean over a fixed window–in this case, 12 months–so short-term fluctuations are reduced, and underlying patterns become clearer.
- (Code Block 3C) Under the
#TODOcomment, you can choose the time range you want to analyze by setting values forstart_yearandend_year. This code block will generate a plot showing monthly sea level values within the chosen years, using the function defined in Code Block 3B.- Example:
start_year = 1900andend_year = 1910will display data from 1900 through 1910.
- Example:
- (Code Block 3D) This code block defines a function
plot_msl_by_yearthat creates a line graph of monthly sea level (Montly_MSL) data for a single year. This lets you zoom in on the seasonal variation of sea level within one specific year. Run this code block. - (Code Block 3E) Under the
#TODOcomment, you can choose a single year you want to analyze by setting the value of year.- Example:
year = 2010will show sea level data for the year 2010 only.
- Example:
Splitting the Data into Train and Test
- (Code Block 4A) Separating the Dataset into Inputs and Target: This code block separates the dataset into two parts:
Xcontains all the feature columns exceptMonthly_MSL.ycontains theMonthly_MSLcolumn, which we will predict based on the other features.
- (Code Block 4B) Splitting the Training and Test Data: This code block splits the sea level dataset into training and testing sets based on a chosen date range. You can update the dates under the
#TODOcomment to control how much data is used for training and how far ahead you want to test.start_date: The first date to include in the training set.split_date: The cutoff date that separates training and testing. All data fromstart_dateup to and including this date will be in the training set.end_date: The last date to include in the testing set. All data after thesplit_dateup to and including this date will be in the testing set.- By default, the training set will include all data from 1975-01-01 through 2000-01-01, and the testing set will include all data from 2000-01-01 through 2025-01-01.
- Important: When changing the dates, always use the format
‘YYYY-MM-DD’(for example‘2015-07-01’).
Training the Model
- (Code Block 5A) We have provided the code to train the Prophet model. Run this code block.
- Note: The
%%timecommand prints the time it takes to run the entire cell. This is useful for estimating the time it takes to train a model.
- Note: The
- (Code Block 5B) This code block uses the trained Prophet model to forecast sea levels for the test dates, producing predicted values (
yhat) with lower and upper bounds (yhat_lowerandyhat_upper). The.head()function shows the first few rows for a quick inspection.
Evaluating the Model
-
(Code Block 6A) This code block displays the Prophet model forecast plot, which shows the predicted monthly sea levels over time. The main blue line represents the model’s predictions, while the shaded area around it shows the uncertainty interval, meaning the range where the true sea level values are likely to fall. This helps you see both the expected trend and how confident the model is in its predictions.Figure 2. Example of the Prophet forecast. The black dots represent historical data points, showing the monthly sea levels over time from 1975 through 2025 (x-axis). The blue shaded region represents the forecast for the future, with the darker blue line indicating the predicted trend and the lighter blue area showing the uncertainty range of the prediction.
Image Credit: Science Buddies
Example of the Prophet forecast. The black dots represent historical data points, showing the monthly sea levels over time from 1975 through 2025 (x-axis). The blue shaded region represents the forecast for the future, with the darker blue line indicating the predicted trend and the lighter blue area showing the uncertainty range of the prediction.
- How does the forecasted trend compare to the historical data? Does it follow the same pattern?
- What factors could cause the actual sea levels to differ from the predictions shown in the graph?
- (Code Block 6B) This code block plots the actual sea levels for the test data against Prophet model forecasts, including a 12-month rolling average and a shaded confidence interval.
- How closely do the forecasted values match the actual sea levels? Are there periods where the model over- or under-predicts?
- Based on the forecast, what might you expect for future sea level trends?
- (Code Block 6C) This code block calculates the Mean Absolute Error (MAE) to assess the accuracy of the model’s predictions.
- The MAE shows how close predictions are to actual values by averaging the size of errors. A lower MAE means better prediction accuracy.
- For example, an MAE of 0.05 indicates that, on average, the model’s predictions differ from the actual values by 0.05 units. That means that if the model predicts the mean sea level to be 0.10, the actual sea level may be anywhere between 0.05 and 0.15.
- (Code Block 6D) This code block calculates the Symmetric Mean Absolute Percentage Error (SMAPE) to evaluate the accuracy of the model’s predictions.
- The SMAPE measures the average percentage difference between predicted and actual values in a symmetric way, accounting for both over- and under-predictions. A lower SMAPE indicates that the model’s predictions are closer to the true values relative to their magnitude.
- For instance, an SMAPE of 50% means that, on average, the model’s predictions differ from the actual values by 50% of their combined average size.
- (Code Block 6E) This code block generates a long-term sea level forecast up to 2100 using the trained Prophet model. It plots the historical sea levels as well as the predicted values from the specified
split_datedefined in Code Block 4B onward.- From the NOAA Sea Level Trends page you visited earlier, on your selected region, click on the “Regional Scenarios” option. A graph will appear showing different scientists' projections for future sea level rise. Compare your model's predictions to these projections: How closely do they align, and where do they differ?
Experimenting with the Prophet Model
- Experiment with Time Interval Sizes: Change the dates in Code Block 4B.
- 25-year rolling windows: For example, train from 1975-2000 and test on 2000-2025. Then move backward: train on 1950-1975, test on 1975-2000, and so on.
- Varying training lengths: Compare performance when training with longer histories vs. shorter ones. For example:
- Train from 1900-2000, then test on 2000-2025 (long history).
- Train on just 10 years of data (e.g., 1990-2000), then predict into the present (2000-2025).
- By experimenting this way, you can see whether the model is more accurate with more historical data, or if shorter windows can capture recent trends better.
- Questions for Analysis:
- How does the model’s accuracy change when you use more historical data compared to less?
- Does the model tend to underpredict, overpredict, or stay flat when forecasting far into the future?
- Do shorter training windows capture sudden changes better, or do they miss longer-term trends?
- If you were tasked with making real-world sea level predictions, which training strategy would you trust more–long history or recent trends? Why?
- What might be some limitations of the Prophet model when predicting far into the future?
Ask an Expert
Global Goals
The United Nations Sustainable Development Goals (UNSDGs) are a blueprint to achieve a better and more sustainable future for all.
Variations
- Experiment with changing parameters in the model to see if it could follow the trend better or potentially match the other scientists projections. You can see more information about the different parameters for the Prophet model on the Prophet documentation here.
- Instead of the Prophet model, try models like ARIMA, LSTM (a type of neural network), or Random Forest regression. How do their predictions compare to the Prophet model and to NOAA’s scientific projections?
- Use data from multiple NOAA stations (e.g., east coast vs. west coast). Do some areas show faster sea level rise than others? Can the model capture regional differences?
- Check if the model can capture recurring seasonal variations, such as higher sea levels during certain months, changes with the tides, or astronomical effects (for example, the location of the moon or Jupiter). Compare predictions with and without seasonality components.
Careers
If you like this project, you might enjoy exploring these related careers:










