Table of contents
Introduction to Pastas
Pastas is an open-source Python framework designed for processing, simulation and analysis of hydrogeological time series models. It has built-in tools for statistically analyzing, visualizing and optimizing such models. It was introduced by Raoul A. Collenteur, Mark Bakker, Ruben Calje, Stijn A. Klop and Frans Schaars in an article published by the National Groundwater Association (NGWA) in July 2019.
Before going into Pastas’ details, let us first understand the core term ‘hydrogeology’ – an area in which the Pastas library marks its significant contribution.
What is hydrogeology?
The word ‘hydrogeology’ is a combination of two terms: ‘hydro’ meaning water and ‘geology’ which means studying the Earth. Hydrogeology is thus a branch of geology dealing with movement and distribution of groundwater in soil and rocks of Earth’s crust (also known as ‘aquifers’). You can read in-depth about this domain here.
Why Pastas?
The two major objectives of this state-of-the-art library are:
- To provide a scientific software package for the development and testing of new hydrogeological methods using a few lines of Python code
- To provide an efficient and easy-to-handle library for groundwater practitioners.
Pastas employ Transfer Function Noise (TFN) time series models using predefined response functions that define how groundwater level responds to evaporation, rainfall etc..
Installation of Pastas
Pastas can be installed using pip or conda command (using Anaconda Distribution of Python is recommended for the installation step)
pip install pastas
-or- conda install pastas
If the library has already been installed, it can be updated to use its latest version as:
pip install pastas --upgrade
Prerequisite: Python 3.6 or higher version
Dependencies:
- jupyter >= 1.0.0
- numpy >=1.16.5
- pandas >= 1.0
- matplotlib >= 1.0
- scipy >= 1.1 (
- lmfit >= 1.0.0 (for non-linear optimizations and curve fitting problems in Python)
- numba (a JIT (Just-In-Time) compiler that converts Python code into optimized machine code)
These dependencies get automatically installed when you instal pastas using pip install manager.
Practical implementation
Here, we have shown two use cases of Pastas library. The demonstrations have been coded in Google colab with Python version 3.7.10.
Step-wise explanation of both the examples is as follows:
Time series analysis using Pastas
- Install pastas library
!pip install pastas
- Import required libraries
import pandas as pd #for data manipulation import pastas as ps import matplotlib.pyplot as plt #for visualization
- Check the versions of the dependencies using show_versions().
ps.show_versions()
Sample output:
Python version: 3.7.10 (default, Feb 20 2021, 21:17:23) [GCC 7.5.0] Numpy version: 1.19.5 Scipy version: 1.4.1 Pandas version: 1.1.5 Pastas version: 0.16.0 Matplotlib version: 3.2.2
- Load head observations
The term ‘head’ here refers to the hydraulic head (also known as ‘static level’) of groundwater flow i.e. the elevation to which water will naturally rise. The file containing the head observations can be downloaded from here.
head_data = pd.read_csv('https://raw.githubusercontent.com/pastas/ pastas/master/examples/data/head_nb1.csv', parse_dates=['date'], index_col='date', squeeze=True)
‘parse_dates’ enables parsing of the ‘date’ column into datetime. ‘index_col’ sets the ‘date’ column as index and ‘squeeze’ parameter squeezes the head observations into a pandas Series object.
5. Check the initial records and datatype of head_data.
print(‘Data type of head_data:’,type(head_data))
Output: Data type of head_data: <class 'pandas.core.series.Series'>
head(head_data)
Output:
date 1985-11-14 27.61 1985-11-28 27.73 1985-12-14 27.91 1985-12-28 28.13 1986-01-13 28.32 Name: head, dtype: float64
6. Plot the head observations
head_data.plot(style='.', figsize=(12, 4)) #Size of plot plt.ylabel('Head (metres)'); #Y-axis label plt.xlabel('Year'); #X-axis label
Output:
7. Load the stresses
The causes of head variations are believed to be rain and evaporation of water. We load the .csv files having data of measured rainfall and measured potential evaporation.
#Load rain data rain = pd.read_csv('https://raw.githubusercontent.com/pastas/ pastas/master/examples/data/rain_nb1.csv', parse_dates=['date'], index_col='date', squeeze=True) #Initial records rain.head()
Output:
date 1980-01-01 0.0033 1980-01-02 0.0025 1980-01-03 0.0003 1980-01-04 0.0075 1980-01-05 0.0080 Name: rain, dtype: float64
#Load evaporation data evap = pd.read_csv('https://raw.githubusercontent.com/pastas/ pastas/master/examples/data/evap_nb1.csv', parse_dates=['date'], index_col='date', squeeze=True) #Initial records evap.head()
Output:
date 1980-01-01 0.0002 1980-01-02 0.0003 1980-01-03 0.0002 1980-01-04 0.0001 1980-01-05 0.0001 Name: evap, dtype: float64
8. Plot the rain and evaporation time series
plt.figure(figsize=(12, 4)) #Size of plot rain.plot(label='rain') #Title for rain plot evap.plot(label='evap') #Title for evaporation plot plt.xlabel(‘Year') #X-axis label plt.ylabel('Rainfall/Evaporation (m/d)') #Y-axis label plt.legend(loc='best'); #Location of legend
Output:
9. Calculate and plot recharge
The term ‘groundwater recharge’ (also known as ‘deep percolation’ or ‘deep drainage’) refers to the hydraulic process in which water moves downwards from surface water to groundwater.
We approximate recharge as the difference between rain and evaporation level.
recharge = rain - evap #Plot the recharge time series plt.figure(figsize=(12, 4)) #Size of plot recharge.plot() plt.xlabel('Year') #X-axis label plt.ylabel('Recharge (m/d)') #Y-axis label
10. Time series model creation
Initiate a pastas time series model using Model class
model = ps.Model(head_data)
Create a stress model which translates input time series (here #‘recharge’) into a contribution explaining part of the output time series
stressModel = ps.StressModel(recharge, ps.Gamma, name='groundwater recharge', settings='prec')
For each stress, a StressModel object is created. The first parameter is the name of the time series of the stress. The second argument is the response function used, ‘name’ specifies just a name given to the series and ‘prec’(stands for ‘precipitation’) is the kind of series.
Add the stress model to the time series model
model.add_stressmodel(stressModel)
Solve model for specified time period (tmin to tmax).Different available solvers can be found here.
model.solve(tmin='1990', tmax='2015')
Sample output:
11. Plot the model
- Code source
- Google colab notebook of above implementation can be found here.
Analyze effect of pumping well using Pastas
Here, a TFN model is created first with the net groundwater recharge as the single stress explaining the observed heads.Then, the model is extended to include the effect of adding a pumping well on the heads by adding another stress model. A comparison is made between the simulated heads in presence and absence of the pumping well.The addition of the pumping well is clearly found to have improved the simulation of the heads.
(1)-(3) Perform steps (1) to (3) as explained in the above time series analysis demo first. Then, proceed as follows:
4. Load the datasets containing information about head observations, rainfall, evaporation and pumping extraction rate available here.
#Head observations head = pd.read_csv("https://raw.githubusercontent.com/pastas/ pastas/master/examples/notebooks/data_notebook_5/head_wellex.csv", index_col="Date", parse_dates=True, squeeze=True) #Rainfall data rain = pd.read_csv("https://raw.githubusercontent.com/pastas/ pastas/master/examples/notebooks/data_notebook_5/prec_wellex.csv", index_col="Date", parse_dates=True) #Evaporation data evap = pd.read_csv("https://raw.githubusercontent.com/pastas/ pastas/master/examples/notebooks/data_notebook_5/evap_wellex.csv", index_col="Date", parse_dates=True) #Pumping well extraction rate well = pd.read_csv("https://raw.githubusercontent.com/pastas/ pastas/master/examples/notebooks/data_notebook_5/well_wellex.csv", index_col="Date", parse_dates=True)
- Plot the four time series defined in the previous step
fig, ax = plt.subplots(4, 1, sharex=True, figsize=(10,5)); ax[0].plot(head, label=head.name, linestyle=" ", marker=".", markersize=2) ax[0].legend() ax[1].plot(rain, label="rain") ax[1].legend() ax[2].plot(evap, label="evap") ax[2].legend() ax[3].plot(well, label="well") ax[3].legend()
Output:
- Pastas time series model creation
model = ps.Model(head, name="groundwater")
Add the stress model for the net groundwater recharge simulated using RechargeModel class
rechargeModel = ps.RechargeModel(rain, evap, name="groundwater recharge",rfunc=ps.Exponential, recharge=ps.rch.Linear())
‘rfunc’ is the response function used. ps.rch.Linear() is the linear model used for computing precipitation excess.
model.add_stressmodel(rechargeModel) model.solve(noise=True) #Solve the model model.plot(figsize=(10,4)) #Plot the model
Output plot:
- Add the stress model for pumping well
stressModel = ps.StressModel(well/1e6, rfunc=ps.Gamma, name="pumping well", settings="well", up=False) model.add_stressmodel(stressModel)
Solve the model
model.solve()
Plot the decomposition of the time series in various stresses
axes = model.plots.decomposition(figsize=(10,8)) axes[0].plot(simulation) # Add the previously simulated values to the plot
Output plot:
- Residuals analysis
Compute residuals time series using residuals() method
model.residuals().plot(figsize=(20, 8))
Plot the residuals series
residual.plot() plt.legend(["With well", "Without well"]) #Legend of plot
Output:
- Code source
- You can access the Google colab notebook here.
References
To dive deeper into the Pastas framework, refer to the following sources: