Now Reading
Analyzing Climate Change Using Earth Surface Temperature DataSet

Analyzing Climate Change Using Earth Surface Temperature DataSet

With each passing day, the threat upon climate change has become an important matter to be concerned about. Giving rise to global warming with the emission of greenhouse gases and drastic weather changes. Greenhouse gases mostly due to the rise in Carbon Dioxide emission and methane. The sources being fossil fuels being burnt, deforestation and industrial effluents. Over recent years there has been a massive increase in Earth’s surface temperature with heat waves rising. Simultaneously glaciers are melting, thereby decreasing land size. Not only humans but also plants, animal kingdom are being affected rigorously.

Scientists say this will continue to destroy mother Earth if something is not done at its earliest. Every big organisation is now joining hands in making decisions regarding the betterment of climate changes for our future generations. WHO and NASA have brought about many regulations in this climate change index for all the countries.

Register for our upcoming Masterclass>>

Source: Wikipedia

About the Dataset

The Berkeley Earth Surface Temperature Study contains 1.6 billion temperature records. It is very well packaged and has interesting subsets (like countries, cities, etc.). They have published the source data for the transformations. They have included methods that have weather observations from a short timespan to be included. In this dataset, there are several files. Global Land and Ocean-and-Land Temperatures record from 1750 – 2015.

Looking for a job change? Let us help you.

Other files include – Global Average Land Temperature record for Country, Global Average Land Temperature record for State, Global Land Temperatures record for Major City, Global Land Temperatures record for City.

Time Series

The raw data collected from Berkley Earth has been processed and cleaned by many developers and made into a proper dataset; thereby, researchers can work upon and bring more insights. Dataset Used – Link. We will be demonstrating time series analysis over this dataset.

# importing libraries

 import pandas as pd
 import seaborn as sns
 import numpy as np
 import matplotlib.pyplot as plt
 %matplotlib inline
 from plotly.offline import download_plotlyjs, init_notebook_mode, iplot 

# read dataset

 temp = pd.read_csv('../input/climate-change-earth-surface-temperature-data/GlobalTemperatures.csv',parse_dates=["dt"], index_col="dt")
 DatetimeIndex(['1750-01-01', '1750-02-01', '1750-03-01', '1750-04-01',
                '1750-05-01', '1750-06-01', '1750-07-01', '1750-08-01',
                '1750-09-01', '1750-10-01',
                '2015-03-01', '2015-04-01', '2015-05-01', '2015-06-01',
                '2015-07-01', '2015-08-01', '2015-09-01', '2015-10-01',
                '2015-11-01', '2015-12-01'],
               dtype='datetime64[ns]', name='dt', length=3192, freq=None) 


 <class 'pandas.core.frame.DataFrame'>
 DatetimeIndex: 3192 entries, 1750-01-01 to 2015-12-01
 Data columns (total of eight columns):
  #   Column                                     Non-Null Count  Dtype  
 ---  ------                                     --------------  -----  
  0   LandAverageTemperature                     3180 non-null   float64
  1   LandAverageTemperatureUncertainty          3180 non-null   float64
  2   LandMaxTemperature                         1992 non-null   float64
  3   LandMaxTemperatureUncertainty              1992 non-null   float64
  4   LandMinTemperature                         1992 non-null   float64
  5   LandMinTemperatureUncertainty              1992 non-null   float64
  6   LandAndOceanAverageTemperature             1992 non-null   float64
  7   LandAndOceanAverageTemperatureUncertainty  1992 non-null   float64
 dtypes: float64(8)
 memory usage: 224.4 KB
 ((3192, 8), None) 

# generating heatmap



# visualisation for all the attributes

# Yearly Average Land Temperature

See Also
Why Indian Cities Should Adopt Tech To Grow Urban Forests

 new_df = pd.read_csv('../input/climate-change-earth-surface-temperature-data/GlobalTemperatures.csv')
 new_df['year'] = pd.to_datetime( new_df['dt']).dt.year 
 by_new = new_df.groupby(['year'] )['LandAverageTemperature'].mean().reset_index()
 new_pivot = by_new.pivot_table(values='LandAverageTemperature', index='year')

After 1900 temperature has a steep increase.

# highest temperate dates

 ax = temp.groupby(['dt'])['AverageTemperature'].last().sort_values(ascending=False).head(10).sort_values().plot(kind='barh');
 ax.set_xlabel("avg temp");
 plt.title("Date Wise Highest Average Temperature"); 

# Average Temperature in all Seasons

 ax.set_ylabel('Average temperature')
 ax.set_title('Average temperature in each season')
 legend = plt.legend(loc='center left', bbox_to_anchor=(1, 0.5), frameon=True, borderpad=1, borderaxespad=1) 

# Countries with Highest temperature Differences

 temp_country = pd.read_csv('../input/climate-change-earth-surface-temperature-data/GlobalLandTemperaturesByCountry.csv')
 countries = temp_country['Country'].unique()
 for country in countries:
     curr_temps = temp_by_country[temp_by_country['Country'] == country]['AverageTemperature']
     max_min_list.append((curr_temps.max(), curr_temps.min()))
 diff, countries = (list(x) for x in zip(*sorted(zip(diff, countries), key=lambda pair: pair[0], reverse=True)))
 f, ax = plt.subplots(figsize=(8, 8))
 sns.barplot(x=diff[:15], y=countries[:15], palette=sns.color_palette("coolwarm", 25), ax=ax) 

For the complete notebook, visit the link here.


In this article, we’ve shown some of the time series analysis trends done to the climate change dataset over the 265 years (1750-2015). Many insights can be drawn from this and can be used for analysis tallying with other similar kinds of data.

What Do You Think?

Join Our Discord Server. Be part of an engaging online community. Join Here.

Subscribe to our Newsletter

Get the latest updates and relevant offers by sharing your email.

Copyright Analytics India Magazine Pvt Ltd

Scroll To Top