How ML Models Can Predict Hazardous Gas Dispersion To Avert Disasters

One of the highly challenging problems many manufacturing companies and industries face is the generation of hazardous gases in their production and operation processes. Even though there are latest toxic gas detection systems available for sensing and containing dangers like gas leaks, they cannot solve the issue of predicting disasters very accurately.


Sign up for your weekly dose of what's up in emerging technology.

With the pervasion of machine learning (ML) and its subfields in a variety of real-world applications today, it is now possible to hint gas hazards through automated dispersion prediction, which is taken care of by ML. In this article, we bring in the latest research study that has proposed two standard ML models for an efficient gas dispersion prediction.

Atmospheric Dispersion Modeling And ML

The process of Atmospheric Dispersion (AD) Modeling involves mathematically determining the dispersion phenomenon of gaseous particles (for example, pollutants) in the air. AD models are generally created through a computer software since they entail mathematical equations governing dispersion. Once set, these models predict the particle dispersion in the surrounding environment.

Conventional methods such as Gaussian model, Lagrangian stochastic (LS) models and Computational Fluid Dynamics (CFD) models are generally used in AD modelling. However, all of these models pose two problems: high computational costs and inaccuracies in results for complex environments.

This is the reason most industries are disinterested in these conventional AD models and opt for traditional emergency response actions. To solve this issue of higher computational costs, researchers and industry experts started exploring inexpensive and powerful ways like ML, which has proved its mettle in various applications, to predict hazardous gas dispersion.

The first study that saw ML for gas dispersion prediction, was by Marija Božnar and team from Jožef Stefan Institute, Slovenia in 1992, where they use a neural network-based method for predicting sulphur dioxide (SO2) concentration at a power plant in Šoštanj. Following ML developments in the subsequent years, supervised learning models gained traction. Concepts such as support vector machines (SVM) slowly made way into studies concerned with hazardous gas dispersion prediction.

But these implementations have faced setbacks, be it the difficulty in model training or the input ML parameters in gas dispersion taking a hit in accuracy. A recent study has tried to overcome this problem by proposing two ML models — back-propagation (BP) network and support vector regression (SVR) model.


For the proposed ML models in the study, researchers Rongxiao Wang and a team from China test the models for two datasets namely Project Prairie Grass and Indianapolis tracer and meteorological data.

Project Prairie Grass is an experimental dataset of a gas emission study carried out at O’Neill, the US in 1956. It mainly measured gas diffusion rates of SO2 emitted from a continuous point source at ground level. In addition, meteorological data such as mean wind direction, mean wind speed, air temperatures and others were also collected.

Similarly, the Indianapolis tracer dataset contains gas concentrations which were collected and measured at Indianapolis, the US in 1985. Unlike, Project Prairie Grass, this study collected data in a complex terrain setting.

A total of 23,900 tracer data samples were considered by Wang and team, based on these two datasets.

Unravelling The Models

Backpropagation (BP) Network Model: In artificial neural networks, backpropagation (BP) stands out as the most used technique in supervised ML. In the study, Wang and the team gave a brief note about why BP is preferred for predicting dispersion:

“The inputs of the BP network are usually parameters associated with atmospheric dispersion. These parameters usually include the meteorological parameters, the parameters related to the points of interest, and the source terms. With appropriate neuron numbers of hidden layers, the BP network can perform well on both accuracy and the convergence speed,” they said.

The BP network is designed for two cases — original monitoring input parameters (meteorological data mentioned earlier) and Gaussian inputs. All of the input parameters from both the datasets are normalised to form the input vectors and are trained using MATLAB Neural Network Toolbox. Furthermore, the cross-validation method is used to create subsets of the datasets for validation. (A detailed information on this process can be found here.)

Support Vector Regression (SVR) Model: By making use of the regression method derived from support vector machine (SVM), the researchers build an SVR model, which has a different take from the BP model. In this case, the input parameters are mapped through kernel functions. Radial Basis Function (RBF) is the core kernel function selected in this SVR model.

Just like the BP network proposed above, the model is built by keeping original monitoring input parameters and Gaussian inputs in mind. Detailed information on this model can be found here.

The Result

To measure the dispersion prediction performance, the BP and SVR models are measured along two statistical features — coefficient of determination (R2) and Normalised mean square error (NMSE). This is also to see how the inputs do well in the learning model.

In both models, the performance was very satisfactory with a high R2  and a lower NMSE for Project Prairie Grass as well as Indianapolis tracer datasets. Differences in these ML models appear only if the training data size is very small. Other than that, it fares very well compared to traditional AD models.


The models have proved to be very effective in gas dispersion prediction. However, it is yet to explore for various environment settings regardless of the type of industry/process. On top of this, these models fail to make a significant mark in performance, for smaller datasets. But, then again it is quicker compared to traditional AD models. If it sees more improvements, this study will definitely be set as the standard for AD prediction.

More Great AIM Stories

Abhishek Sharma
I research and cover latest happenings in data science. My fervent interests are in latest technology and humor/comedy (an odd combination!). When I'm not busy reading on these subjects, you'll find me watching movies or playing badminton.

Our Upcoming Events

Masterclass, Virtual
How to achieve real-time AI inference on your CPU
7th Jul

Masterclass, Virtual
How to power applications for the data-driven economy
20th Jul

Conference, in-person (Bangalore)
Cypher 2022
21-23rd Sep

Conference, Virtual
Deep Learning DevCon 2022
29th Oct

3 Ways to Join our Community

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Telegram Channel

Discover special offers, top stories, upcoming events, and more.

Subscribe to our newsletter

Get the latest updates from AIM

What can SEBI learn from casinos?

It is said that casino AI technology comes with superior risk management systems compared to traditional data analytics that regulators are currently using.

Will Tesla Make (it) in India?

Tesla has struggled with optimising their production because Musk has been intent on manufacturing all the car’s parts independent of other suppliers since 2017.