One of the highly challenging problems many manufacturing companies and industries face is the generation of hazardous gases in their production and operation processes. Even though there are latest toxic gas detection systems available for sensing and containing dangers like gas leaks, they cannot solve the issue of predicting disasters very accurately.
With the pervasion of machine learning (ML) and its subfields in a variety of real-world applications today, it is now possible to hint gas hazards through automated dispersion prediction, which is taken care of by ML. In this article, we bring in the latest research study that has proposed two standard ML models for an efficient gas dispersion prediction.
Atmospheric Dispersion Modeling And ML
The process of Atmospheric Dispersion (AD) Modeling involves mathematically determining the dispersion phenomenon of gaseous particles (for example, pollutants) in the air. AD models are generally created through a computer software since they entail mathematical equations governing dispersion. Once set, these models predict the particle dispersion in the surrounding environment.
Conventional methods such as Gaussian model, Lagrangian stochastic (LS) models and Computational Fluid Dynamics (CFD) models are generally used in AD modelling. However, all of these models pose two problems: high computational costs and inaccuracies in results for complex environments.
This is the reason most industries are disinterested in these conventional AD models and opt for traditional emergency response actions. To solve this issue of higher computational costs, researchers and industry experts started exploring inexpensive and powerful ways like ML, which has proved its mettle in various applications, to predict hazardous gas dispersion.
The first study that saw ML for gas dispersion prediction, was by Marija Božnar and team from Jožef Stefan Institute, Slovenia in 1992, where they use a neural network-based method for predicting sulphur dioxide (SO2) concentration at a power plant in Šoštanj. Following ML developments in the subsequent years, supervised learning models gained traction. Concepts such as support vector machines (SVM) slowly made way into studies concerned with hazardous gas dispersion prediction.
But these implementations have faced setbacks, be it the difficulty in model training or the input ML parameters in gas dispersion taking a hit in accuracy. A recent study has tried to overcome this problem by proposing two ML models — back-propagation (BP) network and support vector regression (SVR) model.
Project Prairie Grass is an experimental dataset of a gas emission study carried out at O’Neill, the US in 1956. It mainly measured gas diffusion rates of SO2 emitted from a continuous point source at ground level. In addition, meteorological data such as mean wind direction, mean wind speed, air temperatures and others were also collected.
Similarly, the Indianapolis tracer dataset contains gas concentrations which were collected and measured at Indianapolis, the US in 1985. Unlike, Project Prairie Grass, this study collected data in a complex terrain setting.
A total of 23,900 tracer data samples were considered by Wang and team, based on these two datasets.
Unravelling The Models
Backpropagation (BP) Network Model: In artificial neural networks, backpropagation (BP) stands out as the most used technique in supervised ML. In the study, Wang and the team gave a brief note about why BP is preferred for predicting dispersion:
“The inputs of the BP network are usually parameters associated with atmospheric dispersion. These parameters usually include the meteorological parameters, the parameters related to the points of interest, and the source terms. With appropriate neuron numbers of hidden layers, the BP network can perform well on both accuracy and the convergence speed,” they said.
The BP network is designed for two cases — original monitoring input parameters (meteorological data mentioned earlier) and Gaussian inputs. All of the input parameters from both the datasets are normalised to form the input vectors and are trained using MATLAB Neural Network Toolbox. Furthermore, the cross-validation method is used to create subsets of the datasets for validation. (A detailed information on this process can be found here.)
Support Vector Regression (SVR) Model: By making use of the regression method derived from support vector machine (SVM), the researchers build an SVR model, which has a different take from the BP model. In this case, the input parameters are mapped through kernel functions. Radial Basis Function (RBF) is the core kernel function selected in this SVR model.
Just like the BP network proposed above, the model is built by keeping original monitoring input parameters and Gaussian inputs in mind. Detailed information on this model can be found here.
To measure the dispersion prediction performance, the BP and SVR models are measured along two statistical features — coefficient of determination (R2) and Normalised mean square error (NMSE). This is also to see how the inputs do well in the learning model.
In both models, the performance was very satisfactory with a high R2 and a lower NMSE for Project Prairie Grass as well as Indianapolis tracer datasets. Differences in these ML models appear only if the training data size is very small. Other than that, it fares very well compared to traditional AD models.
The models have proved to be very effective in gas dispersion prediction. However, it is yet to explore for various environment settings regardless of the type of industry/process. On top of this, these models fail to make a significant mark in performance, for smaller datasets. But, then again it is quicker compared to traditional AD models. If it sees more improvements, this study will definitely be set as the standard for AD prediction.
Enjoyed this story? Join our Telegram group. And be part of an engaging community.
Register for our upcoming Data Engineering Workshop, in Mumbai & Gurugram, here.
Provide your comments below
What's Your Reaction?
I research and cover latest happenings in data science. My fervent interests are in latest technology and humor/comedy (an odd combination!). When I'm not busy reading on these subjects, you'll find me watching movies or playing badminton.