Ajinkya Bhave, Country Head (India) at Siemens Engineering Services, discussed the rising significance of simulated data, in his talk at the MLDS conference, titled “Simulation-driven Machine Learning”. He discussed the application of simulated data to train machine learning models in situations impossible with physical data. “At Siemens, the tool connects simulation models and data to train frameworks for ML to train the model at scale using the digital twin,” he explained.
He outlined the challenge of the generation and labelling of real-world data and how industries can overcome the hurdles using a digital twin and simulation data. He referred to the Reduced Order Model (ROM), which simplifies a high-fidelity static or dynamical model, preserving essential behaviour and dominant effects to reduce solution time or storage capacity required for the more complex model.
ROM, simulation and digital twins
Subscribe to our Newsletter
Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
The reduced-order model helps organisations convert data to models, extend their scope and compute faster. ROM can run your digital twin on embedded devices, cloud and on-site. “The basic idea is that the ROM is the catalyst of the digital twin, enabling more applications that weren’t possible in the past,” he explained.
There are multiple ways to create a ROM, depending on your application area, data, and the model’s system. The model can be anywhere from data-driven with machine learning and deep learning, hybrid with statistical models and physics, to a complete physics-based model. “You cannot create a model without domain knowledge that you encapsulate in it. But, equally important, the data matters. All the models require some amount of data,” he said.
To create ROMs based on a neural network approach, the data can become either a stopping point or an advantage. At Siemens, the team either augments existing physical data, creates synthetic data or cleans/ labels existing data.
Simulation plays a huge role in connecting machine learning to the digital twin model. Ajinkya explored this ability through interesting real-world case studies.
Case study 1: Applying synthetic data to deploy the machine in real-world scenarios
Ajinkya walked the audience through a case study of a Siemen’s client that creates gearboxes for wind turbines. The wind turbines break down due to failures in gearboxes and ball bearings. The company turned to predictive monitoring to minimise the downtime. While the customer had tons of data, they did not have the domain distribution needed, making most of the data good with only one-off events of fault anomalies. To balance the distribution, Siemens leveraged 1D and 3D tools to model the gearbox and the ball bearings around the gears in the company’s multiphysics tool for 1D modelling. The model and its parts were simulated through a nonlinear spring mass damper system with both parameters based on real data and others tuned. Then, fault injections were applied on the model with the faults the customer was looking for, that output a synthetic time series. Next, statistical noise injection was done to ensure the output was closer to real-time. Siemens combined noise to create a time-series analyses and ran it through a neural network to identify faults.
“The idea was that we created synthetic training data, which was then used to train a neural network on a digital twin of the model. Then we tested that on the real faults which occur in the ball bearings of the gearboxes with the physical data. The graph showed us the prediction was pretty accurate,” he said. “The model trained on synthetic data with a well-tuned simulation model was able to create good training data for the machine learning algorithm for it to be able to predict those faults in real-world data in a real-world deployment.”
Case study 2: Model predictive control
MPC’s algorithms need to be accurate and high fidelity plant models, but that is not always possible. To that end, a virtual model of the plant is created through the black or grey-box model approach. The model either completes the system as a plant or as a sub-system in the form of a virtual sensor for the parts of the plant that are not measurable. The neural network-based sensor infers from the physical measurements and a model for the subsystem states the controller needs and are later given to the MPC. “You have augmented the physical plant along with the unobservable data using a simulation-based approach to help the controller to do better than what it would have with only the physical model of the plant,” he said.
The ROM and synthetic data can be additionally applied to the neural network of the plant in the MPC for model-based reinforcement learning, autonomous driving and factory robots for a fast but reduced-order model of the plant for the controller to optimise.
Case study 3: Predictive maintenance of pole-mounted transformers
The last case study was about the pole-mounted transformers that take high tension wires and reduce the voltage to 230 V for the safe operation of house appliances. However, given India’s diverse temperature conditions, such transformers are a fire risk. An identified cause is the oil levels between the coils of the transformers going down, causing it to overheat or spark. To monitor the oil levels of different transformers, the ‘normalised twin concept’ is used. Siemens retrofitted the transformers infrastructure with a Siemens box containing four temperature senses and a cloud-based router to send the measurements periodically to the cloud.
This allowed Siemens to infer the oil level, specialise the normalised digital twin for that model and use the live twin to virtually estimate the oil labels. Although this is still an ongoing project, “using a digital twin with simulated data was parameterised and finetuned with real parameters from the field”.
Lastly, Ajinkya discussed a generative design case study focusing on CFD simulations. ML can be used to adaptively learn the success certainty of simulation runs and reduce the hours of the process to mere minutes.