“Human beings rarely make mistakes. ML model gets 95% precision, whereas most human beings would have a 99.9% precision,” said Devashish Shankar, Principal Architect at Drishti. California-headquartered Drishti provides AI-powered video analytics technology to optimise the manual assembly processes in the manufacturing industry. With over eight years of experience in AI to boot, Devashish has been instrumental in developing Drishti’s data platform and core action recognition model.
During his talk at the MLDS Conference, ‘New developments in Deep Learning for unlikely industries’, Shankar outlined Drishti’s industrial applications of AI in manufacturing. The company leverages deep learning and computer vision to automate the analysis of factory floor videos. Essentially, the company has installed cameras on assembly lines that capture videos on which the company runs object detection, anomaly detection and action recognition. Then, the data is sent to industrial engineers to improve the line. During the talk, Shankar discussed Drishti’s major AI use-cases.
Sign up for your weekly dose of what's up in emerging technology.
Cycle detection and action detection
The first use-case is in cycle and action detection in the assembly line. Essentially, the cycle is a unit of work. “(The unit) comes into the station, some actions are performed on it, and then that unit leaves. The entire sequence is called a cycle,” Shankar said. Action detection is done within these cycles. Shankar illustrated how repetitive cycles are accompanied by data on the actions being performed, time taken to complete the cycle and time taken to complete each action.
The cycle starts and ends with the unit coming and going, and is solved using standard object detection techniques. “The neural network is operating at a frame level. Each frame detects if the unit is present and where it is present. On top of that, you do heuristics to define the cycle,” he added.
Anomaly detection through configurable heuristics FSM
Manufacturers are expected to follow ‘standardised work’, a specific sequence of actions, and an incorrect sequence or missed steps can lead to product defects. Anomaly detection identifies outliers in the system. The neural network tracks the unit in its field of view, detects the actions and attaches them to the cycle. Shankar explained the machine’s responses to the cycles through a toy experiment, where, if the cycle is correct, the actions are shown as ‘successful’ and tagged in green. In the opposite case, the device red flags it.
Given the complexities in the cycle, Drishti’s FSM can be configured based on conflict changes to define custom rule engines for each station. Additionally, the semi-supervised model allows training on minimal object data.
Variation in unit detection
A common issue in manufacturing is the variation on the factory floor, from unit sizes, locations, irregular trajectories, multiple units in the field of view, variations in hands and clothes of operators, background changes, lighting changes and more. This leads to different frames in the video showing conflicting images. So, the deep learning model is trained on different variations and sampling to reach the desired level of accuracy.
Action detection through 3D convolution models
Along with the problems of unit detection, the problem of action detection occurs when the meaning of the action can not be understood without motion or in a single frame. Additional issues such as a left-handed or a right-handed operator or different ways of picking things up can make detection even more difficult.
Multiple planes are required to mitigate this problem. Drishti leverages 3D convolution models that have several planes stacked together in a spatiotemporal cube with images together, on top of a 3D cone. This architecture is built for action localisation over video classification through semi-supervised models.
Overcoming machine inaccuracies
To alert the operators of errors, a tablet is placed right in front of them. “While the workers are performing a cycle, if a mistake happens, they are warned right there,” Shankar said. “This is the way in which Drishti is augmenting human beings. We are making them more efficient.” However, humans are extremely accurate in work, whereas machines tend to raise loads of false alarms.
To overcome the issue, the FSM based validation engine is used to define these anomalies. The neural network’s accuracy is increased through continuous loops of retraining by human annotators that correct data and input it back into the network. Another technique uses NLP bean search that splits the model’s actions into ‘confident’, and ‘non-confident’, followed by trying possible combinations to pick the one closest to the standard sequence. The language model then takes the sequence with the highest lock probability as the final sequence. “This allows us to tune anomaly precision versus anomaly recall”, he added.
ML infrastructure at Drishti
Drishti’s in-house infrastructure, ML Gym, is responsible for continuous training and evaluation. The various steps in the ML infrastructure include: A collection of raw video from the factory to be randomly fed into the ML gym as datasets; the datasets then go to ‘Drishti Annotation Service’ for data labelling; datasets finally go through ML training work through pipelines.