Facebook AI recently launched an open-source machine learning library, PyTouch, to process touch sensing signals. It provides state-of-the-art touch processing capabilities as a service to unify the tactile sensing community, and help build scalable, proven, performance-validated modules. The library is currently available on GitHub.
With the increased availability of tactile sensors, the sense of touch is becoming a new paradigm in robotics and machine learning. However, ready-to-use touch processing software is limited, resulting in a high entry barrier for budding developers. The processing of raw sensor measurements into high-level features is challenging.
On the other hand, computer vision has algorithmic and programmatic methods for understanding images and videos. The popular open-source libraries such as Google’s TensorFlow, PyTorch, CAFFE, OpenCV have further accelerated the research by providing unified interfaces, algorithms and platforms.
Even though tools like PyTorch and CAFFEE can be used for touch processing, precursor development is needed to support algorithms for the experiment and research needs. PyTouch provides an entry point here. The library has been designed to support beginners as well as experts.
With PyTouch, Facebook aims to help researchers develop machine learning models that seamlessly process touch sensing signals. “Sensing the world through touch opens exciting new challenges and opportunities to measure, understand and interact with the world around us,” said Facebook.
“We believe that similar to computer vision, the availability of open-source and maintained software libraries for processing touch reading would lessen the barrier of entry to tactile based tasks, experimentation, and research in the touch sensing domain,” said Facebook.
In a paper called ‘PyTouch: A Machine Learning Library for Touch Processing,” co-authored by Mike Lambeta, Huazhe Xu, Jingwei Xu, Po-Wei Chou, Shaoxiong Wang, Trevor Darrell, and Roberto Calandra, the researchers have described the architectural choice of library and demonstrated its capabilities and benefits through several experiments.
The image depicts PyTouch architecture, where tactile touch processing is delivered to the end application ‘as a service’ through released pre-trained models. (Source: arXiv.org)
As shown in the image above, the software library modularises a set of commonly used tactile-processing functions valuable for various downstream tasks like tactile manipulation, object recognition based on touch, slip detection, etc. With this architecture, PyTouch is dialling up the efforts to standardise robotics and machine learning research for better benchmarks and more reproducible results.
Most importantly, the library aims to standardise how touch-based experiments are designed and look to reduce the amount of individual software developed, keeping the PyTouch library as a foundation for expanding future research applications.
- PyTouch is built on the machine learning framework PyTorch.
- Built on a library of pre-trained models, PyTouch provides real-time touch processing functionalities.
- Provides functions such as contact classification, slip detection, contact area estimation, and interfaces for training and transfers learning
- The library can train models using data from other vision or non-vision based tactile sensors.
- PyTouch allows performance benchmarking of real-world experiments of creating a tactile task baseline.
“Finally, in hand with the framework, we have released a set of pre-trained models which PyTouch uses in the background for tactile based tasks,” said Facebook.
Facebook has evaluated the performance of machine learning models trained across different models of vision-based tactile sensors, including DIGIT, OmniTact and GelSight.
The above table shows the classification accuracy [%] of touch detection (mean and standard) using cross-validation (k = 5). The joint models are trained with data from all three sensors, including DIGIT, OmniTact and GelSight. The cross-validation accuracy with varying train dataset size for single and joint models are shown below.
The experiments showed the same amount of data that training a joint model using data across multiple sensors (DIGIT, OmniTact and GelSight) results in better model performance than training from a single sensor.
Showcasing examples of data used in training touch prediction models. The dataset includes data across several DIGITs, OmniTact and GelSight sensors showing different lighting conditions and objects of various spatial resolutions. (Source: arXiv.org)
Facebook is looking to create an extendable library for touch processing similar to what PyTorch and OpenCV are for computer vision.
PyTouch is still in the early days. With multiple pre-trained models in place, it will allow researchers to focus on rapid prototyping. “We believe that this would beneficially impact the robotic and machine learning community by enabling new capabilities and accelerate research,” concluded Facebook.
Join Our Telegram Group. Be part of an engaging online community. Join Here.
Subscribe to our NewsletterGet the latest updates and relevant offers by sharing your email.
Amit Raja Naik is a senior writer at Analytics India Magazine, where he dives deep into the latest technology innovations. He is also a professional bass player.