In recent years, techniques like deep learning have made several breakthroughs in domains like audio processing, computer vision, robotics and more. Popular deep learning frameworks like Tensorflow, Keras, among others, have immensely eased the development as well as training of complex neural network models.
According to the researchers, most of the deep learning models require several tensor manipulations for data processing, custom loss functions as well as accuracy metrics. Deep learning frameworks offer a number of functionalities which makes them powerful. However, most of the time, it becomes potentially difficult to navigate these functionalities.
For instance, there are about 2000 distinct symbols including aliases in TensorFlow and around 500 of these symbols are tensor-manipulating operations, thus finding the right ones to use for a given task can be a challenge for the beginners and even experienced software developers.
To mitigate such issues, researchers have created the TF-Coder tool to automatically synthesise the tensor manipulation programs. TF-Coder uses a bottom-up weighted enumerative search along with value-based pruning of equivalent expressions and flexible type, including the value-based filtering to ensure that expressions adhere to various requirements imposed by the TensorFlow library.
Behind TF-Coder
TF-Coder is a synthesis tool for automatically generating tensor manipulation programs in TensorFlow from examples and natural language. The synthesis algorithm in TF-Coder is built upon the bottom-up enumerative algorithm proposed earlier in Transit, a language and prototype implementation for distributed protocols.
TF-Coder uses two ML models in order to predict the needed operations from features of the input/output tensors and a natural language description of the task. These predictions are then combined within a general framework to modify the weights to customise the search process for the given task.
The researchers introduced three key ideas in the synthesis algorithm. Firstly, they introduced per-operation weights to the prior algorithm, allowing TF-Coder to enumerate over TensorFlow expressions in order of increasing complexity. Secondly, they introduced a novel, flexible, and efficient type- and value-based filtering system that handles arbitrary constraints imposed by the TensorFlow library, such as “the two tensor arguments must have broadcastable shapes.” Finally, they developed a framework to combine predictions from multiple independent machine learning models that choose operations to prioritise during the search, conditioned on features of the input and output tensors and a natural language description of the task.
The researchers evaluated TF-Coder on 70 real-world tensor transformation tasks from StackOverflow and from an industrial setting. They also trained models that predict TensorFlow operations from features of the input and output tensors and natural language descriptions of tasks and use the models to prioritise relevant operations during the search.
Contributions Of This Research
According to the researchers, a major contribution of this work is demonstrating that a highly optimised enumerative search strategy does scale to solve real-world tensor manipulation problems within seconds.
The contributions of this research are mentioned below-
- The researchers introduced TF-Coder, the first programming by example system for synthesising tensor manipulations in TensorFlow from input/output examples.
- They presented a weighted enumerative search algorithm that uses a novel two-stage filtering approach to efficiently enforce arbitrary preconditions required by the operations.
- In this research, a framework is demonstrated in which multiple independent machine learning models trained to predict useful TensorFlow operations for a given problem, are combined to guide the weighted enumerative search.
- TF-Coder is evaluated on real-world tasks taken from StackOverflow and an industrial setting, showing that TF-Coder outperforms prior synthesis techniques and even professional programmers.
Benefits of TF-Coder
Some of the benefits of using this tool are-
- TF-Coder can automatically find relevant TensorFlow operations, thus reducing the need to search through TensorFlow’s extensive documentation.
- The techniques in TF-Coder are particularly effective for synthesising programs in the domain of tensor manipulation.
- TF-Coder often produces solutions that are simpler than those written by TensorFlow experts.
- TF-Coder supports multi-modal specifications such as input-output examples and natural language descriptions.
- TF-Coder can help users find elegant solutions for difficult tensor transformations.
Wrapping Up
According to the researchers, the TF-Coder tool solves 63 of 70 real-world tensor transformation tasks from StackOverflow and from industrial setting tasks. The solutions of 63 tasks were successfully synthesised in 17 seconds on average.
They also showed that the tool achieved superhuman performance on a range of real problems from StackOverflow by finding solutions that are simpler than those written by TensorFlow experts, in less time as well.
Furthermore, the framework for incorporating multiple trained models leads to significantly faster synthesis times, which is 35.4% faster on average and it creates more elegant outcomes when compared to the TensorFlow experts.
Read the paper here.