Machine learning engineers and data scientists are the backbones of any organisations and more so for blue-chip companies like Netflix, Facebook and Google, among others. These companies possess a plethora of data that are moulded to unearth insights for making informed decisions. That is why these companies continuously strive to facilitate superior environments for their data scientists to effortlessly perform data analysis.
Focusing on delivering an efficient platform for data scientists, Netflix has launched Polynote — a polyglot notebook with multi-language interoperability — with first-class Scala support, Apache Spark integration, Python, and more.
The notebook was developed due to intricacies in the existing notebook while supporting Scala. It is designed to enable data scientists to seamlessly integrate Netflix’s JVM-based machine learning platform which utilises Scala.
While the notebook will assist data scientists to solve specific problems of Netflix, it will also simplify similar challenges of real-world use cases outside Netflix.
The Polyglot Notebook
The notebook is focused on reimagining the Scalar notebook experience to expedite ML innovation by removing complexities, so that the developer can concentrate on analysis. Polyglot can be accessed from Polynote.org and run without conflicts due to package manager. Often Python developers work with a specific environment but Scalar developers employ cluster computing, which often leads to various complexities while analysing. Thus, with Polyglot, Netflix mitigates challenges around conflicting environments.
Polyglot also provides reproducibility, which was one of the longest-standing predicaments in existing notebooks. Besides, the Polyglot has enhanced the flexibility of developers by supporting multiple languages in the same notebook. In other words, developers can use more than one programming language simultaneously for analysing using notebook.
Polyglot Features At A Glance
- Editing improvements
- Dependency management
- Data visualisation
Polyglot allows developers to run different programming language in each cell, which will empower data scientist to use Scalar for data wrangling and Python for making ML models. Such flexibility will revamp the whole data analysis process as developers will now be free to switch between various programming language base on their proficiency.
To facilitate data multi-language programming, the interpreter provides the input and output values to the kernel, which manages the data regardless of the language used.
Besides, the enhancement in editing was crucial as the existing notebooks were not intuitive while coding or making changes. But, Polynote IDE provides parameter hints, error highlighting, interactive autocomplete, and more, just like IntelliJ. This will empower developers to save keystrokes for quickly writing codes.
Further, the dependency management of Polynote is another facet to configure the packages for every notebook. It enables developers to directly manage dependencies in the notebook through its user-friendly configuration section.
While we have included some of the most important advantages, the Polyglot has other features such as visibility and data visualisation for plots using matplotlib and Vega. This makes the notebook a must-have for machine learning practices as it has addressed many problems developers witnessed with the existing notebooks.
Companies are committed to enhancing the experience of developers by delivering robust tools to streamline the analysis processes. Organisations such as Uber and Facebook have released Ludwig and Pythia respectively. While the former is backed by Google’s TensorFlow framework, the later is utilised for deep learning for image and language models.
Now with Netflix’s Polynote, data scientists can enhance their workflow and concentrate on researching with data to make informed decisions. Further, the Polynote gives insights into the internal state of the notebook to show critical information related to the kernel.
Netflix is optimistic about this launch and expects it will be foundational to the development of the machine learning community, resulting in assisting developers to attain new heights and democratise data science for business growth.