Autonomous driving is currently in a nascent stage, but it has immense potential to offer a safe and efficient transportation system in the future. Self-driving cars are a combination of various techniques and tools like Lidar, computer vision and GPS sensors, among others. This sector is clearly on an upward trajectory, so much so that it is estimated to reach a CAGR of 38.6% during the forecast period of 2017-2027.
Last year, Lyft open-sourced a large-scale dataset known as Level 5 dataset. Level 5 featured the raw sensor camera and Lidar inputs as perceived by a fleet of multiple, high-end, autonomous vehicles in a restricted geographic area. This time, the San Francisco-based ridesharing company open-sourced a cloud-native machine learning and data processing platform known as Flyte.
Flyte is a structured programming and distributed processing platform, which has the capability of enabling highly concurrent, scalable and maintainable workflows for machine learning and data processing. The platform uses protocol buffers as the specification language to specify workflows and tasks. Flyte comes with Flytekit — a Python SDK to develop applications on Flyte, be it authoring workflows or tasks.
According to the developers at Lyft, Flyte has been serving production model training and data processing at the company for over three years now and has become the de-facto platform for teams like pricing, locations, estimated time of arrivals (ETA), mapping, self-driving (L5) and much more.
Features of Flyte
There are five key features of this platform which has been mentioned below
- Hosted, Multi-Tenant & Serverless: Flyte is built directly on Kubernetes which provides benefits like portability, scalability, reliability, among others. As a multi-tenant service, a developer can use this platform to deploy, scale, and work on his/her repo.
- Elastic Scale: With a fully distributed, fault-tolerant control plane, Flyte has the capability of scaling multiple clusters, thousands of nodes, and thousands of concurrent workflows.
- Parameters, Data Lineage & Caching: All the tasks and workflows in Flyte have strongly typed inputs and outputs which makes it possible to parameterise the workflows that have rich data lineage, by using cache versions of pre-computed artefacts.
- Versioned, Reproducible & Shareable: Every entity in Flyte is immutable, with every change explicitly captured as a new version, which makes it easy and efficient for a developer to iterate, experiment and rollback the workflows.
- Dynamic and Extensible: Flyte is a framework-agnostic system and has a collection of plugins to assist with all of the workflows needs such as Spark on K8s, AWS Batch, Array Jobs, Hive Qubole, Containers, Pods, and more.
How It Works
The platform manages over 7,000 unique workflows at Lyft, for a total of over 100,000 executions every month, 1 million tasks, and 10 million containers. Flyte also handles all the work which is involved in executing complex workflows such as hardware provisioning, scheduling, data storage and monitoring. In Flyte, workflows are expressed as graphs and to create a workflow, a developer first requires to compose a graph with precise specifications. While executing a workflow, the tasks are launched using Docker containers.
It has been witnessed that there has been a rise in tools, datasets, and techniques by the organisations when it comes to autonomous driving. Before Lyft announced the release of Flyte, on the same day, the ride-hailing rival company of Lyft, Uber has announced that it’s a model-agnostic, visual debugging tool for machine learning — Manifold works as an open-source platform to the community. This tool helps in identifying performance issues across ML data slices and models and then diagnose their root causes by surfacing feature distribution differences between subsets of data.