After Uber’s ‘Manifold’, Lyft Open Sources It’s Cloud-Native Machine Learning Model ‘Flyte’

Autonomous driving is currently in a nascent stage, but it has immense potential to offer a safe and efficient transportation system in the future. Self-driving cars are a combination of various techniques and tools like Lidar, computer vision and GPS sensors, among others. This sector is clearly on an upward trajectory, so much so that it is estimated to reach a CAGR of 38.6% during the forecast period of 2017-2027.

Last year, Lyft open-sourced a large-scale dataset known as Level 5 dataset. Level 5 featured the raw sensor camera and Lidar inputs as perceived by a fleet of multiple, high-end, autonomous vehicles in a restricted geographic area. This time, the San Francisco-based ridesharing company open-sourced a cloud-native machine learning and data processing platform known as Flyte.

About Flyte

Flyte is a structured programming and distributed processing platform, which has the capability of enabling highly concurrent, scalable and maintainable workflows for machine learning and data processing. The platform uses protocol buffers as the specification language to specify workflows and tasks. Flyte comes with Flytekit — a Python SDK to develop applications on Flyte, be it authoring workflows or tasks. 

According to the developers at Lyft, Flyte has been serving production model training and data processing at the company for over three years now and has become the de-facto platform for teams like pricing, locations, estimated time of arrivals (ETA), mapping, self-driving (L5) and much more. 

Features of Flyte

There are five key features of this platform which has been mentioned below

  1. Hosted, Multi-Tenant & Serverless: Flyte is built directly on Kubernetes which provides benefits like portability, scalability, reliability, among others. As a multi-tenant service, a developer can use this platform to deploy, scale, and work on his/her repo.
  2. Elastic Scale: With a fully distributed, fault-tolerant control plane, Flyte has the capability of scaling multiple clusters, thousands of nodes, and thousands of concurrent workflows.  
  3. Parameters, Data Lineage & Caching: All the tasks and workflows in Flyte have strongly typed inputs and outputs which makes it possible to parameterise the workflows that have rich data lineage, by using cache versions of pre-computed artefacts. 
  4. Versioned, Reproducible & Shareable: Every entity in Flyte is immutable, with every change explicitly captured as a new version, which makes it easy and efficient for a developer to iterate, experiment and rollback the workflows.
  5. Dynamic and Extensible: Flyte is a framework-agnostic system and has a collection of plugins to assist with all of the workflows needs such as Spark on K8s, AWS Batch, Array Jobs, Hive Qubole, Containers, Pods, and more. 

How It Works

The platform manages over 7,000 unique workflows at Lyft, for a total of over 100,000 executions every month, 1 million tasks, and 10 million containers. Flyte also handles all the work which is involved in executing complex workflows such as hardware provisioning, scheduling, data storage and monitoring. In Flyte, workflows are expressed as graphs and to create a workflow, a developer first requires to compose a graph with precise specifications. While executing a workflow, the tasks are launched using Docker containers.

Wrapping Up

It has been witnessed that there has been a rise in tools, datasets, and techniques by the organisations when it comes to autonomous driving. Before Lyft announced the release of Flyte, on the same day, the ride-hailing rival company of Lyft, Uber has announced that it’s a model-agnostic, visual debugging tool for machine learning — Manifold works as an open-source platform to the community. This tool helps in identifying performance issues across ML data slices and models and then diagnose their root causes by surfacing feature distribution differences between subsets of data.

Download our Mobile App

Ambika Choudhury
A Technical Journalist who loves writing about Machine Learning and Artificial Intelligence. A lover of music, writing and learning something out of the box.

Subscribe to our newsletter

Join our editors every weekday evening as they steer you through the most significant news of the day.
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.

Our Recent Stories

Our Upcoming Events

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox

6 IDEs Built for Rust

Rust IDEs aid efficient code development by offering features like code completion, syntax highlighting, linting, debugging tools, and code refactoring

Can OpenAI Save SoftBank? 

After a tumultuous investment spree with significant losses, will SoftBank’s plans to invest in OpenAI and other AI companies provide the boost it needs?

Oracle’s Grand Multicloud Gamble

“Cloud Should be Open,” says Larry at Oracle CloudWorld 2023, Las Vegas, recollecting his discussions with Microsoft chief Satya Nadella last week.