MITB Banner

LinkedIn Open-Sources Dagli, A Machine Learning Library For Java

Share

LinkedIn Open-Sources Dagli, A Machine Learning Library For Java

Illustration by LinkedIn Open-Sources Dagli, A Machine Learning Library For Java

LinkedIn has recently announced the open-sourcing Dagli, a machine learning library for Java and other JVM languages. This open-source machine learning library will ostensibly make it easier for developers to create bug-resistant, easily readable, modifiable, maintainable, as well as deployable model pipelines without incurring technical debt.

According to the data report, as the industry of machine learning matures and increases with innovative applications, the majority of companies, approximately 50% spend between 8 and 90 days deploying a single machine learning model — with 18% taking longer than 90 days. A lot of this could be attributed to the inability to scale, along with the challenges that come with model reproducibility, and the lack of executive buy-in, and poor tooling.

With this open source machine learning library, the model pipeline is defined as a directed acyclic graph, consisting of vertices and edges, stated in the news media. These vertices and edges are directed from one vertex to another for training and inference, stated in the news media. The environment of open source Dagli provides developers with the pipeline definitions, near-ubiquitous immutability and static typing.

When asked Jeff Pasternack, the LinkedIn NLP research scientist, he wrote in a blog post that models are traditionally part of an integrated pipeline, and therefore the constructing, training, and deploying these pipelines to production remains a challenging task. “Duplicated or extraneous work is often required to accommodate both training and inference, engendering brittle ‘glue’ code that complicates future evolution and maintenance of the model,” stated Pasternack.

The machine learning library — Dagli works on servers, Hadoop, command-line interfaces, IDEs, and other typical JVM contexts. It also comes with plenty of pipeline components that are built-in for ready to use, including neural networks, gradient boosted decision trees, logistic regression, FastText, cross-validation, feature selection, cross-training, data readers, evaluation, and feature transformations.

For professionals and experienced data scientists, Dagli offers a path to create production-ready AI models that are maintainable and extensible in the long term, and also can leverage an existing JVM technology stack. However, on the other hand, for less experienced software engineers, this machine learning library provides an API that can be used to avoid typical logic bugs, when used with a JVM language and tooling.

According to Pasternack, Dagli is created to make efficient, production-ready models that are easier to write, revise, and deploy. Further it will also avoid the technical debt and long-term maintenance challenges. Dagli, further, leverages modern, highly multicore processors and powerful graphics cards for effective single-machine training of real-world models.

The launch of Dagli comes after LinkedIn made available the LinkedIn Fairness Toolkit (LiFT), which is an open-source software library designed to enable the measurement of fairness in AI and machine learning workflows. LinkedIn also debuted DeText, an open-source framework for NLP-related ranking, language generation tasks as well as classification task. It leverages semantic matching, using deep neural networks to understand member intents in search and recommender systems.

Share
Picture of Sejuti Das

Sejuti Das

Sejuti currently works as Associate Editor at Analytics India Magazine (AIM). Reach out at sejuti.das@analyticsindiamag.com
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

Subscribe to Our Newsletter

The Belamy, our weekly Newsletter is a rage. Just enter your email below.