MITB Banner

DeepLearning Comes Up with New Course on Unstructured Data Handling for LLMs

Taught by Matt Robinson, head of product at Unstructured, the course is free for a limited time and takes about an hour to complete. 

Share

Listen to this story

Andrew Ng has rolled out a new course called “Preprocessing Unstructured Data for LLM Applications,” this time in collaboration with San Francisco-based startup Unstructured. Unstructured essentially captures unstructured data wherever it is stored and transforms it into AI-friendly JSON files for companies eager to incorporate AI into their business.

Taught by Matt Robinson, head of product at Unstructured, it’s free for a limited time and takes about an hour to complete. 

You’ll learn to extract and standardise content from various document types, such as PDFs, PowerPoints, Word, and HTML files, as well as tables and images into a common JSON format. This will broaden the range of information available for your LLM applications. Enriching your content with metadata will improve retrieval augmented generation (RAG) results and enable more nuanced search capabilities.

The course covers techniques for document image analysis, including layout detection and vision and table transformers. You’ll discover how to apply these methods to preprocess PDFs, images, and tables. It is suitable for anyone interested in effectively processing diverse data types and formats to build high-performing LLM RAG systems.

Share
Picture of Shritama Saha

Shritama Saha

Shritama (she/her) is a technology journalist at AIM who is passionate to explore the influence of AI on different domains including fashion, healthcare and banks.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

Subscribe to Our Newsletter

The Belamy, our weekly Newsletter is a rage. Just enter your email below.