MITB Banner

OpenAI Unveils Data Partnerships Program to Propel AGI Ambitions

OpenAI is interested datasets that reflect human society, encompassing various modalities such as text, images, audio, or video

Share

OpenAI Introduces Instruction Hierarchy to Protect LLMs from Jailbreaks and Prompt Injections
Listen to this story

OpenAI has introduced ‘OpenAI Data Partnerships,’ inviting organisations to collaborate in producing both public and private datasets for training AI models. The initiative aims to enhance AI’s understanding of various subjects, industries, cultures, and languages, facilitating the development of AGI, as stated by OpenAI.

The ChatGPT creator is interested in large-scale datasets that reflect human society, encompassing various modalities such as text, images, audio, or video. The focus is on data expressing human intention, including long-form writing or conversations across different languages, topics, and formats.

The company has already partnered with several organisations to incorporate curated datasets into AI training. Notable collaborations include working with the Icelandic Government and Miðeind ehf to enhance GPT-4’s ability to comprehend Icelandic language. 

Additionally, OpenAI joined forces with the non-profit organisation Free Law Project, incorporating their extensive collection of legal documents into AI training, aiming to democratize access to legal understanding.

To participate, OpenAI has offered two partnership options to organisations. The first involves creating an open-source dataset for training language models, promoting collaboration within the wider AI community. The second option allows organisations to contribute private datasets, ensuring the confidentiality of sensitive information while enabling OpenAI’s models to gain a deeper understanding of specific domains.

Interestingly, at the first-ever DevDay, OpenAI launched the Copyright Shield program, which aims to provide financial support and legal defense to the enterprise-level users of ChatGPT against such claims.

While unveiling the program, Sam Altman emphasised their efforts to ensure copyright compliance within their AI systems, which are trained on a combination of licensed and publicly available data sources.

Share
Picture of Siddharth Jindal

Siddharth Jindal

Siddharth is a media graduate who loves to explore tech through journalism and putting forward ideas worth pondering about in the era of artificial intelligence.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India