MITB Banner

Now, OpenAI API have text and code embeddings

Share

OpenAI has introduced embeddings, a new endpoint in the OpenAI API, to assist in semantic search, clustering, topic modeling, and classification.

OpenAI’s embeddings outperform top models in three standard benchmarks, including a 20% relative improvement in code search. Embeddings are really useful for working with natural language and code.

The embeddings that are numerically similar are also semantically similar. For example, the embedding vector of “canine companions say” will be more similar to the embedding vector of “woof” than “meow.” The new endpoint by OpenAI uses neural network models to map text and code to a vector representation—“embedding” them in a high-dimensional space. Each dimension captures some aspect of the input.

The company has released three families of embedding models for different functionalities including text similarity, text search, and code search. The models take either text or code as input and return an embedding vector.

Text similarity models

The text similarity models provide embeddings that capture the semantic similarity of pieces of text. These models are useful for many tasks including clustering, data visualization, and classification.

Text search models

The text search models provide embeddings that allow large-scale search tasks, such as finding a relevant document among a collection of documents given a text query. The model first embeds for the documents and produces query separately, and then cosine similarity is used to compare the similarity between the query and each document. Such embedding-based search generalize better than word overlap techniques used in the classical keyword search, as it captures the semantic meaning of the text and is also less sensitive to exact phrases or words.

Code search models

Code search models provide code and text embeddings for code search tasks. Given a collection of code blocks, the task is to find the relevant code block for a natural language query. 

Find the embeddings documentation here.

PS: The story was written using a keyboard.
Share
Picture of Meeta Ramnani

Meeta Ramnani

Meeta’s interest lies in finding out real practical applications of technology. At AIM, she writes stories that question the new inventions and the need to develop them. She believes that technology has and will continue to change the world very fast and that it is no more ‘cool’ to be ‘old-school’. If people don’t update themselves with the technology, they will surely be left behind.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India