What BloombergGPT Brings to the Finance Table

The latest LLM by Bloomberg, trained on 700 billion tokens, is an ingredient model said to boost Bloomberg Terminal service

Share

Published on April 4, 2023

by Vandana Nair

Listen to this story

Last week, Bloomberg released a research paper on its large language model BloombergGPT. Trained on over 50 billion parameters, the LLM model will be a first-of-its-kind AI generative model catering to the finance industry. While the move may set a precedent for other companies, for now, the announcement sounds like a push for the data and news company to seem relevant in the AI space.

Interestingly, Bloomberg already has Bloomberg Terminal, which employs NLP and ML-trained models for offering financial data. So, naturally, the question that arises is: how much of a value-add is BloombergGPT and where does it stand in comparison to other GPT models?

Training and Parameters

Bloomberg’s vast repository of financial data over the past forty years, has been used for training the GPT model. It is trained on 363 billion token proprietary datasets (financial documents) available from Bloomberg. In addition, 345 billion token public datasets were also incorporated to result in a total of 700 billion tokens for training.

The company claims that the new model (Bloomberg GPT) will help improve their already existing NLP tasks such as sentiment analysis – a method that helps predict market prices – news classification, headline generation, question-answering, and other query-related tasks.

Here is an example of BloombergGPT being used to generate valid Bloomberg Query Language. As we have seen with other models like GPT-3, this model can, with a few examples in input prompt, utilize knowledge about stock tickers and financial terms to compose queries for data… pic.twitter.com/tMumrgnzX3
— elvis (@omarsar0) March 31, 2023

On the face of it, the new LLM model appears great, but is still very limited in its approach. It’s not a multilingual model, has biases and toxicity and is a closed model.

Multilingual

BloombergGPT, the 50-billion parameter ‘decoder-only causal language model’ is not trained on multilingual data. Their training dataset, called FinPile, includes news, filings, press releases, web-scraped financial documents, and social media drawn from the Bloomberg archives, and they are all in the English language. For instance, to train the model on data from press conferences, transcripts of company press conferences through speech recognition were used in the English language. The absence of multi-languages limits input training data.

BLOOM, which has the same model architecture and software stack as BloombergGPT (though BLOOM is trained on higher parameters of 175 billion), is multilingual. Similar is the case with GPT-3, which is also trained on multilingualism and 175 billion parameters.

Biases and Toxicity

Bloomberg has mentioned that the possibility of the “generation of harmful language remains an open question”. LLMs are known for their biases and hallucinations, a problem that large trained models, such as ChatGPT, are also combatting. LLM bias can be highly detrimental when utilised in finance models, as accurate and factual information determines the rightful prediction of market sentiments. However, BloombergGPT does not address this concern completely. The company is still evaluating the model and believes that “existing test procedures, risk and compliance controls” will help reduce the problem. Bloomberg is also studying their FinPile dataset which contains lesser biases and toxic language, which will ultimately curb the generation of inappropriate content.

Closed Model

BloombergGPT is a closed model. Apart from the parameters and general information, details such as the weights of the model are not mentioned in their research paper. It is possible that since this model is based on decades of Bloomberg data, clubbed with its sensitive nature of information, the LLM will not become open sourced. Besides, the model is set to target their Bloomberg Terminal users, who are already availing the service at a subscription cost. However, the company does have plans to release training logs of the model.

In a conversation with AIM, Anju Kambadur, head of AI Engineering at Bloomberg, said: “BloombergGPT is about empowering and augmenting human professionals in finance with new capabilities to deal with numerical and computational concepts in a more accessible way.” Bloomberg has been using AI, Machine Learning and NLP for more than a decade but each of them required a custom model. “With BloombergGPT, we will be able to develop new applications quicker and faster, some of which have been thought about for years and not developed yet,” he said.

“Conversational English can be used to post queries using Bloomberg Query Language (BQL) to pinpoint data, which can then be imported into data science and portfolio management tools.”

Kambadur clarified that BloombergGPT is not a chatbot. “It is an ingredient model that we are using internally for product development and feature enhancement.” The model will help power AI-enabled applications like Bloomberg Terminal, but also power back-end workflows within our data operations. Clients may not engage with the model directly but will be using it through the Terminal functions in the future.

Comparison

Below is a comparison with other models GPT-NeoX (trained on 20B parameters) and FLAN-T5-XXL (trained on 11B parameters). BloombergGPT, updated on the latest information, is able to answer the questions accurately when compared to other similarly-trained LLMs.

Source: arxiv.org

BloombergGPT fared better on financial tasks when compared to other similar open models of the same size and was even evaluated on the ‘Bloomberg internal benchmarks’ and other general-purpose NLP benchmarks such as BIG-bench Hard, knowledge assessments, reading comprehension and linguistic tasks.

Access all our open Survey & Awards Nomination forms in one place

Vandana Nair

As a rare blend of engineering, MBA, and journalism degree, Vandana Nair brings a unique combination of technical know-how, business acumen, and storytelling skills to the table. Her insatiable curiosity for all things startups, businesses, and AI technologies ensures that there's always a fresh and insightful perspective to her reporting.