Listen to this story
|
Recently, Together, an open source community led by researchers and technocrats, released a new version of GPT-JT trained on six billion parameters. This new model has been built using recently published open source techniques and datasets. It has been trained with its decentralised approach on the Together Research Computer, a local, in-house network developed by the team.
Unlike GPT-3, which is available on-demand/request, GPT-JT is now available as open source.
Click here to access the code and datasets.
GPT-JT
GPT-JT came very close to GPT-3’s text-davinci-002 (175B) in terms of performance on classification benchmarks such as RAFT (following holistic evaluation of language models (HELM) protocol).
Read: OpenAI Turns to Davinci to Make GPT-3 Better
Together claimed that this new model is a variant of the previous GPT-J (6B), and performs well on text classification and other tasks. Banking on the power of open source AI, the team said that this would not have been possible without the open source works published by several organisations, including EleutherAI (GPT-J-6B, GPT-NeoX), Google Research (UL2, CoT), Natural-Instructions (NI) dataset by AllenAI, BigScience’s Public Pool of Prompts (P3) dataset, Ought (RAFT), and Stanford CRFM (HELM).
Open. Scalable. Together
Founded in 2022, Together is a decentralised cloud for artificial intelligence that enables researchers, developers, and companies to leverage and improve AI with an intuitive platform combining data, models, and computation. The team said that their GPT-JT model is inspired by lo-fi and ProxSkip, developed by Ludwig Schmidt, Mitchell Wortsman, Peter Richtárik, and others. The company has hosted this model through HuggingFace, which is the hub for the open-source AI ecosystem.
Check out the live demo here.