Listen to this story
Meta, the uncrowned king of open-source is going through tough times. The recent releases of Llama and Llama 2, praised for being open-source language models, led to the departure of some scientists and engineers who had worked on Llama.
The reason behind their departure was an internal battle for computing resources with another Meta research team developing a rival model.
While the tech giant is grappling with internal issues, it is facing stiff competition from others willing to contribute to open-source.
Subscribe to our Newsletter
Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Open source LLMs have welcomed a new king and it comes from the Middle East. TII’s recent release of the rendition of its Falcon model is leading the charts.
With a 180-billion-parameter size and trained on a massive 3.5-trillion-token dataset, Falcon 180b has forced the community to consider it. In terms of performance, Falcon 180B has secured its position, dominating the leaderboard for open-access models. While definitive rankings are challenging to establish at this early stage, Falcon 180B’s performance is already drawing comparisons to PaLM-2, a testament to its prowess.
This is high time Meta needs to rush the release of LlaMa 3 if it wants to be at pace with the competition and doesn’t want to be left behind.
And Then There Were Two
Meta, however, isn’t competing with anyone else but OpenAI, which is talking about multimodal capabilities and looking to integrate the iteration of its image generation model DALLE 3.
In such an environment, the discussion surrounding LLaMa 3 is filled with diverse expectations and predictions. Many anticipate LLaMa 3 using high-quality training data, like Phi 1.5 to enhance its performance. There’s also excitement about the potential for more tokens and further exploration of scaling laws. Additionally, there’s the conversation around Mixture-of-Architecture, a statistical approach to architecture that can improve over the drawbacks of parametric architecture, which could outperform individual experts or submodels.
LLaMa 3 is also expected to bring multimodal capabilities to open-source. Meta could tap into its own ecosystem of multimodal models built on LLaMa like mPLUG-Owl, llava, minigpt4 and blip2 based on LLaMa.
Open source Banks on LLaMa
Meta has taken on as a pivotal player for smaller initiatives that depend on open-source LLMs. The open-source LLM leaderboard is filled with models fine-tuned on LlaMa, six from the top at least are LLaMa-based—Uni-TianYan, FashionGPT, sheep-duck, Orca, to GenZ Model—by Indian developers.
While Falcon presents a very good and powerful alternative, there are doubts over its licensing.
In essence, the implication of these clauses means that the Licensor reserves the right to modify the Acceptable Use Policy without explicitly notifying users, and users are expected to adapt their usage to conform to the latest version of the policy. Failure to do so could potentially lead to a breach of the licence terms.
Several forums from Reddit to hacker news agreed on the significance of Meta’s role in open-source LLM development and reacted with disappointment to the delay in LLaMa 3. According to a WSJ article, the conglomerate has not even started training it yet and will kickstart the project in early to mid-2024.
The delay also meant that the open-source community would lag behind. Commenters reiterated that Meta’s actions have a substantial impact on the availability of such models. If Meta chooses not to release open-source LLaMa 3, it’s unlikely that any other experienced and well-funded team would be willing to give away a model that has cost millions to develop, bar Falcon.