Microsoft’s Thought-Out Plan for LLM Problems

Microsoft has (again) released an approach to make AI reason better.

Share

As irrational language models continue to increasingly influence every aspect of our lives, Microsoft has released an approach to make AI reason better. It’s called ‘Everything of Thought’ (XOT). This methodology draws inspiration from Google DeepMind’s AlphaZero, which uses tiny neural nets that can perform better with larger ones.

The new XOT method was developed in collaboration with Georgia Institute of Technology, and East China Normal University. They used a blend of reinforcement learning and Monte Carlo Tree Search (MCTS) — techniques renowned for their effectiveness in complex decision-making. 

These techniques together allow language models to generalise efficiently to unknown problems, the researchers said. The researchers’ trials on a variety of challenging tasks, such as the Game of 24, the 8-Puzzle, and the Pocket Cube, yielded impressive results.

XOT has outshone its contemporaries in tackling problems that have stymied other methods. This superiority, however, is not without limitations. The system, despite its advancements, has not reached a state of 100% reliability.

However, the research team views the framework as an effective approach for incorporating external knowledge into language model inference. They are sure that it improves performance, efficiency, and flexibility simultaneously — a combination not attainable through alternative methods.

Trying to Reason

The reason for researchers eyeing games to integrate next into language models is because these models can form sentences with impressive accuracy, yet they fall short in an aspect critical to human-like thinking — the ability to reason logically.

Researchers have long studied the subject. For years, the academic and tech communities have delved deep into this conundrum. However, despite their efforts in augmenting AI with more layers, parameters, and attention mechanisms, a solution remains missing. They have also been exploring multimodality but nothing much advanced and dependable has come out of that yet either. 

Earlier this year, a collaborative team at Virginia Tech and Microsoft released an approach titled ‘Algorithm of Thoughts‘ (AoT) to refine AI’s algorithmic reasoning as we all know how bad ChatGPT was at maths when released. Additionally, it suggested that with this training method, large language models could become capable of integrating their intuition into searches optimised for better outcomes.

Moreover, a little over a month ago Microsoft also put the moral reasoning of these models under a microscope. As a result, the team proposed a new framework designed to assess its ethical decision-making skills. In the outcome, the 70-billion parameter LlamaChat model outperformed its larger counterparts.

The result challenged the long-held belief that bigger always equates to better and the community’s over-reliance on large parameters.

As the big tech companies continue to face the consequences of their irrational language models, Microsoft’s strategy appears to be one of careful progress. Rather than racing to add complexity to their models, they are picking their battles one by one.

More Possibilities

Microsoft has not disclosed plans for implementing the XOT method in its products. Meanwhile, Google DeepMind, led by CEO Demis Hassabis, is considering integrating AlphaGo-inspired concepts into its Gemini project, as he mentioned in an interview.

Meta’s CICERO, named with a nod to the famed Roman orator, also entered the fray a year ago, raising eyebrows across the AI community for being so well skilled at the complex board game Diplomacy. This game, demanding not just strategic awareness but also the art of negotiation, has long been considered a challenge for AI.

Yet, CICERO navigated these waters, showing an ability to engage in nuanced, human-like conversations. This discovery did not go unnoticed, especially in light of the benchmarks set by DeepMind. For years, the UK-based research lab has championed the use of games for developing and refining neural networks.

Their feats with AlphaGo, have set a high bar, one that Meta met with borrowing elements from DeepMind’s playbook; combining strategic reasoning algorithms, like AlphaGo, with a natural language processing model, like GPT-3.

Meta’s model stood out because for an AI agent to play Diplomacy, it has to not only understand the rules of the game, but also accurately gauge the possibility of betrayal by other human players. The agent’s capability to engage in conversations in natural-sounding language with other humans makes it the next best thing to be integrated as Meta continues to build Llama-3. 

The integration of CICERO’s capabilities with Meta’s broader AI initiatives could mark the beginning of true conversational AI.

Share
Picture of Tasmia Ansari

Tasmia Ansari

Tasmia is a tech journalist at AIM, looking to bring a fresh perspective to emerging technologies and trends in data science, analytics, and artificial intelligence.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India