Listen to this story
As irrational language models continue to increasingly influence every aspect of our lives, Microsoft has released an approach to make AI reason better. It’s called ‘Everything of Thought’ (XOT). This methodology draws inspiration from Google DeepMind’s AlphaZero, which uses tiny neural nets that can perform better with larger ones.
The new XOT method was developed in collaboration with Georgia Institute of Technology, and East China Normal University. They used a blend of reinforcement learning and Monte Carlo Tree Search (MCTS) — techniques renowned for their effectiveness in complex decision-making.
These techniques together allow language models to generalise efficiently to unknown problems, the researchers said. The researchers’ trials on a variety of challenging tasks, such as the Game of 24, the 8-Puzzle, and the Pocket Cube, yielded impressive results.
Subscribe to our Newsletter
Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
XOT has outshone its contemporaries in tackling problems that have stymied other methods. This superiority, however, is not without limitations. The system, despite its advancements, has not reached a state of 100% reliability.
However, the research team views the framework as an effective approach for incorporating external knowledge into language model inference. They are sure that it improves performance, efficiency, and flexibility simultaneously — a combination not attainable through alternative methods.
Trying to Reason
The reason for researchers eyeing games to integrate next into language models is because these models can form sentences with impressive accuracy, yet they fall short in an aspect critical to human-like thinking — the ability to reason logically.
Researchers have long studied the subject. For years, the academic and tech communities have delved deep into this conundrum. However, despite their efforts in augmenting AI with more layers, parameters, and attention mechanisms, a solution remains missing. They have also been exploring multimodality but nothing much advanced and dependable has come out of that yet either.
Earlier this year, a collaborative team at Virginia Tech and Microsoft released an approach titled ‘Algorithm of Thoughts‘ (AoT) to refine AI’s algorithmic reasoning as we all know how bad ChatGPT was at maths when released. Additionally, it suggested that with this training method, large language models could become capable of integrating their intuition into searches optimised for better outcomes.
Moreover, a little over a month ago Microsoft also put the moral reasoning of these models under a microscope. As a result, the team proposed a new framework designed to assess its ethical decision-making skills. In the outcome, the 70-billion parameter LlamaChat model outperformed its larger counterparts.
The result challenged the long-held belief that bigger always equates to better and the community’s over-reliance on large parameters.
As the big tech companies continue to face the consequences of their irrational language models, Microsoft’s strategy appears to be one of careful progress. Rather than racing to add complexity to their models, they are picking their battles one by one.
Microsoft has not disclosed plans for implementing the XOT method in its products. Meanwhile, Google DeepMind, led by CEO Demis Hassabis, is considering integrating AlphaGo-inspired concepts into its Gemini project, as he mentioned in an interview.
Meta’s CICERO, named with a nod to the famed Roman orator, also entered the fray a year ago, raising eyebrows across the AI community for being so well skilled at the complex board game Diplomacy. This game, demanding not just strategic awareness but also the art of negotiation, has long been considered a challenge for AI.
Yet, CICERO navigated these waters, showing an ability to engage in nuanced, human-like conversations. This discovery did not go unnoticed, especially in light of the benchmarks set by DeepMind. For years, the UK-based research lab has championed the use of games for developing and refining neural networks.
Their feats with AlphaGo, have set a high bar, one that Meta met with borrowing elements from DeepMind’s playbook; combining strategic reasoning algorithms, like AlphaGo, with a natural language processing model, like GPT-3.
Meta’s model stood out because for an AI agent to play Diplomacy, it has to not only understand the rules of the game, but also accurately gauge the possibility of betrayal by other human players. The agent’s capability to engage in conversations in natural-sounding language with other humans makes it the next best thing to be integrated as Meta continues to build Llama-3.
The integration of CICERO’s capabilities with Meta’s broader AI initiatives could mark the beginning of true conversational AI.