Listen to this story
MetaGPT is trending on GitHub, with 20,000 stars. It’s a multi-agent framework trying to connect several different programs and get them to work together better without hallucinating. The programs work on different parts of a problem separately, like experts in different areas – this way, they can double-check each other’s work and make fewer mistakes overall.
Until now, agents like Baby AGI and Agent GPT would spin up a bunch of agents to complete a task for ‘write me a code for this API’ but now, MetaGPT stepped up the game by taking in a one-line requirement as input and outputs user stories, competitive analysis, requirements, data structures, APIs, and documents. But is Meta GPT really any better?
Better than individual agents
The developers took the different roles of a software company, like product manager, project manager, software architect, and software engineer and used GPT-4 to build agents for each persona in a software company and run them at the same time. They tested MetaGPT on tasks related to making computer programs, and it could come up with better solutions compared to how these programs worked before.
Subscribe to our Newsletter
Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
It not only writes code but also performs various analyses that a software house would have to do. It has a lot of different agents, not just developers, but also engineers, QA testers, project managers, and architectural designers. It implements a manager-like structure to oversee these agents.
The manager agent acts as a decision-maker, passing tasks to different agents based on their roles.
After installing MetaGPT users can build anything, for example, a version of Flappy Bird without any code. Instead, the agents will start working on it, with the product manager defining goals, user stories, and competitive analysis. The architect will break down the schema into tasks, followed by developers working on the actual code.
MetaGPT generates a folder called ‘workspace’ with generated files. It even creates charts and diagrams that a software house would typically take days to produce. Though it might need some adjustments and debugging as GPT’s information is cut off till 2021, it’s still a powerful tool for rapidly generating code and documentation.
To do all this, they are using a set of instructions called SOPs. The SOPs are like plans that guide them on how to work together efficiently. First, each agent is given a description so that the system knows what job they’re best suited for. This helps the system start with the right instructions. The agents can talk to each other and share tools and information in a shared space, just like people in a team. They can also share their work with each other. Unlike waiting for messages, the agents can actively find useful information, which is faster. The shared space is like a digital version of a workplace where people collaborate.
How MetaGPT compares with others
MetaGPT is the ‘on steroids’ version of other agents like AutoGPT, LangChain, and AgentVerse.
When it comes to working together on projects, both MetaGPT and AgentVerse allow people to collaborate on tasks. They assign roles to different people, which helps them work better as a team. However, MetaGPT goes a step further by not only breaking down tasks but also managing them.
In terms of creating code, all the tools are good, but according to the paper, MetaGPT is seen as more complete because it covers a wider range of tasks in project development and offers a complete set of tools for managing and executing projects.
Even though MetaGPT can create working code for games, it’s not perfect because there are strict rules and limited options to adjust things manually. On the other hand, the other tools like AutoGPT, LangChain, and AgentVerse work better on larger tasks than MetaGPT.
According to the paper, in tests to create code, MetaGPT does really well, achieving a top score of 81.7% and 82.3% in getting things right on the first try. When we compare it to other ways of making code like AutoGPT, LangChain, and AgentVerse, MetaGPT can handle much more complex software and stands out for its many features. It’s important to note that in our experiments, MetaGPT successfully finished all the tasks we gave it, showing how strong and effective it is.
These various AI systems or frameworks seem to attract attention on GitHub, but they don’t seem to have practical applications beyond being entertaining demos. Similar to AutoGPT, they might struggle with anything even slightly complex. It’s possible that this is the direction we’re moving in. Could it really be as simple as just putting together the right combination of models to create truly useful general-purpose AI agents? So far, these agents draw the same low code/no code crowd with maybe 5-10% improvement but the same or worse technical debt.