Inferact, an AI startup founded by the creators of the open-source vLLM, has secured $150 million in seed funding, valuing the company at $800 million.
This funding round was spearheaded by venture capital firms Andreessen Horowitz (a16z) and Lightspeed, with support from Sequoia Capital, Altimeter Capital, Redpoint Ventures, and ZhenFund, the company announced on January 22.
According to the company, vLLM is a key player at the intersection of models and hardware, collaborating with vendors to provide immediate support for new architectures and silicon. Used by various teams, it supports over 500 model architectures and 200 accelerator types, with a strong ecosystem of over 2,000 contributors.
The company aims to support the growth of vLLM by providing financial and developer resources to handle increasing model complexity, hardware diversity and deployment scale.
“We see a future where serving AI becomes effortless. Today, deploying a frontier model at scale requires a dedicated infrastructure team. Tomorrow, it should be as simple as spinning up a serverless database. The complexity doesn’t disappear; it gets absorbed into the infrastructure we’re building,” Woosuk Kwon, co-founder of Inferact, posted on X.
The startup also plans to develop a next-generation commercial inference engine that works with existing providers to improve software performance and flexibility.
Inferact is led by the maintainers of the vLLM project, including Simon Mo, Kwon, Kaichao You, and Roger Wang. vLLM is the leading open-source inference engine and one of the largest open-source projects of any kind, used in production by companies like Meta, Google, Character AI, and many others.
The team plans to further enhance vLLM’s performance, deepen support for emerging model architectures, and expand coverage across advanced hardware. They believe the AI industry requires inference infrastructure that is not confined within proprietary limitations.
“For a16z infra, investing in the vLLM community is an explicit bet that the future will bring incredible diversity of AI apps, agents, and workloads running on a variety of hardware platforms,” a16z said on X.
Inferact is also hiring engineers and researchers to work at the frontier of inference, “where models meet hardware at scale,” Kwon said.




