Gradium Raises €60M to Scale Voice AI

Three months after its inception in September 2025, Gradium officially announces its launch, buoyed by a seed round of €60 million. A substantial figure for a company so young, yet it underscores investors’ appetite for next-generation voice AI technologies.

A Technology Based on Audio Language Models

Gradium’s technical approach rests on audio language models, the native-audio equivalents of textual large language models (LLMs). This architecture, originally invented by the company’s founders, enables processing speech natively without any intermediate transcription to text, unlike traditional systems that chain speech recognition, textual processing, and speech synthesis.

This native approach offers several technical advantages: reduced latency, preservation of vocal expressiveness, and the ability to handle any vocal task in a unified manner. Audio language models have now become the industry standard since their invention by the founders.

The founding team brings together four researchers renowned in the field of audio AI: Neil Zeghidour (CEO, formerly at Meta and Google DeepMind), Olivier Teboul (CTO, formerly at Google Brain), Laurent Mazaré (Chief Coding Officer, formerly at Google DeepMind and Jane Street), and Alexandre Défossez (Chief Scientist Officer, formerly at Meta). Their expertise rests on more than a decade of fundamental research conducted notably within Kyutai, a nonprofit AI research lab of which Zeghidour and Mazaré were two of the founders.

Read also: Do AI that err aim at the wrong objective?

This collaboration with Kyutai continues and constitutes a strategic asset: it gives Gradium privileged access to advances in fundamental research, which it can quickly translate into commercial applications. The underlying technology of Gradium will be identical to that of Moshi, the vocal AI developed by Kyutai, Neil Zeghidour told Bloomberg.

A ‘Quality–Latency–Cost’ Positioning

Gradium asserts it is solving a major technical trade-off in the sector: current voice systems typically force a choice between interaction quality, low latency, and affordable cost. The startup aims to deliver, simultaneously, realistic vocal expressiveness, accurate transcription, and ultra-low latency, all while keeping prices suitable for large-scale deployment.

This value proposition is primarily aimed at developers and enterprises via an API platform. The service already supports five languages at launch (English, French, German, Spanish, and Portuguese), with more in the works.

Gradium reports generating its first revenues just a few weeks after its creation. The company already counts customers across several sectors: gaming, AI agents, customer service, language learning, and healthcare.

The €60 million seed round was co-led by FirstMark Capital and Eurazeo, with participation from DST Global Partners, Eric Schmidt (former CEO and Chairman of Google), Xavier Niel (Iliad), Rodolphe Saadé (CMA CGM), Korelya Capital, and Amplify Partners.

This amount places Gradium among the largest seed rounds in the French and European ecosystems, reflecting market expectations for the potential of voice AI. According to Neil Zeghidour, the sector is still at the stage where chatbots were before the emergence of LLMs: existing systems remain fragile, costly, and limited in their ability to offer natural interactions.

Gradium’s stated ambition is to become the global reference technology backbone for voice, making voice the primary interface between humans and machines.