audio

Voxtral TTS Environmental Impact

Ultra-efficientEstimated

Open-weight TTS — 9 languages, zero-shot voice cloning

Architecture
Neural TTS (open-weight)
Parameters
4B
Provider
Mistral AI
0.40 Wh
Energy per query
0.02 g
CO₂ per query
0.30 mL
Water per query
about the same as
vs Google search

Energy per query

0.40 Wh

about the same as a Google search (0.3 Wh)

CO2 per query

0.02 g

France grid (50 gCO₂/kWh)

Water per query

0.30 mL

~3,333 queries to fill 1 litre

Processing location

Mistral AI (France) / self-hosted

Provider

Mistral AI

Category

audio

Grid carbon intensity

50 g CO2/kWh (90% renewable)

How does Voxtral TTS compare?

Ranked #26 of 152 models by energy per query

0 Wh0.4 Wh0.8 Wh1.2 Wh1.6 WhOpenAI TTSVoxtral TTSWhisperElevenLabsGoogle search (0.3 Wh)

Detailed Breakdown

Energy Consumption

Voxtral TTS is an open-weight text-to-speech model supporting 9 languages and zero-shot voice cloning from 2-3 seconds of audio. At ~0.4 Wh per 30 seconds of speech, it matches ElevenLabs v3 on expressiveness while being fully open and self-hostable.

Power Source & Carbon

Open-weight and self-hostable. Via Mistral API, runs on France's clean nuclear grid (50 g CO2/kWh).

Water Usage

At ~0.3 mL per 30 seconds of speech — very low. Zero when self-hosted on consumer hardware.

About Voxtral TTS

Voxtral TTS is an open-source audio model from Mistral AI, released in March 26, 2026, that runs well below the category average for energy consumption at 0.40 Wh per query. Because its weights are publicly available, it can be self-hosted on any infrastructure — meaning its carbon footprint depends entirely on where and how you choose to run it. At 4B parameters, it open-weight tts — 9 languages, zero-shot voice cloning.

Voxtral TTS benefits from running in France, one of the cleaner grid regions in our dataset at 50 gCO₂/kWh with 90% renewable energy. The same model running in a coal-heavy region would produce significantly more carbon per query.

These figures are estimates derived from hardware specifications and API benchmarks — Mistral AI has not published official energy data for Voxtral TTS. Actual consumption may vary significantly depending on batching, quantisation, and infrastructure optimisations that we cannot observe from outside.

Key Insights

Uses less than a third of the average energy for audio models
Runs on a 90% renewable grid — among the cleanest AI inference locations
Open-source weights — can be self-hosted on infrastructure you control

What does your Voxtral TTS usage cost the planet?

Use our calculator to estimate your personal environmental footprint based on how often you use Voxtral TTS.

Calculate My Compute

Frequently Asked Questions

How much energy does Voxtral TTS use per query?

Each Voxtral TTS query consumes approximately 0.40 Wh of energy. This is about the same as a traditional Google search (~0.3 Wh).

What is Voxtral TTS's carbon footprint?

Based on the carbon intensity of Mistral AI (France) / self-hosted, each query produces approximately 0.02 g of CO2. The grid in this region has a carbon intensity of 50 g CO2/kWh with 90% renewable energy.

How much water does Voxtral TTS use?

Each query consumes approximately 0.30 mL of water, primarily used for cooling the data centers that process the request.

How does Voxtral TTS compare to a Google search?

A Voxtral TTS query uses about the same as a Google search in terms of energy. A Google search uses approximately 0.3 Wh, while Voxtral TTS uses 0.40 Wh.

Technical Details

Architecture

Neural TTS (open-weight)

Parameters

4B

Release date

2026-03-26

Open source

Yes