Video Generation

Stable Video Diffusion Environmental Impact

ExtremeEstimated

Open-source video generation model

Architecture
Latent video diffusion model
Parameters
1.5B
Provider
Stability AI
150.0 Wh
Energy per query
60.0 g
CO₂ per query
280 mL
Water per query
500x more than
vs Google search

Energy per query

150.0 Wh

500x more than a Google search (0.3 Wh)

CO2 per query

60.0 g

Global Average grid (475 gCO₂/kWh)

Water per query

280 mL

~4 queries to fill 1 litre

Processing location

Self-hosted (varies)

Provider

Stability AI

Category

Video Generation

Grid carbon intensity

475 g CO2/kWh (27% renewable)

How does Stable Video Diffusion compare?

Ranked #137 of 152 models by energy per query

0 Wh150 Wh300 Wh450 Wh600 WhStable VideoDiffusionRunway Gen-3AlphaVeo 2SoraGoogle search (0.3 Wh)

Detailed Breakdown

Energy Consumption

Stable Video Diffusion generates short video clips (14-25 frames) from image inputs. At ~150 Wh per clip, it is among the more efficient video models due to its relatively small parameter count (1.5B). Being open-source, it can be self-hosted and optimised for specific hardware.

Power Source & Carbon

Open-source model that can be deployed on any GPU infrastructure. A single H100 GPU can generate clips, making it accessible for self-hosting.

Water Usage

At ~280 mL per clip. Self-hosted on consumer hardware, water consumption is zero. Cloud-hosted, it depends on the provider's cooling infrastructure.

About Stable Video Diffusion

Stable Video Diffusion sits at the high end of AI energy consumption at 150.0 Wh per generating one short video clip — over 500x what a Google search uses. Stability AI's video generation model, released in November 21, 2023, open-source video generation model. The extreme energy cost reflects the computational complexity of video generation: generating coherent frames across both space and time.

These figures are estimates derived from hardware specifications and API benchmarks — Stability AI has not published official energy data for Stable Video Diffusion. Actual consumption may vary significantly depending on batching, quantisation, and infrastructure optimisations that we cannot observe from outside.

Stable Video Diffusion in Context

13x
phone charges per video

The video generation premium

One Stable Video Diffusion generation uses 150.0 Wh — the equivalent of 13 full smartphone charges. Video generation requires running a diffusion model across both spatial and temporal dimensions, making it fundamentally more compute-intensive than text or image tasks. This is not an efficiency problem to be solved; it is an inherent characteristic of the task.

Key Insights

Open-source weights — can be self-hosted on infrastructure you control

What does your Stable Video Diffusion usage cost the planet?

Use our calculator to estimate your personal environmental footprint based on how often you use Stable Video Diffusion.

Calculate My Compute

Frequently Asked Questions

How much energy does Stable Video Diffusion use per query?

Each Stable Video Diffusion query consumes approximately 150.0 Wh of energy. This is 500x more than a traditional Google search (~0.3 Wh).

What is Stable Video Diffusion's carbon footprint?

Based on the carbon intensity of Self-hosted (varies), each query produces approximately 60.0 g of CO2. The grid in this region has a carbon intensity of 475 g CO2/kWh with 27% renewable energy.

How much water does Stable Video Diffusion use?

Each query consumes approximately 280 mL of water, primarily used for cooling the data centers that process the request.

How does Stable Video Diffusion compare to a Google search?

A Stable Video Diffusion query uses 500x more than a Google search in terms of energy. A Google search uses approximately 0.3 Wh, while Stable Video Diffusion uses 150.0 Wh.

Technical Details

Architecture

Latent video diffusion model

Parameters

1.5B

Release date

2023-11-21

Open source

Yes