Text / Chat

LLaMA 3.3 70B Environmental Impact

Q: How much energy does LLaMA 3.3 70B use per query?

Each LLaMA 3.3 70B query consumes approximately 1 Wh of energy. This is 3x more than a traditional Google search (~0.3 Wh).

Q: What is LLaMA 3.3 70B's carbon footprint?

Based on the carbon intensity of its inference location (Self-hosted (varies)), each LLaMA 3.3 70B query produces approximately 0.4g of CO2.

Q: How does LLaMA 3.3 70B compare to a Google search?

A LLaMA 3.3 70B query uses 3x more than a traditional Google search in terms of energy consumption. A Google search uses approximately 0.3 Wh, while LLaMA 3.3 70B uses 1 Wh per query.

Q: How much water does LLaMA 3.3 70B use?

Each LLaMA 3.3 70B query consumes approximately 1.9 mL of water, primarily used for cooling the data centers that process the request.

StandardEstimated

Current default 70B model — improved efficiency over 3.1

Architecture: Dense Transformer (decoder-only)
Parameters: 70B
Context: 128,000 tokens
Provider: Meta

1.0 Wh

Energy per query

0.40 g

CO₂ per query

2 mL

Water per query

3x more than

vs Google search

Energy per query

1.0 Wh

3x more than a Google search (0.3 Wh)

CO2 per query

0.40 g

Global Average grid (475 gCO₂/kWh)

Water per query

2 mL

~526 queries to fill 1 litre

Processing location

Self-hosted (varies)

Provider

How does LLaMA 3.3 70B compare?

Ranked #52 of 152 models by energy per query

View full comparison

Detailed Breakdown

Energy Consumption

LLaMA 3.3 70B delivers performance comparable to the much larger 405B model at a fraction of the compute cost. At ~1.0 Wh per query, it has become the default open-source model for many production deployments. It represents the sweet spot between capability and efficiency.

Meta AI — Llama 3.3 (Dec 2024)

Power Source & Carbon

Widely deployed across all major cloud providers and self-hosted setups. Its single-GPU inference capability (with quantisation) makes it accessible for deployment on lower-carbon infrastructure.

Meta Infrastructure Evolution (2025)

Water Usage

At ~1.9 mL per query, LLaMA 3.3 70B has a modest water footprint comparable to GPT-4o.

UC Riverside — Making AI Less Thirsty (2023)

About LLaMA 3.3 70B

LLaMA 3.3 70B is an open-source text and chat model from Meta, released in December 6, 2024, that runs well below the category average for energy consumption at 1.0 Wh per query. Because its weights are publicly available, it can be self-hosted on any infrastructure — meaning its carbon footprint depends entirely on where and how you choose to run it. At 70B parameters, it current default 70b model — improved efficiency over 3.1.

These figures are estimates derived from hardware specifications and API benchmarks — Meta has not published official energy data for LLaMA 3.3 70B. Actual consumption may vary significantly depending on batching, quantisation, and infrastructure optimisations that we cannot observe from outside.

LLaMA 3.3 70B in Context

9.1 kWh

per year

Your yearly LLaMA 3.3 70B footprint

At 25 queries per day, your annual LLaMA 3.3 70B usage consumes 9.1 kWh — comparable to running a LED light bulb for a month. That produces 3.6 kg of CO₂.

Key Insights

Open-source weights — can be self-hosted on infrastructure you control

Meta LLaMA Family

How energy efficiency has evolved across versions.

LLaMA 3.1 70B2024-07-23

1.1 Wh

LLaMA 3.1 8B2024-07-23

0.30 Wh

LLaMA 3.1 405B2024-07-23

4.0 Wh

LLaMA 3.2 1B2024-09-25

0.22 Wh

LLaMA 3.2 11B Vision2024-09-25

0.50 Wh

LLaMA 3.2 90B Vision2024-09-25

1.5 Wh

LLaMA 3.3 70BCurrent2024-12-06

1.0 Wh

LLaMA 4 Scout2025-04-05

0.60 Wh

LLaMA 4 Maverick2025-04-05

1.8 Wh

What does your LLaMA 3.3 70B usage cost the planet?

Use our calculator to estimate your personal environmental footprint based on how often you use LLaMA 3.3 70B.

Calculate My Compute

Frequently Asked Questions

How much energy does LLaMA 3.3 70B use per query?

Each LLaMA 3.3 70B query consumes approximately 1.0 Wh of energy. This is 3x more than a traditional Google search (~0.3 Wh).

What is LLaMA 3.3 70B's carbon footprint?

Based on the carbon intensity of Self-hosted (varies), each query produces approximately 0.40 g of CO2. The grid in this region has a carbon intensity of 475 g CO2/kWh with 27% renewable energy.

How much water does LLaMA 3.3 70B use?

Each query consumes approximately 2 mL of water, primarily used for cooling the data centers that process the request.

How does LLaMA 3.3 70B compare to a Google search?

A LLaMA 3.3 70B query uses 3x more than a Google search in terms of energy. A Google search uses approximately 0.3 Wh, while LLaMA 3.3 70B uses 1.0 Wh.

Technical Details

Architecture

Dense Transformer (decoder-only)

Parameters

70B

Context window

128,000 tokens

Release date

2024-12-06

Open source

Yes

Training data cutoff

2024-12

Sources

Meta AI — Llama 3.3 (Dec 2024)

Related Models

LLaMA 3.2 1BUltra-efficient