Practical, data-backed steps to cut your AI energy use \u2014 without giving up the tools you rely on.
~70%
Saved by model choice
~95%
Saved skipping reasoning
100\u00d7
Cost of long vs short prompts
Biggest Impact
01
Use the right model for the job
~70%energy saved by choosing a smaller model
GPT-4o Mini handles most everyday tasks — summarisation, drafting, translation, Q&A — at 3× less energy than GPT-4o. Gemini Flash uses even less. You don't need a flagship model to write an email.
GPT-4o Mini: ~0.4 Wh per query vs GPT-4o: ~1.3 Wh
Claude 3.5 Haiku: ~0.3 Wh vs Claude 3.5 Sonnet: ~1.0 Wh
Use flagship models only when quality visibly degrades with a smaller one
Reasoning models like o3, o4-mini, and DeepSeek-R1 “think” through chains of internal reasoning before answering. This is powerful for maths, logic, and complex analysis — but wildly wasteful for simple questions.
o3: ~9 Wh per query — that's 22× a standard GPT-4o query
DeepSeek-R1: ~4.5 Wh per query on typical infrastructure
Ask yourself: does this task need step-by-step reasoning, or just a good answer?
100×more energy for a 100k-token vs 100-token prompt
Every token in your prompt gets processed by billions of parameters. A 100-token prompt uses ~0.4 Wh. A 100,000-token prompt uses ~40 Wh. Long conversation threads compound this — each message re-processes the entire history.
Start fresh conversations instead of appending to 50-message threads
Include all context upfront rather than drip-feeding across exchanges
Paste only relevant code snippets, not entire files
Not all infrastructure is equal. Google runs Gemini on custom TPUs with 64% renewable energy matching — a median query uses just 0.24 Wh. Provider choice affects energy, carbon intensity, and water use.
Google (TPUs, 64% renewables): lowest per-query footprint
Azure Sweden (97% clean grid): best for carbon-conscious EU users
Local/on-device models: zero water cooling, your local grid mix
Each API call or chat exchange has fixed overhead: network, GPU spin-up, context loading. One detailed prompt that covers everything is far more efficient than five iterative back-and-forth messages.
Write your full request before hitting send — include examples, format, and constraints
Use system prompts or custom instructions to avoid repeating context
For coding tasks: describe the full feature, not one function at a time
Worth Knowing
06
Consider whether AI is the right tool
0.3 vs 0.4 WhGoogle search vs ChatGPT for a simple lookup
AI is transformative for creative work, analysis, and complex reasoning. But for simple factual lookups — “What's the capital of France?” — a search engine is faster, cheaper, and uses less energy.
Simple facts and definitions: use a search engine
Calculations: use a calculator or spreadsheet
Quick lookups in documentation: use Ctrl+F or site search
07
Time your heavy usage for cleaner grids
2–4×carbon variation between peak and off-peak hours
Electricity grids are cleaner at certain times of day — typically when solar and wind generation peak. If you're running large batch jobs or fine-tuning, scheduling for low-carbon windows can meaningfully reduce your carbon footprint.
Midday in solar-heavy regions (California, Spain, Australia)
Overnight in wind-heavy regions (Northern Europe, Texas)
Tools like electricityMap.org show real-time grid carbon intensity
Quick Reference
A cheat sheet you can come back to.
Before you prompt
Do I need AI for this, or will a search do?
Am I using the smallest model that works?
Is reasoning mode actually necessary?
While you prompt
Is all my context in one message?
Am I including only what\u2019s relevant?
Can I start fresh instead of continuing a long thread?
Over time
Have I checked if a more efficient model launched?
Am I using a provider with clean energy?
Can I batch heavy workloads to low-carbon hours?
See the numbers for yourself
Calculate the energy, carbon, and water cost of your AI usage across 40+ models.