Ethics & LLMs: Sustainability

Environmental concerns about artificial intelligence are legitimate and deserve serious consideration. If your organization cares about people—and if you're reading this, it probably does—then you care about them having a habitable environment to live in. But when it comes to making informed decisions about whether and when to use AI, it's crucial to put these impacts in perspective.

The Individual vs. Systemic Responsibility Problem

One of the biggest issues with current AI environmental discourse is that it places responsibility on individual users for what are fundamentally systemic problems.

As Andy Masley's detailed analysis shows, LLM use accounts for only 1-3% of total AI energy consumption, with the vast majority going to enterprise analytics, recommendation systems, and other behind-the-scenes applications.

This focus on individual behavior mirrors earlier climate conversations that emphasized personal recycling while major corporations continued massive emissions unchecked. The real opportunities for environmental improvement lie in what tech companies and governments can encourage:

What Are You Really Comparing It To?

The commonly cited statistic that "ChatGPT uses 10 times more energy than a Google search" is true but misleading: the GPT4-era models that paid accounts default to use about .00003 kWh or .3Wh for a typical query, the same as estimates for a Google search. But even GPT-3 used 2.7 watt-hours more than a Google search—roughly equivalent to running a laptop for 3 minutes.

But the more relevant question is the counterfactual: What are you using an LLM prompt replace?

If you're asking ChatGPT something you could have googled in one search, then yes, Google queries are the right counterfactual. But many LLM use cases replace much more resource-intensive alternatives:

  • Quick look-up tasks: This is where Google is the right comparison. The energy use estimates are the same (we can expect that you find your answer on the first try with search and an LLM), although depending on. your model, Google probably wins in terms of speed.

  • Research synthesis: Using an LLM instead of opening 20+ web pages and reading through them? Scanning 20 webpages (device + network) typically consumes ~6 Wh; one ChatGPT answer ≈ 0.3 Wh. Don’t even get me started on printing: each individual page printed on a laser printer uses .4Wh, more than an LLM query run on modern models.

  • Problem-solving: Using and LLM instead of lengthy email chains or multiple video calls with colleagues is a great use of your time. Cutting a single 30‑minute Zoom call (~0.08 kWh) saves the energy of 260 ChatGPT prompts.

Another note: I hear all the time that a ChatGPT queries use about a 500ml/16.9oz bottle of water each. Best I can tell, that estimate comes from this study, which actually estimates that (in the US) 29.6 prompts use 500ml of water on average. Also, this study used GPT-3 models, the state of the art at the time, rather than GPT-4. Finally, most of the water use calculated using this method comes from electricity generation, not direct data center cooling and includes water used in training, so the additive approaches people often use when citing this statistic (a water bottle, plus the electricity; or Y mL for query and Y gallons for training) are double counting.

Scale Matters: Putting AI Use in Context

Aside from activities that are a direct replacement for LLM use, consider other activities that your organization does or requires you to do regularly to identify opportunities for reducing your organization’s footprint (See sources at the end of this post.) Sources and methods vary and are difficult to compare across activities, so I have included high and low estimates.

Assumptions: ChatGPT 4o 60 prompts, 10 pages printed with laser printer, laptop use for browsing the web and editing documents for 1 hour, streaming video for one 44 minute Netflix episode, video call is Zoom HD group call. 250 mL coffee estimate is for ~1 cup/8 ounces of drip coffee.

(Note that the 250mL of coffee calculation does not include the actual 250mL of water in the coffee you actually drink: that’s considered consumptive use and excluded.)

If you're concerned about environmental impact, skipping a single beef burger saves more water than avoiding ChatGPT for an entire lifetime. Other horrifying comparisons include almonds/ almond milk and a 15 mile commute in a hybrid car.

I calculated the latter to include on these charts, but it would have blown out the x axis and made these charts unreadable: a 15 mile commute in a Prius uses 450-1050 mL of water or 8-12 kWh (not 8 hundredths or 8 tenths. Eight! and that’s in a 50 MPG car). Perhaps the most impactful, accessible environmental choice you can make for your organization is adding a work from home day or two.

Practical Steps for Mission-Driven Organizations

The purpose of this article is not to say that you should ignore the environmental impacts of LLM use because other things are worse, but rather to help you put things in perspective and help you identify where to focus your sustainability efforts. Here are some ways you can reduce the environmental impact of your LLM use:

1. Choose Efficient Models. Counter, perhaps, to your instincts, this means that you may be able reduce your environmental impact ~10x by upgrading to a paid account. Neither Anthropic nor OpenAI have made it clear whether they have implemented their efficiency improvements in the older models after they developed them, but only older models are available on free accounts.

Regardless of whether you pay, for each task, you can reduce waste by choosing the smallest model that meets your needs. Claude Sonnet instead of Opus, or ChatGPT models with "mini" in the name. Claude models seem to use slightly less resources per query overall than ChatGPT ones, although it’s tough to find info with comparable measures. If you do not need the “deep research” setting or the “thinking” features, don’t use models that use them.

To mimic the features of.a “thinking” model with a smaller model, take the time to break out the tasks yourself in your prompt (.e.g “first, ask me a few questions to help you understand the problem better. Then, come up with a rubric to use to evaluate solutions. Then, list our options. Then, evaluate each using your rubric. Please return the rubric with your scores, your top three recommendations and why.”)

2. Audit Your Vendors: Ask AI vendors about their water and energy policies, especially before committing to them. Your free Mission-First AI Starter Kit includes questions for vendors about sustainability and other ethical issues, including red flag and gold star answers for each question.

3. Optimize Your Usage

  • Plan queries in advance when possible. Making sure your goals are clear before you start asking goes a long way toward reducing the number of queries you use for each task.

  • Learn the "jagged frontier" (with thanks to Ethan Mollick) of what AI does well in your domain to avoid wasted attempts

  • Use AI for high-value tasks that replace more resource-intensive alternatives

4. Use LLMs to help you reduce your environmental impact overall: Use LLMs to identify environmental improvements in your own operations.

Perhaps something like: “We are a [department domain] in [organization type] that works on [mission] using [key business processes] and we want to improve our impact on the environment.” Then you can attach any information about your carbon footprint and ask it to help you think of ways to reduce it. Or, if you don’t already have a view of your environmental impact, ask the LLM to ask you questions to quantify it, or identify low hanging fruit to reduce it using assumptions about organizations in your domain.

5. Make Strategic Trade-offs: Cancel one hour-long Zoom call per quarter—it saves more resources than avoiding AI for months. Choose video calls over flights when possible. Optimize your organization's largest environmental impacts first.

Advocating for Systemic Change

The most impactful thing mission-driven organizations can do is advocate for systemic changes:

  • Support renewable energy for data centers

  • Advocate for water reuse solutions in tech infrastructure

  • Push for transparency in environmental reporting across sectors

  • Engage with policymakers on sustainable AI development standards

  • Commit to preferring vendors with sustainable practices

The Bottom Line

Environmental concerns about AI are valid, but the current discourse often misdirects attention from high-impact systemic changes to low-impact individual behaviors. Focusing on individual consumption choices can unintentionally impede progress on the structural changes needed to address climate change effectively.

Use AI thoughtfully, choose efficient options, and direct your primary environmental advocacy efforts toward the changes that will make the biggest difference. Make sure you are considering the right counterfactual when you are evaluating the environmental impact of your work, rather than wholesale avoiding a particular technology or practice.

The climate crisis requires all the tools we can get—including AI's potential to accelerate solutions.

LLM disclosure:
I started a really thorough-looking analysis of LLM impact, but it was on a blog (so, not peer reviewed) and outside of my area of expertise. So asked Claude Sonnet 4:
Can you evaluate this source for me? How would you rate the data, methods, sources, analysis, and robustness of the conclusions? (Not whether you think the conclusions match others conclusions, but whether they follow from the evidence)

I was pretty impressed with its response: it was more critical than I expected (perhaps because it wasn’t the user’s writing?) and clearly identified strengths and weaknesses, which I could confirm and prioritize myself.

Then, I asked the same modeL:
”I have some rambling notes about a chapter of my book. Can you draft a blog post using these notes, and refer to strong sources where necessary. Please link sources that you use. You can use the source above where it is strong.”

I edited the output, then put it in ChatGPT o3:
”Can you rigorously fact check this article? Please provide sources for the numeric claims, or update the claims and cite them.”

Then, I edited it again, and checked all the sources. I realized that all of the LLM estimates were about ChatGPT, so I asked: “thank you! Can you find information about how Claude queries compare?” (It pointed me to sources that suggested it was comparable and even a little lower, so I decided to be conservative and treat them as if they are the same).

Finally, I felt like the estimates for how much energy was used by different technologies was not great: neither Claude, nor Chat, nor the long blog post I found cited all of their data sources and the estimates varied. Plus, I wanted to make a visualization. So I tried to make my own.

I found that the sources vary widely and I couldn’t find any single sources that used the same method to estimate across activities. So I returned to ChatGPT o3:

  • ”In your first response, you gave me a great table comparing ChatGPT prompts, spotify stereming [sic], zoom calls, and netflix 4k video on both their energy and water use. Can you expand that table to include common office tasks and citations for each cell? Prefer peer reviewed sources, and if there are wildly different estimates, please note that.”

  • “thanks! can you add brewing a cup of coffee and commuting to the office in a hybrid vehicle?”

  • Your estimates for audio streaming and video seem really different from this source. Can you explain why they are so different?

  • Thank you for this analysis. I prefer the median approach, rather than the rhetorical one. Can you put the chart reflecting the median approach in a document (csv, ideally) but increase the Netflix estimate to the length of an episode (~44 minutes)?

Scale Matters analysis notes:
Unfortunately, it’s difficult to find single or easily comparable sources for each activity, and the specifics of each activity make some of them vary widely. I’ve tried to represent the ranges, but ultimately, the sources still aren’t very satisfying. I didn’t want to let this stand in the way of trying to get a general idea of how these activities stack up, but just know that I would not submit this for peer review.

Coffee-making plus this one
Data centers
Audio files for music streaming
Video calling
Video Streaming
Laptop use
Printing
Commute in a hybrid plus this one

Previous
Previous

Introducing the free Mission-First AI Starter Kit!

Next
Next

How to use an LLM to prep for a high-stakes meeting