A chatbot reply arrives as text on a screen, with nothing about it to suggest weight, heat, or water, yet the most quoted number in the debate over AI’s environmental cost runs the other way. A 2023 study from the University of California, Riverside estimated that running GPT-3 through ten to fifty medium-length replies uses up about a 500ml bottle of freshwater. The figure is vivid, easy to repeat, and easy to get wrong.
It comes from one paper, “Making AI Less ‘Thirsty'” by Pengfei Li, Jianyi Yang, Mohammad A. Islam and Shaolei Ren, first posted in April 2023 and later published in Communications of the ACM. It is a calculated estimate built from public data rather than a reading off a meter in a data centre. The authors call the per-query figure conservative, and say real-world use could be several times higher. That caveat tends to disappear when the number travels.
The invisible infrastructure behind a text reply
The water never touches your query directly. It sits in the cooling systems that stop server halls from overheating, and in the power plants that supply the electricity those servers draw. When the authors say a model “drinks” water, they mean the freshwater that evaporates in cooling towers and at power stations while the computer does its work.
That framing is the study’s central move. As the authors put it, “GPT-3 needs to ‘drink’ (i.e., consume) a 500ml bottle of water for roughly 10 – 50 medium-length responses, depending on when and where it is deployed.” The part after the comma is what usually gets dropped. The bottle of water is not a fixed feature of the model but a function of where it runs and at what hour.
What the UC Riverside study actually measured
For training the model, the headline figure is large. Using Microsoft’s published efficiency data for different locations, the team estimated that training GPT-3 could directly evaporate 700,000 litres of clean freshwater on-site. That number covers only the water lost at the data centre itself.
The full picture is bigger. The study estimates that training GPT-3 could consume 5.4 million litres of water in total, adding on-site losses to the water used in generating the electricity. GPT-3’s training location was never made public, so the authors modelled it across several possible Microsoft sites. The 5.4 million figure is an estimate built on top of another estimate.
The split between those two numbers matters more than either alone. In a later guest piece, Ren and Amy Luers, who leads sustainability science and innovation at Microsoft, wrote that data centres use water both for cooling and through electricity generation, and that “that indirect use often makes up 80 percent or more of the overall water use.” Most of the water footprint, in many cases, never passes through the data centre at all — it sits upstream, at the power plant.
Why location and timing swing the numbers so widely
The same task can carry very different water costs depending on where it runs. The UC Riverside team found that training GPT-3 in Microsoft’s less efficient Asian data centres would have roughly tripled the water use compared with the U.S. figure.
Timing is the other lever: Ren has suggested that running heavy computing in cooler hours, and shifting power sources, could reduce the toll. “We can’t shift cooler weather to noon, but we can store solar energy, use it later, and still be `green’,” he argued. That is a proposed fix rather than a solved one — storing energy to dodge midday heat trades one constraint for another.
What the figure does and does not tell us
The 500ml bottle is a useful way to explain the problem and a poor universal rule. It works because it turns something abstract into something you can picture, but it misleads when people treat it as a fixed cost per query rather than a range that shifts with place, season, hour, and model. The authors gave a range of ten to fifty replies for a reason, and even called that conservative.
The paper projects global AI water withdrawal reaching 4.2 to 6.6 billion cubic metres in 2027, more than several countries withdraw in a year. The single-chat figure can be uncertain and still point at a trend worth watching across a whole industry.
The estimate is narrower than the headlines suggest. Rather than passing verdict on any single conversation, it argues for disclosure — the companies that hold the meters should report where and when their models run. Without that, every public figure stays a careful guess, and only the operators of the data centres can supply the accounting the study calls for.