The Role of Energy in AI at Scale
My last post examined how electric power and energy will increasingly underpin national strategic power and growth as we lean into leveraging AI—and how we should reconsider electric power as a strategic national priority.
Now we're diving into one piece of that: The cost of using AI at scale.
We know that AI needs energy. We see the extensive datacenter build-out and we've heard of the massive amount of resources required just to train recent frontier models, from data and hardware to water, energy, and compute time. But there is a bigger problem. We haven't talked nearly enough about what it will take to power our use of AI over time and at scale. As we increasingly use AI, our need to power that enhanced productivity will present a new challenge to our nation.
Energy will be the new limiting factor for using AI tools. We will soon see the cost of using AI track closely with the cost of electricity. At this point, electricity itself will become a force multiplier for individuals, companies, and entire nations. Electric power will be central to national power. So how much will it cost to power our use of AI, and what are the factors that will influence that?
Why the cost to use AI will converge with the cost of electricity
Let's begin with the cost of using AI. While higher today, soon this cost will track with the cost of electricity.
Much recent discussion covers companies training the newest, largest, most performant large language models. Once a model is trained, using that model is known as "inference." This is the process of sending input data—a question, perhaps—and receiving an output answer. After model weights are trained on billions of parameters of data, the computer essentially uses the identified relationships to pattern-match words and ideas to build responses. In doing so, it is completing a task.
Right now the big model companies, such as OpenAI, Anthropic, and Google, have different pricing structures for using their models (inference). Some have monthly subscriptions while others have fees for API (application programming interface) calls.
As we move into the future, however, the cost of having AI complete a task will converge to the cost of electricity to run the compute hardware for that task. This is because the cost to train the model and the cost to build the data center are fixed, one-time costs that will be spread across an increasingly large amount of usage. But each time you use AI, you will still have to pay for the associated electricity.
Especially once companies and individuals start using their own custom models and building multi-model workflows, the limiting factors for AI outputs will be around these two simple questions:
1) Do we have the power available, and
2) How much does that electricity cost?
Let's dissect what influences the cost of inference and what those costs look like as nationwide use increases.
What goes into the cost of inference?
Once a model is trained, there are several factors that influence the cost of using the model for inference. The costs to use AI can be broken down as follows:
(Cost of model weights) x (Cost of compute hardware) x (Cost of energy required for calculations) = Marginal cost of one "unit" of inference (such as answering one question)
The model weights (code) are loaded onto GPU (graphics processing units) hardware to run many parallel calculations and determine an output based on inputs. The proportional cost for each piece of this equation will scale differently with increased usage over time.
With increased usage, fixed costs spread out to nearly zero
In 2025, we are at the very beginning of our use of AI. But as time goes on, we expect widespread usage to increase. As fixed costs (in this case, the cost of training the model and the cost of compute hardware) are spread over increased usage, the only major cost left per output will be electricity.
Given the competition in proprietary models and the quality of open-source models, we can expect the cost of acquiring a high-quality model to go to nearly zero. (More on this below—see section "This could happen sooner than we think.")
The model weights and compute hardware, for their part, are fixed costs. As such, their per-unit marginal cost will go down as we use AI more. As general inference increasingly becomes a commodity and the largest models all perform at about the same level on basic tasks, the price should converge over time to the marginal cost to deliver the product. In this case, that delivery or production cost is the cost of electricity.
Electricity is the major cost remaining
Once initial training and hardware costs spread over time to nearly zero, the cost of power to run the equipment is what remains. The cost to run GPUs is a function of both the price of electricity ($ per kilowatt-hour, $/kWh) and the efficiency of the hardware.
As we have seen historically, we can expect that hardware efficiency will go up, which would reduce the amount of energy required to run the hardware.
At the same time, while electricity prices have historically been steady, it is not clear what will happen moving forward. I suspect they will rise faster in the near term, but it is not clear by how much. This overall cost for electricity to run the AI inference hardware is what will drive the cost of using AI moving forward.
The cost of using AI will track with energy
I put together a visualization to explain how the primary cost of using AI in the future is energy. The below graph shows the possible cost of inference over the next decade, with the cost components stacked on each other and summed. The top line is the total cost per use of AI. Each major component (model training, hardware, and energy) starts at about one-third of the overall cost in the first year. The total cost in 2023 is represented as a unitless reference "100."
This first graph illustrates the situation if electricity prices stay flat—in this case, the total cost of inference falls over time. The fixed costs (model training, inference hardware) are spread across usage. The total cost of electricity scales with the total amount of AI inference done. If we assume increased usage volume over time, the fixed costs are reduced per unit and the marginal cost of inference converges with the cost of the electricity to power it.
Given these assumptions, we see that the cost of AI will be effectively the cost of electricity.
Our most limiting factor for AI use will be the amount of electricity we have to run it.
These trends mean that more electricity available equates to more productivity from AI. This could be on a personal, corporate, or national level. When the conversion of electricity to productivity is so direct, electricity—and the ability to get more—becomes extremely valuable.
As I mentioned, I suspect that high AI inference demand growth has the potential to strain the electric grid in the United States, especially because our ability to add new generation, transmission, and distribution hardware is unlikely to scale as fast as that of demand for AI. I expect we will see a small increase in generation capacity, but then it will slow. Soon after, demand for electricity will outpace that constrained supply. Prices for electricity will rise.
As a forecasting example, at the recent GE Vernova third quarter earnings call on October 22, CEO Scott Strazik remarked the company's total non-refundable backlog and slot reservations for gas turbines was up to 62 GW, a number from just a single company that represents about five percent of the total installed capacity on the United States grid.1 A few weeks later, he confirmed their gas turbine production capacity was "largely sold out" through the end of 2028.2
That is one signal our demand is already surging beyond the capacity of supply.
Our demand for AI—and thus, electricity—could scale exponentially, while our ability to grow grid capacity is more linear due to supply chain constraints, permitting and construction timelines, and the complex nature of the system. The result is an increasingly large supply-demand mismatch and corresponding high prices for electricity and AI.
The graph below shows the marginal cost of AI inference if the cost of electricity rises by 8% year-over-year in just 2026, 2027, and 2028. For comparison, the consumer price index (CPI) for electricity for urban consumers rose 6.19% in the twelve months preceding August 2025 and averaged a rise of 6.77% per year over the last five years. This may not be perfectly representative as electricity prices vary by area and type of use (residential, commercial, etc).
With the 8% assumption, we see that prices go up in the near term and, if we somehow build out our ability to rapidly add generation capacity, electricity prices could level out in the long term. The amount of change is less important than illustrating the correlation between electricity prices and the cost of AI inference.
Rampant growth in AI use comes sooner than we think
The data below illustrates the latest AI model situation, with performance getting very good, competition rising, and costs falling. I would argue even the free models are "good enough" for many productivity-enhancing tasks, and all they will need is energy to run. The gap before AI tools are useful is no longer research and development in AI model labs, but in energy to run the existing tools at scale.
Quick thank you to Epoch AI3, Our World in Data4, and Ember5 for helping put numbers to these observations.
Models are getting better
We all know the frontier models are getting quite good, but this data helps illustrate the point. We are at the point in which model quality is high enough for many tasks, especially when the model is fine-tuned for a specific use. And every day, they are improving—the performance today is as low as it will ever be moving forward.
The area to the upper-right beyond this graph is basically where artificial general intelligence (AGI) and/or artificial super intelligence (ASI) would reside. Though surely impressive and likely very useful for further research, it is not obvious that such quality is necessary for many tasks once the existing models are customized. With specific training, the existing model quality is already capable of replacing human efforts in certain tasks and increasing individual, corporate, and national productivity. We don't need them to get much better to see high usage.
The graph below shows the best ("frontier") performance of models on each benchmark over the last few years.6 On many benchmarks, the models have surpassed expected human performance.
The Epoch Capabilities Index (ECI) combines scores from many different AI benchmarks into a single "general capability" scale, allowing comparisons between models even over timespans long enough for single benchmarks to reach saturation.7 We see the same trend upward.
Free, open-weight AI models are very good
High performance is not reserved for the paid models. The performance of freely available, open-weight models tends to lag the performance of proprietary models by only about 6 to 12 months. These models are very good and you can download them today for free. You only need a good gaming computer and electricity to run many of them.
The graph below shows the frontier of free, open-weight models is often not far behind the frontier of proprietary models. The performance of some recent open-weight models has already surpassed that of very impressive proprietary models that came out only months ago.
How the era of small and custom AI models impacts electricity use
The next wave that will increase usage is with small and custom models. Most of the trend over the last few years has been to scale the AI models larger to improve performance, but that is increasingly not the only option. It's true that training on more data usually allows more parameters and more refined pattern matching. The resulting models are large to store and heavy on compute resources, such that only big datacenters can train them and host them for inference. Most AI usage appears to be from people experimenting with these general-purpose tools offered by the tech companies.
But what if bigger isn't necessarily better?
Recently there has been a push toward smaller models and this seems the logical way forward for many use cases. When you have a specific task to complete, such as taking customer calls for one company or product, automated billing or scheduling, and organizing or categorizing specific data, you would just run a smaller model designed specifically for that single narrow use case. Your model won't also need to have expert knowledge in biology, chemistry, geography, and car repair to be useful.
The next wave of models should go down-market as people and companies create their own tools. They will likely base them on existing open-source options and fine-tune them with their own data, then customize prompts and parameters for certain tasks.
And the models wouldn't have to be large or need extensive equipment; the companies could run several smaller models in parallel to improve results.8
As this happens, "the most significant differentiator in the quality of results will come less from the size of the model and more from the amount of energy dedicated to each task." Again, as we move towards smaller models and multi-model workflows, the most significant limit to performance will be electricity.
These models may or may not be hosted in the large datacenters, but they will expand energy usage either way. Soon enough the question won't be around having enough electricity to train the next big model, but rather around having enough electricity to run inference with many smaller tools.
Self-hosting and inference at the edge
A trend in AI usage in edge devices could also significantly grow overall AI use and electricity needs. In addition to demand on centralized datacenters, the models are increasingly good enough to cause an explosion in self-hosted tools used at the edge of the system.
Such tools could allow for improved access and privacy control, custom model fine-tuning, and better control over availability and performance. These could also include mobile use cases where there is a need for faster response, operation in low network connectivity, or simplified control and access.
This evolution could grow power demand at an exponential rate as individuals and small businesses attempt to maintain competitiveness through custom models and/or edge computing, further straining the electric grid.
Again, the linear growth currently expected in the deployment of physical grid infrastructure would be unable to respond to this exponential growth of AI digital demand.
What could lead to AI using less energy?
There are some developments which could potentially upend the trends and outlook above, which could cause us to need less energy to use AI.
First, computers have a track record of becoming more efficient. Moore's Law and the reduction in chip size over the years has produced good results. That said, chip size in leading GPUs could be approaching some physical constraints for which we do not yet have solutions. Tensor processing units (TPUs) and other application-specific integrated circuits (ASICs) have gained attention recently as they are reportedly much more efficient for inference than GPUs, which could assist in alleviating some of the forecasted power demand issues.
Next, traditional architectures of CPU plus GPU could be transforming with new iterations of hardware using unified memory and other integrated AI compute solutions. At the consumer level, there are several options9 that claim to offer advances. Depending on their performance, these could be game-changing for edge needs, but it is not obvious how these solutions could scale to the industrial data center level.
Seemingly further in the future, there are teams working on completely new compute tools in an effort to upend the traditional architectures. As an example, the startup Extropic is working on probabilistic computing hardware that they believe would be thousands of times more efficient than GPUs.10
Two points still have me concerned about our future electric power needs.
The first comes when considering international competition: Even if a country is able to develop a computational or hardware solution that is a step-change upgrade in efficiency, it is not obvious that they will be able to maintain a reliable monopoly over that advance. Eventually, one would expect some aspects of it to spread.
The second concern comes with speed of development. While there will be advances and solutions available to us at some point in the future, I am less convinced we will be able to develop and build them quickly enough to avoid significant supply-demand issues.
AI and Electricity in the US and China
If the future of energy is a direct conversion from electricity to productivity using AI, the following geopolitical trends have me concerned for the United States' ability to stay relevant.
The People's Republic of China (PRC) AI models are not significantly behind those from the US
The United States does not seem to have an unsurpassable advantage or lead in the development of AI performance. Not only that but most of China's major AI development is open source, which likely adds to their speed of further development. It would not be surprising to find China in the lead with any upcoming model.
Moving past the race toward AGI, the existing models are already quite capable of accomplishing many tasks and enabling productivity with the right processes and tuning.
The PRC has been building out their power infrastructure
China is rapidly becoming the leading electro-state. They are using more power, adding all types of generation capacity, and have been developing their electrical component manufacturing industry into a flywheel of production that further pushes costs down.
The graph below shows how much more power they are using to run their country. It is significantly more than the United States.
They are also demanding increasingly more electricity per person—and they are meeting that demand.
The PRC is very serious about this growth, with annual added capacity spiking in recent years.
How are they accomplishing this rapid capacity increase? They are installing more of everything. They are non-discriminatory when it comes to adding electrical generation capacity.
And then there are robots
Most mentions of AI above refer to LLMs and digital tools. But the next wave of production after automating digital tasks is automating manual tasks with physical tools—robots.
China has already seen massive growth in industrial robots, such that there are reports of "dark factories."11 The United States is also seeing increasing competition in the space12 and these robots are sure to be hungry for electricity. We have a lot of work to do if we hope to keep up.
Related: Winning the Next War
Footnotes
-
https://www.gevernova.com/sites/default/files/gev_webcast_transcript_10222025.pdf ↩
-
https://www.wsj.com/podcasts/wsj-the-future-of-everything/the-worlds-tech-giants-are-running-out-of-power-this-ceo-plans-to-deliver/37f59519-823c-4e49-a6fc-7f7f2b2ad36a ↩
-
Apple M5 (Amazon), GMKTec EVO-X2 (Amazon), NVIDIA DGX Spark (Amazon) ↩
-
https://www.wsj.com/tech/ai/ai-robots-china-manufacturing-89ae1b42 ↩
-
https://www.nytimes.com/2025/11/17/technology/bezos-project-prometheus.html ↩