Energy Future: Powering Tomorrow’s Cleaner World

Power Grab: AI Data Centers and the Electric Grid - Part 2

Peter Kelly-Detwiler Episode 24

Picture a future where AI not only writes a quarter of Google's new code but also reshapes entire industries with unparalleled cost-efficiency. This is not a distant possibility but a current reality, as we unpack the transformative power of AI in our latest episode. We promise a comprehensive understanding of the staggering $200 billion investments by tech giants in AI data centers, a move redefining infrastructure landscapes. Discover how the might of GPUs is propelling AI models to new heights, amid challenges of chip shortages and the complex dance of data availability. Delve into the intricacies of tokenization and its critical role in processing data for these modern AI systems.

But what does this mean for the future of energy and data centers? We explore the looming energy challenges, pondering the fate of investments if tech efficiency outpaces demand by 2030. Could relentless advancements, like Nvidia's impressive GPU energy reductions, lead to stranded assets? As data centers grapple with these energy demands, we speculate on how future innovations might reshape their power needs. Join us as we examine the evolving strategies for energy acquisition and set the stage for further discussions on the interplay of AI, computing capabilities, and the energy landscape.

Support the show

Speaker 1:

In our first session, we reviewed some of the staggering projections for data central load growth. Today we're going to talk about chips, compute and power draw. But before we get started, let's quickly address four interesting AI-related factoids that have come out in just the past week since our last video. First, exelon's CEO says it has seen what it terms high probability data central load jump from 6 to 11 gigawatts this year. Second, google CEO indicated that over 25% of new code in the company is being generated by AI and then reviewed by engineers. Many observers ask where's the money in AI? Well, google eating its own dog food in this coding case is a pretty solid use case. Programmers don't come cheap and programming is probably not a good industry to be in these days. It won't be the only one where AI substitutes for human labor, and that's a huge part of the AI value proposition. Third, the Financial Times estimates that spending on AI data centers for the big four Alphabet, amazon, meta and Microsoft will exceed $200 billion this year and be similar next year. And fourth, the Federal Energy Regulatory Commission just rejected a request from Amazon Web Services to expand a contract involving co-location of a data center and consumption of power for that data center directly connected to Talon Energy's Susquehanna nuclear power plant in Pennsylvania. More on that in another video.

Speaker 1:

Okay, with that out of the way, let's talk compute. I noted in our last session that compute capabilities for chat GPT have soared by four orders of magnitude, that's 10,000 times in just five years. Why and how did that happen, and why are these guys burning through hundreds of billions of dollars in cash? Well, the key goal here for the competitors in the space is to improve the quality of their language learning models so that they can deal with more complex logic and increased overall accuracy. The models do this by training on data as much as they can get, with machines as powerful as they can make them so-called scaling. One of the best independent analysts, epic AI, indicates that in recent years, the compute capability has been growing at a rate of around 4x annually, a growth rate that vastly outstrips other technological surges such as cell phone adoption. The natural question to ask here is whether this growth will continue at this torrid pace and also what are the implications for our power grids.

Speaker 1:

Epic AI looks at this issue by examining each of four underlying factors. First, power availability our sweet spot that we'll talk about a lot more later. Second, global chip manufacturing capacity and availability. Third, what Epic calls the quote latency wall, unquote the limit resulting from delays in AI training computations. And fourth, the availability of data to train on. Let's have a brief look at items two through four, since we'll deal with power in its own session.

Speaker 1:

First, let's discuss the chips. They are in high demand these days. By the way, these are game processing units, gpus, rather than your typical central processing unit, cpus, that have typically been used in most data centers in the past. Gpus are used for games because they're powerful. Games require computers to be able to render millions of pixels simultaneously and make thousands of calculations in parallel at the same time. Gpu power and speed also have helped game makers avoid blurring of frames. Processing changed the AI game since, like game makers, ai data centers need a machine that could process huge amounts of data in parallel, performing highly complex calculations at rapid speeds and far more efficiently than CPUs ever could.

Speaker 1:

Those chips also keep getting better, but they're in high demand. They're also expensive, both to develop and buy. Leading chip maker NVIDIA's newest Blackwell chip cost the company about $10 billion to develop and create, and buyers are anteing up $30,000 to $40,000 per chip. Yep, that's right $30,000 to $40,000. And that same Blackwell chip draws between 700 watts and 1.2 kilowatts, depending on the configuration and cooling strategy. Yep, that's right as well. Nvidia currently has about 80% of the GPU market share, followed by AMD, and right now the industry cannot keep up with demand. But Google, amazon, meta and Microsoft are all at work developing their own chips so that supply strain may eventually ease.

Speaker 1:

Next let's look at this thing called latency wall. This area is way out of my strike zone, but I'll try my best. It takes a certain amount of time latency for an AI model to process each data point, and that latency increases as the model size and complexity grows. Models train by separating data into batches that can be addressed in parallel, but the batches can only be so large. At a minimum. Each AI training run takes as long as is needed to process a batch. The more batches you have to process, the longer the run takes. Today's latencies aren't that big Batches can be processed quickly, but at some time in the future that big batches can be processed quickly, but at some time in the future, as training runs get longer and models get bigger, this becomes an issue and efficiencies may fall off. So scaling could become an issue limiting future growth rates. Finally, let's look at availability and potential limitations of data itself. Ai data centers train on data Everything we've ever posted to LinkedIn, facebook or Insta, youtube videos, scientific papers, movies, tv shows, stupid clips on TikTok all of it.

Speaker 1:

To understand what's going on here, we have to understand the concept of a token. That's the smallest element into which text data can be broken down in order for AI models to process it. So, for example, juliet's plaintiff query wherefore art thou Romeo would be represented by seven tokens, or at least that's what perplexity AI told me when I asked it. It broke wherefore into two words, but did allow for the possibility that it might be tokenized into six elements. Instead, four or five words wherefore art thou, as well as the comma and the question mark For images, audio chips or videos.

Speaker 1:

Computers typically break these into smaller patches for tokenization. One picture or one second of video might represent 30 tokens. Get it Now. These tokens, which essentially serve as the link between our languages and images and formats accessible to computers, can be processed at lightning speed, but computers can only handle so many at a time, so models need to be continuously optimized. It's estimated that the web holds about 500 trillion words of unique text, which may grow between now and 2050 by 50%. Add in images, audio and video and you might triple that. So as much as 20 quadrillion tokens might exist for computer training by the end of the decade. That's a lot of raw material, but projections are that, with ever faster computers and more efficient algorithms, we might actually run out of data to train from as soon as 2026.

Speaker 1:

What then? Well, the current thinking is the machines will then learn to generate their own synthetic data. How Well, for example, some machines have learned to play games or solve mathematical problems through the process of training on data they themselves have created. Or the machines could find other ways to learn.

Speaker 1:

This uncertainty leads to a critical question for utilities. What if we build all this infrastructure and then, by 2030, there's less to do with it? The phrase stranded assets should certainly come to mind. Or what if chips become increasingly more efficient, so that they require less electricity, both for processing and for addressing the waste heat they generate? Nvidia claims that its GPUs used in training have seen a 2,000-fold reduction in energy use over the past decade. To date, those gains have simply allowed data centers to do more, and their appetite appears endless. But what if future gains continue? How does that affect future data center power requirements? Nobody really knows. What we do know today is that the power grab continues unabated for now, and data centers are looking at all kinds of supply strategies to get the juice wherever they can, and that's the topic we'll focus on in the next session.