AI Meets Efficiency: A New Chip Shrinks LLM Power Use by 50%

The ongoing artificial intelligence (AI) boom is leading to a surge in data centres, which is driving an immense demand for power to run and cool the servers inside them.

While there are over 8,000 data centers worldwide, most of which are in the US, this number will increase substantially in the coming years.

As per the Boston Consulting Group’s estimate, demand for data centers will rise 15% to 20% every year through 2030. At this point, the firm is expecting them to comprise 16% of total U.S. power consumption, up from a mere 2.5% before OpenAI’s ChatGPT release in 2022.

Meanwhile, the special Energy and AI report from the International Energy Agency (IEA) released this year, expects the demand for electricity from data centres around the world to at least double by the end of this decade to about 945 TWh. This is almost equivalent to what Japan consumes today.

The Paris-based autonomous intergovernmental organization reports AI to be the biggest driver behind this spike, with electricity demand from AI-optimized data centres projected to more than quadruple by 2030.

In the US specifically, power consumption by data centres is already headed to account for about half of the growth in electricity demand between now and 2030. Driven by AI usage, the US economy, as per the report, will be consuming more electricity to process data at that point than to manufacture all the energy-intensive goods combined.

This insatiable hunger for energy presents a massive problem in AI advancement and adoption. However, the silver lining is the growing number of researchers and companies working on reducing AI’s power usage and making it more energy efficient.

What’s interesting about these efforts is that many of them utilize AI to address their own energy challenges.

Just this month, a team of researchers demonstrated a new chip that uses AI to shrink the energy footprint of large language models (LLMs) by 50%, marking a major development in making LLMs cost-effective and more sustainable to run.

New Chip Leverages AI to Reduce LLM’s Energy Consumption

Researchers from Oregon State University College of Engineering developed the new efficient AI chip to solve the massive electricity consumption problem of LLM AI applications like OpenAI’s GPT-4 and Google’s Gemini.

A type of machine learning (ML) model, a large language model (LLM) is pre-trained on vast amounts of data to perform natural language processing (NLP) tasks like text generation, summarization, simplification, text reasoning, language translation, and more.

The most popular and widely used chatbots today include OpenAI’s GPT-4o, o3, and o1, Gemini and Gemma from Google, Llama from Meta, R1 and V3 from DeepSeek, Claude from Anthropic, Nova from Amazon, Phi from Microsoft, and Grok from xAI.

Over the last few years, LLMs have completely transformed the field of AI by enabling machines to understand and generate human-like text with greater accuracy. However, this evolution of LLMs has resulted in an exponential increase in their size.

An LLM’s size, which is measured in its number of parameters, is the main driver of its energy consumption. This means that the larger the model, the greater its need for computational power for training and inference.

For instance, ChatGPT-1 had just under 120 million parameters, which surged to 175 billion parameters with GPT-3, and then to roughly 1.8 trillion parameters with GPT-4.

This immense surge in LLMs’ size and capability means their energy consumption is also rising at an unprecedented scale. Besides the size of the model, factors like the type of hardware used to train these LLMs, the duration of the training process, infrastructure, i.e., data centres, data processing, model optimization, and algorithm efficiency influence the energy consumption of LLMs.

Hence, the new chip from the OSU researchers. According to Tejasvi Anand, an associate professor of electrical engineering at OSU who also directs the Mixed Signal Circuits and Systems Lab at the University:

“The problem is that the energy required to transmit a single bit is not being reduced at the same rate as the data rate demand is increasing. That’s what is causing data centers to use so much power.”

To overcome this problem, the team designed and developed a new chip that consumes only half the energy compared to conventional designs.

Anand and doctoral student Ramin Javad presented this new technology at the IEEE Custom Integrated Circuits (CIC) Conference, which was held in Boston last month. The conference, which hosts forums, panels, exhibits, and oral presentations, is devoted to the development of IC, which serves as the building block of modern electronic systems by providing functionality and processing power in a compact and efficient package.

The latest technology was built with the support of the Center for Ubiquitous Connectivity (CUbiC), the Semiconductor Research Corporation (SRC), and the Defense Advanced Research Projects Agency (DARPA). It also got Javadi the Best Student Paper Award at the conference.

For the new chip, the researchers actually leveraged AI principles that, Javadi noted, reduce electricity use for signal processing.

As he explained, LLMs send and receive a lot of data over wireline connections, which are copper-based communication links in data centers. This entire process requires significant energy, so one potential “solution is to develop more efficient wireline communication chips.”

Javadi further noted that when sent at high speeds, the data actually gets corrupted at the end of the receiver, and as a result, it needs to be cleaned up. For this purpose, the majority of existing wireline communication systems utilize an equalizer, which consumes a lot of power.

“We are using those AI principles on-chip to recover the data in a smarter and more efficient way by training the on-chip classifier to recognize and correct the errors.”

– Javadi

While a big development, this is just the initial version of the chip. Its next iteration is currently in the works to further enhance its energy efficiency.

Overall, this ongoing research shows great potential to have far-reaching implications for the future of AI infrastructure and data center operations. But of course, that would require the technology’s successful implementation at scale, which is never an easy task.

Click here to learn how AI is upending microchip engineering.

Taming AI’s Energy Appetite with Breakthroughs Across Layers

This latest chip development is just one of many research projects tackling AI’s energy consumption problem. So, let’s take a brief look at the innovative ways researchers have addressed it.

Using Light for AI Energy Efficiency

Earlier this year, USST scientists developed¹ a microscopic AI chip, smaller than a speck of dust or a grain of salt, that uses light to process data from fiber-optic cables. This promises faster computations with less energy consumption.

The chip manipulates light to perform calculations instantly rather than interpreting light signals as traditional computers do. For this, it uses “all-optical diffractive deep neural network,” tech that utilizes patterned, 3D-printed layers of components stacked together. While groundbreaking, challenges like task-specific design, sensitivity to imperfections, and difficulty in producing at a large scale need to be overcome for it to achieve “unprecedented functionalities” in endoscopic imaging, quantum computing, and data centers.

A few months before that, MIT scientists also used light to perform the key operations of a neural network on a chip, enabling ultrafast AI computations (in half a nanosecond) with 92% accuracy and massive energy efficiency.

“This work demonstrates that computing — at its essence, the mapping of inputs to outputs — can be compiled onto new architectures of linear and nonlinear physics that enable a fundamentally different scaling law of computation versus effort needed.”

– Senior author Dirk Englund

The scientists developed the photonic chip², which is made of interconnected modules that form an optical neural network. Notably, the usage of commercial foundry processes for its fabrication means it can be scaled and integrated into electronics. Also, scientists overcame the challenge of nonlinearity in optics by designing nonlinear optical function units (NOFUs) that combine electronics and optics.

Click here to learn about the new brain-inspired AI that learns in real time using ultra-low power.

A Software Tool for AI Training & a Cooling System for Data Centres

Cooling System for Data Centres

The University of Michigan researchers, meanwhile, targeted the energy waste created at the time of AI training, more specifically, when it is divided between GPUs, a necessity for processing huge datasets, unequally.

So, they developed a software tool called Perseus that identifies subtasks that will take the longest time to complete and then reduces the speed of processors not on this ‘critical path’ to enable all of them to complete their tasks concurrently, removing unnecessary power use.

This open-sourced tool is available as part of Zeus, a tool for measuring and optimizing AI energy consumption.

Meanwhile, researchers from the University of Missouri turned to devising a next-gen cooling system to help data centers become more energy efficient. They are also fabricating a cooling system for easy connection and disconnection within server racks.

“Cooling and chip manufacturing go hand-in-hand. Without proper cooling, components overheat and fail. Energy-efficient data centers will be key to the future of AI computing.”

– Chanwoo Park, a professor of mechanical and aerospace engineering in the Mizzou College of Engineering

With the support of $1.5 million in funding from DOE’s COOLERCHIPS initiative, the team developed a two-phase cooling system that dissipates heat from server chips through phase change. Not only can it run passively without using any energy when needed for less cooling, but even when in active mode, the system uses a very small amount.

CRAM Hardware Could Cut AI Energy Use by 1000x

Last summer, engineers at the University of Minnesota Twin Cities developed an advanced hardware device³ that could reduce AI’s energy consumption by about 1,000 times.

This new model is called computational random-access memory (CRAM), and here, data never leaves the memory; rather, it is processed completely within the memory array, as such, eliminating the need for energy-intensive and slow data transfers.

Two decades in the making, this study is part of the team’s efforts into building upon senior author Jian-Ping Wang’s patented research into Magnetic Tunnel Junctions (MTJs) devices. These nanostructured devices are used to improve sensors, hard drives, and other microelectronics systems like Magnetic Random Access Memory (MRAM).

“As an extremely energy-efficient digital based in-memory computing substrate, CRAM is very flexible in that computation can be performed in any location in the memory array,” noted co-author Ulya Karpuzcu, Associate Professor in the Department of Electrical and Computer Engineering. Also, it can be reconfigured to best suit the performance needs of different algorithms.

Brain-Inspired AI: Reducing Power Use by Mimicking Human Efficiency

So, as we saw, researchers are looking at different aspects of AI to handle its energy issues. Interestingly, they are also turning to the human brain for inspiration. This makes sense, after all, AI is the simulation of human intelligence processes by machines, although it’s nowhere near human thinking and reasoning due to its ability to generalize across variations being “significantly weaker than human cognition.”

The brain-inspired energy reduction research includes the work of Associate Professor Chang Xu in the University’s Sydney AI Centre, who noted that LLMs utilizing the resources at full capacity, even for simple tasks, is not the right way to do things. He explained:

“When you think about a healthy human brain – it doesn’t fire all neurons or use all of its brain power at once. It operates with incredibly energy efficiency, just 20 Watts of power despite having around 100 billion neurons, which it selectively uses from different hemispheres of the brain to perform different tasks or thinking.”

As such, they are developing algorithms that bypass the redundant computations not needed and don’t go into high gear automatically.

In other instances, research took inspiration from the brain’s neuromodulation and created an algorithm called a ‘stashing system’ to reduce energy by 37% without any accuracy degradation, the self-repairing function of brain cell called astrocytes for hardware devices, and got neuromorphic (brain-inspired) form of computing (memristors) to work together in several sub-groups of neural networks.

Investing in Artificial Intelligence

A global semiconductor company, AMD (AMD +4.68%) is known for its high-performance computing, graphics, and visualization technologies. While in direct competition with AI darling NVIDIA (NVDA +4.16%), it is rapidly gaining ground in the data center and AI accelerator markets. Its MI300 Series specifically targets gen AI workloads and HPC applications.

Its leading presence in the data center CPU space, strong R&D focus, revenue growth, clientele, and acquisitions make AMD a strong player in the sector.

Advanced Micro Devices (AMD +4.68%)

Back in 2022, AMD made a record chip industry deal valued at $50 billion with the acquisition of Xilinx to become the industry’s high-performance and adaptive computing leader. And most recently, it completed acquiring ZT Systems to address the $500 billion data center AI accelerator opportunity in 2028.

AMD’s market performance is also making a recovery this year after being hit by tariff turbulence. As of writing, AMD shares are trading at $120, down 6.9% YTD but only about 47% off of its peak from March 2024. With that, its market cap is $182.34 billion while having an EPS (TTM) of 1.36 and a P/E (TTM) of 82.44.

As for the company’s financials, AMD reported a 36% YoY increase in revenue to $7.4 billion for Q1 of 2025, which CEO Dr. Lisa Su called “an outstanding start” to the year, “despite the dynamic macro and regulatory environment.” This growth was driven by “expanding data center and AI momentum,” she added.

During this period, AMD’s operating income came in at $806 million, net income was $709 million, and diluted earnings per share was $0.44. For Q2 of 2025, it is projecting approximately $7.4 billion in revenue.

Some key developments made by the company include the expansion of strategic partnerships with Meta Platforms, Inc. (META +0.51%) (Llama), Alphabet Inc. (GOOGL +3.66%) (Gemma), Oracle Corporation (ORCL +0.42%), Core42, Dell Technologies (DELL +2.94%), and others. AMD, along with Nokia, Cisco Systems, Inc. (CSCO -0.79%), and Jio, also announced a new Open Telecom AI Platform to offer AI-driven solutions to enhance efficiency, security, and capabilities.

Now, this week, AMD and Nvidia partnered with Humain, an AI-focused subsidiary of Saudi Arabia’s Public Investment Fund to supply semiconductors for a large-scale data centre project that is expected to have a capacity of 500MW.

Click here for a list of top non-silicon computing companies.

Latest Advanced Micro Devices (AMD) Trends and Developments

Conclusion

Over the last few years, AI mania has seen explosive growth, and for good reasons. This technology, after all, has great potential to transform a wide range of industries from healthcare, manufacturing, and materials science to finance, entertainment, education, retail, and cybersecurity.

However, technological advancement, growing adoption, and subsequent expansion of these LLMs have resulted in substantial demand for energy, which contributes to greenhouse gas (GHG) emissions and climate change, increases economic cost, and impacts the sustainability of the technology.

This presents a major challenge for AI. If we want to fully realize its true potential in terms of reduced costs, increased productivity, and improved decision-making at scale, the models must achieve cost-effectiveness and sustainability.

The good thing, however, is that researchers around the world are already hard at work on making AI energy efficient, as evident from Oregon State’s AI-powered chip, which suggests a strong possibility of aligning innovation with sustainability.

Of course, the proposed technologies have to overcome their biggest obstacle in achieving real-world impact, scalability. Still, one thing is clear: the greener AI future is feasible, and it’s coming!

Click here to learn all about investing in artificial intelligence.

Studies Referenced:

1. Yu, H., Huang, Z., Lamon, S., Wu, J., Zhao, Z., Lin, D., Zhao, H., Zhang, J., Lu, C., Liu, H., Zhang, X., & Zhang, C. (2025). All-optical image transportation through a multimode fibre using a miniaturized diffractive neural network on the distal facet. Nature Photonics, 19, 486–493. https://doi.org/10.1038/s41566-025-01621-4
2. Bandyopadhyay, S., Sludds, A., Krastanov, S., Youssry, A., Zhang, L., Lian, C., Yu, S., Desiatov, B., Burek, M. J., Lukin, M. D., & Lončar, M. (2024). Single-chip photonic deep neural network with forward-only training. Nature Photonics, 18, 1335–1343. https://doi.org/10.1038/s41566-024-01567-z
3. Lv, Y., Zink, B. R., Bloom, R. P., Roy, A., Vaddi, K., Shang, L., & Manipatruni, S. (2024). Experimental demonstration of magnetic tunnel junction-based computational random-access memory. npj Unconventional Computing, 1, 3. https://doi.org/10.1038/s44335-024-00003-3

Source link