Technology
A Greener Solution for Artificial Intelligence
A new large language model developed at UCSC removes math from the equation, giving artificial intelligence a more sustainable, greener future.
The energy required to operate the numerous, readily available AI platforms is substantial Credit NurPhoto (via GettyImages)
Type “2 plus 2” into either Google’s search bar or ChatGPT’s black box and you will receive what you expect: four.
ChatGPT, however, spent around 10 times more energy than the Google search to give you the same answer. Unlike Google, ChatGPT is not a search engine — it’s a type of artificial intelligence called a large language model, or LLM, that operates using a complex neural network to formulate what it thinks is the most suitable response to a query.
The computing power required to operate LLMs is enormous, and continues to grow as companies like Meta, Google, OpenAI, and Microsoft race to produce more powerful versions that consume increasingly unprecedented amounts of energy. “It’s ridiculously unsustainable,” said Jason Eshraghian, assistant professor in electrical and computer engineering at University of California, Santa Cruz.
Despite the unreasonable energy requirements, the popularity of LLMs is only increasing. As a result, researchers have been working to develop more sustainable methods of computing — called green computing.
At UCSC, Eshraghian and his colleagues have designed an LLM, called a Matmul-free LM, or matrix-multiplication-free language model, that performs just as well as others on the market with only a fraction of the energy requirement. The key, they found, was to mimic something the human brain does all too well — forget. The researchers published their model on arXiv in June, 2024.
The power struggle
Search engines like Google operate by matching search terms to text found across the internet to show us relevant websites. In contrast, LLMs, such as ChatGPT, use a type of artificial intelligence called deep learning, which operates by processing large amounts of data to recognize relationships and patterns, then learns from them.
Instead of just matching terms as a search engine does, LLMs analyze each part of the search request to discern what the user is really asking it to do, then computes the most logical response based on the data it has. “Artificial intelligence is essentially glorified math,” Eshraghian said.
All of the complex computing required for a single ChatGPT output is estimated to take an average of 3 watt-hours of energy, which is about enough to power a digital alarm clock for three hours. That’s 10 times as much energy required for a single Google search.
By 2030, power demand by the data centers where LLM queries are processed is expected to rise by 160 percent, according to a Goldman Sachs Research report. At that point, those data centers will be responsible for 3-4 percent of total power consumption worldwide.
Public enthusiasm for LLMs has cemented their place in society — a place that energy consumption concerns are unlikely to shake. This means traditional computing that is using less energy is a really big deal, said David Lederman, professor of physics and Director of the Material Science and Engineering Program at UCSC.
Learning — and forgetting — are key
The technology behind deep learning, called a neural network, is designed to operate similarly to the neural networks in our brains — it receives, stores, and builds upon information that it can refer to at any time for improved decision making.
Traditional LLMs transform words into a matrix by asking: “How tightly are those earlier words related to the present word?” Eshraghian explained. This process iterates over and over, through each word in a sentence, gobbling up energy as it processes loads of equations.
The human brain, however, can add two and two, as well as instruct our bodies to live, breathe, talk, walk, and think, all while using roughly only 0.8 watt-hours of energy at any given moment. That’s about the same energy required to stream Netflix for an hour.
The superior energy efficiency of our brains inspired Eshraghian and his colleagues to build a more environmentally sustainable LLM that more closely mimics the way the brain works. The solution, he said, is that the human brain temporarily forgets information that is not useful.
With this in mind, the researchers designed their LLM without the traditional matrix multiplication element, and instead programmed the model to weigh new and historical information by its importance then purge the unimportant data. Additionally, the model doesn’t reprocess earlier words, which cuts down on computations and the amount of data that the LLM needs to retain, Eshraghian said. “As the sequence goes forward and the model processes more words, it’s moving forward in time instead of reprocessing old data,” he said.
Not only do human brains forget, but they can also make sense of noisy data, Eshraghian said. “For example, if you look at an image, you can change one pixel but you will still be able to recognize that it’s the same image.”
The team mimicked the brain’s ability to fill in the gaps by simplifying information stored in the model into single bits, or binary digits. Whereas most computers store information as a series of 64 or 32 zeros and ones, called 64-bits or 32-bits, this model stores information as -1, 0, or 1, only, called ternary units. This forgoes precision but simplifies chip design and speeds up processing.
By removing the energy-intensive multiplication step, and allowing the model to “forget” past actions — just like our brains can do — the sustainable model can run on only 13 watts. That’s about as much energy needed to operate one LED light bulb. It also performs just as well as Meta’s LLM called Llama — a “kind of mind blowing” result, Eshraghian said.
Hardware to match
With the new model’s software design complete, Eshraghian needed equally novel and efficient hardware that could run the program using only single bits.
Eshraghian reached out to his UCSC colleague Dustin Richmond, an assistant professor of computer science and engineering, who researches hardware design. The team quickly got to work, meeting every 48 hours to tweak the software, then hardware, going back and forth to find the most compatible design. After just three weeks of effort, the team built a proof-of-concept circuit made of logic gates that flip open or closed to represent zeros or ones.
Instead of instructing the hardware to open and close gates in a certain order, the goal was to create a circuit that could reach a desired outcome, Richmond explained. For example, imagine a maze printed on a piece of paper. You could instruct the exact movements that the pencil should take — left here, straight, now right — to get from the start to finish. Or, you could tell the person holding the pencil to trace a line between the opening and exit of the maze.
The researchers coded broad instructions into the circuit, telling it: “This is the behavior that we want, now go and find an efficient way to implement it,” Richmond said.
The team published the model and its hardware design online as a call-to-action for “better resourced companies to invest in a fundamentally new way of doing deep learning for far cheaper and far more sustainably,” Eshraghian said.
Doing more with less
Since publishing the model and chip design last summer, several tech companies have reached out to the UCSC team asking to collaborate on adopting the model to proprietary hardware. Eshraghian, who is from Perth — the most isolated city in Australia — said he could never have imagined that his research would make him so popular with U.S.-based tech companies. “It’s been cool,” he said.
In addition to more sustainable computing, Richmond foresees another use for a lower energy model: more computations. “I wish companies wanted to make things more efficient simply because they want to save the world,” Richmond said, but the true selling point is the ability to do more with less.
Until researchers can demonstrate a competitive business advantage of sustainable computing, industry will remain reluctant to replace existing models. Eshraghian and his colleagues remain undeterred.
“The one thing I’ve been sure to anchor my lab on is doing research that is useful,” Eshraghian said. “That’s why we want to emulate the brain. It helps us, and maybe it can help us achieve things more sustainability and efficiently in the form of AI.”