Skip to main content
Illustration of someone embroidering a particle collision
Illustration by Sandbox Studio, Chicago with Abigail Malate

Machine learning and theory

Theoretical physicists use machine-learning algorithms to speed up difficult calculations and eliminate untenable theories—but could they transform what it means to make discoveries?

Theoretical physicists employ their imaginations and their deep understanding of mathematics to decipher the underlying laws of the universe that govern particles, forces and everything in between. More and more often, theorists are doing that work with the help of machine learning.

As might be expected, the group of theorists using machine learning includes people classified as “computational” theorists. But it also includes “formal” theorists, the people interested in the self-consistency of theoretical frameworks, like string theory or quantum gravity. And it includes “phenomenologists,” the theorists who sit next to experimentalists, hypothesizing about new particles or interactions that could be tested by experiments; analyzing the data the experiments collect; and using results to construct new models and dream up how to test them experimentally.

In all areas of theory, machine-learning algorithms are speeding up processes, performing previously impossible calculations, and even causing theorists to rethink the way theoretical physics research is done.

“We’re near the very beginning of something that, to me, is an obvious revolution: the use of computers in scientific discovery,” says Jim Halverson, a professor of physics at Northeastern University. “It’s like being within 50 years of Galileo pointing his telescope at the sky for the first time. Much of the current progress utilizes machine learning.”

David Shih, a professor in the Department of Physics & Astronomy at Rutgers University, thinks of machine learning similarly. “Instead of looking into the sky, you’re looking into the data,” he says. “So it’s allowing us to see much further into the data. It’s opening up new frontiers.”

Machine learning for theoretical physics

“In the types of theory calculations I perform, there is no dataset, and machine learning is used to accelerate a first-principles theory calculation in a mathematically exact way,” says Phiala Shanahan, an associate professor of physics at MIT who does research in theoretical nuclear and particle physics.

Shanahan uses lattice field theory to calculate the structure of protons, neutrons and nuclei from our underlying understanding of particle physics. She says she uses machine learning to do calculations “much faster than you can do it any other way—or perhaps that you couldn’t have done any other way—and with guaranteed exactness.”

Lattice field theory calculations are computationally intense and take a very long time on traditional computers, leading to what Shanahan calls the “massive computational program” in her field. Machine-learning algorithms speed up the calculations and make them feasible—though theorists still have to use supercomputers to run them.

Shanahan and collaborators recently demonstrated that machine learning could generate samples from an underlying probability distribution relevant for lattice field theory—without using “training data” at all.

The hope is that machine-learning algorithms will enable physicists like Shanahan to directly calculate the properties of nuclei too large to be studied with conventional approaches, such as argon or xenon. This work will be helpful for future experiments like the Deep Underground Neutrino Experiment and various dark matter searches, which will use such nuclei as targets in their experimental apparatuses.

As Shanahan shows, physicists don’t necessarily need to have data to use machine-learning algorithms. But if you do have data, especially if you have a lot of it, machine learning is an extremely powerful tool for processing it.

Halverson incorporates machine learning into his string theory work. The equations that undergird string theory have many possible solutions that theorists must sort, “but the number is so astronomical that brute force scans are simply impossible.”

For example, Halverson has worked on a theoretical dataset that has more than 10^755 elements—well beyond the number of particles in the universe. “In this enormous dataset, you might imagine performing some sort of search problem, with well-defined rules of the game,” he says. “You’re looking for something specific, but you also have to satisfy certain constraints.”

In this way, doing theoretical research in string theory can be similar to playing games like chess or Go. So, Halverson and his colleagues use a machine-learning approach called reinforcement learning, which is often used in gameplay settings, to create an algorithm that can explore an astronomically large system and pinpoint data of interest.

Theorists also use machine learning for so-called discovery applications: searching for hidden correlations in structures and for hidden relationships in raw information. For example, theorists working with data from particle colliders deal with complicated mathematical expressions that must be simplified before physicists can calculate how the particles scatter. Machine learning helps accelerate this process by proposing possible solutions. It is much easier for theorists to take the suggestions and test them than it is to come up with the suggestions in the first place.

Shih also uses machine learning to sort through data in his phenomenological research. He works with the European Space Agency’s Gaia telescope, which is cataloging the positions and velocities of all the stars in the Milky Way.

Recently, Shih and his colleagues combined theory with Gaia data to generate a three-dimensional map of the density of dark matter in our galaxy. “That all is enabled by relatively new machine-learning techniques that weren’t available five years ago,” he says. “You couldn’t imagine doing this kind of data analysis until recently.”

Shih used a different machine-learning technique called simulation-based inference in a collaboration with NANOGrav scientists working on pulsar timing arrays. NANOGrav must make calculations by inverting enormous matrices, a process that has traditionally taken about a week. Machine learning can make these calculations using samples of simulated data, a process that takes closer to 24 hours and creates a database that astrophysicists can sample from in a matter of seconds.

Some phenomenologists are even using machine learning to re-define the way physicists search for new physics.

Traditionally a theorist will come up with a hypothesis, define what it would look like for experimentalists to find evidence that the hypothesis is correct, then ask experimentalists to look for that evidence. But machine learning allows a theorist to come up with a hypothesis, define what it would look like to deviate from that hypothesis, then use an algorithm to look for evidence that the hypothesis is not correct.

“It gets very controversial because normally what we do in science is hypothesis testing: A versus B,” says Jesse Thaler, a professor of physics at MIT and the inaugural director of the National Science Foundation’s Institute for Artificial Intelligence and Fundamental Interactions, IAIFI. “The idea that now you might say, ‘Let’s look for anomalous features’ without actually specifying what you’re looking for specifically—that’s a different way of doing science.”

Benefits and challenges

For many theorists, machine learning has already proven to be a promising tool to further their research. “More classical or conventional approaches typically have to bend the data, reduce them down to fewer dimensions or fit them to a very simple model with just a few parameters,” says Shih. “That of course builds in a lot of biases and assumptions or loses information along the way.

“Using these modern machine-learning techniques, you don’t have to do any of that. You could use all the data with the minimum amount of assumptions.”

But, as Thaler mentions, physicists continue to express concerns about using machine learning for theoretical physics. One issue is that some algorithms give predictions without uncertainties. And physicists have worried that machine learning is too much of a “black box”—that it arrives at decisions without showing its work.

That’s why Halverson and others are working to show that machine-learning algorithms can produce understandable results. “Both in string theory and in broader contexts … we are establishing legitimately rigorous results that would pass a mathematician's sniff test,” Halverson says.

This effort is helping physicists establish new standards for machine learning, and not just in physics. “We in particle physics have such a high standard for what it means to discover something or what it means to have a rigorous analysis that we are, in some ways, leading the charge in transforming machine learning,” Thaler says.

“We’re going from off-the-shelf tools that might not incorporate all of our physics best practices to tools that not only incorporate physics best practices but that we can export to other areas.”

The future of machine learning and theory

Machine learning used in both experiment and theory has led to a blurring of the lines between the two traditionally disparate camps. In fact, some posit that a new type of physicist is emerging: the data physicist. Shih coined the term at a Snowmass US high-energy physics community planning meeting in 2022 to describe scientists at the confluence of experiment, theory and data science. While the title of data physicist is not yet commonly used, the demand is growing for physicists who know how to analyze large amounts of data. And machine-learning is already deeply ingrained in this type of work.

Shih advocates for recruiting and retaining more young people who know how to work with machine learning in physics. “We lose a lot of people to industry,” he says. “Creating robust pipelines that keep talented people in the field—that requires jobs.

“I think we do okay at the postdoc level, and certainly at the graduate student level, but we need to create more faculty jobs that are in this interdisciplinary machine learning–data science space in physics and astronomy.”

Theorists say they believe these jobs are here to stay. And they are not afraid of machine learning taking their place.

Thaler acknowledges that machine learning may eventually be able to do what theoretical physicists do, but he says that will only happen when physicists understand their own science so deeply as to be able to explain all of it to a computer.

“To actually phrase some of the aspects of the scientific process in rigorous, algorithmic terms such that a computer could do it—that itself is a rich scientific endeavor, and one that has a chance of really accelerating the way that we do scientific discovery,” he says.

Ultimately, theorists see machine learning as a tool, “like a hammer,” says Shih. “You have a general tool, and you can apply it to many different places.”

“It’s just a class of algorithms,” says Shanahan, who serves as research coordinator for theoretical physics for IAIFI. “Just like any algorithm, hopefully the benefit is that machine learning enables you to do something that you wouldn’t have been able to do any other way.”

If used well, machine learning could make physicists’ lives a little easier. It might even return time currently spent running calculations and analyzing data.

“We have monstrously huge datasets that could be hiding fascinating phenomena—whether it’s new phenomena that’s beyond the Standard Model or even just phenomena within the Standard Model that we haven’t seen yet,” says Thaler. “If you have to have an entire PhD thesis devoted to studying each little possibility, we just won’t be able to explore the vast space of possibilities fast enough, given the deluge of data that’s coming in.

“Just from the fact that we have limited time, each of us, on this planet, and the limited number of people whose eyeballs are on the data, you want to maximize our ability to find new phenomena,” he says. “Collaborating with computers seems to be one way of doing that.”