A joint Fermilab/SLAC publication
An illustration of a man riding a mouse pointer through a galaxy of data
Illustration by Sandbox Studio, Chicago with Corinne Mucha

Studying the stars with machine learning

10/16/18

To keep up with an impending astronomical increase in data about our universe, astrophysicists turn to machine learning.

Kevin Schawinski had a problem.

In 2007 he was an astrophysicist at Oxford University and hard at work reviewing seven years’ worth of photographs from the Sloan Digital Sky Survey—images of more than 900,000 galaxies. He spent his days looking at image after image, noting whether a galaxy looked spiral or elliptical, or logging which way it seemed to be spinning. 

Technological advancements had sped up scientists’ ability to collect information, but scientists were still processing information at the same rate. After working on the task full time and barely making a dent, Schawinski and colleague Chris Lintott decided there had to be a better way to do this. 

There was: a citizen science project called Galaxy Zoo. Schawinski and Lintott recruited volunteers from the public to help out by classifying images online. Showing the same images to multiple volunteers allowed them to check one another’s work. More than 100,000 people chipped in and condensed a task that would have taken years into just under six months.  

Citizen scientists continue to contribute to image-classification tasks. But technology also continues to advance. 

The Dark Energy Spectroscopic Instrument, scheduled to begin in 2019, will measure the velocities of about 30 million galaxies and quasars over five years. The Large Synoptic Survey Telescope, scheduled to begin in the early 2020s, will collect more than 30 terabytes of data each night—for a decade. 

“The volume of datasets [from those surveys] will be at least an order of magnitude larger,” says Camille Avestruz, a postdoctoral researcher at the University of Chicago. 

To keep up, astrophysicists like Schawinski and Avestruz have recruited a new class of non-scientist scientists: machines. 

Researchers are using artificial intelligence to help with a variety of tasks in astronomy and cosmology, from image analysis to telescope scheduling.

Superhuman scheduling, computerized calibration

Artificial intelligence is an umbrella term for ways in which computers can seem to reason, make decisions, learn, and perform other tasks that we associate with human intelligence. Machine learning is a subfield of artificial intelligence that uses statistical techniques and pattern recognition to train computers to make decisions, rather than programming more direct algorithms.

In 2017, a research group from Stanford University used machine learning to study images of strong gravitational lensing, a phenomenon in which an accumulation of matter in space is dense enough that it bends light waves as they travel around it. 

Because many gravitational lenses can’t be accounted for by luminous matter alone, a better understanding of gravitational lenses can help astronomers gain insight into dark matter.

In the past, scientists have conducted this research by comparing actual images of gravitational lenses with large numbers of computer simulations of mathematical lensing models, a process that can take weeks or even months for a single image. The Stanford team showed that machine learning algorithms can speed up this process by a factor of millions

Schawinski, who is now an astrophysicist at ETH Zurich, uses machine learning in his current work. His group has used tools called generative adversarial networks, or GAN, to recover clean versions of images that have been degraded by random noise. They recently published a paper about using AI to generate and test new hypotheses in astrophysics and other areas of research.

Another application of machine learning in astrophysics involves solving logistical challenges such as scheduling. There are only so many hours in a night that a given high-powered telescope can be used, and it can only point in one direction at a time. “It costs millions of dollars to use a telescope for on the order of weeks,” says Brian Nord, a physicist at the University of Chicago and part of Fermilab’s Machine Intelligence Group, which is tasked with helping researchers in all areas of high-energy physics deploy AI in their work.

Machine learning can help observatories schedule telescopes so they can collect data as efficiently as possible. Both Schawinski’s lab and Fermilab are using a technique called reinforcement learning to train algorithms to solve problems like this one. In reinforcement learning, an algorithm isn’t trained on “right” and “wrong” answers but through differing rewards that depend on its outputs. The algorithms must strike a balance between the safe, predictable payoffs of understood options and the potential for a big win with an unexpected solution.

"Machine Learning Helps Out"
Illustration by Sandbox Studio, Chicago with Corinne Mucha

A growing field

When computer science graduate student Shubhendu Trivedi of the Toyota Technological Institute at University of Chicago started teaching a graduate course on deep learning with one of his mentors, Risi Kondor, he was pleased with how many researchers from the physical sciences signed up for it. They didn’t know much about how to use AI in their research, and Trivedi realized there was an unmet need for machine learning experts to help scientists in different fields find ways of exploiting these new techniques.

The conversations he had with researchers in his class evolved into collaborations, including participation in the Deep Skies Lab, an astronomy and artificial intelligence research group co-founded by Avestruz, Nord and astronomer Joshua Peek of the Space Telescope Science Institute. Earlier this month, they submitted their first peer-reviewed paper demonstrating the efficiency of an AI-based method to measure gravitational lensing in the Cosmic Microwave Background.

Similar groups are popping up across the world, from Schawinski’s group in Switzerland to the Centre for Astrophysics and Supercomputing in Australia. And adoption of machine learning techniques in astronomy is increasing rapidly. In an arXiv search of astronomy papers, the terms “deep learning” and “machine learning” appear more in the titles of papers from the first seven months of 2018 than from all of 2017, which in turn had more than 2016. 

“Five years ago, [machine learning algorithms in astronomy] were esoteric tools that performed worse than humans in most circumstances,” Nord says. Today, more and more algorithms are consistently outperforming humans. “You’d be surprised at how much low-hanging fruit there is.”

But there are obstacles to introducing machine learning into astrophysics research. One of the biggest is the fact that machine learning is a black box. “We don’t have a fundamental theory of how neural networks work and make sense of things,” Schawinski says. Scientists are understandably nervous about using tools without fully understanding how they work.

Another related stumbling block is uncertainty. Machine learning often depends on inputs that all have some amount of noise or error, and the models themselves make assumptions that introduce uncertainty. Researchers using machine learning techniques in their work need to understand these uncertainties and communicate those accurately to each other and the broader public. 

The state of the art in machine learning is changing so rapidly that researchers are reluctant to make predictions about what will be coming even in the next five years. “I would be really excited if as soon as data comes off the telescopes, a machine could look at it and find unexpected patterns,” Nord says. 

No matter exactly the form future advances take, the data keeps coming faster and faster, and researchers are increasingly convinced that artificial intelligence is going to be necessary to help them keep up.