How Computers Are Learning to Predict the Future
And do we want them to?
From the time we are born, we are taking in scenes, recording input in our brains and learning how to predict the future and act accordingly. For example, we know if a car is speeding toward us in the wrong lane, we should get out of the way or the following moments could be a disaster. Back in 2016, researchers from the Massachusetts Institute of Technology enabled computers to predetermine the future in a similar way.
Algorithms partially modeled on the human brain reviewed 2 million online videos and observed how different types of scenes usually unfold: people walking in a park or playing golf, waves crashing on the shore, a train arriving at the station and so on. The MIT computer could then look at a single image and generate a short 1.5 second video clip showing its vision of the immediate future.
The system is learning what is plausible, recreating the scene and adding future motions you might see. According to Carl Vondrick, one of the lead scientists on the project, the system’s ability to predict normal behavior could help spot unusual happenings in security footage or improve the reliability of self-driving cars. For example, an onboard computer running this program in an autonomous vehicle may recognize the unusual motion of an animal running into the road and return control of the vehicle to the driver.
The MIT team relied on a scientific technique called deep learning, applying it to modern artificial intelligence research. Deep learning is part of a broader family of machine learning methods based on artificial neural networks- systems “learn” to perform tasks by considering examples, generally without being programmed with task-specific rules.
MIT is not the only research facility developing a future predicting computer system. In 2018, scientists at Germany’s University of Bonn taught a computer to predict events five minutes in advance. The ability to create a plausible future prediction increased threefold in just one year, and scientists think we are just scratching the surface.
Experts say computers examining large data sets and applying deep learning could help make medical diagnoses, detect bank fraud and improve self-driving cars. Already, it is the approach that helps digital assistants like Apple’s Siri and Amazon’s Alexa seem to get to know us and understand what we want. Amazon is applying the technology to predict what customers want to order before they even order it.
“Deep neural networks are performing better than humans on all kinds of significant problems, like image recognition, for example,” says Chris Nicholson, CEO of San Francisco startup Skymind, which develops deep learning software and offers consulting. “Without them, I think self-driving cars would be a danger on the roads, but with them, self-driving cars are safer than human drivers.”
Neural networks take low-level inputs, like the pixels of an image or snippets of audio, and run them through a series of virtual layers, which assign relative weights to each individual piece of data in interpreting the input. The “deep” in deep learning refers to using tall stacks of these layers to collectively uncover more complex patterns in the data, expanding its understanding from pixels to basic shapes to features like stop signs and brake lights. To train the networks, programmers repeatedly test them on large sets of data, automatically tweaking the weights so the network makes progressively fewer mistakes over time.
While research into neural networks, loosely based on the human brain, dates back decades, progress has been particularly remarkable in roughly the past ten years, Nicholson says. A 2006 set of papers by renowned computer scientist Geoffrey Hinton, who now divides his time between Google and the University of Toronto, helped pave the way for deep learning’s rapid development.
In 2012, a team including Hinton was the first to use deep learning to win a prestigious computer science competition called the ImageNet Large Scale Visual Recognition Challenge. The team’s program beat rivals by a wide margin at classifying objects in photographs into categories, performing with a 15.3 percent error rate compared to a 26.2 percent rate for the second-place entry.
This year, a Google-designed computer trained by deep learning defeated one of the world’s top Go players, a feat many experts of the ancient Asian board game had previously thought could be decades away. The system, called AlphaGo, learned in part by playing millions of simulated games against itself. While human chess players have long been bested by digital rivals, many experts had thought Go— which has significantly more sequences of valid moves—could be harder for computers to grasp.
In early November, a group from the University of Oxford unveiled a deep learning-based lipreading system that can outperform human experts. And recently, a team including researchers from Google published a paper in the Journal of the American Medical Association showing that deep learning could spot diabetic retinopathy roughly as well as trained ophthalmologists. That eye condition can cause blindness in people with diabetes, especially if they don’t have access to testing and treatment.
“A lot of people don’t have access to a specialist who can access these [diagnostic] films, especially in underserved populations where the incidence of diabetes is going up and the number of eyecare professionals is flat,” says Dr. Lily Peng, a product manager at Google and lead author on the paper.
Like many of deep learning’s successes, the retinopathy research relied on a large set of training data, including roughly 128,000 images already classified by ophthalmologists. Deep learning is fundamentally a technique for the internet age, requiring datasets that only a few years ago would have been too big to even fit on a hard drive.
“It’s not as useful in cases where there’s not much data available,” Vondrick says. “If it’s very difficult to acquire data, then deep learning may not get you as far.”
Computers need a lot more examples than humans do to learn the same skills. Recent editions of the ImageNet challenge, which has added more sophisticated object recognition and scene analysis challenges as algorithms have grown more sophisticated, included hundreds of gigabytes of training data — orders of magnitude larger than a CD or DVD. Developers at Google train new algorithms from the company’s sweeping archive of search results and clicks, and companies racing to build self-driving vehicles collect vast amounts of sensor readings from heavily instrumented, human-driven cars.
“Getting the right type of data is actually the most critical bit,” says Sameep Tandon, CEO of Bay Area autonomous car startup Drive.ai. “One hundred hours of just driving straight down Highway 5 in California is not going to help when you’re driving down El Camino in Mountain View, for example.”
Once all that data is collected, the neural networks still need to be trained. Experts say, with a bit of awe, that the math operations involved aren’t beyond an advanced high school student—some clever matrix multiplications to weight the data points and a bit of calculus to refine the weights in the most efficient way—but all those computations still add up.
“If you have this massive dataset, but only a very weak computer, you’re going to be waiting a long time to train that model,” says Evan Shelhamer, a graduate student at the University of California at Berkeley and lead developer on Caffe, a widely-used open source toolkit for deep learning.
Only modern computers, along with an internet-enabled research community sharing tools and data, have made deep learning practical. But researchers say it’s still not a perfect fit for every situation. One limitation is that it can be difficult to understand how neural networks are actually interpreting the data, something that could give regulators pause if the algorithms are used for sensitive tasks like driving cars, evaluating medical images, or computing credit scores.
“Right now, deep learning does not have enough explanatory power,” Nicholson says. “It cannot always tell you why it reached a decision, even if it’s reaching that decision with better accuracy than any other [technique].”
The systems could also have potential blind spots not caught by initial training and test data, potentially leading to unexpected errors in unusual situations. And perhaps luckily for humans, current deep learning systems aren’t intelligent enough to learn new skills on their own, even closely related to what they already can do, without a good deal of separate training.
“A network for identifying coral knows nothing about identifying, even, grass from a sidewalk,” Shelhamer says. “The Go network isn’t just going to become a master at checkers on its own.”
For more of the breakthroughs changing our lives, follow NBC MACH.