Last month, the artificial intelligence company DeepMind introduced new software that can take a single image of a few objects in a virtual room and, without human guidance, infer what the three-dimensional scene looks like from entirely new vantage points. Given just a handful of such pictures, the system, dubbed the Generative Query Network, or GQN, can successfully model the layout of a simple, video game-style maze.
There are obvious technological applications for GQN, but it has also caught the eye of neuroscientists, who are particularly interested in the training algorithm it uses to learn how to perform its tasks. From the presented image, GQN generates predictions about what a scene should look like — where objects should be located, how shadows should fall against surfaces, which areas should be visible or hidden based on certain perspectives — and uses the differences between those predictions and its actual observations to improve the accuracy of the predictions it will make in the future. “It was the difference between reality and the prediction that enabled the updating of the model,” said Ali Eslami, one of the project’s leaders.
According to Danilo Rezende, Eslami’s co-author and DeepMind colleague, “the algorithm changes the parameters of its [predictive] model in such a way that next time, when it encounters the same situation, it will be less surprised.”
Neuroscientists have long suspected that a similar mechanism drives how the brain works. (Indeed, those speculations are part of what inspired the GQN team to pursue this approach.) According to this “predictive coding” theory, at each level of a cognitive process, the brain generates models, or beliefs, about what information it should be receiving from the level below it. These beliefs get translated into predictions about what should be experienced in a given situation, providing the best explanation of what’s out there so that the experience will make sense. The predictions then get sent down as feedback to lower-level sensory regions of the brain. The brain compares its predictions with the actual sensory input it receives, “explaining away” whatever differences, or prediction errors, it can by using its internal models to determine likely causes for the discrepancies. (For instance, we might have an internal model of a table as a flat surface supported by four legs, but we can still identify an object as a table even if something else blocks half of it from view.)
To read more, click here.