Neural Networks are extremely good at solving complex problems and their secret for this is buried in the laws of physics as said by physicists.

Deep learning techniques are used in Artificial Intelligence in areas such as face recognition, object recognition and also mastered the ancient game of Go. Neural Nets have achieved great success and nobody’s sure how they have achieved it. There is no mathematical reason why networks arranged in layers should be so good at these challenges.

Today that changes thanks to the work of Henry Lin at Harvard University and Max Tegmark at MIT. These guys say the reason why mathematicians have been so embarrassed is that the answer depends on the nature of the universe. In other words, the answer lies in the regime of physics rather than mathematics.

Consider an example of classifying a megabit grayscale image of determining whether it’s a dog or cat. Now such image is consisting of a million pixels that can each take one of 256 grayscale values. So in theory, there can be 256^{1000000} possible images, and for each one, it is necessary to compute whether it shows a cat or dog. And yet neural networks, with merely thousands or millions of parameters, somehow manage this classification task with ease.

Now, the Neural Nets work by approximating complex mathematical functions to simple ones.

The problem is that there are orders of magnitude more mathematical functions than possible networks to approximate them. And yet deep neural networks somehow get the right answer.

The scientists have worked out why. The answer is that the universe is governed by a tiny subset of all possible functions. In other words, when the laws of physics are written down mathematically, they can all be described by functions that have a remarkable set of simple properties.

So deep neural networks don’t have to approximate any possible mathematical function, only a tiny subset of them. To put this in perspective, consider the order of a polynomial function, which is the size of its highest exponent. So a quadratic equation like y=x^{2} has order 2, the equation y=x^{24} has order 24, and so on.

The laws of physics have other important properties. For example, they are usually symmetrical when it comes to rotation and translation. Rotate a cat or dog through 360 degrees and it looks the same; translate it by 10 meters or 100 meters or a kilometer and it will look the same. That also simplifies the task of approximating the process of cat or dog recognition. These properties mean that neural networks do not need to approximate an infinitude of possible mathematical functions but only a tiny subset of the simplest ones.

There is another property of the universe that neural networks exploit. This is the hierarchy of its structure. “Elementary particles form atoms which in turn form molecules, cells, organisms, planets, solar systems, galaxies, etc.,” say Lin and Tegmark. And complex structures are often formed through a sequence of simpler steps.

This is why the structure of neural networks is important too: the layers in these networks can approximate each step in the causal sequence.

(Via: MitTechReview)