Disgust and Envy

Image processing researchers have mixed feelings of disgust and envy towards this recent trend of deep learning that keeps pushing itself into our court. Some of us have chosen to remain bystanders for now, while others play along and divert their research accordingly. post

A series of papers during the early 2000s suggested the successful application of this architecture, leading to state-of-the-art results in practically any assigned task.

Key aspects in these contributions included the following: the use of many network layers, which explains the term “deep learning;” a huge amount of data on which to train; massive computations typically run on computer clusters or graphic processing units; and wise optimization algorithms that employ effective initializations and gradual stochastic gradient learning.

Neckarfront in Tubingen, Germany in style transfered from The Starry Night by Vincent van Gogh.

Unfortunately, all of these great empirical achievements were obtained with hardly any theoretical understanding of the underlying paradigm. Moreover, the optimization employed in the learning process is highly non-convex and intractable from a theoretical viewpoint.

To further complicate this story, certain deep learning-based contributions bear some elegance that cannot be dismissed. Such is the case with the style-transfer problem, which yielded amazingly beautiful results, and with inversion ideas of learned networks used to synthesize images out of thin air, as Google’s Deep Dream project does.

A few years ago we did not have the slightest idea how to formulate such complicated tasks; now they are solved formidably as a byproduct of a deep neural network trained for the completely extraneous task of visual classification.


Image Style Transfer Using Convolutional Neural Networks. Centre for Integrative Neuroscience, University of Tubingen, Germany. pdf

See Marbled Night for manual recreation of van Gogh.

DeepDream is a computer vision program created by Google which uses a convolutional neural network to find and enhance patterns in images via algorithmic pareidolia, thus creating a dream-like hallucinogenic appearance in the deliberately over-processed images. wikipedia

See Norvig on Chomsky for early controversy

SyntaxNet applies neural networks to the ambiguity problem. At each point in processing many decisions may be possible and a neural network gives scores for competing decisions based on their plausibility. post

See Statistical Modeling Cultures for academic commentary defending each of two cultures.

Deep neural network outputs are not continuous and can be very sensitive to tiny perturbation on the input vectors. Several methods have been proposed for crafting effective perturbation against these networks. pdf


Engineers will train networks to solve problems that will never attract sufficient interest for reasoned solutions. This will demand clever parameter adaption as evidenced by the 'tweening of motion captures in Phase-Functioned Character animation.

Researchers have shown how unsupervised image-to-image translation has learned to translate an image from one domain to another without any corresponding images in two domains in the training dataset. They applied the proposed framework to several unsupervised street scene image translation tasks including sunny to rainy, day to night, summery to snowy, and vice versa. nvidia pdf

Within three years deep learning will change front-end development. It will increase prototyping speed and lower the barrier for building software. post

Brain Score: Brain scientists dive into deep neural networks. science