Concept Whitening – Understanding Deep Neural networks

0
827

Neurons, synapses, and nodes are some common terminologies we have come across at some point in time. However ‘Deep Neural Network’ is a phenomenon only inspired by the meaning of the previously mentioned terminologies. The deep neural network not only works according to the algorithm but can also predict a solution for a task and make conclusions using its previous experience with neither coding nor programming.

Deep Neural Networks

Due to the numerous layers of nodes, Deep Neural Networks are often considered to be complex, extremely large, and multi-layered. Despite their advantages, these complex layers make their interpretability difficult – almost impossible. The lack of interpretability can deem the mechanism to be untrustworthy and unreliable. Thus, Instead of attempting to analyze a neural network post hoc, scientists have come up with a unique concept – a mechanism called ‘concept whitening’ that helps us to understand the computation leading us to that particular layer.

Due to the immense number of layers, the inner mechanism of a neural network is often compared to a cobweb, which implies that though the interdependence of the layers is difficult to predict, it sustains and functions successfully. Interpretability in deep learning is, therefore, extremely important, but the calculations of neural networks are always challenging to understand. Therefore, ‘concept whitening’ – a new format, has been introduced to elaborate the inner functionalities of a Deep neural network.

The role of Latent Space

Latent space plays a pivotal role in the concept of ‘deep learning’. Latent space is an abstract, ideal multi-dimensional space characterized by feature values that are interpreted indirectly; it encodes internal representation of externally observed events. Learning latent space would facilitate the deep learning model to make better sense of the observed data.

In an AI model, latent space refers to individual layers of deep learning model that encodes the features and trains the models into a set of numerical values and stores them in its parameters. The lower layer of a multilayered convolutional neural network learns the basic features such as corners and edges, higher layers learn to detect more complex features like faces, objects, full scenes, etc.

Concept Whitening – a disentangling process

Concept whitening constraints the latent space to represent target concepts and also provides a direct means to extract them. The main feature is that it doesn’t force the concepts to be learned as an intermediate step, rather imposes space to be aligned along with the concepts. Therefore, concept whitening is used to increase the interpretability of the neural network model. It provides a clear understanding of how the network gradually learns concepts over layers. Concept whitening is inserted into neural networks at the pace of the batch normalization module.

Concept whitening implication in Deep neural network model

Researchers from the Prediction Analysis Lab at Duke University, have published a paper in Nature Machine Intelligence about ‘concept whitening’ being used for deep neural network interpretability. Initially, the concept performed a whitening transformation that resembles a signal which is transformed into white noise. By many trials, the researchers found that concept whitening can be applied to any layer of the deep neural network to gain interpretability without negatively affecting performance. Another concern of the research is organizing concepts in hierarchies and disentangling clusters of concepts rather than individual concepts.

Follow and connect with us on FacebookLinkedin & Twitter