Today we present a systematic study comparing wide and deep #NeuralNetworks through the lens of their hidden representations and outputs. Learn how similarities between model layers can inform researchers about model performance and behavior at https://t.co/2iM9HBDQjC pic.twitter.com/2CNeYNTNyR— Google AI (@GoogleAI) May 4, 2021
From Google's AI Blog: Do Wide and Deep Networks Learn the Same Things?
A common practice to improve a neural network’s performance and tailor it to available computational resources is to adjust the architecture depth and width. Indeed, popular families of neural networks, including EfficientNet, ResNet and Transformers, consist of a set of architectures of flexible depths and widths. However, beyond the effect on accuracy, there is limited understanding of how these fundamental choices of architecture design affect the model, such as the impact on its internal representations.
In “Do Wide and Deep Networks Learn the Same Things? Uncovering How Neural Network Representations Vary with Width and Depth”, we perform a systematic study of the similarity between wide and deep networks from the same architectural family through the lens of their hidden representations and final outputs. In very wide or very deep models, we find a characteristic block structure in their internal representations, and establish a connection between this phenomenon and model overparameterization. Comparisons across models demonstrate that those without the block structure show significant similarity between representations in corresponding layers, but those containing the block structure exhibit highly dissimilar representations. These properties of the internal representations in turn translate to systematically different errors at the class and example levels for wide and deep models when they are evaluated on the same test set.
More on the blog.