Towards de-mystification of deep learning: function space analysis of the representation layers
Towards de-mystification of deep learning: function space analysis of the representation layers
PLEASE NOTE DIFFERENT DAY (TUESDAY). We propose a function space approach to Representation Learning [1] and the analysis of the representation layers in deep learning architectures. We show how to compute a `weak-type' Besov smoothness index that quantifies the geometry of the clustering in the feature space. This approach was already applied successfully to improve the performance of machine learning algorithms such as the Random Forest [2] and tree-based Gradient Boosting [3]. Our experiments demonstrate that in well-known and well-performing trained networks, the Besov smoothness of the training set, measured in the corresponding hidden layer feature map representation, increases from layer to layer which relates to the `unfolding' of the clustering in the feature space. We also contribute to the understanding of generalization [4] by showing how the Besov smoothness of the representations, decreases as we add more mis-labeling to the training data. We hope this approach will contribute to the de-mystification of some aspects of deep learning.