17
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Information Plane Analysis of Deep Neural Networks via Matrix-Based Renyi's Entropy and Tensor Kernels

      Preprint

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Analyzing deep neural networks (DNNs) via information plane (IP) theory has gained tremendous attention recently as a tool to gain insight into, among others, their generalization ability. However, it is by no means obvious how to estimate mutual information (MI) between each hidden layer and the input/desired output, to construct the IP. For instance, hidden layers with many neurons require MI estimators with robustness towards the high dimensionality associated with such layers. MI estimators should also be able to naturally handle convolutional layers, while at the same time being computationally tractable to scale to large networks. None of the existing IP methods to date have been able to study truly deep Convolutional Neural Networks (CNNs), such as the e.g.\ VGG-16. In this paper, we propose an IP analysis using the new matrix--based R\'enyi's entropy coupled with tensor kernels over convolutional layers, leveraging the power of kernel methods to represent properties of the probability distribution independently of the dimensionality of the data. The obtained results shed new light on the previous literature concerning small-scale DNNs, however using a completely new approach. Importantly, the new framework enables us to provide the first comprehensive IP analysis of contemporary large-scale DNNs and CNNs, investigating the different training phases and providing new insights into the training dynamics of large-scale neural networks.

          Related collections

          Most cited references14

          • Record: found
          • Abstract: not found
          • Conference Proceedings: not found

          Deep learning and the information bottleneck principle

            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            Input feature selection by mutual information based on Parzen window

              Bookmark
              • Record: found
              • Abstract: not found
              • Article: not found

              Kernel Mean Embedding of Distributions: A Review and Beyond

                Bookmark

                Author and article information

                Journal
                25 September 2019
                Article
                1909.11396
                b045267a-3d81-4b0a-a1d7-aafc04806eb9

                http://arxiv.org/licenses/nonexclusive-distrib/1.0/

                History
                Custom metadata
                15 pages, 8 figures
                stat.ML cs.LG

                Machine learning,Artificial intelligence
                Machine learning, Artificial intelligence

                Comments

                Comment on this article