49
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Deep Variational Information Bottleneck

      Preprint
      , , ,

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          We present a variational approximation to the information bottleneck of Tishby et al. (1999). This variational approach allows us to parameterize the information bottleneck model using a neural network and leverage the reparameterization trick for efficient training. We call this method "Deep Variational Information Bottleneck", or Deep VIB. We show that models trained with the VIB objective outperform those that are trained with other forms of regularization, in terms of generalization performance and robustness to adversarial attack.

          Related collections

          Most cited references7

          • Record: found
          • Abstract: not found
          • Article: not found

          Acceleration of Stochastic Approximation by Averaging

            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            Deep Learning and the Information Bottleneck Principle

            Deep Neural Networks (DNNs) are analyzed via the theoretical framework of the information bottleneck (IB) principle. We first show that any DNN can be quantified by the mutual information between the layers and the input and output variables. Using this representation we can calculate the optimal information theoretic limits of the DNN and obtain finite sample generalization bounds. The advantage of getting closer to the theoretical limit is quantifiable both by the generalization bound and by the network's simplicity. We argue that both the optimal architecture, number of layers and features/connections at each layer, are related to the bifurcation points of the information bottleneck tradeoff, namely, relevant compression of the input layer with respect to the output layer. The hierarchical representations at the layered network naturally correspond to the structural phase transitions along the information curve. We believe that this new insight can lead to new optimality bounds and deep learning algorithms.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              Information based clustering

              In an age of increasingly large data sets, investigators in many different disciplines have turned to clustering as a tool for data analysis and exploration. Existing clustering methods, however, typically depend on several nontrivial assumptions about the structure of data. Here we reformulate the clustering problem from an information theoretic perspective which avoids many of these assumptions. In particular, our formulation obviates the need for defining a cluster "prototype", does not require an a priori similarity metric, is invariant to changes in the representation of the data, and naturally captures non-linear relations. We apply this approach to different domains and find that it consistently produces clusters that are more coherent than those extracted by existing algorithms. Finally, our approach provides a way of clustering based on collective notions of similarity rather than the traditional pairwise measures.
                Bookmark

                Author and article information

                Journal
                2016-12-01
                Article
                1612.00410
                5ea15a4c-377b-4904-a78a-97e369a4c019

                http://arxiv.org/licenses/nonexclusive-distrib/1.0/

                History
                Custom metadata
                13 pages, 4 figures, ICLR2017 Submission
                cs.LG cs.IT math.IT

                Numerical methods,Information systems & theory,Artificial intelligence
                Numerical methods, Information systems & theory, Artificial intelligence

                Comments

                Comment on this article