44
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Learning curves of generic features maps for realistic datasets with a teacher-student model*

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Teacher-student models provide a framework in which the typical-case performance of high-dimensional supervised learning can be described in closed form. The assumptions of Gaussian i.i.d. input data underlying the canonical teacher-student model may, however, be perceived as too restrictive to capture the behaviour of realistic data sets. In this paper, we introduce a Gaussian covariate generalisation of the model where the teacher and student can act on different spaces, generated with fixed, but generic feature maps. While still solvable in a closed form, this generalization is able to capture the learning curves for a broad range of realistic data sets, thus redeeming the potential of the teacher-student framework. Our contribution is then two-fold: first, we prove a rigorous formula for the asymptotic training loss and generalisation error. Second, we present a number of situations where the learning curve of the model captures the one of a realistic data set learned with kernel regression and classification, with out-of-the-box feature maps such as random projections or scattering transforms, or with pre-learned ones—such as the features learned by training multi-layer neural networks. We discuss both the power and the limitations of the framework.

          Related collections

          Most cited references32

          • Record: found
          • Abstract: not found
          • Article: not found

          DISTRIBUTION OF EIGENVALUES FOR SOME SETS OF RANDOM MATRICES

            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            Cox's Regression Model for Counting Processes: A Large Sample Study

              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Reconciling modern machine-learning practice and the classical bias–variance trade-off

              Breakthroughs in machine learning are rapidly changing science and society, yet our fundamental understanding of this technology has lagged far behind. Indeed, one of the central tenets of the field, the bias–variance trade-off, appears to be at odds with the observed behavior of methods used in modern machine-learning practice. The bias–variance trade-off implies that a model should balance underfitting and overfitting: Rich enough to express underlying structure in data and simple enough to avoid fitting spurious patterns. However, in modern practice, very rich models such as neural networks are trained to exactly fit (i.e., interpolate) the data. Classically, such models would be considered overfitted, and yet they often obtain high accuracy on test data. This apparent contradiction has raised questions about the mathematical foundations of machine learning and their relevance to practitioners. In this paper, we reconcile the classical understanding and the modern practice within a unified performance curve. This “double-descent” curve subsumes the textbook U-shaped bias–variance trade-off curve by showing how increasing model capacity beyond the point of interpolation results in improved performance. We provide evidence for the existence and ubiquity of double descent for a wide spectrum of models and datasets, and we posit a mechanism for its emergence. This connection between the performance and the structure of machine-learning models delineates the limits of classical analyses and has implications for both the theory and the practice of machine learning.
                Bookmark

                Author and article information

                Journal
                Journal of Statistical Mechanics: Theory and Experiment
                J. Stat. Mech.
                IOP Publishing
                1742-5468
                November 24 2022
                November 01 2022
                November 24 2022
                November 01 2022
                : 2022
                : 11
                : 114001
                Article
                10.1088/1742-5468/ac9825
                25a4a970-6150-42fb-8b77-b024c35e69d0
                © 2022

                https://creativecommons.org/licenses/by/4.0/

                History

                Comments

                Comment on this article