6
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: not found

      A Primer on Contrastive Pretraining in Language Processing: Methods, Lessons Learned, and Perspectives

      1 , 2
      ACM Computing Surveys
      Association for Computing Machinery (ACM)

      Read this article at

      ScienceOpenPublisher
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Modern natural language processing (NLP) methods employ self-supervised pretraining objectives such as masked language modeling to boost the performance of various downstream tasks. These pretraining methods are frequently extended with recurrence, adversarial, or linguistic property masking. Recently, contrastive self-supervised training objectives have enabled successes in image representation pretraining by learning to contrast input-input pairs of augmented images as either similar or dissimilar. In NLP however, a single token augmentation can invert the meaning of a sentence during input-input contrastive learning, which led to input-output contrastive approaches that avoid the issue by instead contrasting over input-label pairs. In this primer, we summarize recent self-supervised and supervised contrastive NLP pretraining methods and describe where they are used to improve language modeling, zero to few-shot learning, pretraining data-efficiency, and specific NLP tasks. We overview key contrastive learning concepts with lessons learned from prior research and structure works by applications. Finally, we point to open challenges and future directions for contrastive NLP to encourage bringing contrastive NLP pretraining closer to recent successes in image representation pretraining.

          Related collections

          Most cited references44

          • Record: found
          • Abstract: not found
          • Conference Proceedings: not found

          Momentum Contrast for Unsupervised Visual Representation Learning

            • Record: found
            • Abstract: not found
            • Article: not found

            Exploring the limits of transfer learning with a unified text-to-text transformer

              • Record: found
              • Abstract: found
              • Article: not found
              Is Open Access

              A Survey on Contrastive Self-Supervised Learning

              Self-supervised learning has gained popularity because of its ability to avoid the cost of annotating large-scale datasets. It is capable of adopting self-defined pseudolabels as supervision and use the learned representations for several downstream tasks. Specifically, contrastive learning has recently become a dominant component in self-supervised learning for computer vision, natural language processing (NLP), and other domains. It aims at embedding augmented versions of the same sample close to each other while trying to push away embeddings from different samples. This paper provides an extensive review of self-supervised methods that follow the contrastive approach. The work explains commonly used pretext tasks in a contrastive learning setup, followed by different architectures that have been proposed so far. Next, we present a performance comparison of different methods for multiple downstream tasks such as image classification, object detection, and action recognition. Finally, we conclude with the limitations of the current methods and the need for further techniques and future directions to make meaningful progress.

                Author and article information

                Contributors
                (View ORCID Profile)
                (View ORCID Profile)
                Journal
                ACM Computing Surveys
                ACM Comput. Surv.
                Association for Computing Machinery (ACM)
                0360-0300
                1557-7341
                October 31 2023
                February 02 2023
                October 31 2023
                : 55
                : 10
                : 1-17
                Affiliations
                [1 ]German Research Center for AI, Berlin, Germany, University of Copenhagen, Denmark, Berlin, Germany
                [2 ]University of Copenhagen, Copenhagen, Denmark
                Article
                10.1145/3561970
                4f9757c1-db0f-42af-9b89-2372ea70280f
                © 2023
                History

                Comments

                Comment on this article

                Related Documents Log