44
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      The Brazilian Portuguese Lexicon: An Instrument for Psycholinguistic Research

      research-article
      1 , 2 , * , 1 , 2
      PLoS ONE
      Public Library of Science

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          In this article, we present the Brazilian Portuguese Lexicon, a new word-based corpus for psycholinguistic and computational linguistic research in Brazilian Portuguese. We describe the corpus development, the specific characteristics on the internet site and database for user access. We also perform distributional analyses of the corpus and comparisons to other current databases. Our main objective was to provide a large, reliable, and useful word-based corpus with a dynamic, easy-to-use, and intuitive interface with free internet access for word and word-criteria searches. We used the Núcleo Interinstitucional de Linguística Computacional’s corpus as the basic data source and developed the Brazilian Portuguese Lexicon by deriving and adding metalinguistic and psycholinguistic information about Brazilian Portuguese words. We obtained a final corpus with more than 30 million word tokens, 215 thousand word types and 25 categories of information about each word. This corpus was made available on the internet via a free-access site with two search engines: a simple search and a complex search. The simple engine basically searches for a list of words, while the complex engine accepts all types of criteria in the corpus categories. The output result presents all entries found in the corpus with the criteria specified in the input search and can be downloaded as a.csv file. We created a module in the results that delivers basic statistics about each search. The Brazilian Portuguese Lexicon also provides a pseudoword engine and specific tools for linguistic and statistical analysis. Therefore, the Brazilian Portuguese Lexicon is a convenient instrument for stimulus search, selection, control, and manipulation in psycholinguistic experiments, as also it is a powerful database for computational linguistics research and language modeling related to lexicon distribution, functioning, and behavior.

          Related collections

          Most cited references14

          • Record: found
          • Abstract: found
          • Article: not found

          DRC: a dual route cascaded model of visual word recognition and reading aloud.

          This article describes the Dual Route Cascaded (DRC) model, a computational model of visual word recognition and reading aloud. The DRC is a computational realization of the dual-route theory of reading, and is the only computational model of reading that can perform the 2 tasks most commonly used to study reading: lexical decision and reading aloud. For both tasks, the authors show that a wide variety of variables that influence human latencies influence the DRC model's latencies in exactly the same way. The DRC model simulates a number of such effects that other computational models of reading do not, but there appear to be no effects that any other current computational model of reading can simulate but that the DRC model cannot. The authors conclude that the DRC model is the most successful of the existing computational models of reading.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Wuggy: a multilingual pseudoword generator.

            Pseudowords play an important role in psycholinguistic experiments, either because they are required for performing tasks, such as lexical decision, or because they are the main focus of interest, such as in nonword-reading and nonce-inflection studies. We present a pseudoword generator that improves on current methods. It allows for the generation of written polysyllabic pseudowords that obey a given language's phonotactic constraints. Given a word or nonword template, the algorithm can quickly generate pseudowords that match the template in subsyllabic structure and transition frequencies without having to search through a list with all possible candidates. Currently, the program is available for Dutch, English, German, French, Spanish, Serbian, and Basque, and, with little effort, it can be expanded to other languages.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              SUBTLEX-NL: a new measure for Dutch word frequency based on film subtitles.

              We present a new database of Dutch word frequencies based on film and television subtitles, and we validate it with a lexical decision study involving 14,000 monosyllabic and disyllabic Dutch words. The new SUBTLEX frequencies explain up to 10% more variance in accuracies and reaction times (RTs) of the lexical decision task than the existing CELEX word frequency norms, which are based largely on edited texts. As is the case for English, an accessibility measure based on contextual diversity explains more of the variance in accuracy and RT than does the raw frequency of occurrence counts. The database is freely available for research purposes and may be downloaded from the authors' university site at http://crr.ugent.be/subtlex-nl or from http://brm.psychonomic-journals.org/content/supplemental.
                Bookmark

                Author and article information

                Contributors
                Role: Editor
                Journal
                PLoS One
                PLoS ONE
                plos
                plosone
                PLoS ONE
                Public Library of Science (San Francisco, CA USA )
                1932-6203
                2 December 2015
                2015
                : 10
                : 12
                : e0144016
                Affiliations
                [1 ]CNRS UMR5304, Laboratoire sur le Langage, le Cerveau et la Cognition, Institut de Sciences Cognitives, Bron, France
                [2 ]Université Claude Bernard Lyon 1, Université de Lyon, Lyon, France
                University of Groningen, NETHERLANDS
                Author notes

                Competing Interests: The authors have declared that no competing interests exist.

                Conceived and designed the experiments: GLE FM. Performed the experiments: GLE FM. Analyzed the data: GLE FM. Contributed reagents/materials/analysis tools: GLE FM. Wrote the paper: GLE FM. Have the same contributions in the whole work and manuscript: GLE FM.

                Article
                PONE-D-15-22325
                10.1371/journal.pone.0144016
                4668042
                26630138
                508c0eda-9fca-4c0c-ad4d-6a3de3e26cf0
                Copyright @ 2015

                This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited

                History
                : 22 May 2015
                : 12 November 2015
                Page count
                Figures: 6, Tables: 7, Pages: 24
                Funding
                This research was supported by funding from a PhD grant from the Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq) < http://www.cnpq.br/>, Brazil, number 238186/2012-1 to the first author (GLE) and from the Centre National de la Recherche Scientifique (CNRS) < http://www.cnrs.fr/>, France, number L2C2/UMR5304 to the second author (FM). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
                Categories
                Research Article
                Custom metadata
                All relevant data are within the paper and its Supporting Information files.

                Uncategorized
                Uncategorized

                Comments

                Comment on this article