30
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Automatic Speech Recognition from Neural Signals: A Focused Review

      review-article
        ,
      Frontiers in Neuroscience
      Frontiers Media S.A.
      ASR, automatic speech recognition, ECoG, fNIRS, EEG, speech, BCI, brain-computer interface

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Speech interfaces have become widely accepted and are nowadays integrated in various real-life applications and devices. They have become a part of our daily life. However, speech interfaces presume the ability to produce intelligible speech, which might be impossible due to either loud environments, bothering bystanders or incapabilities to produce speech (i.e., patients suffering from locked-in syndrome). For these reasons it would be highly desirable to not speak but to simply envision oneself to say words or sentences. Interfaces based on imagined speech would enable fast and natural communication without the need for audible speech and would give a voice to otherwise mute people. This focused review analyzes the potential of different brain imaging techniques to recognize speech from neural signals by applying Automatic Speech Recognition technology. We argue that modalities based on metabolic processes, such as functional Near Infrared Spectroscopy and functional Magnetic Resonance Imaging, are less suited for Automatic Speech Recognition from neural signals due to low temporal resolution but are very useful for the investigation of the underlying neural mechanisms involved in speech processes. In contrast, electrophysiologic activity is fast enough to capture speech processes and is therefor better suited for ASR. Our experimental results indicate the potential of these signals for speech recognition from neural data with a focus on invasively measured brain activity (electrocorticography). As a first example of Automatic Speech Recognition techniques used from neural signals, we discuss the Brain-to-text system.

          Related collections

          Most cited references40

          • Record: found
          • Abstract: found
          • Article: not found

          BCI2000: a general-purpose brain-computer interface (BCI) system.

          Many laboratories have begun to develop brain-computer interface (BCI) systems that provide communication and control capabilities to people with severe motor disabilities. Further progress and realization of practical applications depends on systematic evaluations and comparisons of different brain signals, recording methods, processing algorithms, output formats, and operating protocols. However, the typical BCI system is designed specifically for one particular BCI method and is, therefore, not suited to the systematic studies that are essential for continued progress. In response to this problem, we have developed a documented general-purpose BCI research and development platform called BCI2000. BCI2000 can incorporate alone or in combination any brain signals, signal processing methods, output devices, and operating protocols. This report is intended to describe to investigators, biomedical engineers, and computer scientists the concepts that the BC12000 system is based upon and gives examples of successful BCI implementations using this system. To date, we have used BCI2000 to create BCI systems for a variety of brain signals, processing methods, and applications. The data show that these systems function well in online operation and that BCI2000 satisfies the stringent real-time requirements of BCI systems. By substantially reducing labor and cost, BCI2000 facilitates the implementation of different BCI systems and other psychophysiological experiments. It is available with full documentation and free of charge for research or educational purposes and is currently being used in a variety of studies by many research groups.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Talking off the top of your head: toward a mental prosthesis utilizing event-related brain potentials

            This paper describes the development and testing of a system whereby one can communicate through a computer by using the P300 component of the event-related brain potential (ERP). Such a system may be used as a communication aid by individuals who cannot use any motor system for communication (e.g., 'locked-in' patients). The 26 letters of the alphabet, together with several other symbols and commands, are displayed on a computer screen which serves as the keyboard or prosthetic device. The subject focuses attention successively on the characters he wishes to communicate. The computer detects the chosen character on-line and in real time. This detection is achieved by repeatedly flashing rows and columns of the matrix. When the elements containing the chosen character are flashed, a P300 is elicited, and it is this P300 that is detected by the computer. We report an analysis of the operating characteristics of the system when used with normal volunteers, who took part in 2 experimental sessions. In the first session (the pilot study/training session) subjects attempted to spell a word and convey it to a voice synthesizer for production. In the second session (the analysis of the operating characteristics of the system) subjects were required simply to attend to individual letters of a word for a specific number of trials while data were recorded for off-line analysis. The analyses suggest that this communication channel can be operated accurately at the rate of 0.20 bits/sec. In other words, under the conditions we used, subjects can communicate 12.0 bits, or 2.3 characters, per min.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              Can parametric statistical methods be trusted for fMRI based group studies?

              The most widely used task fMRI analyses use parametric methods that depend on a variety of assumptions. While individual aspects of these fMRI models have been evaluated, they have not been evaluated in a comprehensive manner with empirical data. In this work, a total of 2 million random task fMRI group analyses have been performed using resting state fMRI data, to compute empirical familywise error rates for the software packages SPM, FSL and AFNI, as well as a standard non-parametric permutation method. While there is some variation, for a nominal familywise error rate of 5% the parametric statistical methods are shown to be conservative for voxel-wise inference and invalid for cluster-wise inference; in particular, cluster size inference with a cluster defining threshold of p = 0.01 generates familywise error rates up to 60%. We conduct a number of follow up analyses and investigations that suggest the cause of the invalid cluster inferences is spatial auto correlation functions that do not follow the assumed Gaussian shape. By comparison, the non-parametric permutation test, which is based on a small number of assumptions, is found to produce valid results for voxel as well as cluster wise inference. Using real task data, we compare the results between one parametric method and the permutation test, and find stark differences in the conclusions drawn between the two using cluster inference. These findings speak to the need of validating the statistical methods being used in the neuroimaging field.
                Bookmark

                Author and article information

                Contributors
                Journal
                Front Neurosci
                Front Neurosci
                Front. Neurosci.
                Frontiers in Neuroscience
                Frontiers Media S.A.
                1662-4548
                1662-453X
                27 September 2016
                2016
                : 10
                : 429
                Affiliations
                Cognitive Systems Lab, Department for Mathematics and Computer Science, University of Bremen Bremen, Germany
                Author notes

                Edited by: Giovanni Mirabella, Sapienza University of Rome, Italy

                Reviewed by: Andrea Brovelli, Centre national de la recherche scientifique (CNRS), France; Elizaveta Okorokova, National Research University Higher School of Economics, Russia

                Article
                10.3389/fnins.2016.00429
                5037201
                27729844
                d1400829-b188-4024-956e-f226aae9725b
                Copyright © 2016 Herff and Schultz.

                This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

                History
                : 15 April 2016
                : 05 September 2016
                Page count
                Figures: 2, Tables: 0, Equations: 0, References: 54, Pages: 7, Words: 5831
                Categories
                Neuroscience
                Focused Review

                Neurosciences
                asr,automatic speech recognition,ecog,fnirs,eeg,speech,bci,brain-computer interface
                Neurosciences
                asr, automatic speech recognition, ecog, fnirs, eeg, speech, bci, brain-computer interface

                Comments

                Comment on this article