15
views
0
recommends
+1 Recommend
1 collections
4
shares
• Record: found
• Abstract: found
• Article: found
Is Open Access

# Acoustic classification of focus: On the web and in the lab

, ,

Ubiquity Press

ScienceOpenPublisher
Bookmark
There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

### Abstract

We present a new methodological approach which combines both naturally-occurring speech harvested on the web and speech data elicited in the laboratory. This proof-of-concept study examines the phenomenon of focus sensitivity in English, in which the interpretation of particular grammatical constructions (e.g., the comparative) is sensitive to the location of prosodic prominence. Machine learning algorithms (support vector machines and linear discriminant analysis) and human perception experiments are used to cross-validate the web-harvested and lab-elicited speech. Results confirm the theoretical predictions for location of prominence in comparative clauses and the advantages using both web-harvested and lab-elicited speech. The most robust acoustic classifiers include paradigmatic (i.e., un-normalized), non-intonational acoustic measures (duration and relative formant frequencies from single segments). These acoustic cues are also significant predictors of human listeners’ classification, offering new evidence in the debate whether prominence is mainly encoded by pitch or by other cues, and the role that utterance-normalization plays when looking at non-pitch cues such as duration.

### Most cited references35

• Record: found
• Abstract: found
• Article: found
Is Open Access

### Feature selection in omics prediction problems using cat scores and false nondiscovery rate control

,   (2010)
We revisit the problem of feature selection in linear discriminant analysis (LDA), that is, when features are correlated. First, we introduce a pooled centroids formulation of the multiclass LDA predictor function, in which the relative weights of Mahalanobis-transformed predictors are given by correlation-adjusted $$t$$-scores (cat scores). Second, for feature selection we propose thresholding cat scores by controlling false nondiscovery rates (FNDR). Third, training of the classifier is based on James--Stein shrinkage estimates of correlations and variances, where regularization parameters are chosen analytically without resampling. Overall, this results in an effective and computationally inexpensive framework for high-dimensional prediction with natural feature selection. The proposed shrinkage discriminant procedures are implemented in the R package sda'' available from the R repository CRAN.
Bookmark
• Record: found

### On stress and linguistic rhythm

(1977)
Bookmark
• Record: found
• Abstract: found

### Approximate Statistical Tests for Comparing Supervised Classification Learning Algorithms.

(1998)
This article reviews five approximate statistical tests for determining whether one learning algorithm outperforms another on a particular learning task. These tests are compared experimentally to determine their probability of incorrectly detecting a difference when no difference exists (type I error). Two widely used statistical tests are shown to have high probability of type I error in certain situations and should never be used: a test for difference of two proportions and a paired-differences t test based on taking several random train-test splits. A third test, a paired-differences t test based on 10-fold cross-validation, exhibits somewhat elevated probability of type I error. A fourth test, McNemar's test, is shown to have low type I error. The fifth test is a new test, 5 x 2 cv, based on five iterations of twofold cross-validation. Experiments show that this test also has acceptable type I error. The article also measures the power (ability to detect algorithm differences when they do exist) of these tests. The cross-validated t test is the most powerful. The 5 x 2 cv test is shown to be slightly more powerful than McNemar's test. The choice of the best test is determined by the computational cost of running the learning algorithm. For algorithms that can be executed only once, McNemar's test is the only test with acceptable type I error. For algorithms that can be executed 10 times, the 5 x 2 cv test is recommended, because it is slightly more powerful and because it directly measures variation due to the choice of training set.
Bookmark

### Author and article information

###### Journal
1868-6354
Laboratory Phonology: Journal of the Association for Laboratory Phonology
Ubiquity Press
1868-6354
11 July 2017
: 8
: 1
###### Affiliations
Montclair State University, US
Cornell University, US
McGill University, CA
###### Article
10.5334/labphon.8

This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License (CC-BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. See http://creativecommons.org/licenses/by/4.0/.

###### Product
Self URI (journal page): https://www.journal-labphon.org/
###### Categories
Journal article

Applied linguistics, General linguistics, Linguistics & Semiotics