scispace - formally typeset
Open AccessJournal ArticleDOI

A cross-linguistic acoustic study of voiceless fricatives

Reads0
Chats0
TLDR
In this paper, the authors performed an acoustic study of voiceless fricatives in seven languages and measured three measurements: duration, center of gravity, and overall spectral shape of the fricative shape.
Abstract
Results of an acoustic study of voiceless fricatives in seven languages are presented. Three measurements were taken: duration, center of gravity, and overall spectral shape. In addition, formant transitions from adjacent vowels were measured for a subset of the fricatives in certain languages. Fricatives were well differentiated in terms of overall spectral shape and their co-articulation effects on formant transitions for adjacent vowels. The center of gravity measurement also proved useful in differentiating certain fricatives. Duration generally was less useful in differentiating the fricatives. In general, results were consistent across speakers and languages, with lateral fricatives displaying the greatest interlanguage variation in their acoustic properties and /s/ providing the greatest source of interspeaker variation.

read more

Content maybe subject to copyright    Report

1
A cross-linguistic acoustic study of voiceless fricatives
Matthew Gordon, Paul Barthmaier, and Kathy Sands
University of California, Santa Barbara
Abstract
Results of an acoustic study of voiceless fricatives in seven languages are presented. Three
measurements were taken: duration, center of gravity, and overall spectral shape. In addition,
formant transitions from adjacent vowels were measured for a subset of the fricatives in certain
languages. Fricatives were well differentiated in terms of overall spectral shape and their co-
articulation effects on formant transitions for adjacent vowels. The center of gravity measurement
also proved useful in differentiating certain fricatives. Duration generally was less useful in
differentiating the fricatives. In general, results were consistent across speakers and languages,
with lateral fricatives displaying the greatest interspeaker variation in their acoustic properties and
/s/ providing the greatest source of interspeaker variation.
1. Introduction
As early as the pioneering studies of fricatives carried out by Hughes and Halle (1956), Strevens
(1960), and Jassem (1962), it has been clear that fricatives are potentially differentiated along a
number of acoustic parameters, e.g. spectral shape, duration, overall intensity. While a relatively
large body of research has indicated that a number of properties may be fruitfully used to classify
fricatives, the set of languages forming the basis for generalizations about the acoustic structure of
fricatives remains very limited. Numerous studies have examined the fricatives of English, e.g.
Hughes and Halle (1956), Harris (1958), Forrest et al. (1988), Behrens and Blumstein (1988a, b),
Tomiak (1990), Jongman et al. (2000) and various fricatives produced by trained phoneticians,
e.g. Strevens (1960), Jassem (1962), Shadle (1985), Shadle et al. (1991), while only a relatively
few studies have targeted fricatives in languages other than English, e.g. Halle (1959) on Russian,
Westerdale (1969) on French, Lindblad (1980) on Swedish, Kudela (1968), Jassem (1979) on
Polish, Lacerda (1982) on Portuguese, Norlin (1983) on Cairene Arabic, Svantesson (1986) on
Mandarin Chinese, Tronnier and Dantsuji (1993) on Japanese and German. There is a particularly
acute dearth of studies employing the same set of measures to compare fricatives produced by
several speakers of different languages. Probably the largest cross-linguistic study of fricatives is
Nartey’s (1982) unpublished UCLA dissertation which presents auditory spectra, expressed in
critical bands, of fricatives in fourteen languages.
The present study seeks to increase our typological knowledge of the acoustic structures of
fricatives by examining data on fricatives in a genetically diverse set of seven languages: Aleut,
Apache, Chickasaw, Gaelic, Hupa, Montana Salish, and Toda. All of these languages possess
relatively rich fricative inventories consisting of between four and nine fricatives, thereby allowing
for cross-linguistic comparison of fricatives. Several of the examined languages also include
fricatives which have not been the subject of previous quantitative study; the present study thus
broadens our understanding of the acoustic characteristics of a wide range of fricatives. Finally,
comparison of data from multiple speakers allows for examination of interspeaker variation in the
acoustics of fricatives.
2. Methodology
2.1. Languages
The corpus for the present study consists of data from seven languages collected as part of an NSF
grant to Peter Ladefoged and Ian Maddieson to study endangered languages. The seven languages
included Aleut (Western dialect), Apache (Western dialect), Chickasaw, Scottish Gaelic, Hupa,

2
Montana Salish, and Toda. The examined languages form a genetically diverse set with only two
languages bearing a remote genetic affiliation to each other; these two languages, Hupa and
Western Apache, belong to different branches within the Athabaskan family (Na Dene phylum),
Hupa to the Pacific coast branch and Western Apache to the southern branch.
The fricatives investigated in the present study were voiceless fricatives contained in words
elicited in isolation from native speakers by researchers conducting fieldwork designed to
document and record the basic phonetic properties of the examined languages. The languages
contained between four (Western Aleut, Chickasaw) and nine (Toda) voiceless fricatives, differing
in the location of the primary constriction and the presence and degree of lip rounding. Table 1
lists the examined languages, the original study documenting their phonetic structures, their genetic
affiliations (according to Grimes 2001), their geographic location, and their inventory of voiceless
fricatives.
Table 1. Languages examined in present study
Language[Sources] Genetic
affiliation
Geographic
location
Voiceless
Fricatives
Aleut (Western)
[Taff et al. 2001]
Eskimo-Aleut North America
(Aleutian islands)
s, ç, Ò, x, X
Apache (Western)
[Gordon et al. 2001]
Na Dene North America
(Arizona)
s, S, Ò, x
Chickasaw
[Gordon et al. 2000]
Muskogean North America
(Oklahoma)
f, s, S, Ò
Gaelic (Scottish)
[Ladefoged et al. 1998]
Indo-European Europe (Scotland)
f, f
j
, s, S, ç, x
Hupa [Gordon 1996] Na Dene North America
(California)
s, S, Ò, x, xW, x7W
Montana Salish
[Flemming et al. 1994]
Salishan North America
(Montana)
s, S, Ò, xW, X, XW
Toda
[Shalev et al. 1994]
Dravidian Asia (India) f, T, s1, s, S, ß, Ò, Ò¢, x
2.2. Recordings
As part of the original data collection, speakers were recorded using a high quality noise canceling
head-mounted microphone and data were captured on DAT for the majority of the languages,
except for the Hupa, Montana Salish, and Toda recordings, which were made using a high quality
analog audio cassette recorder. The environment in which the examined fricatives occurred was
held constant within languages (with some substitutions where gaps in the original recorded data
precluded a perfect match across fricatives). In all languages, the fricatives (wherever the data
allowed) occurred adjacent to the vowel /a/ (and for languages in which fricatives were word-
medial, after the vowel /i/). A list of the words containing the examined fricatives in each language
appears in Appendix 1. Each word containing a targeted fricative was repeated twice by each
speaker, though in isolated cases, only a single token of a certain fricative uttered for a given
speaker was suitable for analysis. Each Toda word was uttered once by each speaker.
Data were digitized from the original recording at 22.05kHz using Scicon’s PcQuirer software
system in preparation for acoustic analysis, which was also completed using the same software.
Sampling at this rate allowed for measurement of a broad frequency range, while also respecting
the limits of the recording medium, a particularly important consideration in the case of data
collected using analog audio cassettes.

3
2.3. Measurements
A number of acoustic measurements designed to differentiate the examined fricatives were taken.
First, duration measures of each fricative were made from a waveform with the assistance of a
spectrogram in cases where segmentation was difficult using only the waveform. The onset and
cessation of noise were used as benchmarks for determining the beginning and end, respectively,
of each fricative. Second, FFT power spectra were computed for each fricative using a 1024 point
frame, which amounted to 46 milliseconds given the sampling rate of 22.05kHz. The window for
each spectrum was centered around the middle of each fricative to reduce co-articulation effects.
Spectra for the two repetitions of each fricative for each speaker were then averaged together,
yielding a average spectrum for each fricative for an individual speaker. Third, the center of
gravity (centroid or spectral mean) was also calculated for the frequency range 0-10kHz for each
fricative (Forrest et al. 1988, Zsiga 1993, Jongman et al. 2000). The center of gravity for each
fricative was calculated by multiplying each frequency value in the numerical spectrum by its
corresponding intensity value and then dividing the sum of these products by the sum of all the
frequency values of the spectrum. As a final measure, formant transitions for vowels adjacent to
certain fricatives were computed using a 12 or 14 coefficient LPC display checked against a 512
point frame (23 milliseconds) FFT spectrum encompassing the portion of the vowel immediately
adjacent to the fricative. The fricatives targeted for measurement of their vowel transitions were
those which either previous research had indicated were profitably differentiated through their
transitions or which were otherwise relatively poorly separated through the other measurements
taken, i.e. /f/ and /T/, retroflex fricatives, velar and uvular fricatives, and fricatives distinguished
through rounding.
3. Results
Sections 3.1-3.7 present results for individual languages. The words containing the measured
fricatives for each language appear in Appendix 1. Comparison of fricatives across the examined
languages is deferred until section 4.
3.1. Chickasaw
Chickasaw possesses four fricatives: a labiodental /f/, an alveolar /s/, a postalveolar /S/, and an
alveolar lateral /Ò/. Data from 12 Chickasaw speakers, 7 females and 5 males, were collected. The
targeted fricatives all appeared in either disyllabic or trisyllabic words following an unstressed /i/
and a stressed /a/ (see Appendix 1).
3.1.1. Duration
Duration measures were not found to differentiate the four fricatives of Chickasaw reliably. A two
factor ANOVA (fricative and gender) pooled over all speakers indicated no significant effect of the
fricative on duration measurements: F (3, 82) = 1.036, p=.3813. Gender had a significant effect
on duration with fricatives being longer for female speakers than for male speakers: F (1, 82) =
7.115, p=.0092. Averaged over all speakers, /s/ was slightly longer than other fricatives (see
Table 2), but pairwise comparisons by Fisher’s posthoc tests did not indicate any statistically
reliable length difference between any pairs of fricatives.

4
Table 2. Average duration in milliseconds for 12 speakers of Chickasaw
f s S Ò
F1 104.0 140.6 109.7 52.2
F2 113.5 108.2 100.8 90.9
F3 145.1 161.6 129.4 171.9
F4 128.9 137.8 114.5 120.5
F5 131.9 140.5 127.7 148.4
F6 122.0 121.9 98.4 102.3
F7 105.2 125.5 138.9 126.6
M1 87.5 110.0 115.2 -----
M2 113.9 104.0 79.1 112.0
M3 114.3 126.0 122.3 112.9
M4 110.3 119.6 119.6 115.3
M5 95.4 95.3 90.4 123.1
Average 115.5 123.6 112.8 116.0
3.1.2. Gravity centers
A two factor ANOVA (fricative and gender) indicated a highly significant effect of fricative type on
gravity center: F (3, 82) = 11.660, p<.0001. Gravity centers for the alveolar /s/ were highest.
Pairwise Fisher’s posthoc comparison revealed the difference between /s/ and all other fricatives to
be significant at minimally the p<.01 level. Pairwise comparison between other fricatives did not
reach statistical significance. Gender also exerted a significant effect on duration values with
higher gravity centers observed for the female speakers: F (1, 82) = 6.565, p=.0122. Gravity
centers for individual speakers appear in Table 3. There is considerable variation between speakers
in the rank ordering of gravity centers for the three fricatives other than /s/.
Table 3. Average gravity centers in Hz for 12 speakers of Chickasaw
f s S Ò
F1 4193 5423 4675 4462
F2 5150 5674 4709 5119
F3 4235 5142 4827 4685
F4 4848 4943 4558 4469
F5 4925 5854 5158 5102
F6 4720 5653 4954 4658
F7 4228 4480 4333 4523
M1 4369 5407 4775 4938
M2 4600 4686 4268 4858
M3 4584 5003 4827 4866
M4 4352 4737 4252 4546
M5 4326 4519 4510 4358
Average 4562 5163 4679 4715
3.1.3. Spectra
Spectra averaged over the female Chickasaw speakers (calculated after inspection of individual
speakers indicated little interspeaker variation in spectral characteristics) are plotted in Figure 1.
Spectra for the male speakers appear in Figure 2. Due to interspeaker variation in the spectra for /s/

5
and /Ò/ among the male speakers, spectra for these two fricatives are separated according to
speaker.
The labiodental /f/ is characterized by the flattest spectrum for both male and female speakers
gradually dropping in intensity as frequency increases. /f/ also displays the lowest overall intensity
of the fricatives: at virtually all frequencies, intensity is lowest for /f/. The postalveolar /S/
displays a relatively sharp spectral peak between 2.5kHz and 4kHz, approximately the same for
both male and female speakers. For the female speakers, the spectrum for the lateral /Ò/ is similar to
that of /S/. However, the spectral peak is not as sharp for /Ò/ and there is a second peak for the
lateral at approximately 7kHz. In addition, there is greater low frequency noise below 1kHz for
the /Ò/. For the female speakers, the greatest noise for the alveolar /s/ is centered at the highest
frequencies of the four fricatives, between 5kHz and 8kHz, in keeping with the high gravity center
for /s/.
The male speakers differ considerably among themselves in their spectra for /s/, presumably
reflecting differences in the exact location and length of the constriction as well as tongue body
shape. Large differences in acoustic spectra for /s/ (and other coronal obstruents) between
speakers of the same language may be relatively common, as Dart’s (1991, 1998) data from
French and English suggest. Chickasaw speakers M1 and M4 have spectra similar to those of the
female speakers for /s/ with greatest noise between approximately 5kHz and 8kHz. The high
frequency noise band for speaker M4 is broader than for speaker M1, extending as low as 3kHz.
Speaker M2 differs from the other speakers in producing an /s/ with a narrow peak centered at
about 4kHz. Though similar in shape to the spectrum of /S/ for speaker M2, the /s/ nevertheless
differs in the frequency location of the spectral peak, which is higher for /s/ than for /S/. The /s/
spectrum for speaker M3 has a two peaked distribution with one peak at about 3kHz and one at
6kHz. Finally, speaker M5 has a flat high intensity spectrum for /s/ with noise only tailing off at
frequencies above 8kHz. Interspeaker variation among male speakers in their production of /Ò/ is
also considerable, though speakers M1, M3, and M4 have their most pronounced peaks centered
around 2.5kHz. Speaker M3 displays a second sharp peak at around 4.5kHz. A less acute peak at
about 4kHz characterizes the /Ò/ spectrum for speaker M5. The spectrum for speaker M2 displays
three relatively gentle peaks between 3kHz and 8kHz.
40
45
50
55
60
65
70
75
80
85
0 2 4 6 8 10
kHz
S
Ò
s
f
dB
Figure 1. Averaged acoustic spectra (female speakers) for Chickasaw fricatives

Citations
More filters
Journal ArticleDOI

The evolution of auditory dispersion in bidirectional constraint grammars

TL;DR: This article showed that learners optimise their perception by gradually ranking their cue constraints, and reuse the resulting ranking in production, they automatically introduce a PROTOTYPE EFFECT, which can be counteracted by an ARTICULATORY EFFECT.
Journal ArticleDOI

Structure in talker-specific phonetic realization: Covariation of stop consonant VOT in American English

TL;DR: Evidence that variation across talkers in the realization of American English stop consonants is highly structured is provided, which supports a uniformity constraint on the talker-specific realization of a phonetic property, such as glottal spreading, that is shared by multiple speech sounds.
Journal ArticleDOI

Context-specific acoustic differences between Peruvian and Iberian Spanish vowels.

TL;DR: This paper examines four acoustic properties of the monophthongal vowels of Iberian Spanish from Madrid and Peruvian Spanish from Lima in various consonantal contexts and finds that /s/ has a higher spectral center of gravity in PS than in IS by about 10%, that PS speakers speak slower than IS speakers by about 9%, and that Spanish-speaking women speak faster than Spanish- speaking men by about 5% (irrespective of dialect.
Journal ArticleDOI

Hegemonic masculinity and the variability of gay-sounding speech: The perceived sexuality of transgender men

TL;DR: In this article, the authors focus on variability among gay-sounding speakers by analyzing the voices of female-to-male transgender individuals, or trans men, and find that trans men who make use of testosterone typically experience a significant drop in vocal pitch, yet may maintain stylistic traits acquired while living in a female social role.
References
More filters
Book

The Sounds of the World's Languages

TL;DR: The Sounds of the Worlda s Languages as discussed by the authors is a collection of the world languages spoken in the Middle East and North Africa, including Arabic, French, German, Italian, and Dutch.
Journal ArticleDOI

Acoustic characteristics of English fricatives.

TL;DR: The present results indicate that spectral peak location, spectral moments, and both normalized and relative amplitude serve to distinguish all four places of fricative articulation.
Journal ArticleDOI

Statistical analysis of word-initial voiceless obstruents: preliminary data.

TL;DR: The model constructed from the males' data correctly classified about 94% of the voiceless stops produced by the female speakers and the classification model held across gender.
Journal ArticleDOI

Spectral Properties of Fricative Consonants

TL;DR: In this paper, energy density spectra of gated segments of fricative consonants were measured and the spectral data were used as a basis for developing objective identification criteria which yielded fair results when tested.
Frequently Asked Questions (15)
Q1. What are the contributions in "A cross-linguistic acoustic study of voiceless fricatives" ?

This paper performed an acoustic study of voiceless fricatives in seven languages: Aleut, Apache, Chickasaw, Gaelic, Hupa, Montana Salish, and Toda. 

It is hoped that a better understanding of this variation in fricatives will be achieved in future studies combining articulatory and acoustic data. 

Constriction location acted as a reliable predictor of the distribution of the greatest concentrations of noise in all of the examined languages: the more posterior the constriction, the greater the weighting of noise toward lower frequencies, as predicted given the prominence of front cavity resonances in fricative spectra. 

Like backing, rounding also has the effect of lengthening the cavity in front of the fricative constriction thereby enhancing the lower frequency components in the spectrum. 

in Western Aleut, the first formant is also raised in the vicinity of uvulars, suggesting a lowering effect of the uvular consonant on the tongue. 

Rounding has a greater lowering effect than backing on the primary spectral peak in the realization of the contrast between /xW/, /X/ and /XW/ in Montana Salish. 

In Western Aleut and Salish, uvulars trigger a lowering of the second formant of the immediately adjacent vowel suggesting a backing of vowels adjacent to uvulars. 

The corpus for the present study consists of data from seven languages collected as part of an NSF grant to Peter Ladefoged and Ian Maddieson to study endangered languages. 

/ß/ has substantially lower F3 values (also statistically reliable) than the other fricatives, due to the sublingual cavity created through retroflexion. 

The results averaged over the five speakers in Figure 11 indicate that rounding triggers substantial lowering of second formant, and to a lesser extent the first formant, during the consonant-to-vowel transition. 

Probably the largest cross-linguistic study of fricatives is Nartey’s (1982) unpublished UCLA dissertation which presents auditory spectra, expressed in critical bands, of fricatives in fourteen languages. 

In addition, F3 lowering before /ß/ is presumably important in contrasting the retroflex sibilant and other sibilants, particularly for the female speakers, for whom /ß/ is poorly differentiated from /s/ and /S/ spectrally. 

As early as the pioneering studies of fricatives carried out by Hughes and Halle (1956), Strevens (1960), and Jassem (1962), it has been clear that fricatives are potentially differentiated along a number of acoustic parameters, e.g. spectral shape, duration, overall intensity. 

Due to interspeaker variation in the spectra for /s/5and /Ò/ among the male speakers, spectra for these two fricatives are separated according to speaker. 

All of these languages possess relatively rich fricative inventories consisting of between four and nine fricatives, thereby allowing for cross-linguistic comparison of fricatives.