SAME TALKER, DIFFERENT LANGUAGE: A REPLICATION

Verna Stockmal and Z. S. Bond
Department of Linguistics and Institute for Empirical Study of Language
Ohio University, Athens, OH
Stockmav@Ohio.edu Bond@Ohio.edu

ABSTRACT

When discriminating between spoken samples of unknown foreign languages, infants, young children and adult listeners are able to make same-language/different-language judgments at better than chance levels. Adults can even discriminate between languages when they are produced by the same bilingual talkers. That is, listeners are able to separate talker from language characteristics. One question raised by this investigation had to do with the familiarity of languages. The bilingual talkers who provided spoken language samples spoke a home language and a language they had learned as part of formal education such as French, German or Russian. It is possible that the listeners, American college students, had some familiarity with the school language and could distinguish it from the home language on this basis. In the current study, four bilingual talkers provided spoken samples of languages spoken in Africa which would be expected to be equally unfamiliar to American listeners. One of the languages spoken by all the talkers was Swahili; the other languages were Akan, Haya, Kikuyu, and Luhya. American listeners were asked to judge whether spoken paired samples were produced in the same language or in two different languages. Overall, listeners were able to identify languages as same or different at better than chance expectation.
 

1. INTRODUCTION

Infants [1, 2] young children [3] and adults [4, 5] are able to discriminate between spoken samples of foreign languages which they do not speak or understand. The research with infants in particular has suggested that they base discrimination judgments on prosodic patterns, such as the rhythmic structures [6] and pitch characteristics [7] of languages. Children and adults also employ prosodic information in discriminating between languages [8]. Listener judgments are also influenced by talker voice [9] and affect [10]. In fact, Esling and Wong [11] suggest that talkers show speech characteristics associated with geographic areas.

In some studies of language discrimination, different language samples have been provided by different talkers, confusing the contributions of talker-specific characteristics with language characteristics. In some cases, listeners are able to separate talker voice characteristics from language characteristics.
In a previous study, we obtained better than chance discrimination of language pairs produced by bilingual talkers [12]. Listeners were presented paired language samples produced by four bilingual male talkers and four bilingual female talkers. Listeners could not discriminate all of the language pairs with equal facility. For male talkers, the listeners discriminated between Arabic and French, Hebrew and German, Akan and Swahili, and Latvian and Russian. For female talkers, the listeners discriminated only between Korean and Japanese and Mbawa and French. Their ability to discriminate between Russian and Latvian was marginal and they could not discriminate between Ilocano and Tagalog at all.

Some of the languages presented for discrimination are better known than others. Listeners may have been using previous knowledge of the ‘sound’ or ‘acoustic signature’ of some languages. French and German are commonly studied in high schools. Hebrew, Russian and Japanese are also somewhat familiar to American listeners.

The objective of the current study is to replicate listener discrimination of languages produced by bilingual talkers, but using spoken samples equally unfamiliar to listeners. In addition, the languages selected for discrimination employ similar rhythmic patterns and come from the same geographic area.

2. METHOD

2.1. Materials

Four bilingual talkers recorded short passages in Swahili and in their home language. The languages and countries of origin of the talkers are given in Table 1.

Swahili is a Bantu language spoken in East Africa. It was formerly used as a trade language between Africans and Arabs, and is now widely spoken as a second language, and often used in primary education. Kikuyu (5.3 million speakers), Luhya (3.6 million speakers) and Haya (1.2 million speakers) are Bantu languages spoken in East Africa. Akan (7 million speakers) is one of the major languages of West Africa. It is classified within the same Niger-Kordofanian language family but distantly related to the Bantu languages of East Africa.

2.1.1 Phonology

The prosodic properties of the test languages are very similar. All employ syllable rhythm and all but Swahili employ register tone. The languages differ in vowel inventories. The Bantu languages have either a 5- or a 7- vowel system, whereas Akan employs 8 oral vowels and 7 nasalized vowels. The consonant inventories of the languages are relatively similar [13, 14, 15].
Language Pairs  Country
 Akan-Swahili  Ghana
 Haya-Swahili  Tanzania
 Kikuyu-Swahili  Kenya
Luhya–Swahili   Kenya

Table 1: Country of origin and languages produced by four bilingual talkers

2.1.2 Test Recording

Short sentences or phrases, 5 seconds in duration, were excerpted from read passages and assembled as a test recording. The recording containing 32 test items, consisting of paired spoken language samples produced by the same talker. Half were same-language pairs, half were different-language pairs. In the different-language pairs, Swahili was always one of the languages. The test also contained three practice items employing language samples produced by a Hebrew-German bilingual talker. Each test item consisted of: Item number + Language sample 1 + tone + Language Sample 2.

2.2 Listeners

Twenty-five undergraduate students enrolled in an introduction to linguistics class with no previous experience with African languages served as listeners.

2.3. Procedure

After training with practice items, the listeners judged each test item as containing a same-language pair or a different-language pair.

3. RESULTS

Overall, listeners performed significantly better than chance, 71% correct (t (24) = 10.5, p < .001).  Listeners were approximately equally accurate in making same-language and different-language judgments, 70% and 73%.

Figure 1: Listeners were able to discriminate 3 of the different-language pairs, but not Kikuyu-Swahili.
 

In the response patterns to the different-language pairs, listeners were able to discriminate between the different languages at nearly 80% correct, except for Swahili-Kikuyu. The correct discriminations are given in Fig. 1.

Listener responses to same-language pairs varied. Most were identified accurately as representing the same language, but the Kikuyu same-language pairs were difficult (56%) as were the Swahili same-language pairs as produced by one of the talkers (38%). Listener responses to same-language pairs are given in Fig. 2.
 
 

Figure 2: Listeners were able to identify most same-language pairs correctly. Kikuyu and Swahili as produced by one talker were difficult.
 

4. DISCUSSION

Even when presented with samples of foreign languages with which they were unfamiliar, listeners were able to make accurate same-language, different-language judgments, indicating that listener success in the task does not require familiarity with test languages. Instead, listener responses were based on acoustic-phonetic information present in the speech samples.

All the languages employ syllable-based rhythm; therefore it is not likely that listeners have based discrimination judgments on rhythmic patterns. If listeners were making judgments solely on rhythmic properties, they would have shown a bias toward same language judgments and produced many false alarms. Rather, different-language judgments were made correctly for 71% of the items.

Swahili is the only language among those tested which does not employ tone. It is not clear to what extent American listeners were basing judgments on tone differences. American listeners whose language does not employ tone seem to find tone patterns  difficult to detect [16].

Listeners may have used information provided by the segment inventories of the languages. The Bantu languages have relatively similar consonant and vowel inventories. All employ either 5 or 7 vowels and prenasalized consonants.  Although Swahili differs from the other languages in that it employs implosive stops, American listeners are likely to assimilate these to the category of voiced stops [17, 18].  On the other hand, the nasalized vowels of Akan appear to be highly salient. It is possible that Akan is judged different from Swahili on the basis of vowel inventory.

Listeners can employ talker characteristics associated with specific geographic areas to make discrimination judgments and can often identify the geographic area in which the language is spoken [19].  However, all talkers in the current study were from the same geographic area, and employed a similar low-normal pitch range.

Bilingual talkers sometimes produce their different languages with what appears to be a change in affect.   All the talkers appeared to speak Swahili more quickly than their home language, except the Kikuyu talker.  He produced both languages at about the same rate.  It is interesting to note that both the Kikuyu-Swahili language pairs in the different judgment set and the Kikuyu-Kikuyu language pairs in the same judgment set produced very low correct scores, 44% and 56%, respectively. One of the most important components of clear speech is speaking rate [21].  It is possible that because of the perceived fast rate of speech of this talker, listeners had difficulty extracting the information they needed for discrimination.  Discrimination judgments for this talker were at chance levels for two of the three types of production.

While this might account for the results from this single talker, it does not account for the fact that listeners can and do make accurate judgments for the other three languages when paired with themselves and with Swahili.  Talkers may produce noticeably different background elements or voice settings for the different languages they speak fluently. [11] Fine distinctions in prosody and segments as well as affective information may be included within the background elements of each language. Listeners may be able to extract these fine-grained elements from the signal when the rate of speech is such that they can be categorized.  Voice setting may be included in both talker characteristics and language characteristics and may be salient to listeners even when they do not know the languages they are discriminating.
 In attempting to form a model of the process by which human listeners perform the discrimination of unknown foreign languages, further research needs to address the question: to what degree do talkers associate specific speaking styles with particular languages?

REFERENCES

[1] Mehler, J., Jusczyk, P. W., Lambertz, G., Halsted,N.,Bertoncini, J. and Amiel-Tison, C., “A Precursor of Language Acquisition in Young Infants,” Cognition 29: 143-178, 1988.
[2] Bosch, L., Sebastian-Galles, N., “Native-language Recognition Abilities in 4-month-old Infants from Monolingual and Bilingual Environments,” Cognition 65: 33-69, 1997.
[3] Stockmal, V., Muljani, D., Bond, Z., “Can Children Identify Samples of Foreign Languages as Same or Different?” Language Sciences, 16(2): 237-252, 1994.
[4] Lorch, M. and Meara, P. “How people listen to languages they don’t know,” Language Sciences 11(4): 343-353, 1989.
[5] Stockmal, V., Muljani, D. and Bond, Z.S. “Perceptual Features of Unknown Foreign Languages as Revealed by Multi-dimensional scaling,” Proc. ICSLP , Philadelphia, 1748-1751, 1996.
[6] Nazzi, T., Bertoncini, J. and Mehler, J., “Language Discrimination by Newborns: Towards an Understanding of the Role of Rhythm”. J. Exp. Psychol: Human Perception and Performance 24: 756-766, 1998.
[7] Nazzi, T., Jusczyk, P., and Johnson, E. “Language Discrimination by English-Learning 5-month-olds: Effects of Rhythm and Familiarity.” J. of Memory and Language 43: 1-19, 2000.
[8] Stockmal, V. Discrimination of unknown foreign languages in spoken utterances: A developmental study. M. A. Thesis, Ohio University, 1995.
[9] Pisoni, D. B., “Effects of Talker Variability on Speech Perception: Implications for Current Research and Theory,” Proc. ICSLP, 1399-1407, 1992.
[10] Bond, Z. and Stockmal, V. “Selecting Samples of Spoken Korean from Rhythmic and Regional competitors,” Proc. Acous. Soc. of America and DEGA, Berlin 1999.
[11] Esling, J. and Wong, R, “Voice quality settings and the teaching of pronunciation,” TESOL Quarterly 17: 89-95, 1983.
[12] Stockmal, V., Moates, D. and Bond, Z. S. “Same Talker, Different Language,” Applied Psycholinguistics 21: 383-393, 2000.
[13] Angogo, R. M. Linguistic and Attitudinal Factors in the Maintenance of Luyia Group Identity. Ph.D. Dissertation, University of Texas, Austin, 1980.
[14] Maddieson, I. Patterns of Sound. Cambridge: Cambridge University Press, 1984.
[15] Mugane, J. M. A Paradigmatic Grammar of Gikuyu. Stanford monographs in African Languages. CSLI Publications, Stanford, CA: 1997.
[16] Burnham, D., Francis, E., Webster, D., Luksaneeyanawin, S., Attapaiboon, C., Lacerda, F. and Keller, P. “Perception of Lexical Tone Across Languages: Evidence for a Linguistic Mode of Processing,” Proc. ICSLP, 1996.
[17] Best, C., McRoberts, G., and Sithole, N. “Examination of the Perceptual Reorganization for Speech Contrasts: Zulu Click Discrimination by English-Speaking Adults and Infants,” J. of Experimental Psychology: Human Perception and Performance 14: 345-360, 1988.
[18]  MacKay, I., Flege, J. E., Piske, T. and Schirru, C. “Category Restructuring During Second-Language Speech Acquisition,” J. Acoustical Society of America 110 (1): 516-528, 2001.
[19] Lorch M. and Meara, P. “Can people discriminate languages they don’t know?” Language Sciences, 17(1), p. 65-71, 1995.
[20] Bond, Z. S. and Stockmal, V., “Distinguishing samples of spoken Korean from rhythmic and regional competitors,” Language Sciences 24: 175-185, 2000.
[21] Bond, Z. S. and Moore, T., “A Note on the Acoustic-phonetic Characteristics of Inadvertently Clear Speech,” Speech Communication 14: 325-337, 1994.

ACKNOWLEDGEMENTS

We want to express our gratitude to students and colleagues who made this project possible: Joe Amoako, Aliel Cunningham, Chip Flory, Laxford Kajuna, Leonard Muaka and John Mugane.