초록Objectives:The purposes of this study were to investigate how children with phonological disorders (PD) process speech variation defined as accented speech and to examine perceptual adaptability by contextual cue of the accented speech.
Methods:A total of 55 children (4-8 years old) participated in the study across two days. Accented sentences, spoken by a foreigner, were composed of two different contexts; low and high. Children were to listen to each sentence and identify the last word of each sentence on the screen as depicted by a picture. A separate three-way mixed ANOVA (group×day×context) was used to analyze the data. Additionally, stepwise multiple regression analysis was used to test which variables predicted the children’s adaptability.
Results:First, no main effect of group but day effect and significant interaction effect of group×day and context×day was found. For RT, there was no main effect of group but day effect was found and interaction effect of group×day did not reach the conventional significant level. Second, it was articulation accuracy which predicted both groups’ adaptability accuracy scores. However, expressive vocabulary scores significantly predicted children with PD’s adaptability of response time differently from typically developing children.
Conclusion:Results indicated that children with PD, as well as typically developing children, have good speech adaptability. Speech variation and adaptability may be important factors to consider when clinically assessing children’s speech processing and language skills, as well as when designing and implementing therapy tasks.
When children learn their own language, they must establish phonological systems of a particular accent that are accepted by their social environment. Thus a child who is acquiring their native language learns a particular perceptual representation of his or her native language that fits in to his/her speech environment. The process of this type of learning requires experiencing diverse speech sounds which can be referred to as “speech-variation.”
Variation is the hallmark of speech and language as it exists in the real world. This variation occurs at every level of linguistic structure, from the resonant frequencies of a vowel to the choice of particular words in conversation. Part of the language learning task is to learn the variations which function in different linguisticcontexts. This knowledge serves two functions. Firstly, it allows children to interpret and convey an additional set of messages in the speech signal. Secondly, it enables the child ‘normalize’ variable forms when learning and processing regular semantic meaning.
Among these many types of variabilities, accented speech is the main interest for this current study. Accent is defined by Chambers & Trudgill (1980) as “the way in which a speaker pronounces and... refers to a variety which is phonetically and/or phonologically different from other varieties” (p. 5). Thus, accented speech is spoken either by a person who shares the native language caused by a regional accent or by a person who does not share the native language due to their second language.
Studies have revealed that accent variation can disrupt access to lexical representations. Nathan, Wells, & Donlan (1998) tested children, aged 4 and 7 years, on their ability to repeat and define words spoken in their own accent (London) and in an unfamiliar accent (Glaswegian). Overall, when words were spoken in an unfamiliar accent, the performance was poor, with older children performing better than younger children. Additionally, older children showed different qualitative patterns from younger children. For the definition task, the error patterns in younger children were due to lexical access failure; yet, for older children the errors were due to phonetic confusion. For the repetition task, younger children repeated the unfamiliar accent rather than making correct phonological repetition in their own accent. This suggested that younger children were more influenced by phonetic forms of the variant input.
In order to address this accented speech processing in children with phonological disorders (PD), Nathan and Wells (2001) used the auditory lexical decision task in which children had to decide whether they heard the word that correctly corresponded to the picture. The performance when the task was done in their own native accent (London) and when it was done in an unfamiliar accent (Glaswegian) were compared. Results showed that children with PD performed poorly on unfamiliar accent condition which suggests they have difficulty processing speech that is variant. Limitations of these previous studies were that they were only at a word level processing (such as repeating and defining the word appropriately), and the unfamiliar accent was spoken by a speaker who shares the same mother tongue. Thus, this study examines whether these findings will extend to a sentence level and to accented speech that are spoken by a non-native speaker who does not share the mother tongue.
Speech from a non-native speaker has its own characteristics. First, there are more and more diverse speakers with accents who do not speak the same native language. Thus, children who learn language and speech sounds these days encounter people with accents. Second, the speech variation in this situation is that a talker and a listener do not share a mother tongue. As mentioned above, speech learning involves a process of tuning to the sound structure of speech sounds that we are exposed to during infancy. Thus, when children are required to process speech that is spoken by a speaker who does not share a mother tongue is the research inter-interest for this current study.
Overall, it is evident that when typically developing children are exposed to speech that is more variable due to the number of talkers, familiarity of voice, and the clearness of the speech sounds, their ability to process speech is negatively influenced. This effect can be detrimental for children who have fragile representation of sound systems such as children with PD (Munson, Baylis, Krause, & Yim, 2010). Researchers have suggested that this is the reason why children with PD have a difficult time generalizing sounds they have mastered during therapy sessions across diverse settings and with various talkers (Yim, 2010).
Yim (2010) compared accented speech vs. native speech on the sentence level in children with and without PD. The aim of the study was to find out whether children perform lower on accented speech over native speech, and whether there is a significant difference between the groups. The results showed that both groups performed more poorly on accented speech than non-accented speech and that typical children were qualitatively better (faster) when performing accented speech. Based on these findings, it was of our interest whether children can adapt to accented speech when they are exposed to these sentences. Individuals possess highly flexible perceptual learning ability. Studies (Dupoux & Green, 1997; McGarr, 1983; Pallier, Sebastian-Galles, Dupoux, Christophe, & Mehler, 1998) have shown that if the listeners spend enough time listening to speech sounds that are different from their norm, they are able to tune into that sound system and will be able reach a relatively high perceptual accuracy. Recent studies have documented on perceptual learning of native accented speech (Eisner & Mc Queen, 2005, 2006; Kraljic & Samuel, 2005, 2006; Maye, Aslin, & Tanenhaus, 2003; Norris, McQueen, & Cutler, 2003) and foreign accented speech (Bradlow & Bent, 2008). These studies resulted in the conclusion that listeners need only a small amount of time window to be adapted to non-native speech (Clarke and Garrett, 2004). The adaptation skill is the key factor for intervening children with speech or language difficulties. While adults’ ability to adapt to accent variation has been paid much attention (Labov, 1989; Schmid & Yeni-Komshian, 1999), there has been little research on understanding children’s ability to process unfamiliar accents or how they process accent variation. Thus, it is important to test whether children with PD also have as good adap tability as typically developing children on processing speech that is variant.
This study explores the hypothesis that children identified as having phonological processing problems may have particular difficulty in processing a different accent as found in previous studies (Nathan & Wells, 2001; Yim, 2010). Additionally, this study investigates how children process non-native speech and how well they adapt to the non-native speech. More specifically, this study investigated how children adapt to the accented speech and how well they take advantage of contextual cues at the sentence level. Lastly, this study explored which variables may influence adaptability in children.
Followings are the three research questions:
1) How do children with PD perform on accented speech differed by contextual cue (high context vs. low context)?
2) How well do children with and without PD adapt to accent speech?
3) Which variables among speech and language skills influence the adaptability of accented speech processing?
For this purpose of the study, this study investigated children with PD compared to typically developing children over two sessions on processing accented speech. Additionally, different from the previous studies (Nathan & Wells, 2001), this study used the sentence level processing with context condition included. Children in a real life situation can also be exposed to high context cued sentence. However, in many cases, children are confronted to low context cued sentences and this is the main way to evaluate the children’s accented speech processing ability. Thus, in this study, sentences were divided into low and high context conditions.
METHODSParticipantsA total of 55 children, who were the same cohort of groups parparticipated in Yim (2010)’s study, participated in the study. There were 20 children with PD and 35 children who were typically developing. All children were native English speakers who were between the ages of 4-8 years. Children with PD were recruited from public schools and private clinics in the Evanston, Chicago. Typically developing children were recruited from local day-care centers, and by word of mouth. No participant had a broader developmental delay, permanent hearing loss, craniofacial anomaly, or psychosocial impairment (e.g., autism), as gauged by a parent report. None of the children with PD had any other diagnosed language impairment, nor were they receiving clinical services for any communication impairments other than their speech-production difficulties.
All participants passed the hearing screening (pure tone presented at 25 dB at 1, 2, and 4 kHz bilaterally) and showed a nonverbal intelligence test score within the normal range (Leiter International Performance Scale-Revised; Roid, Miller, & Billinger, 2002). Both groups were within the normal range on receptive (Peabody Picture Vocabulary Test [PPVT]-III; Dunn & Dunn, 2007 and expressive vocabulary (Expressive Vocabulary Test, EVT; Williams, 1997). However, children with PD were significantly lower than normally developing children on EVT (p<.05). The Goldman- Fristoe 2 test of articulation (GFTA-2; Goldman & Fristoe, 2000) was used to evaluate the speech-production accuracy. Children with PD performed significantly poorly than typically developing children (p<.05). Demographic information of these participants is in Table 1.
StimuliAs a brief overview, the auditory stimuli consist of high- and low-context sentences, in which a target word occurs sentence-finally, produced by two speakers with noticeable foreign accents. The visual stimuli consist of images corresponding to the targetword, a semantic foil, a phonological foil, and unrelated foils. The wordlist consists of 60 sets of words. Each set consists of a target word (T), a semantic foil (S), a phonological foil (P), and two unrelated foils (F1, F2). An example set is given as follows; BUS (T) / SCHOOL (S) / BUG (P) / SKUNK (F1), CROCODILE (F2).
For each word, a black-and-white line-drawing was selected by searching the internet, based on the criterion that the word should be a likely name for the picture. For example, the image for “circle” was a circle, and the image for “kick” consisted of a boy kicking a ball, with motion lines to indicate the kicking action.
Images were then cropped to remove blank space. The experiment program automatically resized them to occupy equal screen space during display. Each image occupied approximately 3 cm × 3 cm of space on the display.
For each target word in the wordlist, two sentences were generated: a high- and a low-context sentence (Fallon, Trehub, & Schneider, 2002). The target word was the last word in both sentences. The high-context sentence was intended to provide enough information so as to eliminate some of the foils even if the target were not actually produced. The low-context sentence was intended to provide as little information about the target as possible. An example sentence is given below, for the recurring example “BUS”:
Low: My sister likes the BUS.
High: I don’t like to drive a car when I can ride a BUS.
Each sentence was read aloud and recorded by two different female Chinese speakers. Both speakers were native speakers of Mandarin from the Beijing, and had very noticeable non-native accents in English. We selected the two different speakers to balance the talker variation effect. Initially five different speakers were allowed to record their sentences read aloud. Then, 10 English native speakers listened to these five speaker’s sentences and rated on a five-point scale whether they sounded very native like, native like, somewhere in the middle, non-native like, and very non-native like. Then, the average of the scores across 10 subjects ratings were obtained. We eliminated the two which were native like, and two which were non-native like which interferes with the understanding of the sentence. Finally, we selected the middle one which was intelligible and also accented. The recordings were excised and normalized in volume to 70 dB using the free speech software Praat (University of Amsterdam, Amsterdam, Netherlands).
ProcedureAll children came twice to test their adaptability. On two separate days (within 2 days), children performed a picture identification task, in which they were first shown four pictures and then heard a sentence whose final word (the target) matched one of the pictures. They were asked to press the number matching the picture in the sentence. Children first completed a brief practice session. Then, they completed the first test block, were given a short rest, and completed the second test block.
The experiment was administered by the E-Prime (Psychological Software Tools, Pittsburgh, PA, USA). Auditory stimuli were presented over headphones (HD228, Sennheiser, Wedemark, Germany). The display screen was 30 cm×20 cm (SyncMaster 2243- BWX; Samsung, Suwon, Korea).
During a trial, children first saw four pictures (3 cm×3 cm) in a horizontal row, with the numbers 1-4 printed below the pictures, with 1 corresponding to leftmost image and 4 corresponding to the righmost image. They were allowed to look at the pictures as long as they wanted. After they pressed a button, the auditory stimulus was played over the headphones. The child pressed the number on the keyboard that corresponded to the image named by the final word in the auditory stimulus.
For example, a trial might start with four images: 1) SCHOOL (S); 2) SKUNK (F); 3) BUS (T); and 4) BUG (P). After the child presses a button, they might hear, “I don’t like to drive a car when I can ride a BUS.” During or after the stimulus presentation, the child will press the number corresponding to the picture they believe the sentence names.
The response and response latency are recorded by the experimental software. Response latency is measured from offset of the auditory stimulus. Thus, negative response latency would indicate responding during or before presentation of the target word. Children were tested on target words as an off line task, to confirm their knowledge of all target words at the end of each session. All children were asked to perform on E-Primeidentification task which resulted 100% accuracy on every single target word.
There was a block design for presenting 60 sentences in order to eliminate speaker familiarity effect. If children are exposed to one speaker across two days, then it is challenging to distinguish whether the adaptability occurred due to their exposure to a particulartalker or solely due to an exposure to the accented speech of which the latter is the focus of the study. Thus, on the first day, children received 30 trials with high-context sentences and 30 trials with low-context sentences, all produced by the same speaker. On the second day, children again received 30 trials with high-context sentences and 30 trials with low-context sentences, produced by the other speaker. Thus, the biggest difference between the first and second day was the speaker. The actual targets were the same across days, but the context level was counterbalanced. In other words, if a child heard the target “BUS” in a low-context sentence on the first day, they heard it in a high-context sentence the next day. In order to prevent the visual stimuli from being identical across days, the unrelated foils differed across days.
Counterbalancing was achieved as follows. Targets were divided into two equal subsets, blocks 1 and 2. Two sentential versions of each block were assembled, the high-context and low-context version. Finally, each sentential version of a block yielded two spoken versions, one from each speaker. Children were also divided into two equal subsets, groups 1 and 2 (group assignment was random, subject to the constraints of keeping equal numbers and approximately equal gender balance across groups). The block design is indicated below (Table 2):
Thus, groups differed as to which speaker they heard on the first day. They also differed as to which block they heard the low/high context sentences from. This method allowed us to balance the speaker familiarity effect.
The order of target presentation was pseudo-random. The experiment was divided into two test blocks. During the first test block, children received 15 targets from the low-context stimulus block and 15 targets from the high-context stimulus block, in random order. Then they were given a short break. During the second test block, children received the remaining targets from the lowcontext and high-context blocks, in random order. In other words, children always received a mixture of low and high-context items, and each child got these items in a different order.
The visual stimuli were presented in a horizontal row across the screen, with corresponding numbers beneath, in pseudorandom order. For each target within each block, a random order was selected for the first day (e.g., semantic foil on the far left, target on the midleft, unrelated foil on the midright, and phonological foil on the far right). A different random order was selected for the second day, subject to the constraint that it could not be the same order as for the first day. In other words, the image order for a target was the same for all children within a group on each day, but differed across days.
AnalysesA separate three-way mixed ANOVA (group×day×context) was used to analyze two dependent variables, accuracy and response time (RT; ms). Accuracy was converted into percent correct score. For RT analysis, only accurate responses within ±2 SD from the mean task RT for an individual child were included. Stepwise multiple regression analysis was used to test which variables (speech and language test scores) predict children’s second day performance (adaptability). The effect sizes were also calculated for significant results.
RESULTSResults for accuracy and RT are reported separately. The mean percent correct in each context by day for both groups are shown in Table 3.
There was no main effect of group. However, there was a main effect of day (F(1, 53)=12.8, p<.001, η2=.19) in which children’s performance was better on day 2 (mean=87.3, SE=2.2) than on day 1 (mean=80.8, SE=2.7). There was an interaction effect of group× day (F(1, 53)=6.7, p<.05, η2=.11) and conext×day (F(1, 53)=28.4, p<.001, η2=.34). For chronological age (CA) group the mean difference between day 1 (mean=84.1, SE=3.2) and day 2 (mean= 85.9, SE=2.7) was not as significant as in PD group (day 1: mean= 77.6, SE=4.3; day 2: mean=88.8, SE=3.5). Additionally, the difference between day 1 (mean=83.3, SE=2.8) and day 2 (mean= 85.6, SE=2.5) on high-context was not as significant as in low-context (day 1: mean=78.4, SE=2.8; day 2: mean=89.1, SE=2.0). Figures 1 and 2 illustrate how children with PD and typically developing children performed on the task.
As mentioned above, only the accurate response was entered for RT analysis and then to eliminate outliers (defined as ±2 SD from the mean task RT) individual’s mean RT was calculated. About 3% of the data was eliminated after removing the outliers for the final analysis. Table 4 shows the mean and SD for RT in each context by day for both groups.
There was no main effect of group and interaction effect. However, there was condition effect of say (F(1, 53)=7.4, p<.05, η2=.12). The overall RT for day 2 (mean=2,156, SE=169) was significantly faster than day 1 (mean=2,433, SE=167). Figure 3 presents the overall RT for both groups on each day.
Finally multiple stepwise regression was run to predict which variables (speech and language scores) influence the second day performance. Thus, PPVT, EVT, and GFTA scores were entered as independent variable for predicting day 2 performance on lowand high-context. First, accuracy data was analyzed for CA group and PD group. For CA group, it was GFTA which significantly predicted both the low-context (F(1, 34)=6.25, p<.05, R²=.15) performance and high-context (F(1, 34)=8.0, p<.05, R²=.19) on day 2. Additionally, for PD group, it was again GFTA which significantly predicted both the low-context (F(1, 19)=6.88, p<.05, R²=.27) performance and high-context (F(1, 19)=8.8, p<.05, R²=.32) on day 2.
The RT data showed similar findings. For CA group, it was GFTA which significantly predicted both the low-context (F(1, 34) =5.6, p<.05, R²=.14) performance and high-context (F(1, 34)=4.7, p<.05,R²=.12) on day 2. However, for PD group it was EVT which significantly predicted both the low-context (F(1, 19)=5.4, p<.05, R²= .23) performance and high-context (F(1, 19)=5.1, p<.05, R²=.22) on day 2.
CONCLUSIONThis study investigated how children with PD process speech sounds that are variable, characterized by accented speech spoken by non-native speakers, and adapt to the speech solely by second trial of exposure. Additionally, the study examined whether their adaptability was predicted by receptive and expressive vocabulary size or phonological skills. Both accuracy and RT data were used to report the results.
Findings showed that both children did not differ on performing accented sentences both for accuracy and RT data. These results are in line with the previous study done by Yim (2010), in which there was no group difference on both native and accented sentences. However, two important findings were found. Firstly, children with PD took more advantages of exposure to the accented speech. As shown in Figure 1. There was a significant interaction effect of day×group in which the magnitude of improvement of children with PD performance was significantly larger than typically developing children between day 1 and day 2. There may have been no room for typically developing children to improve on the day 2 since their performance was fairly good on day 1. However, recall that there was no group difference which means that even though children with PD were slightly lower than typically developing children on day 1, it was not statistically significant. Additionally, these findings were held constant for qualitative data which were represented as RT. Figure 3 shows that overall children were significantly faster to process accented speech on Day 2. The interaction effect did not reach the conventional significance yet very close (p=.09) as you can observe the trend on Figure 3. This lack of significance may be solved with enough power. These findings are critical that children with PD also have a good adaptability which implies that these children have room for manipulating their speech sound representation. As was found in previous studies (Dupoux & Green, 1997; McGarr, 1983; Pallier et al, 1998), children were able to tune into new sound systems even though they were not familiar with those sounds. Additionally, our results showed that only a simple exposure of the accented speech, with no explicit training that can lead children reach to a relatively high perceptual accuracy (Bradlow & Bent, 2008; Clarke & Garrett, 2004).
Secondly, there was a significant interaction between context and day for both groups in which low-context took more advantage of second exposure to the speech variation. These findings enable us to consider that children do have high flexibility for adapting to new speech whether they have speech sound difficulties or not. The adaptation skill is the key factor for intervening with children who have speech or language difficulties. Thus, it may be important for children to receive exposures to variable speech such as speech of male voice, peer child, or regional accent which may be influential for children with PD to build stronger sound representation.
Finally, stepwise multiple regression analyses showed that it was the speech sound accuracy scores (represented by GFTA scores) which significantly predicted children’s overall adaptability. Both groups showed these results for accuracy data which was very counterintuitive since the experimental tasks highlighted speech processing ability rather than language processing. However, for RT results, these findings were held constant only for typically developing children. For children with PD, it was expressive vocabulary scores (represented by EVT) which significantly predicted their adaptability of low and high-context accented sentence processing. These findings are interesting since EVT scores were significantly poor in children with PD compared to those of typically developing children. Two speculations can be made from these results. Firstly, even though words which composed the sentence were the ones that all children knew and were very easy, children with PD with high EVT scores were better than those with lower EVT scores. This means that children with better language scores have more adaptability, which was shown by RT. Thus, if the tasks become harder, then there might be a difference between children with PD who have higher language skills and those who have lower language skills. Secondly, the results were different from typically developing children in which it was still GFTA which predicted RT scores on day 2 for these children. These findings are critical because when we examined the accuracy data only, there was no group difference. Thus, it is important to look at both quan-titative and qualitative data. However, this different trend between groups suggests that children with PD might be inefficiently processing in order to perform as well as typically developing children. There was no group difference on day 2 RT but children with PD had significantly lower EVT scores.
This study investigated fundamental aspects of speech processing, such as the way children process variable speech input and their adaptability by context variation. The exact way how variation might be encoded or represented within the phonological system and how listeners process this variation is still unanswered. The results suggest that we should look beyond a surface description of speech disorder to examine the more subtle speech processing skills of this population, including different aspects of speech input processing, such as the ability to process different kinds of speech variation. Thus, this is an important variable to consider when assessing children’s speech processing and language skills clinically, as well as when designing and implementing therapy tasks.
Table 1.
Nonverbal IQ= Leiter International Performance Scale-Revised (Roid, Miller, & Billinger, 2002); PPVT= Peabody Picture Vocabulary Test-III (Dunn & Dunn, 2007); EVT= Expressive Vocabulary Test (Williams, 1997); GFTA= GFTA-2 (Goldman & Fristoe, 2000). Table 2.REFERENCESClarke, CM, & Garrett, MF (2004). Rapid adaptation to foreign-accented English. Journal of the Acoustical Society of America, 116, 3647–3658.
Dunn, DM, & Dunn, LM (2007). Peabody Picture Vocabulary Test manual (4th ed.). Minneapolis, MN: Pearson.
Dupoux, E, & Green, KP (1997). Perceptual adjustment to highly compressed speech: effects of talker and rate changes. Journal of Experimental Psychology: Human Perception and Performance, 23, 914–927.
Eisner, F, & McQueen, JM (2005). The specificity of perceptual learning in speech processing. Perception and Psychophysics, 67, 224–238.
Eisner, F, & McQueen, JM (2006). Perceptual learning in speech: stability over time. Journal of the Acoustical Society of America, 119, 1950–1953.
Flege, JE (1995). Two procedures for training a novel second language pho netic contrast. Applied Psycholinguistics, 16, 425–442.
Fallon, M., Trehub, SE, & Schneider, BA (2002). Children’s use of semantic cues in degraded listening environments. Journal of the Acoustical Society of America, 111, 2242–2249.
Goldman, R, & Fristoe, M (2000). The Goldman-Fristoe 2 test of articulation. Circle Pines, MN: American Guidance Service.
Kraljic, T, & Samuel, AG (2005). Perceptual learning for speech: is there a return to normal? Cognitive Psychology, 51, 141–178.
Kraljic, T, & Samuel, AG (2006). Generalization in perceptual learning for speech. Psychonomic Bulletin & Review, 13, 262–268.
Labov, W (1989). The limitations of context: evidence from misunderstandings in Chicago. In C. Wiltshire, R. Graczyk & B. Music (Eds.), CLS 25: papers from the 25th Annual Regional Meeting of the Chicago Linguistic Society (pp. 171–200). Chicago, IL: Chicago Linguistic Society.
Maye, J., Aslin, R, & Tanenhaus, M (2003). In search of the weckud wetch: online adaptation to speaker accent. Proceedings of the 16th Annual CUNY Conference on Human Sentence Processing, Cambridge, MA.
McGarr, NS (1983). The intelligibility of deaf speech to experienced and inexperienced listeners. Journal of Speech and Hearing Research, 26, 451–458.
Munson, B., Baylis, AL., Krause, MO, & Yim, D (2010). Representation and access in phonological impairment. In C. Fougeron (Ed.), Laboratory phonology 10: variation, detail, and representation, 381–404. Berlin: Mouton de Gruyter.
Nathan, L, & Wells, B (2001). Can children with speech difficulties process an unfamiliar accent? Applied Psycholinguistics, 22, 343–361.
Nathan, L., Wells, B, & Donlan, C (1998). Children’s comprehension of unfamiliar regional accents: a preliminary investigation. Journal of Child Language, 25, 343–365.
Norris, D., McQueen, JM, & Cutler, A (2003). Perceptual learning in speech. Cognitive Psychology, 47, 204–238.
Pallier, C., Sebastian-Galles, N., Dupoux, E., Christophe, A, & Mehler, J (1998). Perceptual adjustment to time-compressed speech: a cross-linguistic study. Memory & Cognition, 26, 844–851.
Roid, GH., Miller, LJ, & Billinger, S (2002). Leiter-R: Leiter International Performance Scale-revised. Stockholm: Psykologiforlaget.
|
|