Starkey Research & Clinical Blog

Effective communication behavior during hearing aid appointments

Munoz, K., Ong, C., Borrie, S., Nelson, L., & Twohig, M. (2017). Audiologists’ communication behavior during hearing device management appointments. International Journal of Audiology, Early Online, 1-9.

This editorial discusses the clinical implications of an independent research study and does not represent the opinions of the original authors.

The skill of the audiologist in communicating with a patient can significantly impact rehabilitative outcomes. Nowhere is this more evident than when an audiologist in engaged in managing a hearing device fitting. Studies have suggested a lack of patient-centeredness behavior by audiologists in audiologist-patient interactions, including domination of speaking time, a tendency to overemphasize the technical aspects of device care, interruptions of the patient, an inability to deal with emotion-laden aspects of rehabilitation, expressing empathy, and not actively listening, (e.g., Ekberg, 2014;  Grenness, et al, 2014; Grenness, et al, 2015; Knudsen, et, al., 2010; Laplante-Levesque, et al, 2014; Munoz, et al, 2014, and Munoz, et, al, 2015). The counseling tendencies noted above can create a lack of adherence to and understanding of the recommendations and information provided by the audiologist (Robinson, et al, 2008).

Audiologists in training are likely as not to internalize or imitate how their mentors or supervisors interact with patients. Unless their instructors have themselves achieved satisfactory interpersonal communication skills, audiologists may enter the workforce lacking practical counseling and communication skills that may diminish their effectiveness in the clinical setting.

The authors designed this exploratory, longitudinal study to measure audiologist communication behaviors at three time intervals, first, prior to participating in a one-day pre-training workshop, second, at a two-month interval, and third, at a six-month interval. The pre-training workshop focused on the psychosocial aspects of counseling including the use of open-ended questions, validation of emotions, reframing and clarifying patient problems and complaints, methods for increasing motivation, and double-checking patient assumptions. In addition, five one-hour support sessions were offered to the audiologists for a three-month period following the initial workshop, during which topics were discussed such as addressing client barriers, addressing emotions, being present and non-judgmental, and developing reflection/summarizing skills, among others. Attendance ranged from 30% to 90% of participants; one audiologist attended none of the support sessions, but most attended 3-4 sessions.

Ten audiologists actively providing clinical services were evaluated on two rating scales—1) the Behavior Competencies Rating Scale (a 10-item self-rating measure developed by the authors) designed to evaluate the audiologist’s own perception of his/her communication skills, and 2) a modified version of the Counseling Competencies Scale (Swank, et al, 2012), intended to measure counseling skills and behaviors, graded by both the instructor and independently by a psychology graduate student. 53 patients consented to participate and each audiologist-patient interaction was recorded. A set of coding guidelines was developed to recognize and categorize by type the counselling behaviors (interactions) exhibited by the audiologist, as well as the frequency of each of the counseling behaviors. The coding categories for counseling skills included encouragers, questions, listening and reflecting feelings, confrontations, goal setting, focus of counseling, and expressions of appropriate empathy, care, respect and unconditional positive regard.

The article gave examples of expressions and statements during counseling that would fall into  specific coding categories. For example, an open-ended question such as “What do you think is the most challenging part of wearing (or taking care) of your hearing aids?” would be categorized as assessing and addressing barriers and motivation. An audiologist might comment to a patient who mentions they are in the process of moving, “So you have a lot going on,” which would be interpreted as an instance of listening and reflection.  Or the audiologist might suggest, “For homework, I’d like you to work on using a couple of the strategies we discussed,” a statement that would fall into the category of planning for behavior change.

The average length of each recorded counseling session was 46 minutes, from which a selected ten-minute sample was extracted, coded and subjected to analysis. The rate of change of audiologist behaviors, expressed as the percentage frequency of occurrence per session, was measured at the three time intervals mentioned above, baseline, one-month post-training, and at a six-month follow-up.

The authors found that audiologists devoted the greatest amount of clinical interactions throughout the six-month period to general fitting discussions followed by educational and technical instruction. The frequencies of occurrence (interactions) devoted to these two variables increased slightly post workshop, but thereafter decreased. The fewest number of the clinicians’ interactions per session over the six-month period was spent in listening and reflection, clarifying treatment goals, assessing and addressing motivation and barriers, and discussing behavior changes. Although small changes were noted in the frequencies of occurrence of these behaviors over the study period, the authors concluded that the observed changes were so minimal as not to be practically meaningful. Of interest, they also found the time per session devoted to irrelevant conversation and small talk increased linearly from a relatively low point to a higher level throughout the time of the study.

A striking outcome was the significant reduction in personal speaking time of audiologists following a pre-training workshop. When the speaking time of both patients and audiologists were compared (audiologists dominated during pre-training) both were approximately equal after the workshop. Although speaking time was not explicitly stressed in the workshop, these findings suggest a reduction in audiologist verbal dominance after training, suggesting that the training positively impacted this counseling behavior.

Finally, the audiologists, in rating their personal communication behaviors, perceived a marked improvement in their own communication skills on the self-rating scale. This improvement was not entirely supported by the data, as the observer-rated data showed little clinically important changes in psychologically relevant interactions over the study period.

The authors suggest that one of the reasons for lack of meaningful change in clinician communication behavior might have been the complexity of counseling skills taught within a relatively short time frame. The provision of a short workshop on communication skills is insufficient and that the importance of teaching patient-centered communication skills to audiologists-in-training as early as possible cannot be overstated.

Although there was evidence of improvement in audiologists’ counseling skills following the pre-training workshop and with supplementary instruction, it was limited. Hesitation to address patients’ psychosocial concerns, express empathy when appropriate, and address client’s emotions, indicate a possible gap in training and education. The authors recommend that clinical supervisors should be aware of the critical role patient-centered counselling plays in providing positive clinical outcomes. Further, these supervisors should recognize within themselves the need for improving personal counseling skills by furthering their own continuing education.

References

Ekberg, K., Grenness, C. & Hickson, L. (2014). Addressing patients’ psychosocial concerns regarding hearing aids within audiology appointments for older adults. American Journal of Audiology, 23, 337-350.

Grenness, C., Hickson, L., Laplante-Levesque, A., Meyer. C., & Davidson, B (2014). Communication patterns in audiologic rehabilitation history-taking: audiologists, patients, and their companions. Ear and Hearing, 36, 191-204.

Grenness, C., Hickson, L., Laplante-Levesque, A., Meyer. C., & Davidson, B (2015). The nature of communication throughout diagnosis and management planning in initial audiologic rehabilitation consultations. Journal of American Academy of Audiology, 50, 36-50.

Knudsen, L.V., Oberg, M., Nielsen, C., Naylor, G., & Kramer, S.E. (2010). Factors influencing help seeking, hearing aid uptake, hearing aid use and satisfaction with hearing aids: a review of the literature. Trends in Hearing, 14, 127-154.

Laplante-Levesque, A., Hickson, L., & Grenness, C. (2014). An Australian survey of audiologists’ preference for patient-centeredness. International Journal of Audiology, 53, S76-S82.

Munoz, K., Nelson, L., Blaiser, K., Price, T., & Twohig, M. (2015). Improving support for parents of children with hearing loss: provider training on use of targeted communications.

Munoz, K., Preston, E., & Hickens, S. (2014). Pediatric hearing aid use: how can audiologists support parents to increase consistency. Journal of the American Academy of Audiology, 25, 380-387.

Robinson, J.H., Callister, L.C., Berry, J.A., & Dearing, K.A. (2008). Patient-centered care and adherence: definitions and applications to improve outcomes. Journal of the American Academy of Nurse Practitioners, 20, 600-607

Swank, J.M., Lambie, G.W., & Witta, E. L. (2012). An exploratory investigation of the Counseling Competencies Scale: a measure of counseling skills, dispositions, and behaviors. Counselor Education and Supervision, 51, 189-206.

On the Topic of Hearing Loss and Fatigue

Hornsby, B. & Kipp, A. (2016). Subjective ratings of fatigue and vigor in adults with hearing loss are driven by perceived hearing difficulties not degree of hearing loss. Ear and Hearing 37 (1), 1-10.

This editorial includes clinical implications of an independent research study and does not represent the opinions of the original authors.

In 2013, we reviewed an article from Dr. Ben Hornsby in which he reported on an initial foray into the fatiguing effects of listening to speech while managing a cognitively challenging secondary task (read here). The outcomes of his investigation suggested that use of hearing aids may reduce fatiguing effects of completing that secondary task. In more recent work, reviewed here, Drs Hornsby and Kipp assessed utility of standardized measures of fatigue among a large group of subjects with hearing loss.

Fatigue can be caused by a combination of physical, mental and emotional factors. Usually fatigue is temporary, resulting from periods of sustained physical or mental labor, and resolves during breaks, in between work days or on weekends. Intermittent fatigue has minimal effects on everyday life and health, but sustained fatigue, caused by unremitting work, stress or illness, has a variety of negative effects. Sustained and severe fatigue makes people less productive and more prone to accidents in the workplace (Ricci et al, 2007), reduces the ability to maintain concentration and attention, reduces processing speed, impairs decision-making abilities and may increase stress and burnout (vanderLinden et al, 2003; Bryant et al, 2004; DeLuca, 2005).

Though fatigue as a result of communication difficulty is commonly acknowledged by anecdotal reports, there has been little systematic examination of the relationship. As mentioned above, Hornsby (2013) found that hearing-impaired individuals experienced increased listening effort and mental fatigue that was mitigated somewhat by the use of hearing aids and other studies have suggested that the increased cognitive effort required for hearing-impaired individuals to understand speech may lead to subjective reports of mental fatigue (Hetu et al., 1988; Ringdahl & Grimby, 2000; Kramer et al., 2006; Copithorne, 2006). The purpose of Hornsby and Kipp’s study was to compare standardized, validated measures of fatigue to audiometric measures of hearing loss and subjective reports of hearing handicap.

The authors recruited subjects from a population of adults who sought help for their hearing loss from an Audiology clinic. There were 149 subjects, with a mean age of 66.1 years and a range from 22 to 94 years and mean pure tone average of 36.7dB HL.

Subjective fatigue was measured with two standardized scales: the Profile of Mood States (POMS; McNair et al., 1971) and the short form of the Multi-Dimensional Fatigue Symptom Inventory (MDFS-SF; Stein et al., 2004).  Two POMS subscales assessed general fatigue and vigor, which was described by words like “energetic” and “alert”.

A presentation summarizing the POMS can be found here

The MFSI-SF assessed vigor and four dimensions of fatigue – general, physical, emotional and mental. On both measures, subjects were asked to rate, on a 5-point scale, how well each item described their feelings during the past week.

The MDFS in long and short form can be found here

Audiometric data included pure tone thresholds in each ear at 500, 1000, 2000 and 4000Hz.  Perceived or subjective hearing handicap was measured with the Hearing Handicap for the Elderly (HHIE; Ventry & Weinstein, 1982) and the Hearing Handicap Inventory for Adults (HHIA; Newman et al., 1990).

Individuals 65 years or older completed the HHIE and those under 65 years completed the HHIA.

A version of the HHIA can be found here

The first set of analyses examined how the hearing-impaired subjects in the current study compared to normative data for the POMS and MFSI-SF.   Scores on vigor subscales were reverse coded and identified as “vigor deficit”, because unlike measures of fatigue or hearing handicap, high scores for vigor indicate less difficulty or less negative impact on the individual.  The authors found that the subjects in their study demonstrated significantly less vigor and slightly more fatigue than the subjects in the normative data. Furthermore, severe fatigue was reported more than twice as often and severe lack of vigor was reported more than four times as often compared to normative data. When subtypes of fatigue were examined, differences in vigor deficit were significantly greater than any of the other subscales, followed by general fatigue and mental fatigue which were both significantly greater than emotional or physical fatigue.

Hearing handicap was significantly related to both subjective fatigue and vigor ratings.  There were significant relationships among all HHIE/A scores (social, emotional, and total) and all subscales of the MFSI-SF scales.  Total score on the HHIE/A had a simple linear relationship with MFSI ratings in the physical and emotional domains. Total HHIE/A score had a nonlinear relationship with general, mental fatigue, and vigor deficit scores. In other words, low HHIE/A scores (little or no handicap) were not significantly associated with MFSI ratings, but as HHIE/A scores increased, there were stronger relationships. This nonlinear relationship indicates that as hearing handicap increased, there was a stronger likelihood of general fatigue, mental fatigue and lack of vigor.

Hornsby and Kipp drew three main conclusions from the study outcomes. First, the hearing-impaired adults in their study, who had contacted a hearing clinic for help, were more likely to report low vigor and increased fatigue than adults of comparable age in the general population.  They acknowledge that hearing loss was not specifically measured in the normative data and it is likely that there were some hearing-impaired individuals in that population. However, if hearing-impaired individuals were included in the normative data, it would likely decrease the significance of the differences noted here.  Instead, severe fatigue was more than twice as high in this study and severely low vigor was more than four times as high as in the normative population.

The second notable conclusion was that there was no relationship between degree of hearing loss and subjective ratings of fatigue or vigor. The authors hypothesized that higher degree of hearing loss would be associated with increased fatigue and vigor deficit but this was not the outcome. This observation presents a future avenue in which speech recognition ability could analyzed as a predictive factor to individuals reported fatigue.

Hearing aid use was not specifically examined in this study, yet it is likely to affect subjective ratings of fatigue and vigor. Several reports indicate that hearing aids, especially those with advanced signal processing, may reduce listening effort, fatigue and distractibility and may improve ease of listening. (Hallgren, 2005; Picou, et al., 2013; Noble & Gatehouse, 2006; Bentler, 2008). If study participants base their subjecting ratings of fatigue and vigor on how they function in everyday environments with their hearing aids, then the non-significant contribution of degree of hearing loss, as measured audiometrically, could be misleading.  Hearing aid experience and usage patterns should be evaluated in future work to ensure that hearing aid benefits do not confound the measured effects of the hearing loss itself.

The significant relationship between hearing handicap and subjective fatigue ratings underscores the importance of incorporating subjective measures into diagnostic and hearing aid fitting protocols.   Hearing care clinicians who counsel patients primarily based on audiometric results may underestimate the challenges faced by individuals who have milder hearing loss but significant perceived hearing handicap.  The HHIE/A and other hearing handicap scales, along with inquiries into work environment and work-related activities, can help us more effectively identify individual needs of our patients and formulate appropriately responsive treatment plans. Similar inquiries should be repeated as follow-up measures to evaluate how well these needs have been addressed and to indicate problem areas that remain.

References

Bentler, R.A., Wu, Y., Kettel, J. (2008). Digital noise reduction: outcomes from laboratory and field studies. International Journal of Audiology 47, 447-460

Bryant, D., Chiaravalloti, N. & DeLuca, J. (2004). Objective measurement of cognitive fatigue in multiple sclerosis. Rehabilitation Psychology 49, 114-122.

Copithorne, D. (2006). The fatigue factor: How I learned to love power naps, meditation and other tricks to cope with hearing-loss exhaustion. [Healthy Hearing Website, August 21, 2006].

DeLuca, J. (2005).  Fatigue, cognition and mental effort. In J. DeLuca (Ed.), Fatigue as a Window to the Brain (pp. 37-58). Cambridge, MA: MIT Press.

Eddy, L. & Cruz, M. (2007).  The relationship between fatigue and quality of life in children with chronic health problems: A systematic review. Journal for Specialists in Pediatric Nursing 12, 105-114.

Hallgren, M., Larsby, B. & Lyxell, B. (2005). Speech understanding in quiet and noise, with and without hearing aids. International Journal of Audiology 44, 574-583.

Hetu, R., Riverin, L. & Lalande, N. (1988). Qualitative analysis of the handicap associated with occupational hearing loss. British Journal of Audiology 22, 251-264.

Hornsby, B. (2013). The effects of hearing aid use on listening effort and mental fatigue associated with sustained speech processing demands. Ear and Hearing 34 (5), 523-534.

Hornsby, B. & Kipp, A. (2016). Subjective ratings of fatigue and vigor in adults with hearing loss are driven by perceived hearing difficulties not degree of hearing loss. Ear and Hearing 37 (1), 1-10.

Johnson, S. (2005). Depression and fatigue. In J. DeLuca (Ed.), Fatigue as a Window to the Brain (pp. 37-58). Cambridge, MA: MIT Press.

Kramer, S., Kapteyn, T. & Houtgast, T. (2006). Occupational performance: Comparing normally-hearing and hearing-impaired employees using the Amsterdam Checklist for Hearing and Work. International Journal of Audiology 45, 503-512.

McNair, D., Lorr, M. & Droppleman, L. (1971). Profile of Mood States. San Diego, CA: Educational and Industrial Testing Service. Retrieved from http://www.mhs.com/product.aspx?gr=cl&id=overview&prod=poms.

Noble, W. & Gatehouse, S. (2006). Effects of bilateral versus unilateral hearing aid fitting on abilities measured by the SSQ. International Journal of Audiology 45, 172-181.

Picou, E.M., Ricketts, T.A. & Hornsby, B.W. (2013). The effect of individual variability on listening effort in unaided and aided conditions. Ear and Hearing (in press).

Pronk, M., Deeg, D. & Kramer, S. (2013). Hearing status in older persons: A significant determinant of depression and loneliness? Results from the Longitudinal Aging Study Amsterdam. American Journal of Audiology 22, 316-320.

Ricci, J., Chee, E. & Lorandeau, A. (2007). Fatigue in the U.S. workforce: Prevalence and implications for lost productive work time. Journal of Occupational Environmental Medicine  49, 1-10.

Ringdahl, A. & Grimby, A. (2000). Severe-profound hearing impairment and health related quality of life among post-lingual deafened Swedish adults. Scandinavian Audiology 29, 266-275.

Stein, K., Jacobsen, P. & Blanchard, C. (2004). Further validation of the multidimensional fatigue symptom inventory – short form. Journal of Pain and Symptom Management 27, 14-23.

vanderLinden, D., Frese, M. & Meijman, T. (2003). Mental fatigue and the control of cognitive processes: effects on perseveration and planning. Acta Psychologica (Amst) 113, 45-65.

Ventry, I. & Weinstein, B. (1982). The Hearing Handicap Inventory for the Elderly: a new tool. Ear and Hearing 3, 128-134.

Weinstein, B., Sirow, L. & Moser, S. (2016).  Relating hearing aid use to social and emotional loneliness in older adults. American Journal of Audiology 25, 54-61.

Listening gets more effortful in your forties

DeGeest, S., Keppler, H. & Corthals, P. (2015) The effect of age on listening effort. Journal of Speech, Language and Hearing Research 58(5), 1592-1600.

This editorial discusses the clinical implications of an independent research study and does not represent the opinions of the original authors.

The ability to understand conversational speech in everyday situations is affected by many obstacles. A large proportion of our work involves determining the best treatment plan to help hearing-impaired patients overcome these obstacles.  Though understanding speech in noise poses difficulty for hearing-impaired individuals of all ages, several studies have indicated that in the absence of hearing loss, older adults face increased challenges in noisy environments (Pichora-Fuller & Singh, 2006; Duquesnoy, 1983; Dubno et al., 1984; Helfer & Freyman, 2008); some reports suggest that middle-aged adults have significantly poorer speech recognition in noise compared to young adults. (Helfer & Vargo, 2009).

Competing environmental noise reduces the audibility of acoustic speech information, increasing reliance upon visual, situational and contextual cues, that in turn requires a greater delegation of cognitive resources (Schneider et al., 2002), making listening more effortful. Increases in listening effort in noise could be related to decreases in hearing thresholds or available cognitive resources, as both are known to decrease with advancing age.  But the fact that normal-hearing individuals also experience more difficulty hearing in noise suggests that factors other than hearing loss may be involved, including working memory, processing speed and selective attention (Akeroyd, 2008; Pichora-Fuller et al., 1995).

The work of DeGeest and colleagues examined listening effort and speech recognition in adult subjects from 20 to 77 years of age. All of the subjects were determined to have normal “age corrected” hearing thresholds from 250Hz through 8000Hz, though older subjects had average high-frequency pure tone thresholds in the mild to moderate range of hearing loss. Subjects over age 60 were screened with the Montreal Cognitive Assessment (MoCA; Nasreddine et al., 2005), no specific cognitive performance measures were included in data analysis.  Listening effort was evaluated using a dual-task paradigm in which subjects performed a speech recognition task while simultaneously performing a visual memory task. Speech recognition ability was measured with 10-item sets of two-syllable digits, presented at two SNR levels: +2dB SNR and -10dB SNR.  Performance on the dual-task presentation was examined in comparison to baseline measures of each test in isolation. Listening effort was defined as the change in performance on the visual memory task when the dual-task condition was compared to baseline. Speech recognition ability was not expected to change from baseline when measured in the dual-task condition.

The investigators found that listening effort increased in parallel with advancing age. Though subjects were initially determined to have “age corrected” normal hearing, which meant some participants had high frequency hearing loss, the correlation between listening effort and age was maintained even when the factors of pure tone threshold and baseline word recognition performance were controlled. Of note was the observation that listening effort started to increase notably between +2dB and -10dB SNRs at ages of 40.5 years and 44.1 years, respectively. Their determination that listening effort begins to increase in the mid 40’s is in agreement with other research that reported cognitive declines beginning around age 45 years (Singh-Manoux et al., 2012).  The authors suggest that further investigations of listening effort and word recognition in middle-aged and older adults should examine cognitive ability in more detail with specific tests of working memory, processing speed and selection attention included in the data analyses.

Although middle-aged adults are less likely to demonstrate outward effects of cognitive decline than older adults, the should not be regarded as immune to changes in cognitive ability and resulting listening effort.  Middle-aged individuals are more likely than their older counterparts to be working full time and may have more active lifestyles.  Hearing-impaired individuals of middle-age who work in reverberant or noisy environments may face additional challenges to job performance if they are also experiencing changes in processing speed or memory or if they struggle with even mild attentional deficits.  These are tangible considerations that might impact the entirety of treatment plan development, from the selection of hearing aids and assistive technologies to the communication and counseling strategies that are selected for the patient and their family members.

References

Akeroyd, M. (2008). Are individual differences in speech reception related to individual differences in cognitive ability? A survey of twenty experimental studies with normal and hearing-impaired adults. International Journal of Audiology 47 (Suppl 2), S53-S71.

DeGeest, S., Keppler, H. & Corthals, P. (2015) The effect of age on listening effort. Journal of Speech, Language and Hearing Research 58(5), 1592-1600.

Desjardins, J. & Doherty, K. (2014). The effect of hearing aid noise reduction on listening effort in hearing-impaired adults. Ear and Hearing 35 (6), 600-610.

Dubno, J., Dirks, D. & Morgan, D. (1984). Effects of age and mild hearing loss on speech recognition in noise. Journal of the Acoustical Society of America 76, 87-96.

Duquesnoy, J. (1983). The intelligibility of sentences in quiet and noise in aged listeners. Journal of the Acoustical Society of America 74, 1136-1144.

Helfer, K. & Freyman, R. (2008).  Aging and speech on speech masking. Ear and Hearing 29, 87-98.

Keppler, H., Dhooge, I., Corthals, P., Maes, L., D’haenens, W., Bockstael, A. & Vinck, B. (2010). The effects of aging on evoked otoacoustic emissions and efferent suppression of transient evoked otoacoustic emissions. Clinical Neurophysiology 121, 359-365.

Nasreddine, Z., Phillips, M., Bedirian, V., Charbonneau, S., Whitehead, V., Collin, I. & Chertkow, H. (2005).  The Montreal Cognitive Assessment, MoCA: A brief screening tool for mild cognitive impairment. Journal of the American Geriatrics Society 53, 695-699.

Pichora-Fuller, M., Schneider, B. & Daneman, M. (1995).  How young and old adults listen to and remember speech in noise. The Journal of the Acoustical Society of America 97, 593-608.

Pichora-Fuller, M. & Singh, G. (2006). Effects of age on auditory and cognitive processing: implications for hearing aid fitting and audiologic rehabilitation. Trends in Amplification 10, 29-59.

Sarampalis, A., Kalluri, S. & Edwards, B. (2009). Objective measures of listening effort: Effects of background noise and noise reduction. Journal of Speech, Language and Hearing Research 52, 1230-1240.

Schneider, B., Daneman, M. & Pichora-Fuller, M. (2002). Listening in aging adults: from discourse comprehension to psychoacoustics. Canadian Journal of Experimental Psychology 56, 139-152.

The Christmas Party Problem: Guest Post from Dr. Simon Carlile

 A version of this blog first appeared as an article in the Australian Audiology Today Christmas edition.

One problem with Christmas parties is that there are so many of them and picking which ones to go to can be difficult. Something to influence your decision (other than the quality of the wine on offer) might be where the party is being held. The downtown club with disco music pounding away might be great if you want to dance the night away but that type of venue is not going to help you develop your network with witty conversation and one-liners. Of course, the real Christmas party challenge, even in less busy environments, is hearing and understanding what others are saying at such gatherings; a problem that is virtually insurmountable for those with even a moderate hearing loss.

The Original “Cocktail Party”

Colin Cherry was the first to coin the phrase “the cocktail party problem,” and it seems appropriate to paraphrase that term in regards to this Christmas issue. While most people reading this article have probably come across this term, not many will have the opportunity to read Cherry’s original paper – and what an interesting read it is! His brief, but very influential paper, “Some experiments on the recognition of speech with one and with two ears” first appeared in the Journal of the Acoustical Society in 1953 and is remarkable for a number of reasons.

First, in coining the term the “cocktail party problem,” the question for Cherry was “How do we recognize what one person is saying when others are speaking at the same time?” Two important ideas can be drawn from this, both of which relate to the fact that the conversational environment of the cocktail party involves multiple talkers rather than just one talker and background noise. The first idea is that some talkers will be conveying information that is of interest and also not of interest, i.e. conversation is a multisource listening challenge where focus must quickly switch between sources. The second idea is that many of the talkers’ voices will be what constitutes noise. This is important because the nature of the background sounds are important in terms of the type of masking needed to enable focusing on the sound of interest and the sorts of processing available to the auditory system to ameliorate that masking (see “A primer on masking” below).

Second, Cherry’s paper is mostly about selective attention in speech understanding, the role of the “statistics of language,” voice characteristics and the costs and time course of switching attention. In the Introduction he makes a very clear distinction between the kinds of perceptions that are studied using simple stimuli, such as clicks or pure tones, and the “acts of recognition and discrimination” that underlie understanding speech in the “cocktail party” environment. Cherry’s paper has been cited nearly 1,200 times, but interestingly enough, the greater proportion of those focused on detecting sounds on a background of other sounds used simple stimuli such as tones against broadband noise or other tones. Hardly the rich and complex stimuli that Cherry was talking about. Of course this was very much the bottom-up, reductionist approach of the physicists and engineers in Bell Labs and elsewhere who had had an immense influence on the development of our thinking about auditory perception, energetic masking in particular (See Box – “A primer on masking” and the discussion of the development of the Articulation Index).

An excellent and almost definitive review of this literature is provided by Adelbert Bronkhorst in 2000: “The Cocktail Party Phenomenon: A Review of Research on Speech Intelligibility in Multiple-Talker Conditions.” The research over that period focused on energetic unmasking. For instance: the head shadow producing a “better ear advantage” by reducing the masker level in the ear furthest from the source, the effects of binaural processing or the effects of the modulation characteristics of speech and other maskers. So, on the one hand, the high citation rate for Cherry’s paper is very surprising because there is very little in the original paper that relates to energetic masking. On the other hand, the appropriation of the term “the cocktail party problem” and the reconfiguring of the research question demonstrates the powerful influence of the bottom-up, physics-engineering approach to thinking about auditory perception. This had become the lens through which much thinking and research was viewed. To be fair though, Bronkhorst does point out in his review that there were some data in the literature involving speech-on-speech masking that were not well explained by energetic masking but that this had not been a particular focus of the research.

 

Informational Masking

The turn of the century was propitious for hearing science as it marked another turning point in our thinking about this “cocktail party” problem. In 1998, Richard Freyman and colleagues reported that differences in the perceived locations of a target and maskers (as opposed to actual physical differences in location) produced a significant unmasking for speech maskers but not for noise. Such a result was not amenable to a simple bottom-up explanation of energetic masking. Thus, Freyman appropriated the term “information masking” which had been previously used in experiments involving relatively simple stimuli. This was the first time it had been applied to something as complex and rich as speech. As we shall see in more detail later, the unmasking produced in this experiment depended on the active, top-down focus of attention. As previously mentioned, Bronkhorst had pointed out that others had noted speech interference of speech understanding seemed to amount to more than the algebraic sum of the spectral energy. Indeed, as early as 1969, Carhart and colleagues had referred to this as “perceptual masking” or “cognitive interference.” Along those lines, information masking in the context of the perceptual unmasking in Freyman’s and later similar experiments came to stand for everything that wasn’t energetic masking.

Over the ensuing 15 years, many studies have been carried out examining the nature of information masking. A number of general observations can be made and some of these are drawn out in the “Primer” below. One very important shift however, was that the “cocktail party problem” became increasingly seen as a particular case of the general problem of auditory scene analysis (ASA). This is the problem of “acoustic superposition” where the energy from multiple concurrent sounds converges on a single encoder; in this case the cochlea of the inner ear. The first task of the auditory system then, is to work out which spectral components belong to which sound sources and to group them together in some way. The second task is how these now segregated components are joined up in time to provide a stream of information associated with a specific sound.

 

Auditory Scene Analysis

Albert Bregman did much to promote thinking in this area with the publication of Auditory Scene Analysis in 1992, marking a significant return of Gestalt thinking to the study of auditory perception. Although this part of the story is still being worked out, it is clear that much of the grouping and steaming processes underlying ASA are largely automatic, that is bottom-up, and they capitalize on the physical acoustics of sounding bodies – probably not surprising given that the auditory system evolved in a world of physically sounding bodies and “the cocktail party problem” is a common evolutionary challenge for nearly all terrestrial animals. The perceptual outcome of this process is the emergence of auditory objects that usually correspond to the individual physical sources. Indeed, many of the experimental approaches to understanding ASA involved stimuli which created perceptual objects that were in some way ambiguous and also looking at the illusions and/or confusions that such manipulation creates.

In the case of “the cocktail party problem”, the speech from each talker forms a specific stream and the problem becomes more about how we are able to select between each of the streams. In practical terms, the greater the differences between the talkers on some dimension (pitch, timbre, accent, rhythm, location etc.), the less likely we are to confuse the streams. That is, the greater stream variety, the more information unmasking we can expect.

This brings us to the key role of attention in understanding listening in a “cocktail party” scenario. Attention has been thought of as a type of filter that can be focused on a feature of interest, allowing for an up-regulation of the processing of information within that filter with a potential down-regulation of information outside the filter. A physical difference in some aspect of the auditory stream provides the hook onto which the listener can focus their attention. In recognizing the critical role that attention plays in understanding what is happening in a cocktail party scenario, it does move the discussion from “hearing” to “listening” and closer to Cherry’s goals of understanding the “acts of recognition and discrimination” that underlie the understanding of speech.

 

Auditory Attention

The neuroscience of auditory attention is in its infancy compared what we know about visual attention, although some tentative generalizations can be made:

Attention is a process of biased competition. The moment to moment focus of attention is dependent on competition between (1) top-down, voluntary or endogenous attentional control and (2) bottom-up, saliency driven or exogenous attention. The cognitive capacity to focus attention plays a key role in the sustained attention necessary to process the stream of information from a particular talker. There is evidence that we listen to only one auditory object at a time and selective attention is critical in enabling this. The exogenous competition introduced by concurrent sounds, particularly other talkers (the distractors) means more cognitive effort is required to sustain attention on a particular target of interest. The implication for an ageing population is that any reduction in cognitive capacity to sustain attention will increase the difficulty of understanding the stream of information from a single talker in the presence of other talkers.

Selective attention works at the level of perceptual objects as opposed to a particular physical dimension such as loudness or pitch. That is, attention focuses on the voice or the location of a particular talker (or both simultaneously – see below). While the attentional hook might be a difference on a particular perceptual dimension, the sum total of characteristics that make up the perceptual object are what becomes enhanced. Models of attention suggest that the competition for attention is played out in working memory and the players are the sensory objects contained in working memory at any particular point in time. Indeed, our conscious perception of the world relies on this process.

What this means, is when auditory objects are not well defined then the application of selective attention can be degraded. There are a number of circumstances where this can happen. For instance, when the stimuli themselves are ambiguous and don’t possess the relevant acoustical elements to support good grouping and streaming. Alternatively, the stimuli themselves may possess the necessary physical characteristics; however, poor encoding at the sensory epithelia and/or degraded neural transmission of the perceptual signal can result in a reduced fidelity or absence of the encoded features necessary for grouping or streaming. Implications for hearing impairment are that degradation of sensory encoding, such as that produced by broader auditory filters (critical bands) or poor temporal resolution, will weaken object formation and make the task of selective attention that much harder.

Attention acts as both a gain control and a gate. There is a growing body of evidence that indicates attention modulates the activity of neurones in the auditory system, not only at a cortical level but even earlier in the signal chain, possibly even at the level of the hair cells of the cochlea. In a number of recent and ground-breaking experiments, this process of up-regulation of the attended talker and down-regulation of the maskers has been convincingly demonstrated in the auditory cortex of people dynamically switching their attention between competing talkers (Mesgarani & Chang, 2012; Ding & Simon, 2013). Importantly, the strength of the selective cortical representation of the “attended-to” talker correlated with the perceptual performance of the listener in understanding the targeted talker over the competing talker.

The auditory system engages two different attentional system – one focused on the spatial location of a source and one focused on non-spatial characteristics of the source – which have two different cortical control systems. In a 2013 study, Adrian “KC” Lee and colleagues (Lee et al, 2013) had listeners change their attentional focus while imaging the brain. They found that the left frontal eye fields (FEF) became active before the onset of a stimulus when subjects were asked to attend to the location of a to-be-heard sound. This is part of the so-called dorsal attention pathway thought to generally support goal-directed attention. On the other hand, when asked to attend to a non-spatial attribute of the stimulus such as the pitch, a different pattern of pre-stimulus activation was observed in the left posterior central sulcus, an area also associated with auditory pitch categorization. This suggests that for the hearing impaired, a loss of the ability to localize the source of a sound disables or degrades a significant component of the auditory attention system resulting in an increased reliance on the non-spatial attention system.

Returning to Colin Cherry’s paper, it appears that we have — to paraphrase T.S. Eliot —“arrived where we started and know the place for the first time.”

So much of what Cherry discussed in his seminal paper is where we now find our neuroscientific focus including: the statistics of language in terms of its phonetic and semantic characteristics; the focus of attention and how that is mediated by spatial location and/or vocal or other characteristics; the transitional probabilities of what is being said and so on. The difference now is that we have both the technical and analytical tools to get a handle on how these processes are represented in the brain. With an increasing understanding of the functional plasticity of the brain, we are at a point now where we are making advances in the understanding of human perception and cognition that will have significant ramifications for how we intervene, support and rehabilitate many of the disorders that manifest as hearing impairment.

Further Reading

Cherry, E.C. (1953). “Some experiments on the recognition of speech with one and with two ears” J Acoust Soc Am, 25:975

Bronkhorst, A. (2000). “The cocktail party phenomenon: A review of research on speech intelligibility in multiple-talker conditions” in Acustica 86:117-128.

Lee, A. K. C., et al. (2012). “Auditory selective attention reveals preparatory activity in different cortical regions for selection based on source location and source pitch.” Frontiers in Neuroscience 6: 190-190.

Mesgarani, N. and Chang, E. F. (2012). “Selective cortical representation of attended speaker in multi-talker speech perception.” Nature 485: 233-236.

Ding, N. and Simon, J. Z. (2012). “Emergence of neural encoding of auditory objects while listening to competing speakers.” Proceedings of the National Academy of Sciences of the United States of America 109: 11854-9.