Starkey Research & Clinical Blog

A preferred speech stimulus for testing hearing aids

Development and Analysis of an International Speech Test Signal (ISTS)

Holube, I., Fredelake, S., Vlaming, M. & Kollmeier, B. (2010). Development and analysis of an international speech test signal (ISTS). International Journal of Audiology, 49, 891-903.

This editorial discusses the clinical implications of an independent research study. The original work was not associated with Starkey Laboratories and does not reflect the opinions of the authors.

Current hearing aid functional verification measures are described in the standards IEC 60118 and ANSI 3.22 and use stationary signals, including sine wave frequency sweeps and unmodulated noise signals. Test stimuli are presented to the hearing instrument and frequency specific gain and output is measured in a coupler or ear simulator.  Current standardized measurement methods require the instrument to be set at maximum or a reference test setting and adaptive parameters such as noise reduction and feedback management are turned off.

These procedures provide helpful information for quality assurance and determining fitting ranges for specific hearing aid models. However, because they were designed for linear, time-invariant hearing instruments, they have limitations for today’s nonlinear, adaptive instruments and cannot provide meaningful information about real-life performance in the presence of dynamically changing acoustic environments.

Speech is the most important stimulus encountered by hearing aid users and nonlinear hearing aids with adaptive characteristics process speech differently than they do stationary signals like sine waves and unmodulated noise. Therefore, it seems preferable for standardized test procedures to use stimuli that are as close as possible to natural speech.  Indeed, there are some hearing aid test protocols that use samples of natural speech or live speech. But natural speech stimuli will have different spectra, fundamental frequencies, and temporal characteristics depending on the speaker, the source material and the language. For hearing aid verification measures to be comparable to each other it is necessary to have standardized stimuli that can be used internationally.

Alternative test stimuli have been proposed based on the long-term average speech spectrum (Byrne et al., 1994) or temporal envelope fluctuations (Fastl, 1987). The International Collegium for Rehabilitative Audiology (ICRA) developed a set of stimuli (Dreschler, 2001) that reflect the long-term average speech spectrum and have speech-like modulations that differ across frequency bands.  ICRA stimuli have advantages over modulated noise and sine wave stimuli in that they share some similar characteristics with speech, but they lack speech-like comodulation characteristics (e.g., fundamental frequency). Furthermore, ICRA stimuli are often classified by signal processing algorithms as “noise” rather than “speech”, so they are less than optimal for measuring how hearing aids process speech.

The European Hearing Instrument Manufacturers Association (EHIMA) is developing a new measurement procedure for nonlinear, adaptive hearing instruments and an important part of their initiative is development of a standardized test signal or International Speech Test Signal (ISTS).  The development and analysis of the ISTS was described in a paper by Holube, et al. (2010).

There were fifteen articulated requirements for the ISTS, based on available test signals and knowledge of natural speech, the most clinically salient of which are:

  • The ISTS should resemble normal speech but should be non-intelligible.
  • The ISTS should be based on six major languages, representing a wide range of phonological structures and fundamental frequency variations.
  • The ISTS should be based on female speech and should deviate from the international long-term average speech spectrum (ILTASS) for females by no more than 1dB.
  • The ISTS should have a bandwidth of 100 to 16,000Hz and an overall RMS level of 65dB.
  • The dynamic range should be speech-like and comparable to published values for speech (Cox et al., 1988; Byrne et al., 1994).
  • The ISTS should contain voiced and voiceless components. Voiced components should have a fundamental frequency characteristic of female speech.
  • The ISTS should have short-term spectral variations similar to speech (e.g., formant transitions).
  • The ISTS should have modulation characteristics similar to speech (Plomp, 1984).
  • The ISTS should contain short pauses similar to natural running speech.
  • The ISTS stimulus should have a 60 second duration, from which other durations can be derived.
  • The stimulus should allow for accurate and reproducible measurements regardless of signal duration.

Twenty-one female speakers of six different languages (American English, Arabic, Mandarin, French, German and Spanish) were recorded while reading a story, the text and translations of which came from the Handbook of the International Phonetic Association (IPA).  One recording from each language was selected based on a number of criteria including voice quality, naturalness and median fundamental frequency. The recordings were filtered to meet the ILTASS characteristics described by Byrne et al. (1994) and were then split into 500ms segments that roughly corresponded to individual syllables. These syllable-length segments were attached in pseudo-random order to generate sections of 10 or 15 milliseconds. Each of the resulting sections could be combined to generate different durations of the ISTS stimulus and no single language was used more than once in any 6-segment section.  Speech interval and pause durations were analyzed to ensure that ISTS characteristics would closely resemble natural speech patterns.

For analysis purposes, a 60-second ISTS stimulus was created by concatenation of 10- and 15-second sections.  This ISTS stimulus was measured and compared to natural speech and ICRA-5 stimuli based on several criteria:

  • Long-term average speech spectrum (LTASS)
  • Short term spectrum
  • Fundamental frequency
  • Proportion of voiceless segments
  • Band-specific modulation spectra
  • Comodulation characteristics
  • Pause and speech duration
  • Dynamic range (spectral power level distribution)

On all of the analysis criteria, the ISTS stimulus resembled natural speech stimuli as well or better than ICRA-5 stimuli. Notable improvements for the ISTS over the ICRA-5 stimulus were its comodulation characteristics and dynamic range of 20-30dB, as well as pauses and combinations of voiced and voiceless segments that more closely resembled the distributions in natural speech.  Overall, the ISTS was deemed an appropriate speech-like stimulus proposal for the new standard measurement protocol.

Following the detailed analysis, the ISTS stimulus was used to measure four different hearing instruments, which were programmed to fit a flat, sensorineural hearing loss of 60dBHL.  Each instrument was nonlinear with adaptive noise reduction, compression and feedback management characteristics. The first-fit algorithms from each manufacturer were used, with all microphones fixed to an omnidirectional mode.  Instead of yielding gain and output measurements across frequency for one input level, the results showed percentile dependent gain (99th, 65th and 30th) across frequency as referenced to the long-term average speech spectrum.  The percentile dependent gain values provided information about nonlinearity, in that the softer components of speech were represented by the 30th percentile, moderate and loud speech components were represented by the 65th and 99th percentiles, respectively.  Relations between these three percentiles represented the differences in gain for soft, moderate and loud sounds.

The measurement technique described by Holube and colleagues, using the ISTS stimulus, offers significant advantages over current measurement protocols with standard sine wave or noise stimuli. First and perhaps most importantly, it allows hearing instruments to be programmed to real-life settings with adaptive signal processing features active. It measures how hearing aids process a stimulus that very closely resembles natural speech, so clinical verification measures may provide more meaningful information about everyday performance. By showing changes in percentile gain values across frequency, it also allows compression effects to be directly visible and may be used to evaluate noise reduction algorithms as well. The authors also note that the acoustic resemblance of ISTS to speech with its lack of linguistic information may have additional applications for diagnostic testing, telecommunications or communication acoustics.

The ISTS is currently available in some probe microphone equipment and will likely be introduced in most commercially available equipment over the next few years. Its introduction brings a standardized speech stimulus, for the testing of hearing aids, to the clinic. An important component of clinical best practice is the measurement of a hearing aid’s response characteristics. This is most easily accomplished through insitu probe microphone measurement in combination with a speech test stimulus such as the ISTS.

References

American National Standards Institute (ANSI ). ANSI S3.22-2003. Specification of hearing aid characteristics. New York: Acoustical Society of America.

Byrne, D., Dillon, H., Tran, K., Arlinger, S. & Wibraham, K. (1994). An international comparison of long0term average speech spectra. Journal of the Acoustical Society of America, 96(4), 2108-2120.

Cox, R.M., Matesich, J.S. & Moore, J.N. (1988). Distribution of short-term rms levels in conversational speech. Journal of the Acoustical Society of America, 84(3), 1100-1104.

Dreschler, W.A., Verschuure, H., Ludvigsen, C. & Westerman, S. (2001). ICRA noises: Artificial noise signals with speech-like spectral and temporal properties for hearing aid assessment. Audiology, 40, 148-157.

Fastl, H. (1987). Ein Storgerausch fur die Sprachaudiometrie. Audiologische Akustik, 26, 2-13.

Holube, I., Fredelake, S., Vlaming, M. & Kollmeier, B. (2010). Development and analysis of an international speech test signal (ISTS). International Journal of Audiology, 49, 891-903.

International Electrotechnical Commission, 1994, IEC 60118-0. Hearing Aids: Measurement of electroacoustical characteristics, Bureau of the International Electrotechnical Commission, Geneva, Switzerland.

IPA, 1999. Handbook of the International Phonetic Association. Cambridge University Press.

Plomp, R. (1984). Perception of speech as a modulated signal. In M.P.R. van den Broeche, A. Cohen (eds), Proceedings of the 10th International Congress of Phonetic Sciences, Utrecht, Dordrecht: Foris Publications, 29-40.