Comparison of Voice Onset Time in People with Spastic Dysarthria and Healthy Group


avatar Fatemeh Khorsha Kisomi 1 , avatar Majid Soltani ORCID 1 , * , avatar Maryam Dastoorpoor ORCID 2 , avatar Nastaran Madjdinasab 3 , avatar Negin Moradi 1

Musculoskeletal Rehabilitation Research Center, Ahvaz Jundishapur University of Medical Sciences, Ahvaz, Iran
Department of Biostatistics and Epidemiology, Menopause Andropause Research Center, Ahvaz Jundishapur Medical Sciences, Ahvaz, Iran
Department of Neurology, Golestan Hospital, Ahvaz Jundishapur University of Medical Sciences, Ahvaz, Iran

how to cite: Khorsha Kisomi F, Soltani M, Dastoorpoor M, Madjdinasab N, Moradi N. Comparison of Voice Onset Time in People with Spastic Dysarthria and Healthy Group. Shiraz E-Med J. 2020;21(5):e94573.



Given the role of voice onset time in speech production and its value in the identification of speech disorders, the present paper aimed at comparing VOT in people with multiple sclerosis and healthy group and investigating the factors affecting VOT.


In this cross-sectional analysis study, 36 patients with MS with spastic dysarthria and 36 healthy subjects were investigated. After placing the subjects in an acoustic environment, the acoustic signal of the voiced and voiceless stop words /p/, /t/, /k/, /b/, /d/, /g/ with the vowel /a/, in the tissue cvc was recorded using the Sure-beta54 microphone. The spectrogram of the words was checked with the Praat version 6.0.36. Data were analyzed by the Shapiro-Wilk test, Independent t-test, and two-way analysis of variance.


Patients with MS have a longer VOT than healthy people, although the difference is not statistically significant (P > 0.05). The study of the effect of place of articulation on VOT revealed that the place of articulation was effective on the VOT in the healthy controls, however, there was no significant difference in the patient group. There was a significant difference in the effect of voiced-voiceless words on VOT (P < 0.05). The place of articulation and voiced-voiceless variables do not affect the voice onset time simultaneously; however, they might be effective independently (P > 0.05).


Patients with MS differ in their motor coordination between articulator structures and speech structure compared to the healthy group. Deficient speech production timing causes problems in speech motor control and ultimately changes the speech of the affected people.

1. Background

Voice onset time is one of the acoustic indicators used in speech motor control. This parameter is one of the important issues in the disordered timing of speech sound production, which indicates the time between air release in the pronunciation of a stop consonant and the onset of regular vibrations of vocal cords (1). The parameter is influenced by various factors such as physiological and aerodynamic changes (e.g. lung capacity, movement of the tongue and the lips, duration of time of movement, and changes in speech speed). This means that VOT varies with the tongue moving up and down, or back and forth. Since speech timing is determined directly by the central nervous system, it creates sophisticated acoustic communication between mechanical and neurological factors that play a crucial role in speech motor coordination as well as coordination between laryngeal and upper larynx mechanisms, such as production, stress, phonation, and respiration (2-5).

Voice onset time in neurological diseases has been studied in various studies. For example, Hardcrestle et al., in 1985, Maurice et al., in 1989, and Stanovich et al., in 2007, reviewed speech timing in patients with dysarthria. The results of these studies indicate that patients have problems in coordinating larynx and upper larynx mechanisms, which are due to speech motor control (6-8).

Due to turbulent breathing, pitch variation, and voice quality, VOT is affected in MS patients with spastic dysarthria. Disorder at the voice onset time and the overlap patterns between voiced and voiceless sounds in MS patients show that the nature of production error is more phonetic, and these errors are related to defects in the timing of the production process (9). The clinical aim of the analysis of voice onset time in speech motor disorder is to establish a correlation between voice abnormalities and phonetic disorder. Since the correct production of stop consonants indicates coordination between the muscles of the tongue, lips, jaw, and larynx, the voice onset time can be a good indicator of this coordinated timing (3, 9). As the acoustic properties of the sound are highly sensitive to neurophysiologic function, a precise study of this parameter to measure the onset of neuromuscular disorders, progression of the disorder, and the identification of subtle speech disorders can be helpful (10).

In many studies, voice onset time is investigated as a speech timing feature of neurological disorders, which includes Parkinson's and ALS. However, few studies are conducted on patients with MS with spastic dysarthria. Of the studies carried out so far, specific aspects of time index, such as syllable duration, are studied. However, the VOT, which is a variable of time, has remained to be investigated.

2. Objectives

Given the role of voice onset time in production of speech and its value in identification of speech disorders, the present study aims at comparing voice onset time in people with spastic dysarthria and the control group and to study the factors affecting VOT changes in these individuals.

3. Methods

This is a cross-sectional study. The study included two patient and healthy groups. The patient group includes multiple sclerosis patients with spastic dysarthria referring to speech therapy clinics and MS Association of Ahvaz Jundishapur University of Medical Sciences. The healthy group, on the other hand, consists of the staff members of Ahvaz Jundishapur University, matched according to age and sex with the patient group. Patients were approved for MS by a neurologist. Definitive diagnosis of dysarthria was also performed by two speech pathologists.

Due to the lack of similar studies to determine the sample size, a pilot study was initially conducted on 10 healthy groups and 10 patients in the patient group. In MS patients, the mean for VOT kur 0.104 and standard deviation 0.0188 was calculated and in healthy people, the mean for VOT kur was 0.089 and standard deviation 0.019 was calculated. Then, considering the 95% confidence level and 80% test power, the final sample size of 36 patients and 36 healthy groups were determined according to the following formula:


The inclusion criteria to the study were: confirmed dysarthria, being monolingual (Persian), a lack of symptoms of colds, and respiratory diseases. The exclusion criteria from the study were lack of consent to participate in the study and recurrence of MS attacks during sampling. The inclusion criteria for the healthy group to identify with the MS group in terms of age and sex include scaling and lack of symptoms of colds and satisfaction to participate in the study.

3.1. Dysarthria Valuation

In the first stage, the presence of dysarthria in MS patients was diagnosed by two pathologists. For this purpose, evaluation was performed to determine the components of speech based on the Duffy protocol. The protocol includes vowel stretches of /a/, /i/, frequent and alternating speed of /pa-ta-ka/, (diaado), counting from 1 to 10 with one breath, continuous reading of texts, and a sample speech (introducing self, describing their job, if employed, and answering interview questions).

In the vowel stretch task, each of the clients were asked to take a deep breath, and stretch vowels /a/ and /i/ separately as long as they could. The vowel stretching was done by the subjects after a few second-minute training of a speech-pathologist was present at the meeting. In examining the alternating motion rate, the subjects were requested to repeat /pa/ as long as they can, quickly and accurately. This test was performed after two to three seconds of training. After the repetition of the syllable /pa/, the subjects repeated the two syllables /ta/ and /ka/ separately with the same pattern. In the same way, sequential motion rate was also performed; the subjects repeated the syllables /pa, ta, ka/ in succession (11).

After conducting interviews and preliminary evaluations, in order to recognize dysarthria, voice recordings were provided to two speech and speech pathologists with at least 10 years of clinical experience in the field of motor speech disorders in order to listen to the sample of recorded sounds separately. They were asked to judge according to the speech factors, respiration, phonation, production, stress, and prosodic features. In case of disagreement about diagnosing dysarthria, the subject's voice was re-examined at a joint meeting (12).

After the definitive diagnosis of the speech, the subjects were tested for VOT for the second time. At this stage, the task was described by the examiner to the subject and performed in practice (11).

The task consists of 6 single-syllabic words, which were given in the form of phrases. Target words include stop consonants, palatal, alveolar, voiced and voiceless bilabials at the beginning of the word with vowel /a/ (8, 13).

These phrases were handwritten on 8 × 4 cm cards with readable handwriting. Each phrase was presented to the subjects within five seconds. The subjects were then asked to read the letter on the card three times clearly and naturally (13, 14). The words are listed in Table 1.

Table 1. The Target Phrases Used in Data Collection
Voiced StopsVoiceless Stops
/gam Ɂæst//Ɂin kar Ɂæst/
/dar Ɂæst//Ɂin tar Ɂæst/
/bar Ɂæst//Ɂin par Ɂæst/

The recording was done through the Sure-beta54 microphone (made in USA) set on the subjects’ head and the headset was placed at a distance of three cm to the right of their mouth (15).

Recording of speech samples was performed in a sound-proof environment with no noise less than 50 dB (11 and 18). After recording the data, the Praat software version 6.0.36 was used to examine each word’s spectrogram (8, 13).

To determine the positive values of the VOT, the voice onset time started with zero passage before the first negative explosion (Figure 1), and negative VOT values measured by voiced variable start from the lower line of the first negative peak (14) (Figure 2).

3.2. Analysis

Mean, standard deviation and frequency were used to describe the data. The data were normalized through Shapirowilk's test. The test showed that the main variables were normal. For the comparison of the mean of the groups, Independent t-test and two-way variance analysis were used. The level of significance was less than 0.05. The analysis was performed using SPSS software version 22.

3.3. Ethical Considerations

In this study, the individuals were requested for oral and written informed consent. Meanwhile, they were assured that their personal information would remain confidential. The proposals of this study were approved by the Ethics Committee of Jundishapur University of Medical Sciences in Ahvaz with the code of ethics IR.AJUMS.REC.1396.1112.

4. Results

The average age of the subjects in the patient group was 39.08 ± 8.41. The healthy groups’ average age was 37.22 ± 7.88. The male to female ratio was also 6 to 30. The patients’ EDSS mean was reported to be 1.65 ± 1.17 and dysarthria intensity of patients was 1.65 ± 0.65. The average duration of the disease was 6 years and 4 months.

In this study, the results of the correlation coefficient test showed that the correlation coefficient was 0.91% in two week intervals (16).

Table 2 shows the mean voice onset time in both healthy and patient groups for each consonant in combination with vowel /a/. In the Table 2, the mean voice onset time for vowels in stop voiceless and voiced words in the patient group is higher than the healthy controls. However, this difference is not statistically significant (P > 0.05).

Table 2. Mean Distribution and SD for Voice Onset Time (in Second) in Voiced and Voiceless Stop Words
Words, GroupMean ± SDP Value
/kar/ (palatal-voiceless)0.796
Healthy 0.090 ± 0.024
Patient0.092 ± 0.019
/Tar/ (alveolar-voiceless)0.813
Healthy0.080 ± 0.025
Patient0.081 ± 0.019
/Par/ (bilabial-voiceless)0.969
Healthy0.082 ± 0.021
Patient0.082 ± 0.021
/bar/ (bilabial-voiced)0.094
Healthy-0.039 ± 0.037
Patient-0.049 ± 0.032
/dar/ (alveolar-voiced)0.585
Healthy-0.041 ± 0.042
Patient-0.045 ± 0.041
/gam/ (palatal-voiced)0.184
Healthy-0.025 ± 0.029
Patient-0.037 ± 0.038

Table 3 shows the values obtained for the VOT based on the place of articulation and voiced/voiceless in the healthy controls. The results of two-way ANOVA showed that there is a significant difference between the mean distribution of the VOT based on the place of articulation and the voiced-voiceless variable (P < 0.05).

Table 3. Mean Distribution of VOT in Terms Place of Articulation and Voiced-Voiceless in the Healthy Group
Variable, ClassMean ± SDFP Value
Place of articulation3.6920.027
Palatal0.033 ± 0.004
Alveolar0.019 ± 0.004
Bilabial0.022 ± 0.004
Voiced-voiceless800.63< 0.001
Voiced-0.035 ± 0.003
Voiceless0.085 ± 0.003
Voiced-voiceless × place of articulation0.1530.858

However, the interconnection between the two groups is not significant. That is, place of articulation and voiced/voiceless variables affect the VOT independently. However, they are ineffective simultaneously (Figure 3).

Sample of mean distribution of VOT in terms place of articulation and voiced-voiceless in the healthy group

In Table 4, as in the previous table, the mean distribution of VOT based on the production location and voiced-voiceless variables was evaluated by two-way ANOVA, which did not show a significant difference for these values based on the production location (P > 0.05). However, there was a significant difference between voiced-voiceless (P < 0.05). Interaction between the two groups was not significant (Figure 4).

Table 4. Mean Distribution of VOT in Terms of Place of Articulation and Voiced-Voiceless in the Patient Group
Variable, ClassMean ± SDFP Value
Place of articulatin2.7050.069
Palatal0.027 ± 0.004
Alveolar0.018 ± 0.004
Bilabial0.017 ± 0.004
Voiced-voiceless1008.03< 0.001
Voiced-0.044 ± 0.003
Voiceless0.086 ± 0.003
Voiced-voiceless × place of articulation0.1790.837
Sample of mean distribution of VOT in terms place of articulation and voiced-voiceless in the patient group

Also, the results of the Tukey's post hoc test showed that the average VOT of patients in palatal words was higher than the alveolar and bilabials. However, there was no significant difference between the palatal and alveolar (P = 0.187) and bilabial (P = 0.103) and alveolar with bilabial (P > 0.999).

The findings of this study show that due to the nervous system involvement in MS patients, the coordination between the phonation system and the larynx with the production system or tongue movements is decreased and therefore, the effect of place of articulation on VOT is not significant; however, in the healthy group the reason for this coordination was that the effect of place of articulation on VOT was significant (13).

In general, these findings can be explored to evaluate fine motor movements and laryngeal coordination in speech motor control disorders.

5. Discussion

The aim of the present study was to compare the voice onset time in people with spastic dysarthria and healthy group, and factors affecting the voice onset time and the relationship between these factors on VOT.

The comparison of the voice onset time in the patient and healthy group showed that the mean voice onset time of vowels in voiced and voiceless stop words in the patient group was more than the healthy group. However, this difference was not statistically significant (P > 0.05). Since in the MS patients with spastic dysarthria, speech speed decreases, and given that the acoustic parameter VOT is a speed dependent variable, the longer the voice onset time of vowels in patients, compared with healthy group, the more is timing defect and inconsistency between the larynx and uppers larynx structures (3, 17). The results of this study are in accordance with the Weismer and Bunton studies performed in 2002, which did not show a statistically significant difference at the voice onset time between the dysarthria and the healthy group. Perhaps there is no statistically significant difference in the voice onset time due to the discussion of non-treatment as a criterion for inclusion and duration of the disease due to the demographic information of the clients in any of the studies is not considered. Given that patients show defects in the respiratory system, and since the VOT is influenced by the volume of respiration, and because therapeutic methods focus on speech speed and respiratory function, the duration of the disease can be considered as an interventional factor in speech control (13, 18). However, the results of this study contradict a study by Flint and Black in 1992, in which showed a delayed voice onset time in 30 Parkinson's patients compared to the control group. He believed that the decrease in duration was due to the softness of the laryngeal muscles in people with dysarthria and the reduction of the opening of the vocal cords (19). Like MS, this is a degenerative disease; however, the reason for the difference in the above study with the current study is perhaps due to the damage zone that differs in both diseases; the Parkinson's disease is a complicated basal ganglion, and patients with speech disorders are hypokinesic and speech speed has increased. In this study, MS patients with dysarthria are spastic and the speech speed is slow. Perhaps at high speech rates, VOT may show some flaws in motion control (20). The results of this study is different with the one with Hardcastle et al., in which there was a significant difference between dysarthria patients and the healthy controls at the voice onset time. Perhaps the reason for the difference between the Hardcastle's study and the present work is the difference in dysarthria intensity, frequency, and type of stimulus. In the Hardcastle’s study, patients had mild dysarthria and the stimulus was presented at least four times in a single word. Meanwhile, in the present study, patients suffer from mild dysarthria and the stimulus was presented with three repetitions in the form of phrases, and these can cause a difference (6).

In the study of the effect of place of articulation on the VOT in the patient group and healthy group, the purpose of this study was to follow the study of Fisher and Goberman, conducted in 2010 where in the healthy controls, the place of articulation affected the VOT. Perhaps the difference in the VOT in different place of articulation is attributed to physiological changes due to the difference in pressure in the various positions of the tongue for phonemic production. However, the above study contrasts with the current study in the patient group. In Fisher and Goberman's study, place of articulation was found to have an effect on VOT. However, the current study found no significant difference. Perhaps this is because of the difference in the movement speed of articulators based on the type of dysarthria. In Fisher and Goberman’s study, patients suffer from hypokinesic dysarthria and in the present study, patients have spastic dysarthria (13). As in the Klat's study in 1975, the increase in VOT in the palatal stop words, compared to alveolar and bilabial words in the patient group can explain this. To go back, the tongue needs to be more closely coordinated with the larynx and upper larynx muscles (tongue, palate, and lips), as nerve control mechanisms in comparison to alveolar and bilabial consonants (21-23).

In the study of the effect of voiced-voiceless variable on VOT, the results of this study showed that voiced-voiceless variables are effective on the voice onset time. The reason may be that variations in voiceless stop consonants, air outflow, and in voiced stop consonants, vocal cord vibration change occurs. As changes in the lung capacity and vocal cords affect VOT, it can be concluded that voiced and voiceless feature affects voice onset time (13, 24, 25).

The results of the present study showed that the relationship between place of articulation and voiced voiceless on VOT was not significant. According to Bohlooli et al., there was no significant difference between voiced and voiceless pairs for stop consonants with respect to lingual-palatal consonants. Therefore, the difference in voiced and voiceless feature of speech does not make a significant difference in place of articulation. Thus, we may assume that these two variables do not affect voice onset time simultaneously; however, they might be effective independently (26).

5.1. Strengths of Research

The voice onset time can be a quantitative precise criterion for examining the movements of the tongue and can detect subtle movements of the tongue and lips, and speech language pathologists can better control the acoustic behavior by assessing this acoustic index Be.

5.2. Limitations and Weaknesses of the Study

Finding people with MS with dysarthria, according to inclusion criteria, and matching person-to-person so that they are similar in age and gender.

It is suggested that voice onset time in MS patients with dysarthria is compared and investigated before and after speech therapy to measure speech motor control sub-systems. It is also better to use other speech timing parameters to evaluate the speech control process more precisely.

5.3. Conclusions

In this study, more MS patients with dysarthria are studied, which does not show a significant statistical difference. Nonetheless, MS patients with dysarthria are different with the normal group in terms of timing and motor coordination between production, phonation, respiration, and speech production. Although the difference is negligible, this timing defect indicates an inconsistency between the larynx and upper larynx muscles, including tongue and lips, which correlates with the degree of neurological damage. Therefore, the diagnosis of motor problems of anatomical organs using the acoustic parameter of voice onset time is important. In addition, in this study, the factors influencing voice onset time, such as the place of articulation, voiced and voiceless, and the effect of their relationship on VOT were studied. Each variable is effective on voice onset time. Examining these issues for therapists allows for a more accurate assessment of this group of patients.


  • 1.

    Lisker L, Abramson AS. A cross-language study of voicing in initial stops: Acoustical measurements. Word. 2015;20(3):384-422. doi: 10.1080/00437956.1964.11659830.

  • 2.

    Kent RD. Research on speech motor control and its disorders: A review and prospective. J Commun Disord. 2000;33(5):391-427. quiz 428. doi: 10.1016/s0021-9924(00)00023-x. [PubMed: 11081787].

  • 3.

    Auzou P, Ozsancak C, Morris RJ, Jan M, Eustache F, Hannequin D. Voice onset time in aphasia, apraxia of speech and dysarthria: A review. Clin Linguist Phon. 2009;14(2):131-50. doi: 10.1080/026992000298878.

  • 4.

    Feijo AV, Parente MA, Behlau M, Haussen S, de Veccino MC, Martignago BC. Acoustic analysis of voice in multiple sclerosis patients. J Voice. 2004;18(3):341-7. doi: 10.1016/j.jvoice.2003.05.004. [PubMed: 15331106].

  • 5.

    Öğüt F, Kiliç MA, Engin EZ, Midilli R. Voice onset times for Turkish stop consonants. Speech Comm. 2006;48(9):1094-9. doi: 10.1016/j.specom.2006.02.003.

  • 6.

    Hardcastle WJ, Barry RA, Clark CJ. Articulatory and voicing characteristics of adult dysarthric and verbal dyspraxic speakers: An instrumental study. Br J Disord Commun. 1985;20(3):249-70. doi: 10.3109/13682828509012266. [PubMed: 4084436].

  • 7.

    Morris RJ. VOT and dysarthria: A descriptive study. J Commun Disord. 1989;22(1):23-33. doi: 10.1016/0021-9924(89)90004-x. [PubMed: 2715378].

  • 8.

    Stipinovich A, Van der Merwe A. Acquired dysarthria within the context of the four-level framework of speech sensorimotor control. S Afr J Commun Disord. 2007;54:67-76. [PubMed: 18240662].

  • 9.

    Hartelius L, Runmarker B, Andersen O, Nord L. Temporal speech characteristics of individuals with multiple sclerosis and ataxic dysarthria: 'Scanning speech' revisited. Folia Phoniatr Logop. 2000;52(5):228-38. doi: 10.1159/000021538. [PubMed: 10965176].

  • 10.

    Jahan A. [Voice onset time in Azerbaijani consonants]. Arch Rehabil. 2009;10(3). Persian.

  • 11.

    Duffy JR. Motor speech disorders: Substrates, differential diagnosis, and management. Elsevier Health Sciences; 2013.

  • 12.

    Connor NP, Ludlow CL, Schulz GM. Stop consonant production in isolated and repeated syllables in Parkinson's disease. Neuropsychologia. 1989;27(6):829-38. doi: 10.1016/0028-3932(89)90006-7. [PubMed: 2755591].

  • 13.

    Fischer E, Goberman AM. Voice onset time in Parkinson disease. J Commun Disord. 2010;43(1):21-34. doi: 10.1016/j.jcomdis.2009.07.004. [PubMed: 19717164].

  • 14.

    Bijankhan M, Nourbakhsh M. Voice onset time in Persian initial and intervocalic stop production. J Int Phonetic Assoc. 2009;39(3):335-64. doi: 10.1017/s0025100309990168.

  • 15.

    Baghban K, Torabinezhad F, Moradi N, Mardani N, Asadollahpour F. [An investigation of the effect of nasalization on/a/vowel frequency formants before and after/m/nasal consonant in cleft palate children]. J Paramed Sci Rehabil. 2014;3(2):62-8. Persian.

  • 16.

    Peolsson A, Hedlund R, Oberg B. Intra- and inter-tester reliability and reference values for hand strength. J Rehabil Med. 2001;33(1):36-41. doi: 10.1080/165019701300006524. [PubMed: 11480468].

  • 17.

    Adam H. An acoustical study of the fricative/s/in the speech of Palestinian-speaking Broca's Aphasics–preliminary findings. Ling Online. 2012;53(3/12):4.

  • 18.

    Bunton K, Weismer G. Segmental level analysis of laryngeal function in persons with motor speech disorders. Folia Phoniatr Logop. 2002;54(5):223-39. doi: 10.1159/000065199. [PubMed: 12378034].

  • 19.

    Flint AJ, Black SE, Campbell-Taylor I, Gailey GF, Levinton C. Acoustic analysis in the differentiation of Parkinson's disease and major depression. J Psycholinguist Res. 1992;21(5):383-9. doi: 10.1007/bf01067922. [PubMed: 1447729].

  • 20.

    Wildgruber D, Ackermann H, Grodd W. Differential contributions of motor cortex, basal ganglia, and cerebellum to speech motor control: effects of syllable repetition rate evaluated by fMRI. Neuroimage. 2001;13(1):101-9. doi: 10.1006/nimg.2000.0672. [PubMed: 11133313].

  • 21.

    LaDuke KM. Speech characteristics in multiple sclerosis [dissertation]. University of Wyoming; 2001.

  • 22.

    Klatt DH. Voice onset time, frication, and aspiration in word-initial consonant clusters. J Speech Hear Res. 1975;18(4):686-706. doi: 10.1044/jshr.1804.686. [PubMed: 1207100].

  • 23.

    Cho T, Ladefoged P. Variations and universals in VOT: Evidence from 17 endangered languages. UCLA Working Papers in Phonetics. 1997:18-40.

  • 24.

    Samareh Y, Nili Pour R. [Book of phonetics Persian: Phoneme and phonetic structural of syllable]. Markaz Nashr Daneshgaahi; 2004. Persian.

  • 25.

    Lazard G. Une neutralisation en phonologie persane. Klincksieck; 1972. French.

  • 26.

    Bohlooli A, Agharasouli Z, Torabinezhad F, Keyhani MR. [Effect of voicing on spacial indices of consonants]. Reading. 2009;1(1):4. Persian.

Copyright © 2020, Author(s). This is an open-access article distributed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License ( which permits copy and redistribute the material just in noncommercial usages, provided the original work is properly cited.