Reliability and Validity of the Embouchure Dystonia Severity Rating Scale
Article information
Abstract
Objective
Embouchure dystonia (ED) is a task-specific movement disorder that leads to loss of fine motor control of the embouchure and tongue muscles in wind musicians. In contrast to musicians’ hand dystonia, no validated severity rating for ED exists, posing a major obstacle for structured assessment in scientific and clinical settings. The aim of this study is to validate an ED severity rating scale (EDSRS) allowing for a standardized estimation of symptom severity in ED.
Methods
The EDSRS was set up as a composite score of six items evaluating audio-visual disease symptoms during the performance of three standardized musical tasks (sustained notes, scales, and fourths) separately for each body side. For validation, 17 musicians with ED underwent standardized audiovisual recordings during performance. Anonymized and randomized recordings were assessed by two experts in ED (raters). Statistical analysis included metrics of consistency, reliability, and construct validity with the fluctuation of the fundamental frequency of the acoustic signal (F0) (extracted in an audio analysis of the sustained notes).
Results
The EDSRS showed high internal consistency (Cronbach’s α = 0.975−0.983, corrected item-total correlations r = 0.90−0.96), interrater reliability (intraclass correlation coefficient [ICC] for agreement/consistency = 0.94/0.96), intrarater reliability over time (ICC per rater = 0.93/0.87) and good precision (standard error of measurement = 2.19/2.65), and correlated significantly with F0 variability (r = 0.55–0.60, p = 0.011–0.023).
Conclusion
The developed EDSRS is a valid and reliable tool for the assessment of ED severity in the hands of trained expert raters. Its easy applicability makes it suitable not only for routine clinical practice but also for scientific studies.
Musician’s dystonia (MD) is a focal, task-specific dystonia. The resulting loss of fine motor control for highly trained movements considerably impairs playing ability, often resulting in termination of professional careers. 1 Two major subentities can be distinguished: 1) MD of the extremities, mostly affecting the fingers playing the instrument (lower extremities are only rarely affected), and 2) embouchure dystonia (ED) affecting the orofacial muscles and the tongue, most frequently in brass players [1,2]. Diagnosis still relies mainly on patient history and clinical examination at the instrument by a movement disorders specialist. In musician’s hand dystonia, the dystonic posture of the affected fingers is usually visible on the instrument during performance. In contrast, in ED, such purely visual assessment is inherently limited since not only perioral but also the jaw or tongue muscles may be affected [2]. Thus, assessment of ED severity relies heavily on the evaluation of sound quality (e.g., onset of a note, ability to sustain a note, etc.). A validated clinical rating scale with sufficient sensitivity to ED-specific features is therefore highly desirable; however, to date, these exist only for musician’s hand dystonia (see Peterson et al. [3] for a review). While customized clinical ratings of ED severity have been applied in previous neuroimaging research [4-6], no validated version of such a scoring is available. Therefore, the aim of the present work was to assess the validity of a clinical ED symptom severity scale (EDSRS) derived from those previously used customized scoring approaches, including its relationship with an established objective measure, the fluctuation of the fundamental frequency (F0), which has reproducibly been shown to differ between diseased and nondiseased professional brass musicians [7,8].
MATERIALS & METHODS
Participants and audiovisual data acquisition
Between 2016 and 2017, 17 professional brass players with ED from the Institute of Music Physiology and Musicians Medicine in Hanover (an expert center for diagnosis and management of MD) were included in parallel to a neuroimaging study published elsewhere [4]. Data acquisition was approved by the Technical University of Munich ethics comittee (5173/11S), and written informed consent according to the Declaration of Helsinki was obtained from the participants. Standardized performances were audio-visually recorded from each patient with a Sony® HDR-CX305 video camera (Sony, Tokyo, Japan) using a stereo microphone with zoom. Each patient played three short musical tasks: 1) ascending and descending scales, 2) ascending and descending fourths, and 3) sustained tones of at least 5 seconds duration (Supplementary Figure 1 in the online-only Data Supplement). All three tasks were performed by each patient in low, medium and high pitch registers typical for the respective instrument (French horn, trombone, or trumpet) to ensure sufficient sensitivity given a frequent differential degree of involvement of pitch registers in ED. Furthermore, each piece was performed twice to account for potential asymmetry in symptom manifestation: once while being video recorded from a right and once from a left lateral viewing angle of 30°–45° from the midline (Figure 1). This ensured an optimal visualization of the lower face and neck region on each side. For validation purposes, the field of view was confined to this area to ensure anonymization of all participants to the raters.
Scale composition and rating procedure
Scale composition
Based on scorings applied in past neuroimaging studies [4-6], a video-based embouchure dystonia severity rating scale (EDSRS) was set up for evaluation. In brief, the EDSRS is calculated as a composite scale from the audio-visual ratings of six items on a 5-point (0 to 4) Likert scale based on the criteria outlined in Supplementary Figure 2 in the online-only Data Supplement. The six items of the EDSRS correspond to the performance of the three task categories (scales, fourths, and sustained tones) across all registers, separately for each (left and right) body side (i.e., three pieces × two sides). The resulting EDSRS accordingly ranged from 0–24 points, aiming to proportionally describe the severity of impairment due to ED.
Rating procedure
For EDSRS validation, rating was performed by two experts specialized in MD (A.L., E.A.). Prior to rating, appropriate rater training was ensured using selected similar video recordings from a past neuroimaging study [5]. For evaluation of interrater reliability, anonymized patient videos were provided to each rater in a randomized order. To assess intrarater consistency over time, each rater was again provided with the anonymized and newly randomized patient videos. To best avoid recall effects, this second rating was performed > 30 months after the first rating (A.L./E.A. completion at 35/32 months).
Audio analysis
F0 analysis was based on a previous approach [7]. In brief, after a signal cleanup that was required due to the acoustic limitations of video camera microphone recording, time-varying information of F0 from the acoustic signal of the sustained tones in the representative middle pitch-register (for the respective instrument) was extracted from both left and right face recording using Harvest [9]. A sustained F0 signal of 1 s without obvious estimation error was extracted, and then the standard deviation (SD) was computed from the F0 signal. This value was defined as a variable representing the fluctuation of the time-varying F0 signal. The average value of the left/right body side recording was defined as a variable for statistical analysis (Table 1).
Statistical analysis
The following attributes were evaluated, with benchmarks given in brackets: 1) Acceptability: floor/ceiling effects (< 20%) [10] and data distribution characteristics (skewness; from -1 to 1) [11], 2) Internal consistency (Cronbach’s α; > 0.70) [12] and corrected item-total correlation (Pearson’s r > 0.40) [10,11], 3) Interrater reliability expressed by intraclass correlation coefficients (ICC) for the EDSRS (considered satisfactory if > 0.70) [12], and secondarily for the scale’s six items by means of Krippendorff ’s α (considered satisfactory if > 0.60) [11,13], 4) Intrarater reliability expressed by ICC for the EDSRS (considered satisfactory if > 0.70) [12], and 5) Precision as estimated through the standard error of measurement (SEM = SD *
RESULTS
Data from medical records characterizing the cohort and rating results are given in Table 1 (single item results are additionally provided in Supplementary Table 1 in the online-only Data Supplement). With regard to EDSRS ratings (2 raters × 2 time points), 1) data distribution characteristics by skewness (-0.31–0.92), as well as floor (across ratings 4.4%; range 5.9%–11.8%) or ceiling effects (across ratings, 8.8%; range 0%–17.6%), were within benchmarks across EDSRS ratings. 2) Internal consistency was satisfactory across EDSRS ratings (Cronbach’s α = 0.975−0.983). Each of the six items of the EDSRS reached the 0.40 threshold value for corrected item-total correlations (r = 0.90−0.96). 3) Interrater reliability for the EDSRS was satisfactory (ICC [3 to 2]: agreement 0.94, 95% confidence interval (CI) [0.64–0.98]; consistency 0.96, 95% CI [0.90–0.99]). For the six items of the scale, reliability was also within benchmark with Krippendorff ’s α between 0.64 and 0.90 across ratings, with the highest values for sustained tones (fourths α = 0.64−0.72, scales α = 0.69−0.75, sustained tones α = 0.71−0.90). 4) The intrarater reliability of the EDSRS between the first and second application was also satisfactory (ICC [1 to 1] 0.93/0.87, 95% CI [0.82–0.97]/[0.68–0.95]). 5) The SEM was below one-third of the SD (SD at baseline 8.15/7.31; SEM 2.19/2.65). 6) Convergent (construct) validity against the measure of fundamental frequency (F0) variability was also adequate (r = 0.55-0.60, p = 0.011–0.023; Supplementary Figure 3 in the online-only Data Supplement).
DISCUSSION
In the present study, we validated a composite score (EDSRS) that quantitatively measures ED-related impairment based on an audio-visual rating of patients. Through this score, we aim to overcome the lack of a valid and reliable clinical score for the estimation of symptom severity in ED, which to date is a major obstacle for the structured clinical assessment of this type of MD. By assessing three modes of playing across registers that require different techniques of the embouchure (i.e., sustained notes, scales, and fourths) from both the left and right sides on a five-point Likert scale, the scale considers that playing impairment is not specific to one certain way of playing [2]. The EDSRS showed high internal consistency and interrater and intrarater reproducibility. Together with a low SEM, the EDSRS thus may prove to be a reliable tool for the quantification of ED severity in daily clinical practice. Furthermore, the significant association of the EDSRS with the fluctuation of the fundamental frequency as an objective correlate of ED symptoms [7,16] indicated construct validity. While a purely technical rating of disease severity in ED based on such correlates has also been proposed [8], this is technically challenging in both acquisition and processing and hence to date not applicable in clinical routine; neither an automated application for such sound analysis nor specialized technical equipment for such approaches is broadly available.
For the first time, we present a clinical rating score for ED that fulfills three of four criteria for scores assessing MD proposed by Spector and Brandfonbrener [17]: The EDSRS is 1) reliable and valid, 2) specifically designed for MD since it assesses symptom severity at the instrument with tasks that induce dystonia, and 3) practical in a clinical setting. Indeed, Comella et al. [18] showed that rating scales that are too complex are not considered useful for clinical applications but rather for clinical studies. However, as a limiting aspect, we could not make a statement regarding the fourth proposed criterion, sensitivity to change. One reason is that treatment options for MD are limited and highly individualized. Thus, no standardized intervention exists against which an improvement could be validated. However, future research should aim to address this criterion.
One strength of the EDSRS is that its application takes 5–6 min and that it can be applied during a clinical consultation, which we consider feasible in daily practice. Although not necessary for rating in clinical settings, additional audiovisual recording of the six items of the EDSRS does not require much extra effort and resources, yet allows for the additional assessment of F0 fluctuations and makes the score easily usable for clinical trials. Naturally, setting up the technical equipment for such optional recordings may require some additional time investment beyond solely the EDSRS application time. Furthermore, in the case of the use of recordings for blinded or external ratings (e.g., in clinical trials), scale application on such recordings would have to be done offline after acquisition (similarly as done for this study).
Our aim was to show that the EDSRS is a valid and reliable tool for assessing ED when applied by experts in musicians’ medicine to whom most of the musicians with ED are referred. One limitation is therefore that we cannot make a statement regarding the generalizability of the EDSRS if it is applied by non-experts in musicians’ medicine. Future studies should aim to broaden the applicability of the scale. Another limitation of the study is that not all phenotypes of ED were present. Future prospective studies may therefore aim to 1) assess the sensitivity of the EDSRS to change, 2) apply it to a larger sample of patients for external validation, ideally including all phenotypes, and 3) assess whether a short version can be derived.
A key challenge in ED is that 12 muscles of the embouchure, laryngeal muscles or the tongue may be involved [2,19], which makes the detection of overt abnormal movement patterns more difficult than in MD of the upper extremity. In the latter condition, the affected fingers can usually be determined by carefully observing the abnormal movements [1], and therefore, this has been a key measure in validated scores for MD of the upper limb as well as in other focal dystonias [3,18]. We addressed this challenge by developing an audiovisual rating scale of performance, which is specifically designed for ED and quantitatively assesses impairment of performance. We showed that this scale is valid and reliable, and we consider it to be suitable for application in everyday clinical routines as well as in clinical studies at clinics specialized in musicians’ medicine.
Supplementary Material
The online-only Data Supplement is available with this article at https://doi.org/10.14802/jmd.22213.
Notes
Conflicts of Interest
The authors have no financial conflicts of interest.
Funding Statement
None
Author Contributions
Conceptualization: Tobias Mantel, André Lee. Data curation: Tobias Mantel. Formal analysis: Tobias Mantel, André Lee, Shinichi Furuya, Masanori Morise. Investigation: Tobias Mantel, André Lee, Shinichi Furuya, Masanori Morise. Methodology: Tobias Mantel, André Lee, Masanori Morise. Software: Masanori Morise, Tobias Mantel. Supervision: Tobias Mantel, Masanori Morise, Eckart Altenmüller, Bernhard Haslinger. Visualization: Tobias Mantel, André Lee. Writing—original draft: Tobias Mantel, André Lee. Writing—review & editing: all authors.
Acknowledgements
We thank all musicians for taking part in this study.