1. Background
The muscle volume is a reliable indicator of the physical capacity of a muscle in force and power generation (1). It considerably changes with aging, pathologies, mechanical loading and exercise, and immobilization (2, 3). This parameter can be used to evaluate the effectiveness of interventions focusing on muscle strengthening and function. Generally, various methods are used to measure the muscle volume (4). Nonetheless, before the application of any measurement method for research or clinical applications, their reliability needs to be established (5). Reliability is defined as the extent to which measurements can be replicated. In other words, it reflects not only the extent of correlation, but also the level of agreement between measurements. Without reliability measurements, we can neither rely on our measurements, nor draw any rational conclusions (6, 7).
The weakness of the intrinsic foot (IF) muscles is an important issue that has been investigated in different deformities. The weakness of these muscles may alter the foot alignment (8); the collapse of the medial longitudinal arch of the foot is an example of foot misalignment (9-11). In this regard, Chang et al. compared the volume of IF muscles between patients with plantar fasciitis and healthy individuals by separating the muscle tissues from non-muscular tissues in magnetic resonance (MR) images. In this study, the muscle borders were digitally marked, the intensity of muscle signals was examined, and the volume of muscles was calculated. The volume of intrinsic muscles in the forefoot of the plantar fasciitis group was 5.2% smaller than that of healthy feet (12).
The hallux valgus (HV) deformity is another example of muscle weakness. In this deformity, the head of the first metatarsal bone is deviated medially, the hallux is deviated laterally, and the abductor hallucis (AbdH) muscle is displaced relative to the metatarsophalangeal joint (13). From a clinical perspective, the role of the AbdH muscle is yet to be established. However, previous studies suggest that the AbdH muscle, its distal attachment, and muscle imbalance play a role in the etiology and treatment of HV deformities (14, 15). The measurement of the AbdH muscle in HV deformities and investigation of the effects of interventions on the volume of this muscle can provide useful information for the management of this prevalent deformity. Nevertheless, it is somewhat challenging to measure different characteristics of the IF muscles due to their small size and depth.
Magnetic resonance imaging (MRI) is considered as the reference standard technique to measure the muscle volume (16), as it yields three-dimensional images of the muscle and facilitates the assessment of muscle mass (17, 18). It also provides high-contrast, high-resolution images of soft tissues across multiple planes and enables examining the anatomical and functional characteristics of foot muscles (19). There are many techniques that can be used manually, semi-automatically, and automatically to examine and segment muscles from MRI images (20). Four manual techniques have been used in previous studies to measure the muscle volume. Slice-by-slice segmentation of the muscle cross-sectional area (CSA) is one of the manual techniques used as a standard reference method in studies on large muscles (21).
2. Objectives
The reliability assessment of measurements is especially important in examining the effects of clinical treatment, allowing researchers to evaluate between- and within-group changes over time. The manual technique has been mostly used in large muscles to calculate the muscle volume, while the reliability of this technique has not been investigated in the AbdH muscle, especially in HV deformities that greatly affect this muscle. Therefore, the present study aimed to assess the intra- and inter-rater reliability of the manual measurement of the AbdH muscle volume in HV deformities for clinical and research purposes to evaluate the effects of treatment.
3. Patients and Methods
3.1. Study Sample
The MRI images of the right foot of 15 women with HV deformities were acquired in the frontal view. The sample size was estimated based on the hypothesized value of intraclass correlation coefficient (ICC) (0.6) (22), α value of 0.05, and test power of 80% (β = 0.2) for two replicated measurements. All participants signed a written informed consent form. This study was approved by the institutional review board of Iran University of Medical Sciences, Tehran, Iran (IR.IUMS.REC.1399.1037). The participants were screened for medical and orthopedic conditions that would preclude MRI procedures. The inclusion criteria were age of 18 - 44 years and lack of any underlying diseases, diabetes, gout, leprosy, or neurological conditions. Besides, they had no history of foot injuries (e.g., fractures and dislocations). A summary of the participants’ demographic information is presented in Table 1.
Variables | Values |
---|---|
Number | 15 |
Age (y) | 30.40 ± 5.56 |
Height (cm) | 162.27 ± 6.57 |
Weight (kg) | 60.80 ± 9.23 |
BMI (kg/m2) | 23.00 ± 3.04 |
Hallux valgus angle (degree) | 18.80 ± 3.46 |
The Demographic Information of the Participants
3.2. Image Acquisition
The MRI images of the right foot were acquired using an MRI system (MAGNETOM Symphony 1.5 Tesla, Siemens, Germany) with a one-channel knee coil. The participants were positioned in a supine position with the foot in a neutral position (rest position) and perpendicular to the bed inside the coil (23). To prevent extra movement during imaging, the foot and ankle were fixed with side pillows. The knee coil was used on the target foot to achieve the highest resolution without missing the signal strength-to-noise ratio (23). The position of the foot was maintained in a way that the natural shape of the soft tissue would not be altered; by keeping the foot straight, the locations of the muscle origin and insertion were in line. The images were recorded in three planes. The examination period was 26 minutes for each foot (Figure 1).
The MRI images were recorded from January 2021 to October 2021. The images were prepared based on the following parameters: repetition time, 540 ms; echo time, 12 ms; average, 3; slice thickness, 3 mm; inter-slice gap, 0 mm; field of view, 240×120 mm; flip angle, 90°; and matrix size, 320 × 200. The field of view covered one foot, depending on the foot length from the back part of the heel to the end of the longest toe.
3.3. Muscle CSA Measurements
The cross-sections of the AbdH muscle in the frontal plane were manually outlined once by one of the raters (FD) and twice by the second rater (NM) in the target slices. The raters were trained by a professional to decide on the origin and insertion slices, separate the muscle borders, use the software utilities, measure the CSA on each slice, and calculate the total volume of the muscle. Before independent measurements by the raters, they practiced the method several times together to ensure the uniformity of their measurement technique. The CSA of the AbdH muscle was marked in each cut and measured using the Marco Packs software (Tahavolat Novin Yademan Co., Tehran, Iran), connected to a Siemens device. The entire length of the foot was examined in the frontal view (11). The number of cuts in which the muscle was defined varied from one person to another due to differences in the length of the feet (42 cuts on average).
The measurements were performed from the origin of the muscle on the calcaneus to the insertion of the muscle tissue in the forefoot. The CSA of the muscle in each cut was recorded in mm2 in Microsoft Excel software. There was a black area around the AbdH muscle, called a chemical shift, which occurs when there is fat surrounding the muscle. Different signal intensities allowed for the separation of muscle tissues from chemical shifts around the muscle compartments (24). Meanwhile, different views of each cut were evaluated to ensure that the outlines were carefully selected (Figure 2). Next, the sum of CSA measurements for all cuts was calculated. The total muscle volume was measured by multiplying the sum of total CSAs by the thickness of slices (3 mm) (muscle volume = ∑CSA × 3) (24-27).
3.4. Reliability of Measurements
The manual slice-by-slice CSA segmentation of muscles is a standard method used in previous studies; however, it is a very time-consuming procedure due to the examination of all slices. Compared to other techniques, this technique can provide more accurate and detailed information (17). In this study, two trained raters outlined the AbdH muscle CSAs in the target cuts (55 cuts in the frontal view); this process was performed by both raters for each image separately (23); the raters were blind to the findings of one another. Next, the reliability of this method was analyzed for the two raters. Regarding the intra-rater reliability, the second rater repeated the measurements five days after the initial measurements (on average) to eliminate the memory effect (28).
3.5. Statistical Analysis
The intra-rater reliability and inter-rater reliability were assessed in SPSS version 21.0 (IBM Corp. Released 2012. IBM SPSS Statistics for Windows, Version 21.0. Armonk, NY: IBM Corp.) by measuring the ICC. The reliability coefficients range from zero to one, with values closer to one representing higher reliability (29). The mean values and standard deviations were calculated for all variables. The ICC was measured by using the two-way random-effects and absolute agreement [ICC (2, 1)] model to evaluate the extent of agreement between the raters at a 95% confidence interval. In this model, each sample was measured by one rater, who represented a larger community of raters (reliability analysis based on single measurements) (28, 29). According to Portney and Watkins (2009), ICC values below 0.5 indicate poor reliability, values of 0.5 - 0.75 suggest moderate reliability, values of 0.75 - 0.90 suggest good reliability, and values > 0.90 represent excellent reliability (7). The standard error of measurement (SEM) was calculated by the following formula (Equation 1) to estimate the expected error value in measurements:
SD, standard deviation.
The minimal detectable change (MDC) was also measured according to the following formula (Equation 2) (30):
CI, confidence interval.
This procedure aimed to determine the magnitude of change that would exceed the minimal error of measurement at a 95% confidence interval; the observed changes between the two tests accurately represented the difference (not a measurement error) (30).
4. Results
4.1. Inter-rater Reliability
Table 2 presents the results of reliability analysis between the two raters for 15 samples of the right foot with HV deformity. The descriptive data (mean and standard deviation) are also reported for the measurements. The ICC for the inter-rater reliability of the AbdH muscle volume measurement was excellent (0.92). Also, the SEM was small (621.13 mm3), which indicated the accuracy of measurements.
Measure | Volume measurement by rater 1 (mm3) (mean ± SD) | Volume measurement by rater 2 (mm3) (mean ± SD) | ICC (95% CI) | SEM | MDC | P-value |
---|---|---|---|---|---|---|
CSA of the AbdH muscle | 10262.07 ± 2196.03 | 10093.73 ± 2154.01 | 0.92 (0.79 - 0.97) | 621.13 | 1721.68 | 0.00 |
The Results of Inter-rater Reliability Analysis
4.2. Intra-rater Reliability
The results of intra-rater reliability analysis for 15 samples of the right foot with HV deformity are presented in Table 3. The intra-rater reliability was also found to be excellent (0.99). The SEM value for the intra-rater reliability was lower than that of the inter-rater reliability (215.40 mm3).
Measure | Volume measurement by rater 2 (mm3) (mean ± SD) | Volume measurement by rater 2, repeat (mm3) (mean ± SD) | ICC (95% CI) | SEM | MDC | P-value |
---|---|---|---|---|---|---|
CSA of the AbdH muscle | 10093.73 ± 2154.01 | 9954.69 ± 2123.89 | 0.99 (0.97 - 0.99) | 215.40 | 597.05 | 0.00 |
The Results of Intra-rater Reliability Analysis
5. Discussion
This study aimed to evaluate the inter- and intra-rater reliability of a manual method used for measuring the AbdH muscle volume based on the MRI images of feet with HV deformity for research and clinical purposes. Before interpreting the results, it is necessary to evaluate the reliability of methods used for measuring the characteristics of muscles responsible for the formation of HV deformities. The ICCs for inter-rater and intra-rater reliability indicated excellent reliability. The SEM% for intra- and inter-rater agreement was estimated at 6.2% and 2.1%, respectively, which is comparable to the results of previous studies. In this regard, in a study by Franettovich Smith et al., the SEM% of inter- and intra-rater agreement was 4% and 6%, respectively for the CSA measurement of the AbdH muscle by ultrasound (31).
Moreover, based on the findings reported by Jung et al., the SEM% was estimated at 3.8% (32); nevertheless, it should be noted that both of these studies used the US imaging method. Generally, the SEM value represents the measurement error (33). An error may occur while detecting the exact location and borders of the muscle among other intrinsic muscles (33). The manual tracing of borders can also influence the measurements. Besides, the resolution of MRI images is an important factor that may affect the precision of muscle borders. Two raters were trained in several sessions, during which reference images, such as anatomical atlases of foot muscles, were used to determine the exact path and borders in different cuts; the prior experience of raters in such measurements may be the cause of high ICC and low SEM values (28).
Additionally, the inter- and intra-rater MDC95 values were estimated at 17.2% and 5.9%, respectively in the present study; the MDC value represents the potential to detect changes exceeding the measurement error for research or clinical applications. Therefore, if a single muscle volume measurement technique is employed by a single rater, not all changes in the muscle volume (< 5.9%) are actual changes. This finding is in line with the results of a study by Jung et al., which showed significant changes in the AbdH muscle CSA on ultrasound images after two types of interventions (32). Moreover, Hing et al. evaluated the reliability of two ultrasound machines and found that a change greater than 21.25% is needed to be 95% confident that a real change has occurred in the AbdH muscle CSA (34).
Similarly, Lund et al. examined inter- and intra-rater differences in using a manual method to measure the muscle volume of the dorsal ankle (tibialis anterior muscle, extensor digitorum longus, and extensor hallucis longus) in MRI images. Overall, these studies aimed to determine the number of slices needed for calculations and reported excellent inter- and intra-rater reliability (0.98 - 1.0) (16). It is known that the volume of these muscles (tibialis anterior muscle, extensor digitorum longus, and extensor hallucis longus) is larger than that of deep foot muscles, which may make it easier to identify and follow their path. In a validity and reliability study of a semi-automatic method for discriminating adipose tissue, subcutaneous fat, and intrinsic muscles of the foot, the ICC was mostly above 0.95, which indicated a high level of agreement among therapists (23). Also, Pons et al. examined the validity and reliability of automatic, semi-automatic, and manual techniques, which were used for measuring the muscle volume based on MRI images in healthy population. For cases of muscle pathology, more data on metrological quality of techniques are required. In addition, techniques that simplified the segmentation, made errors in volume and shape estimation (20). Previous research has investigated the reliability of slice-by-slice measurements. The intra-rater reliability was good to excellent in four studies (0.7 - 1.0) (21, 33, 35, 36), and inter-rater reliability was moderate to good in eight studies (0.5 - 0.89) (10, 21, 33, 35-39). Seven studies used manual methods to calculate the total volume of muscles by summing up the measured CSAs in all slices, similar to the method used in the current study (33, 35, 36, 38, 40-42). However, to the best of our knowledge, no study has yet evaluated our manual method to measure intrinsic foot muscles, especially the AbdH muscle. After muscle segmentation, seven methods were used to calculate the muscle volume. There was no measurement error in volume calculations, and error was related to the time of muscle segmentation (20).
In previous studies, the IF muscles, which are located deep within several layers, were commonly classified in groups due to their small and irregular size (11, 23, 24). The separation of a particular muscle from the adjacent intrinsic muscles is a somewhat difficult procedure. To find the beginning and end of a muscle, greater accuracy is needed, since there is a likelihood of measurement error. However, this is not an issue in the middle slices, as the border of muscles is easily separable. The measurement of the IF muscle volume is challenging because of its arrangement in a four-layer complex; therefore, it is very difficult to differentiate these muscles from others (43).
In individuals with HV deformities, the path of the AbdH muscle may be displaced below the head of the first metatarsal bone, depending on the severity of deformity (14, 15). Following changes in the muscle anatomy and biomechanics in individuals with HV deformities, muscle imbalance will develop between the abductor and adductor muscles of the hallux (15). Based on the results of a study by Stewart et al., significant changes were observed in the mediolateral width, dorsoplantar thickness, and CSA of the AbdH muscle between feet with and without HV based on ultrasound data. However, no significant changes were observed in different degrees of deformity (44). The reliability analysis of the AbdH muscle volume measurement in HV patients provides an important opportunity to gain further insight into the effects of interventions and strategies that focus on improving the strengths and functions of this small muscle by monitoring any related changes.
The limitations of this study include because of time-consuming image segmentation, measurement done one time by each rater; therefore, the absolute agreement was investigated and average reliability was not reported. A lack of comparison between the manual technique and automatic techniques is another limitation of this study.
In conclusion, the inter- and intra-rater reliability of the AbdH muscle volume measurement based on slice-by-slice examination in MRI images was found to be excellent. Therefore, it can be used as a reproducible method to measure the rate of change in the AbdH muscle volume in various treatments or research applications. Due to the excellent intra-rater reliability and lower standard error percentage of measurements, a single person is preferred to perform the measurements in comparative studies. Further research with a larger sample size is recommended.