Persian Handwriting Assessment Tool: Reliability in Students with Speciﬁc Learning Disorders

Background: Handwriting is one of the most common reasons for referral to occupational therapy among children with speciﬁc learning disorders (SLDs). The Persian handwriting assessment tool (PHAT) is a valid assessment instrument. It is important to clarify the reliability of this assessment tool for the accuracy of results and certain clinical uses in Iranian children with SLDs. Objectives: The present study aimed to investigate the internal consistency, test-retest, and inter-rater reliability of the PHAT in children with SLDs aged 10 to 12 years in the Iranian context. Methods: Thirty children (mean ± SD 132.33 ± 53.8 months) with SLDs, studying in grades 4 to 6, were recruited from special educationschoolsandrehabilitationclinicsfromJanuarytoMay2022. Cronbach’salphaandintraclasscorrelationcoeﬃcient(ICC) were calculated to determine internal consistency, test-retest reliability, and inter-rater reliability, respectively. The standard error of measurement (SEM) and minimal detectable change (MDC) were computed to establish absolute reliability. Results: Internal consistency was excellent ( α = 0.98 to 0.99), as was inter-rater reliability (ICC = 0.95 to 1.00). Test-retest reliability wasgoodtoexcellent(0.86to1.00). TheSEMandMDCvaluesfortest-retestreliabilitywere0to0.47and0to1.29,respectively. Finally, the SEM (0 - 0.21) and MDC (0 - 0.57) values were acceptable for inter-rater reliability. Conclusions: ThePHATisareliableassessmenttoolforIranianchildrenwithSLDsaged10to12years. Further,clinicianscanutilize this tool to identify handwriting diﬃculties in children with SLDs aged 10 to 12 years, which leads to more targeted interventions.


Background
Handwriting is a tangible manifestation of human language expression, representing a complex psychomotor activity (1).
This psychomotor activity is a means of communication used in many settings, including education (2,3).Occupational therapists, given their specialization in motor skills, accord particular significance to handwriting (4).Remarkably, a substantial 80% of fine motor activities in students involve tasks requiring paper and pencil (3).Proficiency in handwriting extends beyond mere pedagogy; it profoundly impacts a child's self-esteem and behavior, potentially giving rise to challenges such as obstinacy and communication problems if difficulties persist (5).When handwriting problems persist and impede academic progress, students frequently receive referrals to occupational therapists for specialized assessment and intervention (6).In this context, standardized assessment tools are indispensable for the meticulous analysis of handwriting problems and the customization of sensory, motor, cognitive, behavioral, and adaptive interventions (7).Thus, the presence of a standardized assessment tool is imperative for accurate handwriting evaluation.
However, within the realm of handwriting difficulties, it is important to recognize the significant role played by specific learning disorders (SLDs).These are neurodevelopmental disorders affecting a child's ability to learn and use academic skills, such as reading, writing, and math, despite adequate intelligence, motivation, and educational opportunities.Specific learning disorders are largely unrecognized disorders that can manifest as a disability in handwriting, spelling, and/or composition skills during child development (8).
Children with SLDs may experience difficulties in various aspects of handwriting, including letter formation, spacing, size, and alignment, as well as speed and legibility.These difficulties can affect their academic performance, self-esteem, and social interactions (9).
Various handwriting assessment tools have been developed for different languages, encompassing English, Spanish, Korean, Hebrew, and more (10)(11)(12)(13).The Persian handwriting assessment tool (PHAT), conceived by Havaei et al. (14), underwent initial validation in typically developing (TD) children aged 8 to 10 years and those with SLDs (14)(15)(16)(17).Havaei et al. reported good to excellent reliability for the PHAT in elementary school children, specifically in grades 2 and 3 (14).Furthermore, Meimandi et al. documented good to excellent reliability for the PHAT in children aged 8 to 10 years with SLDs (17).While these studies offered valuable insights, their focus was primarily on the psychometric properties of the PHAT within a limited age range.To establish its applicability across diverse age groups and learning disorders, a comprehensive investigation of its reliability in various contexts is essential (18).
Within the population requiring handwriting intervention, children with SLDs, such as dysgraphia, emerge as a predominant group.
Dysgraphia is a specifier encompassing difficulties in writing, including spelling, grammar, punctuation, and handwriting (19).Handwriting challenges tend to endure throughout different stages of development and grades for the majority of children grappling with SLDs.Of note is that previous research on the reliability of PHAT predominantly revolved around children aged 8 to 10 years.

Objectives
The present study endeavors to address this critical research gap by specifically scrutinizing the internal consistency, test-retest reliability, and inter-rater reliability of the PHAT within the context of 10 to 12-year-old students contending with SLD.This investigation thus aspires to underscore the significance of assessing the PHAT's reliability in a diverse array of groups, placing particular emphasis on older students grappling with handwriting difficulties.

Participants
In this cross-sectional and methodological study, 30 students (boys, N = 15) aged 10 -12 years with SLDs in grades 4 to 6 were recruited from 5 learning disorder centers in Tehran, Iran, from January to May 2022.The selection of these 5 learning disorder centers in Tehran was based on their reputation and expertise in diagnosing and supporting students with SLDs to ensure that the sample is representative of the target population.
The sample size of 30 was determined following established guidelines in the literature for assessing reliability (20).The inclusion criteria were a diagnosis of SLDs by a child and adolescent psychiatrist, normal intelligence quotient (IQ ≥ 70), being monolingual, absence of other developmental or neurologic comorbidities, absence of uncorrectable visual and hearing problems, and no history of repeating educational grades.These criteria were validated through a thorough review of academic records, rehabilitation history, and clinical evaluations.
Participants were excluded if they had concentration difficulties due to high levels of stress or medication use, which were identified through clinical evaluation and observation conducted by qualified professionals.According to the literature, a minimum of 30 participants is required for examining reliability (21).Ethical approval was obtained from the Ethics Committee of Iran University of Medical Sciences (IR.IUMS.REC.1400.876).
All the participants provided informed consent before being included in the study.

Measure
The PHAT was administered for data collection.It evaluates handwriting legibility and speed in the copying domain and legibility and orthographic error in the dictation domain.Students were asked to transcribe 12 written words and write another 12 words dictated by the examiner.The estimated time required for scoring by the examiner was 15 minutes for each student.The time to copy 12 words (speed) was recorded and used to calculate the number of letters written per minute with the following formula: Number of letters/number of seconds = χ/60.Orthographic errors were recorded in the dictation domain with the number of wrong words.Legibility has 4 components: Word formation, spacing, alignment, and text slant.Each word was scored on a 5-point Likert scale (from 1: Very poor to 5: Very good).The size was scored differently, from 1: Very small to 5: Very large, and a score of 3 was considered the best score.Finally, the average score of the 12 words for the copying and dictation domains was considered as the participant's total score in each component of legibility (14).

Procedure
First, study procedures were explained, and demographic data were collected.Second, students were asked to sit behind a desk with an appropriate height.The desk and chair were appropriate to the participant's height to control ergonomic factors.Students should have a proper posture and sufficient upper extremity stability.The equipment used to administer the test included a hard black (HB) pencil, an eraser, a pencil sharpener, a clipboard, a nonslip cover, a stopwatch, and a preprinted A4 lined paper.The clipboard was slanted to provide the participants with a better pencil grasp and a parallel state of the forearm of the writing hand to the table.There was no practice trial.The examiner stood opposite the students and asked them to read the words aloud and transcribe them on the bottom lines of the paper without using hyphens between words.In the dictation domain, the examiner dictated 12 words and asked the students to write them on a lined paper without time limitations.All the participants completed the PHAT between 10 a.m. and 12 p.m. in a well-lit room with suitable ventilation.An experienced occupational therapist administered the PHAT twice with a 2-week interval.According to similar articles and the instrument manual for handwriting assessment, an interval of more than 2 weeks between test and re-test is forbidden due to developmental changes (15).Two other occupational therapists performed the scoring to determine inter-rater reliability; they had at least 5 years of experience working with students with SLDs.Both examiners received scoring training and were in the same condition in terms of location and time during scoring.

Statistical Analysis
According to the Shapiro-Wilk test, the data had a normal distribution.Cronbach's alpha coefficient was used to determine internal consistency, with a value of > 0.7 considered as the minimum acceptable value (22).
Test-retest reliability and inter-rater reliability were estimated by intraclass correlation coefficient (ICC), two-way mixed, and absolute agreement.The ICC values higher than 0.8 represent acceptable reliability (21).
The PHAT absolute reliability was estimated through the standard error of measurement The SEM represents meaningful changes in the subject's score that are beyond measurement error.A SEM value of < 1/2 standard deviation (SD) was used as a criterion (23).
All the analyses were conducted in SPSS v. 27 (IBM Corp., Armonk, NY, USA).

Results
Thirty students in grades 4 to 6 participated, with an equal distribution of boys (N = 15) and girls (N = 15).Their mean age was 132.33 months, with a SD of 53.8 months.The grade distribution within the sample was as follows: 16 participants were in the fourth grade (53.3%), 7 in the fifth grade (23.3%), and 7 in the sixth grade (23.3%).In terms of handedness, 80% of the participants were right-handed (N = 24), while 20% were left-handed (N = 6).Only 13.3% of the participants wore glasses (N = 4), while the majority (86.7%) did not wear glasses (N = 26).None of the participants used hearing aids (0%), as all had normal hearing (N = 30).A detailed summary of the demographic characteristics of the students with SLDs is provided in Table 1 (at the end of the manuscript).Cronbach's alpha was excellent in copying (α = 0.98) and dictation (α = 0.99) domains (Table 2 at the end of the manuscript).Test-retest reliability was found to be 0.86 to 1 in the copying domain and 0.95 to 1 in the dictation domain, indicating good to excellent reliability.The SEM values ranged from 0 to 0.28 for legibility components and speed in the copying domain, with MDC values ranging from 0 to 0.77 and 0 to 1.29, respectively.In the dictation domain, SEM values ranged from 0 to 0.16 for legibility components and 0.38 for orthographic error, with MDC ranging from 0 to 0.44 and 0 to 1.24, respectively (Table 3 at the end of the manuscript).Inter-rater reliability, as measured by the ICC, ranged from 0.96 to 1 in the copying domain and 0.95 to 1 in the dictation domain.The SEM values were satisfactory, ranging from 0 to 0.21 for legibility components in copying and 0 to 0.2 in dictation.
The MDC values ranged from 0 to 0.57 in copying and 0 to 0.55 in dictation (Table 4 at the end of the manuscript).These findings demonstrate that the PHAT exhibits robust reliability in assessing handwriting legibility, speed, and orthographic error in both copying and dictation domains.The small SEM and MDC values indicate low measurement error and highlight the MDC in handwriting performance, further substantiating the tool's reliability and precision (Tables 3 and 4).These results align with established criteria for reliability assessment, emphasizing the PHAT's utility as a valid and consistent tool for handwriting evaluation.

Discussion
Reliability is an important feature of assessment tools and should be examined before administration.For this purpose, the reliability of the PHAT for SLD students aged 10 to 12 years was investigated.The results of the current study indicated good to excellent internal consistency, test-retest reliability, and inter-rater reliability.

Internal Consistency
The legibility components (i.e., formation, space, alignment, and size) of the PHAT in both copying and dictation domains showed excellent internal consistency.Havaei et al. reported good to excellent (α = 0.84 -0.99) internal consistency for the PHAT in TD children aged 8 to 10 years old (15).Moreover, Meimandi et al. reported good to excellent (α = 0.8 -0.98) internal consistency in children with SLDs aged 8 to 10 years (16).The literature showed good to excellent internal consistency for handwriting instruments measuring legibility and speed with a Likert scoring scale.The total score is not calculated for the PHAT, and each component is scored separately.Despite the complex nature of handwriting, this method of evaluating handwriting legibility and speed appears to have resulted in excellent internal consistency.Rosenblum reported good (α = 0.9) internal consistency for the Handwriting Proficiency Screening Questionnaire (HPSQ) in TD children.The HPSQ evaluates handwriting components such as legibility, performance time, and physical and emotional well-being (25).Additionally, the Handwriting Legibility Scale (HLS) designed by Barnett et al. measures global legibility, the effort required to read the script, layout on the page, letter formation, and alterations to the writing, and has excellent (α = 0.92) internal consistency (12).Further, Hong et al. reported good (α = 0.74) internal consistency for a handwriting test for preschool children in TD children aged 5 to 6 years old.This test evaluates speed, accuracy, and construction in dictation and spontaneous writing (26).Li-Tsang et al. reported moderate internal consistency for the Chinese handwriting assessment tool (CHAT) and declared that the complex nature of handwriting and different criteria such as pencil grip, the amount of pressure on the pencil, and legibility in CHAT may have led to moderate internal consistency (27).It seems that the main reason for the excellent internal consistency of the PHAT is the use of homogeneous components to evaluate legibility and speed in handwriting.Moreover, this tool does not evaluate sensory, perceptual, and motor prerequisites of handwriting.

Test-Retest Reliability
The test-retest reliability of the PHAT was investigated with a two-week interval.The results revealed good to excellent (ICC = 0.86 -1) reliability in all components of the copying and dictation domains.Small SEM values in copying and dictation domains indicated that the PHAT is a practical tool for identifying real changes in handwriting.Havaei et al. (15) explored good to excellent (ICC = 0.87 -1) test-retest reliability for the PHAT in TD children.Furthermore, Meimandi et al. (17) found good to excellent (ICC = 0.75 -0.98) test-retest reliability for the PHAT in children with SLDs.Both these studies examined the test-retest reliability with a two-week interval.Similarly, Li-Tsang et al. reported good to excellent test-retest reliability for handwriting speed, accuracy, and pen pressure for CHAT (27).They stated that the reason for this finding was the rating of the tool by an experienced examiner.In the present study, too, the PHAT was rated by an experienced occupational therapist with 5 years of research and clinical experience with children with

Uncorrected Proof
Kheirollahzadeh M et al.
SLDs.This may be one of the reasons for the acceptable reliability of subjective tools with a Likert scaling, such as PHAT.The therapist's knowledge and experience are of paramount importance while scoring handwriting.Test-retest reliability for text slant in both copying and dictation domains was 1.The possible explanation for this result may be that the overall text slant will not change in a two-week time frame since this handwriting component relies on spatial perception skills.Duff and Goyen reported the test-retest reliability of the evaluation tool of children's handwriting-cursive (ETCH-C) to be below the expected criterion.They justified that this result was due to the long time frame (i.e., 4 weeks).Furthermore, the participants were 5 and 6 years old, and according to the developmental process, this interval can be educationally decisive for these children.Practice during this period may improve handwriting and, hence, lead to the inconsistency of results (28). Lee et al. explored the test-retest reliability of the Korean handwriting assessment for children using digital image processing in 4 parts (i.e., consonant-vowel, word, sentence, and total score).Despite the nonsubjectivity of scoring, reliability was good to excellent in a two-week time frame (11).Correspondingly, Rosenblum and Gafni-Lachter (25), Salameh-Matar et al. (29), Barnett et al. (12), and Hong et al. ( 26) stated good to excellent test-retest reliability for the respective tools.The time interval in all the aforementioned studies was 2 weeks, and scoring was done by an experienced occupational therapist.Consequently, it can be speculated that determining the appropriate time interval between test and retest is crucial for measuring the reliability of handwriting instruments in elementary school children.It appears that the 2-week time interval led to the good to excellent test-retest reliability of PHAT.The reason for choosing a two-week interval between test and retest is that handwriting prerequisite skills such as visual perception and fine motor skills may change during two weeks due to the development and acquisition of skills during practice (30).The brain is developing at this age, and the aforementioned skills develop faster.Hence, these skills and handwriting may change in an interval of more than 2 weeks.

Inter-Rater Reliability
Inter-rater reliability was excellent in the present study.In addition, small SEM values in both copying and dictation domains indicated that the PHAT is a feasible and practical tool for identifying real changes in handwriting.According to Havaei et al.'s findings, the PHAT has good to excellent (ICC = 0.7 -1) inter-rater reliability in TD children (15).The inter-rater reliability was higher in the present study, which may be due to the subjective nature of the scoring procedure and the raters' experience.Meimandi et al. reported excellent (ICC = 0.86 -0.95) inter-rater reliability between teachers and good to excellent (ICC = 0.60 -0.95) inter-rater reliability between teachers and occupational therapists (16).Based on these findings, there is a lower agreement between the two specialties.Daniel and Froude revealed a lack of inter-rater reliability between pediatric occupational therapists and teachers (31).They explained that this lack of reliability may be due to not using a standard tool and only using a Likert scale for rating the quality of handwriting.Moreover, the lack of agreement may be a result of different levels of expertise and specialty.These findings manifest the importance of utilizing standard assessment tools.
In PHAT, each subject completes the assignments once, and the examiners rate the handwriting.Therefore, the environmental and emotional factors of the examinee will not affect the inter-rater reliability.Additionally, it appears that the most important factor for excellent inter-rater reliability in the present study is the result of similar expertise and specialty in raters.On the other hand, when both raters were occupational therapists, the judgment criteria were close to each other, and both of them received similar training for the analysis of handwriting.For example, an occupational therapist analyzes handwriting in terms of visual perception and fine motor aspects, but a teacher may not have this point of view.
The main advantage of using PHAT in research and clinical practice is that this tool was developed specifically for the Persian language, and its high reliability for a wider age group helps expand its use.Another advantage is that PHAT is a comprehensive assessment tool that analyzes Persian handwriting in terms of different aspects of legibility, speed, and orthographic errors.Furthermore, PHAT can be administered in educational settings.Teachers or other educators can benefit from incorporating this tool into their assessment and interpreting handwriting problems in detail so the educational plan will be more targeted for students with SLDs.

Suggestions for Future Studies
We suggest that future studies explore the discriminant validity of the PHAT in TD children and children with SLDs or other developmental disabilities and investigate the potential of the PHAT as a screening tool for handwriting difficulties.Furthermore, we suggest that future studies investigate the construct validity of PHAT with fine motor and visual perception assessment tools as the most related area to handwriting.We propose investigating the validity of the PHAT by comparing its results with other established measures of handwriting and exploring its sensitivity in detecting changes in handwriting skills over time with intervention.

Comparison with Other Handwriting Assessment Tools
To comprehensively evaluate the reliability and utility of the PHAT, it is valuable to compare its performance with that of other established handwriting assessment tools validated in similar age groups and populations.This comparative analysis sheds light on the unique strengths and contributions of the PHAT in assessing handwriting difficulties, particularly within the context of SLDs.

The Evaluation Tool of Children's Handwriting-Cursive
The ETCH-C is a widely recognized tool for assessing handwriting, particularly in cursive writing.Duff and Goyen reported the test-retest reliability of ETCH-C to be below the expected criterion, with a time frame of 4 weeks.They explained that this result could be due to the extended interval and the age of the participants (5and 6-year-olds).In contrast, the PHAT demonstrated good to excellent test-retest reliability (ICC = 0.86 -1) within a 2-week interval in children aged 10 to 12 years with SLDs.This suggests that the PHAT may provide more reliable results within a shorter time frame and for older children (28).

The Handwriting Legibility Scale
The HLS, designed by Barnett et al., measures global legibility, layout on the page, letter formation, and alterations to the writing.It has excellent internal consistency (α = 0.92) (12).
While HLS focuses on various aspects of legibility, the PHAT evaluates specific components, such as formation, spacing, alignment, and size, separately.This separation of components allows the PHAT to offer a more detailed assessment of handwriting, potentially making it a valuable tool for pinpointing specific areas of difficulty (12).

The Handwriting Proficiency Screening Questionnaire
The HPSQ assesses handwriting components such as legibility, performance time, and physical and emotional well-being (4).It had good internal consistency (α = 0.9) in TD children.In comparison, the PHAT not only assesses legibility and speed but also separates these components into distinct categories.This detailed approach to assessment can provide a more precise understanding of a child's handwriting difficulties (10).

The Developmental Test of Visual-Motor Integration
The developmental test of visual-motor integration (VMI) is often used to assess visual-motor integration skills, which are closely related to handwriting (5).While the VMI serves a different purpose than the PHAT, their comparative analysis could reveal the unique contributions of each tool in assessing the various aspects of handwriting and its related skills.
In summary, the PHAT stands out for its ability to provide detailed assessments of handwriting legibility and speed in children aged 10 to 12 years with SLDs.Its excellent internal consistency, test-retest reliability, and inter-rater reliability, especially within a 2-week timeframe, make it a practical tool for identifying real changes in handwriting.By separating components and focusing on specific criteria, the PHAT offers a nuanced evaluation of handwriting difficulties, potentially assisting clinicians and educators in tailoring interventions to individual needs.Future research may delve deeper into these comparative analyses, shedding more light on the unique strengths of the PHAT in the realm of handwriting assessment.

Clinical Implications
According to the high reliability of PHAT, researchers and clinicians can use this instrument in children with SLDs aged 10 -12 years, although the test was developed for a population aged 8 -10 years.The high reliability of the PHAT contributes to more accurate assessments and targeted interventions for children with SLDs who experience handwriting difficulties in late childhood.

Limitations
The main limitation of the present study is the recruitment of participants during the coronavirus disease 2019 (COVID-19) pandemic.The pandemic led to online education and nonattendance of students in schools.Some participants did not come for the retest and were excluded.Furthermore, the convenience sampling method was performed, and many participants were excluded.As a result, the sampling process was extended.However, as the participants studied in different schools, it seems that the sample could be representative of the population, and the results of the study could be generalized to children with SLDs.
While the PHAT demonstrates excellent inter-rater reliability, it is essential to acknowledge that the use of a Likert scale for subjective assessments introduces a level of subjectivity into the scoring process.Despite efforts to standardize the criteria and provide clear guidelines to raters, variations in individual judgment may still occur.Future research and refinements of the PHAT may explore ways to further minimize subjectivity in scoring, potentially through additional training of raters or refining the scoring criteria.

Uncorrected Proof
Kheirollahzadeh M et al.

Conclusions
For occupational therapists working with children with SLDs, it is clinically crucial to have a reliable handwriting assessment tool like PHAT.With regards to the high reliability of PHAT, it can be used by Iranian occupational therapists to detect Persian handwriting problems in children with SLDs aged 10 to 12 years.The availability of PHAT as a reliable and valid assessment tool can lead to earlier identification and targeted intervention for handwriting difficulties and potentially improve the academic performance and overall well-being of these students.

Table 1 .
Demographic Characteristics of the Students with Specific Learning Disorders (N = 30)

Table 2 .
Internal Consistency of the Persian Handwriting Assessment Tool in Copying and Dictation Domains (N = 30)

Table 3 .
Test-Retest Reliability of the Persian Handwriting Assessment Tool in Copying and Dictation Domains (N = 30) Abbreviations: ICC, intraclass correlation coefficient; CI, confidence interval; SEM, standard error of measurement; MDC, minimal detectable change.

Table 4 .
Inter-Rater Reliability of the Persian Handwriting Assessment Tool in Copying and Dictation Domains (N = 30) Abbreviations: ICC, intraclass correlation coefficient; CI, confidence interval; SEM, standard error of measurement; MDC, minimal detectable change.