Objective Structured Clinical Examination (OSCE) is one of the important assessment methods applied for performance assessment. It can be defined as “an assessment tool that is characterized by being objective and standardized. During this exam, students move through a series of time-bound stations in the circuit for the purposes of assessment of performance in a safe environment. Students are assessed using standardized scoring rubrics by trained examiners” (
1).
Known for high-level validity and reliability levels, OSCE tests have been incorporated more extensively in the assessment strategies of different medical schools with different methods, such as long case and short case examinations. Organizing and developing OSCE is not an easy task, which requires several preparations for all those involved (
2).
At our institution, in an attempt to improve standardization and objectivity of assessment in undergraduate years, different standardized assessment methods were used to improve outcomes of assessment (e.g., long case exams have been completely replaced by OSCE exams). There are some worrying factors that affect the reproducibility in OSCE exams, including students’ performance across stations, inter-rater reliability as well as examination and station length. Recently, more attention has been given to standard-setting procedures (
3).
There are multiple methods used for this purpose, which have been divided into three groups: norm-referenced methods, criterion-referenced methods, and combination methods (
3).
In norm-referenced methods, the pass/fail scores are determined by the relative scores of students (e.g., Cohen methods). These types are usually considered unacceptable in licensing tests. On the other hand, in criterion-referenced methods, a group of experts examine each test item to determine its difficulty and relevance (e.g., borderline group and contrasting groups methods) (
2).
The combination/compromise methods were designed to provide a balance between norm-referenced and criterion-referenced judgment. The main idea in these methods is that this compromise will help avoid the unreasonably high or low scores (e.g., the Hofstee methods) (
4).
Studies have shown that various standard-setting methods may lead to different results. The credibility of the passing score obtained from any assessment method will be high if this method produced a standard which is consistent with the purpose of the test and based on the judgment of experts who fit the “criteria of judge selection” (
2).
In 2001, Wilkinson et al., conducted a study to examine the validity and reliability of using global ratings of borderline performance to set the pass mark. They concluded that this method yielded a valid and reliable cut-off score (
5).
On the other hand, Boulet JR et al., concluded that selecting the proper standard-setting method for OSCE exams should depend on the purpose of the assessment and the availability of the resources (
6).
In our study, we compared four methods of standard-setting in order to determine the most effective method for establishing an appropriate passing score for a low-stake OSCE exam, these methods include the Modified Cohen’s, the borderline regression, the arbitrary fixed 60% score method and Hofstee method.
1.1. Cohen Method of Standard Setting
Cohen method is one of the norm-referenced standard-setting methods that can set the standards in ‘lower stakes’ exams. It uses the best performing students’ mark as a reference point to define the difficulty of the exam (
7).
According to the Cohen method, the students’ scores are arranged from the lowest to the highest scores; the 95% confidence interval (95% CI) or top 5% of the scores is highlighted, the mean was determined, and finally, 60% of the total mean score is calculated and considered as the standard/passing score. This can be expressed by the formula:
Pass Mark = R + 0.6 (X – R),
Where R is the mark which could be obtained by random guessing, and X is the mark of the 95th percentile student (
7).
The previous formula was modified by Taylor (2011); accordingly, he used the score of the 90th percentile student to determine the passing score as 65% of the total mean score, making no adjustment for random marks. The formula was adjusted to be as follows.
Pass Mark = 0.65Y,
Where Y is the mark of the 90th percentile student.
1.2. Borderline Regression Method
Borderline regression is considered the best method in OSCE exams. Examiners are asked to award a global score in each OSCE station based on their subjective opinions. The global score should be selected form 3 - 5 grade descriptors, such as good pass, pass, borderline, or fail. The borderline score reflects students whom the examiner feels that they have not performed good enough to pass the test nor performed so bad to fail that test or part of it. The global scores are collected and statistically regressed against the station’s checklist. The passing score is then calculated using a linear equation by assigning the midpoint of the global rating scale against the borderline group(s) scores (
8).
1.3. Hofstee Method
It is one of the compromised methods of standard-setting, which has similar characteristics with both norm-referenced and criterion-referenced methods. It considers the scores of the students as well as the judges’ expert group agreement about the maximum passing mark, minimum passing mark, maximum accepted failure rate, and minimum tolerated failure rate of the students (
4).
The fixed arbitrary 60% method:
In our institution, as many other medical schools, the passing score as well as the failure and passing rates are previously determined according to the institution’s bylaws (60%). Accordingly, the standard of any exam will be fixed.