Well-assessment is one of the most critical factors in improving the quality of each education system. Multiple choice questions (MCQs) are generally the most common type of questions used in clinical tests. Content validity and the appropriate structure of the questions are always significant issues for test developers. Therefore, it is impossible to distinguish between weak and strong students without observing the structural rules and the appropriate taxonomy level in designing these questions. Low-quality test also reduces learners’ motivation, and teachers’ and the educational system’s efforts will be wasted (
1). On the other hand, the type and quality of the test affect the teaching method and the teacher’s credibility. Therefore, it is necessary to be careful in preparing questions and performing tests to have the desired characteristics of standard tests, such as validity, reliability, and practicality (
2). In this context, educational systems should make appropriate interventions to assess the adequacy of the tests. There is a difference in the quality of four-choice questions in universities regarding structure and learning levels. Various studies have been conducted, including evaluating the quality of multiple-choice tests in a semester of medical school at Mazandaran University of Medical Sciences. Out of 1471 questions related to 25 tests, 64% had one or more structural defects, and most were at the first level of Bloom’s taxonomy (
3). Baghaei et al. (
4) concluded that most questions (84.6%) had one taxonomy level. According to the difficulty index, most questions (332 items) were complex. Among the studied subjects, medical-surgical 3 was the most difficult (61.42%), and obstetric nursing (
2) was the least challenging (10%). Regarding the discrimination index, most questions had an average discrimination coefficient (29.36%), and mental illnesses nursing (
1) had the best coefficient among the subjects. Most questions (1.84%) had appropriate structure (
4). Shakurnia et al. found that the average difficulty index of the MCQs was 0.59 ± 0.25, and 46.2% had a practical difficulty. The average of the discrimination index of the MCQs was 0.25 ± 0.24, and 57.3% of the MCQs had a discrimination index. Accordingly, combining the two difficulty and discrimination indices showed that only 248 MCQs (30.7%) were ideal. A total of 1525 distractor options (62.9%) were functional distractors (FD), and 889 (37%) were non-functional distractors (NFDs). The results showed that the MCQs should be improved (
5). Meanwhile, the analysis of the questions of the specialized midwifery courses of the same university was desirable (
3). Shakoornia et al. showed that more than half of the questions designed by the Jundishapur University of Medical Sciences faculty had a correct structure (
6). In addition, improving the quality of multiple-choice questions led to an increase in students’ level of knowledge (
4). Meayari and Biglarkhani indicated that 65.2% of the questions lacked the problems of the overall structure before the intervention. After the intervention, this rate reached 82.8%, which was a significant difference. In 2009, 38% of questions with high taxonomy were designed; in 2010, 53.1%, the differences were also significant. Therefore, intervention can effectively improve the design quality of multiple choice questions as feedback and compliance with technical principles in medical education, even for experienced designers in the design of questions (
7). The importance of evaluating students’ end-of-term and designing appropriate questions and the lack of knowledge of multi-choice questions designed by faculty members is undeniable. Therefore, the need for appropriate interventions in various fields, such as empowerment, continuing education, and feedback, quantitative and qualitative, is felt more than ever at Kermanshah University of Medical Sciences.