The aim of this review is to highlight recent and potential future enhancements to the United States Licensing Examination (USMLE) program. The USMLE program is co-owned by the National Board of Medical Examiners (NBME) and the Federation of State Medical Boards. The USMLE includes four examinations: Step 1, Step 2 Clinical Knowledge, Step 2 Clinical Skills, and Step 3; every graduate of Liaison Committee on Medical Education-accredited allopathic medical schools and all international medical graduates must pass this examination series to practice medicine in the United States. From 2006 to 2009, the program underwent an indepth review resulting in five accepted recommendations. These recommendations have been the primary driver for many of the recent enhancements, such as an increased emphasis on foundational science and changes in the clinical skills examination, including more advanced communication skills assessment. These recommendations will continue to inform future changes such as access to references (e.g., a map of metabolic pathways) or decision-making tools for use during the examination. The NBME also provides assessment services globally to medical schools, students, residency programs, and residents. In 2015, >550,000 assessments were provided through the subject examination program, NBME self-assessment services, and customized assessment services.
- medical student
- National Board of Medical Examiners
- subject tests
- United States Medical Licensing Examination
the national board of medical examiners (NBME) is an independent not-for-profit organization founded in 1915 whose mission is “to protect the health of the public through state-of-the-art assessment of health professionals.” Since its inception, the NBME has focused on the licensing of physicians to practice medicine in the United States (U.S.). The U.S. Medical Licensing Examination (USMLE), which is co-owned with the Federation of State Medical Boards, is the sole route to medical licensure for graduates of Liaison Committee on Medical Education (LCME)-accredited U.S. and Canadian medical schools and for all graduates of international medical schools.
During the past 30 years, the assessment portfolio of the NBME has expanded. Besides the USMLE program, the NBME oversees 35 other examination programs, including 120 additional examinations. NBME also provides testing, educational, consultative, and research services to health professional credentialing and educational organizations and ministries globally. The NBME currently works with ∼30 organizations to support >75 assessment programs. The assessments administered vary in use from high-stakes examinations, such as the USMLE, to lower-stakes examinations, such as NBME self-assessments. This portfolio of examinations also includes post-licensure assessments, specialty board certification, specialty board recertification, certification of special competencies, residency selection, medical school subject tests, and in-training examinations.
In the early 1990s, the NBME flagship licensure program, the USMLE, replaced both the existing Federation Licensing Examination program and the NBME certification examinations (also known as the “Part” exams). The USMLE has undergone a gradual evolution in design and format since its inception. In 1999, major programmatic changes included computerized examination delivery and the use of computerized case simulations. The USMLE Step 2 Clinical Skills (CS) exam, using standardized patients (SPs), was introduced in 2004.
The Review of the USMLE and Other Drivers for Change
In 2004, the USMLE undertook an indepth review of the licensure program’s purpose, design, and format (28a). This review resulted in five recommendations endorsed by the governing USMLE Composite Committee in 2009: 1) focus on assessments that support state licensing authorities’ decisions about a physician’s readiness to provide patient care at entry into supervised practice and entry into unsupervised practice; 2) adopt a general competencies schema consistent with national standards for the overall design, development, and scoring of the USMLE; 3) emphasize the scientific foundations of medicine in all components of the USMLE; 4) continue and enhance the assessment of clinical skills important to medical practice; and 5) introduce assessment(s) of an examinee’s ability to obtain, interpret, and apply scientific and clinical information.
The question asked by medical students, medical educators, and the leadership of medical schools is “Why change the USMLE?” The practice of medicine and medical education are ever changing. Advances in how medicine is taught, as well as how we approach the assessment of medical students and graduate trainees, necessitated this review. Clinical cases are used with increasing frequency to facilitate problem-based learning throughout medical education (24); curricula have become more integrated, often using an organ-based approach with clinical correlations (2, 29); and the basic sciences are being taught throughout the medical school curriculum (26). Additionally, SPs are widely used to teach and assess trainees (1), clinical experiences are being introduced early in medical school (3, 9, 13, 21), and technological advances have enabled the use of high-fidelity simulations for both teaching and assessment (5, 18).
From an assessment and licensure perspective, the USMLE must continually evolve to remain relevant, and practice analysis should periodically be performed to this end. The USMLE program undertook a series of practice analyses activities to inform the aforementioned changes to the USMLE. Five national databases compiled by the Centers for Disease Control and Prevention were analyzed. Additionally, a survey of first-year residency trainees was conducted at the commencement of internship. Significant findings on the intern survey included a low percentage of ambulatory experiences; interns performed a variety of procedures, often with general attending supervision; they were often required to engage in complex communication tasks; and they were often required to retrieve, evaluate, and integrate information obtained electronically (23).
Additionally, a survey was conducted of newly licensed physicians (“newly licensed” was defined as those who received an unrestricted medical license within 4 yr of the survey). This survey revealed that 71% of the newly licensed physicians answering the survey were still in training when they received licensure, and about one-half noted they obtained their license to facilitate moonlighting during residency (22). In addition, ordering, interpreting, and performing a variety of procedures are prevalent tasks among newly licensed physicians (22).
Focus of the USMLE: Entry Into Supervised Practice and Entry Into Independent Practice
The committee’s first recommendation, “focus on assessments that support state licensing authorities’ decisions about a physician’s readiness to provide patient care at entry into supervised practice and entry into unsupervised practice,” has shifted the focus away from each individual Step examination to two “decision points”: entry into supervised practice and entry into independent practice. The organization of question-writing and question-review committees, as well as question-writing assignments themselves, has evolved to focus discussions on questions appropriate for one decision point versus the other.
Use of Competency Framework
In response to the second recommendation, “adopt a general competencies schema consistent with national standards for the overall design, development, and scoring of the USMLE,” the USMLE adopted the Accreditation Council for Graduate Medical Education (ACGME) general medical competency-based schema (17). The ACGME competency framework has been used to guide test design, content development, and score reporting. In 2014, when the Step 3 examination split into two unique examination components, Foundations of Independent Practice (FIP) and Advanced Clinical Medicine, the content was organized around the six general competencies (medical knowledge, patient care, communication and interpersonal skills, practice-based learning and improvement, professionalism, and systems-based practice). Initially, examinees had the option of taking these examinations on back-to-back days, as previously offered, or taking the examinations separated by up to 2 wk. For an overview of the USMLE program, see Table 1.
Currently, there is one score and one pass/fail decision. In the future, there may be two separate scores and two separate pass/fail decisions if the separate scores provide the reliability to support such a high-stakes decision as readiness for entry into unsupervised practice. Moreover, the six competencies provide a framework to develop associated subcompetencies in areas currently rich in content, such as diagnosis and management. It is clear that an examination at one point in time is not adequate to fully assess some competencies or subcompetencies. For instance, we can assess whether examinees know the ethical principles used to make decisions, but whether they act in a professional and ethical manner in the care of their patients remains the responsibility of the medical school and the postgraduate training programs.
Assessment of Foundational Science Across the USMLE
Some changes are, or have been, prompted by unintended consequences of the USMLE program. For example, many medical students prepared for Step 1 with a “binge and purge” mentality. They may have felt foundational science was not pertinent to the practice of medicine, so the information was memorized with short-term retention in mind, thereby failing to recognize the value of foundational sciences in medical practice. Evidence regarding the short-term retention of foundational science is strong. There have been six studies that embedded Step 1 foundational science questions on Step 2 Clinical Knowledge (CK) (7, 8, 14, 16, 27, 28) and one study that embedded such questions on Step 3 (10). Three studies have shown that examinees’ total scores were 3-6 points higher on Step 1 versus Step 2 CK and 11 points higher on Step 1 versus Step 3. Behavioral science was the only discipline in which the more advanced examinee consistently scored higher. Other researchers, both in the U.S. and elsewhere, have found similar results (6).
The third recommendation, “the assessment of foundational science throughout the USMLE,” specifically addresses the binge-purge approach to learning. Foundational science knowledge is now extensively assessed on the Step 3 FIP examination; there is over 1 h of such content. Additionally, over the past 20 years, there has been increased emphasis on writing all Step 1 questions using clinical or experimental vignettes as the stimulus to each question. This has helped writers focus questions on the foundational science principles pertinent to the practice of medicine.
Emphasis on and Enhancement of CS Assessment
In the past, examinees’ performance on the SP cases in the Step 2 CS examination was scored via a checklist. This resulted in the unintended consequence of examinees asking as many questions as possible so as to increase the number of “yes” or affirmative marks. This “shotgun” approach to history taking is far removed from the actual behavior medical educators and communication experts teach, and this behavior does not necessarily reflect a competently performed patient-physician encounter. This was addressed in response to the fourth recommendation: “Continue and enhance the assessment of clinical skills important to medical practice.”
The approach to scoring the cases and assessing communication and interpersonal skills has been thoroughly and thoughtfully revised (12). The checklist is no longer used to score the history taking; in its place, an evidence-based scoring scale is used and assesses the presence or absence of desired observable behaviors. Scoring of the history taking and physical examination is based on the note written by the examinee. Additionally, examinees are required to include an ordered differential diagnosis and a care plan. They must provide evidence to support their differential diagnosis. In addition, more advanced communication skills cases, such as “breaking bad news,” have been developed, thus addressing a major finding in the new intern survey.
An additional challenge has been examinee reliance on studying “buzzwords” or word associations with specific diseases when preparing for the USMLE. For example, if the word “umbilicated” is used in a vignette, an examinee could be clued into the diagnosis of molluscum contagiosum because “umbilicated” is often used to describe the typical dermatological appearance. During the past 5 yr, there has been a concerted effort to use images of skin findings rather than descriptions to avoid such cueing. Furthermore, there have been a number of additional multimedia enhancements to the assessment of clinical skills on the Step 1, Step 2 CK, and Step 3 examinations.
In 2007, the USMLE introduced multiple-choice questions (MCQs) using recorded heart sounds. With the adult cardiac avatar, the examinee can listen over the four cardiac auscultation sites and the left and right carotid arteries with the diaphragm of the stethoscope and over the fifth intercostal space in the midclavicular line with the bell of the stethoscope. Initially, the questions with recorded heart sounds were 23.9% and 12.7% more difficult than text versions for examinees of LCME-accredited medical schools on Step 1 and Step 2 CK, respectively (11). This could have been in part due to cueing in the text version of the questions (e.g., crescendo murmur = aortic stenosis). More recently, there has been no significant difference in difficulty (25). Future enhancements will include breath sounds. The avatars used for heart and breath sounds will include variable heart and respiratory rates and physical appearances to better represent all patients.
Additional areas for future exploration include the use of videos to depict physical examination findings. For example, a “shuffling gait” is a classic description of a finding or “buzzword” associated with Parkinson disease, but it is more important for an examinee to recognize a shuffling gait when observing an actual patient. This would better approximate medical practice in the real world. In addition, videos of physicians interacting with patients, patients’ families, or other members of health care teams are being explored as prompts for MCQs in the domains of communication and interpersonal skills, professionalism, and systems-based practice.
Evidence-Based Medicine and Use of References and Other Tools
In response to the fifth recommendation, “assess an examinee’s ability to obtain, interpret, and apply scientific and clinical information,” two new formats have been developed: faux pharmaceutical advertisements and scientific abstracts. These new formats facilitate assessment of the application of knowledge of biostatistics in real-life settings.
The new Step 3 FIP has approximately 1 h of items that assess knowledge related to biostatistics and epidemiology. Future endeavors may include employing a static informational database that examinees could access during the entire examination or a portion of it. This requires extensive research to define how best to measure the construct of interest. Besides a searchable database such as PubMed, there are other tools that could be made available to examinees while taking the USMLE, for example, a cardiac risk calculator, a body mass index calculator, the Centers for Disease Control and Prevention pediatric and adult immunization schedules, and pharmacotherapeutic references.
Additionally, figures depicting biological pathways (e.g., the coagulation cascade) or various metabolic pathways (e.g., the tricarboxylic acid or citric acid cycles) could be provided. The Association of Biochemistry Educators has developed an extensive metabolic map (http://metabolicpathways.stanford.edu/) that a number of course directors use to teach biochemistry and that is provided to students as a reference for their examinations. The Association of Biochemistry Educators and NBME are exploring the possible inclusion of the metabolic map as a reference on the USMLE. The map will facilitate the assessment of higher-order and clinically relevant problem-solving skills and support a shift away from rote memorization.
The plans for the redesign and structural changes to the Step 1 and Step 2 CK examinations are ongoing and are being informed by recent changes to the Step 3 examinations. Changes to Step 1 and Step 2 CK will aim to better measure both currently and previously assessed competencies (see Table 1). The changes to the USMLE will continue to ensure physicians are competent and will help fulfill the NBME mission to better protect the health of the public through the art and science of assessment.
NBME Medical School Services
In addition to producing the USMLE, the NBME also provides assessment services globally to medical schools, students, residency programs, and residents. In 2015, greater than 550,000 assessments were provided through the subject examination program, NBME self-assessment services, and customized assessment services. The NBME developed the subject examination program during the early 1960s to meet the need for high-quality, standardized examinations that could measure achievement in the traditional basic and clinical science disciplines and provide a basis for comparison with a national reference group of medical students. The subject exams are constructed to be appropriate for assessing a broad range of curricular experiences in the basic and clinical sciences. Faculty and educational institutions often use these examinations instead of, or in conjunction with, locally developed tests because of the high quality of the questions. The examination forms are reviewed by content experts to assure relevancy; these individuals are faculty who are, or have been, directing medical school courses or clerkships. Additionally, a majority of subject test items have been previously used on the USMLE. Other advantages include the availability of multiple forms that have been statistically equated to produce equivalent scores and provision of norms that permit comparison of local student performance with a national reference group.
In 2007, NBME introduced the Customized Assessment Services (CAS) program in response to the increasing number of schools that have dynamic and integrated preclinical curricula. The CAS program allows faculty to build high-quality, standardized assessments targeted to local curricula using secure NBME item banks. The CAS program has experienced steady growth in the number and mix of schools using the service. Figure 1 shows the growth in use of the basic science assessment program, including the CAS and basic science subject test program. As shown, medical schools use the CAS more than the basic science assessments, and NBME staff will collaborate with faculty members to determine how to enhance the basic science assessments to better fit the needs of medical schools.
NBME clinical science subject tests are appropriate for administration at the end of clerkships. Additionally, as seen with the basic science assessment program, there has been a similar increase in the utilization of the NBME clinical science subject tests since 2009. A number of medical schools have developed integrated clerkships and have requested the ability to customize the clinical science examinations. NBME staff will be working with medical school faculty members to determine how best to proceed with enhancing the existing clinical subject exam program to meet the ever-evolving needs of schools of medicine (15).
The NBME also provides web-based self-assessments to medical students and graduates. The Clinical Science Mastery Series includes self-assessments that permit students to assess their knowledge of the clinical sciences covered during a clerkship. They are built to the same content specifications as the NBME clinical science subject examinations. The Comprehensive Self-Assessment Series helps students evaluate their readiness to take the USMLE Step 1, Step 2 CK, and Step 3 examinations. Research has demonstrated that under certain circumstances there is a moderate relationship between performance on the self-assessments and USMLE, with some variation in predictive accuracy across test administration conditions (19, 20).
The NBME provides a wide array of assessments and, together with the Federation of State Medical Boards, is dedicated to the continued evolution of the USMLE to ensure the licensing examination is of the highest quality. Besides the USMLE, the basic and clinical science subject test programs and CAS provide valid assessments used by medical schools. Finally, the NBME offers medical school students self-assessments that provide validated predictions of their future performance on USMLE and the clinical science subject tests.
All authors are actively involved in the programs described. The three authors are employees of the NBME, which is a co-owner of the USMLE with the Federation of State Medical Boards; otherwise, the authors do not have a conflict of interest.
S.A.H. conceived and designed research; S.A.H. analyzed data; S.A.H., A.B., and M.A.P. drafted manuscript; S.A.H. and M.A.P. edited and revised manuscript; S.A.H. and M.A.P. approved final version of manuscript; A.B. prepared figures.
- Copyright © 2017 the American Physiological Society