Institution
National Board of Medical Examiners
Nonprofit•Philadelphia, Pennsylvania, United States•
About: National Board of Medical Examiners is a nonprofit organization based out in Philadelphia, Pennsylvania, United States. It is known for research contribution in the topics: United States Medical Licensing Examination & Item response theory. The organization has 224 authors who have published 551 publications receiving 13474 citations. The organization is also known as: NBME.
Topics: United States Medical Licensing Examination, Item response theory, Test (assessment), Certification, Equating
Papers published on a yearly basis
Papers
More filters
••
TL;DR: This article provides a comprehensive review of large‐scale studies of the psychometric characteristics of SP‐based tests and indicates that the major source of measurement error is variation in examinee performance from station to station.
Abstract: A little more than 10 years ago, the objective structured clinical examination (OSCE) was introduced. It includes several “stations,”; at which examinees perform a variety of clinical tasks. Although an OSCE may involve a range of testing methods, standardized patients (SPs), who are nonphysicians trained to play the role of a patient, are commonly used to assess clinical skills. This article provides a comprehensive review of large‐scale studies of the psychometric characteristics of SP‐based tests. Across studies, reliability analyses consistently indicate that the major source of measurement error is variation in examinee performance from station to station (termed content specificity in the medical‐problem‐solving literature). As a consequence, tests must include large numbers of stations to obtain a stable, reproducible assessment of examinee skills. Disagreements among raters observing examinee performance and differences between SPs playing the same patient role appear to have less effect on the pr...
598 citations
••
TL;DR: Criteria for good assessment are outlined that include: (1) validity or coherence, (2) reproducibility or consistency, (3) equivalence, (4) feasibility, (5) educational effect, (6) catalytic effect, and (7) acceptability.
Abstract: In this article, we outline criteria for good assessment that include: (1) validity or coherence, (2) reproducibility or consistency, (3) equivalence, (4) feasibility, (5) educational effect, (6) catalytic effect, and (7) acceptability. Many of the criteria have been described before and we continue to support their importance here. However, we place particular emphasis on the catalytic effect of the assessment, which is whether the assessment provides results and feedback in a fashion that creates, enhances, and supports education. These criteria do not apply equally well to all situations. Consequently, we discuss how the purpose of the test (summative versus formative) and the perspectives of stakeholders (examinees, patients, teachers-educational institutions, healthcare system, and regulators) influence the importance of the criteria. Finally, we offer a series of practice points as well as next steps that should be taken with the criteria. Specifically, we recommend that the criteria be expanded or modified to take account of: (1) the perspectives of patients and the public, (2) the intimate relationship between assessment, feedback, and continued learning, (3) systems of assessment, and (4) accreditation systems.
431 citations
••
TL;DR: A review of faculty development initiatives designed to improve teaching effectiveness synthesized findings related to intervention types, study characteristics, individual and organizational outcomes, key features, and community building to hold implications for practice and research.
Abstract: Background: This review, which focused on faculty development initiatives designed to improve teaching effectiveness, synthesized findings related to intervention types, study characteristics, individual and organizational outcomes, key features, and community building.Methods: This review included 111 studies (between 2002 and 2012) that met the review criteria.Findings: Overall satisfaction with faculty development programs was high. Participants reported increased confidence, enthusiasm, and awareness of effective educational practices. Gains in knowledge and skills, and self-reported changes in teaching behaviors, were frequently noted. Observed behavior changes included enhanced teaching practices, new educational initiatives, new leadership positions, and increased academic output. Organizational changes were infrequently explored. Key features included evidence-informed educational design, relevant content, experiential learning, feedback and reflection, educational projects, intentional co...
429 citations
••
TL;DR: The Mantel-Haenszel statistic, logistic regression, SIBTES'r, the Standardization procedure, and various IRT-based approaches are presented, and the relative strengths and weaknesses of the method are highlighted, and guidance is provided for interpretation of the resulting statistical indices.
Abstract: This module is intended to prepare the reader to use statistical proce dures to detect differentially functioning test items. 7b provide back ground, differential item functioning (DIF) is distinguished from item and test bias, and the importance of DIF screening within the overall test development process is discussed. The Mantel-Haenszel statistic, logistic regression, SIBTES'r, the Standardization procedure, and various IRT-based approaches are presented. For each of these proce dures, the theoretical framework is presented, the relative strengths and weaknesses of the method are highlighted, and guidance is pro vided for interpretation of the resulting statistical indices. Numerous technical decisions are required in order for the practitioner to appro priately implement these procedures. These decisions are discussed in some detail, as are the policy decisions necessary to implement an op erational DIF detection program. The module also includes an anno tated bibliography and a self-test.
409 citations
••
TL;DR: Current views of the relationship between competence and performance are described and some of the implications of the distinctions between the two areas are delineated for the purpose of assessing doctors in practice.
Abstract: Objective This paper aims to describe current views of the relationship between competence and performance and to delineate some of the implications of the distinctions between the two areas for the purpose of assessing doctors in practice.
Methods During a 2-day closed session, the authors, using their wide experiences in this domain, defined the problem and the context, discussed the content and set up a new model. This was developed further by e-mail correspondence over a 6-month period.
Results Competency-based assessments were defined as measures of what doctors do in testing situations, while performance-based assessments were defined as measures of what doctors do in practice. The distinction between competency-based and performance-based methods leads to a three-stage model for assessing doctors in practice. The first component of the model proposed is a screening test that would identify doctors at risk. Practitioners who ‘pass’ the screen would move on to a continuous quality improvement process aimed at raising the general level of performance. Practitioners deemed to be at risk would undergo a more detailed assessment process focused on rigorous testing, with poor performers targeted for remediation or removal from practice.
Conclusion We propose a new model, designated the Cambridge Model, which extends and refines Miller's pyramid. It inverts his pyramid, focuses exclusively on the top two tiers, and identifies performance as a product of competence, the influences of the individual (e.g. health, relationships), and the influences of the system (e.g. facilities, practice time). The model provides a basis for understanding and designing assessments of practice performance.
390 citations
Authors
Showing all 226 results
Name | H-index | Papers | Citations |
---|---|---|---|
Howard Wainer | 59 | 395 | 14182 |
Kathleen M. Mazor | 46 | 240 | 7423 |
David B. Swanson | 40 | 123 | 5384 |
Steven M. Downing | 38 | 63 | 6649 |
Matthias von Davier | 35 | 152 | 4127 |
Hua Hua Chang | 33 | 119 | 3140 |
Brian E. Clauser | 31 | 117 | 3101 |
Ruslan Mitkov | 30 | 164 | 3651 |
Eta S. Berner | 28 | 125 | 3762 |
Ian Spence | 27 | 61 | 3678 |
Thomas E. Ford | 24 | 54 | 2553 |
Paul F. Velleman | 20 | 51 | 3675 |
Richard M. Luecht | 20 | 49 | 1520 |
S M Case | 19 | 38 | 1177 |
Steven A. Haist | 18 | 71 | 1462 |