scispace - formally typeset
Open AccessJournal ArticleDOI

A quantitative approach to content validity

C. H. Lawshe
- 01 Dec 1975 - 
- Vol. 28, Iss: 4, pp 563-575
TLDR
The content validity in employment testing has been a hot topic in the last few years as discussed by the authors, and a large body of work has been published in the area of content validity for employment testing.
Abstract
CIVIL rights legislation, the attendant actions of compliance agencies, and a few landmark court cases have provided the impetus for the extension of the application of content validity from academic achievement testing to personnel testing in business and industry. Pressed by the legal requirement to demonstrate validity, and constrained by the limited applicability of traditional criterion-related methodologies, practitioners are more and more turning to content validity in search of solutions. Over time, criterion-related validity principles and strategies have evolved so that the term, "commonly accepted professional practice" has meaning. Such is not the case with content validity. The relative newness of the field, the proprietary nature of work done by professionals practicing in industry, to say nothing of the ever present legal overtones, have predictably militated against publication in the journals and formal discussion at professional meetings. There is a paucity of literature on content validity in employment testing, and much of what exists has eminated from civil service commissions. The selectipn of civil servants, with its eligibility lists and "pass-fail" concepts, has always been something of a special case with limited transferability to industry. Given the current lack of consensus in professional practice, practitioners will more and more face each other in adversary roles as expert witnesses for plaintiff and defendant. Until professionals reach some degree of concurrence regarding what constitutes acceptable evidence of content validity, there is a serious risk that the courts and the enforcement agencies will play the major determining role. Hopefully, this paper will modestly contribute to the improvement of this state of affairs (1) by helping sharpen the content ' A paper presented at Content Validity [1, a conference held at Bowling Green

read more

Content maybe subject to copyright    Report

PHRSONNHL PSYCHOI.OGY
1975.
28, 563-575.
A QUANTITATIVE APPROACH TO CONTENT
VALIDITY^
C.
H. LAWSHE
Purdue
University
CIVIL rights legislation, the attendant actions of compliance agencies,
and a few landmark court cases have provided the impetus for the
extension of
the
application of content validity from academic achieve-
ment testing to personnel testing in business and industry. Pressed by
the legal requirement to demonstrate validity, and constrained by the
limited applicability of traditional criterion-related methodologies,
practitioners are more and more turning to content validity in search
of
solutions.
Over time, criterion-related validity principles and strate-
gies have evolved so that the term, "commonly accepted professional
practice" has meaning. Such is not the case with content validity. The
relative newness of the field, the proprietary nature of work done by
professionals practicing in industry, to say nothing of the ever present
legal overtones, have predictably militated against publication in the
journals and formal discussion at professional meetings. There is a
paucity of literature on content validity in employment testing, and
much of what exists has eminated from civil service commissions. The
selectipn of civil servants, with its eligibility lists and "pass-fail" con-
cepts,
has always been something of a special case with limited trans-
ferability to industry. Given the current lack of consensus in profes-
sional practice, practitioners will more and more face each other in
adversary roles as expert witnesses for plaintiff and defendant. Until
professionals reach some degree of concurrence regarding what con-
stitutes acceptable evidence of content validity, there is a serious risk
that the courts and the enforcement agencies will play the major
determining role. Hopefully, this paper will modestly contribute to the
improvement of this state of affairs (1) by helping sharpen the content
' A paper presented at Content Validity [1, a conference held at Bowling Green
State
University, July 18, 1975
Copyright
© 1975, by PERSONNEL PSYCHOLOGY, INC.
563

564 PERSONNEL PSYCHOLOGY
validity concept
and (2) by
presenting
one
approach
to the
quan-
tification
of
content validity.
A Conceptual Framework
Jobs vs. curricula. Some
of our
difficulties eminate from
the
fact that
the parallel between curriculum content validity and job content valid-
ity
is not a
perfect
one.
Generally speaking,
an
academic achievement
test
is
considered content valid
if and
when
(a) the
curriculum universe
has been defined (called
the
"content domain")
and (b) the
test
ade-
quately samples that universe.
In
contrast,
the job
performance
uni-
verse
and its
parameters
are
often
ill
defined, even with careful
job
analysis. What
we can do is
isolate specific segments
of the job per-
formance universe.
In
this paper, a job performance domain^
is
defined
as:
an
identifiable segment
or
aspect
of the job
performance universe
(a) which
has
been operationally defined
and (b)
about which infer-
ences
are to be
made. Hence,
a
particular
job may
have
a
single
job
performance domain; more often jobs have several.
For
example,
the
job
of
Typist might have
a
single
job
performance domain,
i.e., "typ-
ing straight copy from rough draft."
On the
other hand,
the job of
Secretary might have
a job
performance universe from which
can be
extracted several
job
performance domains, only
one of
which
is "typ-
ing straight copy from rough draft."
The
distinction made here
is
that,
in
the
academic achievement field
we
seek
to
define
and
sample
the
entire universe;
in the job
performance field
we
sample a job perform-
ance domain which
may or may not
approximate the job performance
universe. More often
it is not the
total universe
but
rather
is a
segment
of
it
which
has
been identified
and
operationally defined.
The nature of job requirements.
If
a
job truly requires
a
specific skill
or
certain job knowledge,
and a
candidate cannot demonstrate
the
posses-
sion
of
that skill
or
knowledge, defensible grounds
for
rejection
cer-
tainly exist.
For
example,
a
retail clerk
may
spend less than five
percent
of the
working
day
adding
the
prices
on
sales slips; however,
a
candidate
who
cannot demonstrate
the
ability
to add
whole numbers
may defensibly
be
rejected. Whether
or not
we attempt
to
sample other
required skills
or
knowledges
is
irrelevant
to
this issue. Similarly,
it is
irrelevant that other aspects
of the job
(other
job
performance
do-
mains)
may not
involve
the
ability
to add
whole numbers.
The judgment of experts. Adkins''
has
this
to say
about judgments
in
discussing
the
content validity approach:
^
The author is aware that certain of his colleagues prefer the term "job content
domain." The term, "job performance domain" is used here (a) to distinguish it from
the content domain concept in achievement testing and (b) to be consistent with the
Standards, pp. 28-29.
'
Dorothy C. Adkins, as quoted in Mussio and Smith, p. 8.

C. H. LAWSHE 565
In academic achievement testing, the judgment has to do with how closely test
content and mental processes called into play are related to instructional objectives.
In employment testing, the content validation approach requires judgment as to the
correspondence of abilities tapped by the test with abilities requisite for job success.
The crucial question, of course, is, "Whose judgment?" In achieve-
ment testing we normally use subject matter experts to define the
curriculum universe which we then designate as the "content domain."
We may take still another step and have those experts assign weights
to the various portions of a test item budget. From that point on,
content validity is established by demonstrating that the items in the
test appropriately sample the content domain. If the subject matter
experts are generally perceived as true experts, then it is unlikely that
there is a higher authority to challenge the purported content validity
of the test.
When a personnel test is being validated, who are the experts? Are
they job encumbents or supervisors who "know
the
job?"
Or, are they
psychologists or other professionals who are expected to have a
greater understanding of the organization of human personality and/
or greater insight into "what the test measures?" To answer these
questions requires a critical examination of job performance domains
and their characteristics.
The
nature
of job
performance
domains.
The behaviors constituting
job performance domains range all the way from behavior which is
directly observable, through that which is reportable, to behavior that
is highly abstract. The continuum extends from the exercise of simple
proficiencies (i.e., arithmetic and typing) to the use of higher mental
processes such as inductive and deductive reasoning. Comparison of
the behavior elicited by a test to behavior required on a job involves
little or no inference at the "observation" end; however, the higher the
level of abstraction, the greater is the "inferential leap" required to
demonstrate validity by other than a criterion-related approach. For
example, it is one thing to say, "This
job
performance domain involves
the addition of whole numbers; Test A measures the ability to add
whole numbers; therefore. Test A is content valid for identifying candi-
dates who have this proficiency." It
is
quite another thing to say, "This
job performance domain involves the use of deductive reasoning; Test
B purports to measure deductive reasoning; therefore. Test B is valid
for identifying those who are capable of functioning in this job per-
formance domain." At the "observation" end of the continuum,
where the "inferential leap" is small or virtually nonexistent, sound
judgments can normally be made by incumbents, supervisors, or oth-
ers who can be shown to "know the job." The more closely the
behavior elicited by the test approximates a true "work sample" of the

566 PERSONNEL PSYCHOLOGY
job performance domain,
the
more competent
are
people
who
know
the job
to
assess
the
content validity
of
the test. When
a
job knowledge
test
is
under consideration, they
are
similarly competent
to
judge
whether
or not
knowledge
of a
given
bit
of job information
is
relevant
to
the job
performance domain.
Construct validity.
On the
other hand, when
a
high level
of
abstrac-
tion
is
involved
and
when
the
magnitude
of the
"inferential leap"
becomes significant,
job
incumbents
and
supervisors normally
do not
have
the
insights
to
make
the
required judgments. When these condi-
tions obtain,
we
transition from content validity
to a
construct validity
approach. Deductive reasoning,
for
example,
is a
psychological
"con-
struct." Professionals
who
make judgments
as to
whether
or not de-
ductive reasoning
(a) is
measured
by
this test
and (b) is
relevant
to
this
job performance domain must rely upon
a
broad familiarity with
the
psychological literature.
To
quote
the
"Standards",**
Evidence of construct validity is not found in a single study; judgments of construct
validity are based upon an accumulation of research results.
An operational definition. Content validity
is the
extent
to
which
communality
or
overlap exists between
(a)
performance
on the
test
under investigation
and (b)
ability
to
function
in the
defined
job
performance domain.
In
summary, content validity analysis
pro-
cedures
are
appropriate only when
the
behavior under scrutiny
in the
job performance domain falls
at or
near
the
"observation"
end of the
continuum; here, those
who
"know the job"
are
normally competent
to make
the
required judgments. However, when
the job
behavior
approaches
the
abstract
end of the
continuum,
a
construct validity
approach
is
indicated;
job
incumbents
and
supervisors
are
normally
not qualified
to
judge. Operationally defined, content validity
is: the
extent
to
which members
of a
Content Evaluation Fanel perceive over-
lap between
the
test
and the job
performance domain. Such analyses
are essentially restricted
to (1)
simple proficiency tests,
(2) job
knowl-
edge tests,
and (3)
work sample tests.
Measuring
the
Extent
of
Overlap
Content evaluation panel.
How,
then,
do we
determine
the
extent
of
overlap
(or
communality) between
a job
performance domain
and a
specific test?
The
approach outlined here uses
a
Content Evaluation
Fanel composed
of
persons knowledgeable about the job. Best results
have been obtained when
the
panel
is
composed
of an
equal number
of
incumbents
and
supervisors. Each member
of the
Fanel
is
supplied
a
* Standards
for
Educational
and
Psychological Tests,
p. 30.

C. H. LAWSHE 567
number of items, either prepared for the purpose or constituting a
"shelf test. Independent of
the
other panelists, he is asked to respond
to the following question for each of the items:
Is the skill (or knowledge) measured by this item
—Essential
—Useful but not essential, or
—Not necessary
to the performance of the job?
Responses from all panelists are pooled and the number indicating
"essential" for each item is determined.
Validity of judgments. Whenever panelists or other experts make
judgments, the question properly arises as to the validity of their
judgments. If the panelists do not agree regarding the essentiality of
the knowledge or skill measured to the performance of the job, then
serious questions can be raised. If, on the other hand, they
do
agree, we
must conclude that they are either "all wrong" or "all right." Because
they are performing
the
job,
or are engaged in the direct supervision of
those performing the job, there is no basis upon which to refute a
strong consensus.
Quantifying
consensus.
When all panelists say that the tested knowl-
edge or skill is "essential," or when none say that it is "essential," we
can have confidence that the knowledge or skill is or is not truly
essential, as the case might be. It is when the strength of the consensus
moves away from unity and approaches
fifty-fifty
that problems arise.
Two assumptions are made, each of which is consistent with estab-
lished psychophysical principles:
—Any item, performance on which is perceived to be "essential" by more than half
of the panelists, has some degree of content validity.
—The more panelists (beyond 50%) who perceive the item as "essential," the greater
the extent or degree of its content validity.
With these assumptions in mind, the following formula for the
content
validity ratio (CVR) was devised:
CVR =-
in which the «e is the number of panelists indicating "essential" and
A' is the total number of panelists. While the CVR is a direct linear
transformation from the percentage saying "essential," its utility de-
rives from its characteristics:
—When fewer than half say "essential," the CVR is negative
—When half say "essential" and half do not, the CVR is zero

Citations
More filters
Journal ArticleDOI

Validation guidelines for is positivist research

TL;DR: Heuristics for reinvigorating the quest for validation in IS research via content/construct validity, reliability, manipulation validity, and statistical conclusion validity are suggested and new guidelines for validation and new research directions are offered.
Journal ArticleDOI

Is the CVI an acceptable indicator of content validity? Appraisal and recommendations

TL;DR: In this article, the authors compared the CVI to alternative content validity indexes and concluded that the widely-used CVI has advantages with regard to ease of computation, understandability, focus on agreement of relevance rather than agreement per se, and focus on consensus rather than consistency, and provision of both item and scale information.

Focus on research methods. is the cvi an acceptable indicator of content validity? appraisal and recommendations

TL;DR: This work translates item-level CVIs (I-CVIs) into values of a modified kappa statistic and suggests that items with an I-CVI of .78 or higher for three or more experts could be considered evidence of good content validity.
Posted Content

Construct Measurement and Validation Procedures in MIS and Behavioral Research: Integrating New and Existing Techniques

TL;DR: New and existing techniques are integrated into a comprehensive set of recommendations that can be used to give researchers in MIS and the behavioral sciences a framework for developing valid measures.
References
More filters
Journal ArticleDOI

Studies in Synthetic Validity. I. An exploratory investigation of clerical jobs

TL;DR: In this paper, an elemental approach to job analysis has been used, test requirements for the elements have been established, and a procedure for their use is proposed, and test standards for their application are established.
Related Papers (5)