A quantitative approach to content validity

doi:10.1111/J.1744-6570.1975.TB01393.X

PHRSONNHL PSYCHOI.OGY

1975.

28, 563-575.

A QUANTITATIVE APPROACH TO CONTENT

VALIDITY^

C.

H. LAWSHE

Purdue

University

CIVIL rights legislation, the attendant actions of compliance agencies,

and a few landmark court cases have provided the impetus for the

extension of

the

application of content validity from academic achieve-

ment testing to personnel testing in business and industry. Pressed by

the legal requirement to demonstrate validity, and constrained by the

limited applicability of traditional criterion-related methodologies,

practitioners are more and more turning to content validity in search

of

solutions.

Over time, criterion-related validity principles and strate-

gies have evolved so that the term, "commonly accepted professional

practice" has meaning. Such is not the case with content validity. The

relative newness of the field, the proprietary nature of work done by

professionals practicing in industry, to say nothing of the ever present

legal overtones, have predictably militated against publication in the

journals and formal discussion at professional meetings. There is a

paucity of literature on content validity in employment testing, and

much of what exists has eminated from civil service commissions. The

selectipn of civil servants, with its eligibility lists and "pass-fail" con-

cepts,

has always been something of a special case with limited trans-

ferability to industry. Given the current lack of consensus in profes-

sional practice, practitioners will more and more face each other in

adversary roles as expert witnesses for plaintiff and defendant. Until

professionals reach some degree of concurrence regarding what con-

stitutes acceptable evidence of content validity, there is a serious risk

that the courts and the enforcement agencies will play the major

determining role. Hopefully, this paper will modestly contribute to the

improvement of this state of affairs (1) by helping sharpen the content

' A paper presented at Content Validity [1, a conference held at Bowling Green

State

University, July 18, 1975

Copyright

563

564 PERSONNEL PSYCHOLOGY

validity concept

and (2) by

presenting

one

approach

to the

quan-

tification

of

content validity.

A Conceptual Framework

Jobs vs. curricula. Some

of our

difficulties eminate from

the

fact that

the parallel between curriculum content validity and job content valid-

ity

is not a

perfect

one.

Generally speaking,

an

academic achievement

test

is

considered content valid

if and

when

(a) the

curriculum universe

has been defined (called

the

"content domain")

and (b) the

test

ade-

quately samples that universe.

In

contrast,

the job

performance

uni-

verse

and its

parameters

are

often

ill

defined, even with careful

job

analysis. What

we can do is

isolate specific segments

of the job per-

formance universe.

In

this paper, a job performance domain^

is

defined

as:

an

identifiable segment

or

aspect

of the job

performance universe

(a) which

has

been operationally defined

and (b)

about which infer-

ences

are to be

made. Hence,

a

particular

job may

have

a

single

job

performance domain; more often jobs have several.

For

example,

the

job

of

Typist might have

a

single

job

performance domain,

i.e., "typ-

ing straight copy from rough draft."

On the

other hand,

the job of

Secretary might have

a job

performance universe from which

can be

extracted several

job

performance domains, only

one of

which

is "typ-

ing straight copy from rough draft."

The

distinction made here

is

that,

in

the

academic achievement field

we

seek

to

define

and

sample

the

entire universe;

in the job

performance field

we

sample a job perform-

ance domain which

may or may not

approximate the job performance

universe. More often

it is not the

total universe

but

rather

is a

segment

of

it

which

has

been identified

and

operationally defined.

The nature of job requirements.

If

a

job truly requires

a

specific skill

or

certain job knowledge,

and a

candidate cannot demonstrate

the

posses-

sion

of

that skill

or

knowledge, defensible grounds

for

rejection

cer-

tainly exist.

For

example,

a

retail clerk

may

spend less than five

percent

of the

working

day

adding

the

prices

on

sales slips; however,

a

candidate

who

cannot demonstrate

the

ability

to add

whole numbers

may defensibly

be

rejected. Whether

or not

we attempt

to

sample other

required skills

or

knowledges

is

irrelevant

to

this issue. Similarly,

it is

irrelevant that other aspects

of the job

(other

job

performance

do-

mains)

may not

involve

the

ability

to add

whole numbers.

The judgment of experts. Adkins''

has

this

to say

about judgments

in

discussing

the

content validity approach:

^

The author is aware that certain of his colleagues prefer the term "job content

domain." The term, "job performance domain" is used here (a) to distinguish it from

the content domain concept in achievement testing and (b) to be consistent with the

Standards, pp. 28-29.

'

Dorothy C. Adkins, as quoted in Mussio and Smith, p. 8.

C. H. LAWSHE 565

In academic achievement testing, the judgment has to do with how closely test

content and mental processes called into play are related to instructional objectives.

In employment testing, the content validation approach requires judgment as to the

correspondence of abilities tapped by the test with abilities requisite for job success.

The crucial question, of course, is, "Whose judgment?" In achieve-

ment testing we normally use subject matter experts to define the

curriculum universe which we then designate as the "content domain."

We may take still another step and have those experts assign weights

to the various portions of a test item budget. From that point on,

content validity is established by demonstrating that the items in the

test appropriately sample the content domain. If the subject matter

experts are generally perceived as true experts, then it is unlikely that

there is a higher authority to challenge the purported content validity

of the test.

When a personnel test is being validated, who are the experts? Are

they job encumbents or supervisors who "know

the

job?"

Or, are they

psychologists or other professionals who are expected to have a

greater understanding of the organization of human personality and/

or greater insight into "what the test measures?" To answer these

questions requires a critical examination of job performance domains

and their characteristics.

The

nature

of job

performance

domains.

The behaviors constituting

job performance domains range all the way from behavior which is

directly observable, through that which is reportable, to behavior that

is highly abstract. The continuum extends from the exercise of simple

proficiencies (i.e., arithmetic and typing) to the use of higher mental

processes such as inductive and deductive reasoning. Comparison of

the behavior elicited by a test to behavior required on a job involves

little or no inference at the "observation" end; however, the higher the

level of abstraction, the greater is the "inferential leap" required to

demonstrate validity by other than a criterion-related approach. For

example, it is one thing to say, "This

job

performance domain involves

the addition of whole numbers; Test A measures the ability to add

whole numbers; therefore. Test A is content valid for identifying candi-

dates who have this proficiency." It

is

quite another thing to say, "This

job performance domain involves the use of deductive reasoning; Test

B purports to measure deductive reasoning; therefore. Test B is valid

for identifying those who are capable of functioning in this job per-

formance domain." At the "observation" end of the continuum,

where the "inferential leap" is small or virtually nonexistent, sound

judgments can normally be made by incumbents, supervisors, or oth-

ers who can be shown to "know the job." The more closely the

behavior elicited by the test approximates a true "work sample" of the

566 PERSONNEL PSYCHOLOGY

job performance domain,

the

more competent

are

people

who

know

the job

to

assess

the

content validity

of

the test. When

a

job knowledge

test

is

under consideration, they

are

similarly competent

to

judge

whether

or not

knowledge

of a

given

bit

of job information

is

relevant

to

the job

performance domain.

Construct validity.

On the

other hand, when

a

high level

of

abstrac-

tion

is

involved

and

when

the

magnitude

of the

"inferential leap"

becomes significant,

job

incumbents

and

supervisors normally

do not

have

the

insights

to

make

the

required judgments. When these condi-

tions obtain,

we

transition from content validity

to a

construct validity

approach. Deductive reasoning,

for

example,

is a

psychological

"con-

struct." Professionals

who

make judgments

as to

whether

or not de-

ductive reasoning

(a) is

measured

by

this test

and (b) is

relevant

to

this

job performance domain must rely upon

a

broad familiarity with

the

psychological literature.

To

quote

the

"Standards",**

Evidence of construct validity is not found in a single study; judgments of construct

validity are based upon an accumulation of research results.

An operational definition. Content validity

is the

extent

to

which

communality

or

overlap exists between

(a)

performance

on the

test

under investigation

and (b)

ability

to

function

in the

defined

job

performance domain.

In

summary, content validity analysis

pro-

cedures

are

appropriate only when

the

behavior under scrutiny

in the

job performance domain falls

at or

near

the

"observation"

end of the

continuum; here, those

who

"know the job"

are

normally competent

to make

the

required judgments. However, when

the job

behavior

approaches

the

abstract

end of the

continuum,

a

construct validity

approach

is

indicated;

job

incumbents

and

supervisors

are

normally

not qualified

to

judge. Operationally defined, content validity

is: the

extent

to

which members

of a

Content Evaluation Fanel perceive over-

lap between

the

test

and the job

performance domain. Such analyses

are essentially restricted

to (1)

simple proficiency tests,

(2) job

knowl-

edge tests,

and (3)

work sample tests.

Measuring

the

Extent

of

Overlap

Content evaluation panel.

How,

then,

do we

determine

the

extent

of

overlap

(or

communality) between

a job

performance domain

and a

specific test?

The

approach outlined here uses

a

Content Evaluation

Fanel composed

of

persons knowledgeable about the job. Best results

have been obtained when

the

panel

is

composed

of an

equal number

of

incumbents

and

supervisors. Each member

of the

Fanel

is

supplied

a

* Standards

for

Educational

and

Psychological Tests,

p. 30.

C. H. LAWSHE 567

number of items, either prepared for the purpose or constituting a

"shelf test. Independent of

the

other panelists, he is asked to respond

to the following question for each of the items:

Is the skill (or knowledge) measured by this item

—Essential

—Useful but not essential, or

—Not necessary

to the performance of the job?

Responses from all panelists are pooled and the number indicating

"essential" for each item is determined.

Validity of judgments. Whenever panelists or other experts make

judgments, the question properly arises as to the validity of their

judgments. If the panelists do not agree regarding the essentiality of

the knowledge or skill measured to the performance of the job, then

serious questions can be raised. If, on the other hand, they

do

agree, we

must conclude that they are either "all wrong" or "all right." Because

they are performing

the

job,

or are engaged in the direct supervision of

those performing the job, there is no basis upon which to refute a

strong consensus.

Quantifying

consensus.

When all panelists say that the tested knowl-

edge or skill is "essential," or when none say that it is "essential," we

can have confidence that the knowledge or skill is or is not truly

essential, as the case might be. It is when the strength of the consensus

moves away from unity and approaches

fifty-fifty

that problems arise.

Two assumptions are made, each of which is consistent with estab-

lished psychophysical principles:

—Any item, performance on which is perceived to be "essential" by more than half

of the panelists, has some degree of content validity.

—The more panelists (beyond 50%) who perceive the item as "essential," the greater

the extent or degree of its content validity.

With these assumptions in mind, the following formula for the

content

validity ratio (CVR) was devised:

CVR =-

in which the «e is the number of panelists indicating "essential" and

A' is the total number of panelists. While the CVR is a direct linear

transformation from the percentage saying "essential," its utility de-

rives from its characteristics:

—When fewer than half say "essential," the CVR is negative

—When half say "essential" and half do not, the CVR is zero

A quantitative approach to content validity

Citations

Validation guidelines for is positivist research

Is the CVI an acceptable indicator of content validity? Appraisal and recommendations

Focus on research methods. is the cvi an acceptable indicator of content validity? appraisal and recommendations

What is Technological Pedagogical Content Knowledge (TPACK)

Construct Measurement and Validation Procedures in MIS and Behavioral Research: Integrating New and Existing Techniques

References

Studies in Synthetic Validity. I. An exploratory investigation of clerical jobs

Statistical theory and practice in applied psychology

Related Papers (5)

Coefficient alpha and the internal structure of tests.

Cutoff criteria for fit indexes in covariance structure analysis : Conventional criteria versus new alternatives

Evaluating Structural Equation Models with Unobservable Variables and Measurement Error

Principles and Practice of Structural Equation Modeling

A Paradigm for Developing Better Measures of Marketing Constructs