What are the contributions in "Detection of multiple change–points in multivariate time series" ?

The authors consider the multiple change–point problem for multivariate time series, including strongly dependent processes, with an unknown number of change–points. The authors consider applications to multivariate series of daily stock indices returns and series generated by an artificial financial market.

(Open Access) Detection of multiple change-points in multivariate time series (2006) | Marc Lavielle

Detection of Multiple Change–Points in

Multivariate Time Series

Marc Lavielle

Universit´e Ren´e Descartes and Universit´e Paris–Sud,

Laboratoire de Math´ematiques

Marc.Lavielle@math.u-psud.fr

Gilles Teyssi

ere

Statistique Appliqu´ee et MOd´elisation Stochastique

CES, Universit´e Paris 1 Panth´eon–Sorbonne.

stats@gillesteyssiere.net

July 2006

To appear in the Lithuanian Mathematical Journal, vol46, 2006

Abstract

We consider the multiple change–point p roblem for multivariate time series, including strongly

dependent processes, with an unknown number of change–points. We assume that the covariance

structure of the series changes abruptly at some unknown common change–point times. The proposed

adaptive method is able to detect changes in mu ltivariate i.i.d., weakly and strongly dependent series.

This adaptive method outperforms the Schwarz criteria, mainly for the case of weakly dependent data.

We consider applications to multivariate series of daily stock indices returns and series generated by

an artiﬁcial ﬁnancial market.

1 Introduction

Detecting changes in multivariate time series is of interest if we be lieve that these series are correlated,

and/or that the components of the multivariate vector processes ar e ge nerated by the same process.

This assumption is relevant for ﬁnancial markets where correlated assets are traded. Empirical evidence,

reported e.g ., in Teyssi`ere [36, 37], shows that several time series , i.e., Foreign Exchange (FX) rates

returns, display the same degree of persistence in their volatilities and co–volatilities, a property that

might be caused by a common non–stationarity of these s e ries. The presence of s trong dependence in asset

price volatilities is still a matter of debate, although numerous works, see e.g., Mikosch and St˘aric˘a [33],

Kokoszka and Teyssi`ere [26], Lavielle and Teyssi`ere [30], have shown that the strong persistence in

volatility is likely to be a statistical artefact, i.e., mainly an eﬀect of the concatenation of processes with

diﬀerent unconditional varia nce s; see also Giraitis et al. [14] for a survey on volatility models.

From the point of view of the practitioner, change–po ints detection procedures are of interest, as we

do not know which process does actually generate the data under investigation. Furthermore, economic

data are usually not stationary, and then it may be of interest to approximate an unknown and possibly

nonstationary proce ss with locally stationary proce sses; see e.g., Dalhaus [12].

The literature on change-point detection is ra ther huge: reference monographs include Bass e ville and

Nikiforov [1], Brodsky and Darkhovsky [7], Cs¨org¨o and Horv´ath [11], Chen and Gupta [8]. The journal

articles by Giraitis and Leipus [16, 15], Hawkins [19, 20], Chen and Gupta [9], Mia and Zhao [32], Sen

and Srivastava [35] among others, ar e also of interest.

The statistical theory for weakly dependent vo latility proce sses with a change–point was developed

more recently; see, e.g., Chu [10], Kokoszka and Leipus [24, 25], Horv´ath, Kokoszka and Teyssi`ere [21],

Kokoszka and Teyssi`ere [26], Berkes et al. [2]. The processes considered in these works are no longer

i.i.d., but weakly dependent. For the case of strongly dependent time series, the reader is referred to the

paper by Giraitis, Leipus and Surgailis [17], Lavielle [27], and the chapter by Kokoszka and Leipus [23]

in the book on long-r ange dependence edited by Doukhan et al. (2003), which rev iew s the recent works

on the issue of change–point detection for univariate dependent time series.

The occurrence of a single change –point in real data is rather rare, as data in economics, ﬁnance, hy-

drology, biology, electrical e ngineering, etc., display multiple changes, see e.g., Schechtman and Wolfe [34],

Braun et al. [6], Lavielle and Moulines [29], Lavielle and Teyssi`er e [30]. Thus, a statistical procedure able

to reliably detect multiple changes is of practical interest. It has be e n often claimed that the testing proce-

dure for single change–point can be extended to the multiple change–point case by using Vostrikova’s [39]

binary segmentation procedure, which consists in applying the single change–point detection procedure

on the whole sample, split the sample at the detected change–point, and then apply iteratively the

change–point detection proc edure on the resulting two segments until no further change–point is found.

In Lavielle and Teyssi`ere [30], we addressed the issue of globa l procedure vs local procedure, and

found that the extension of s ingle change–points procedures to the case of multiple change–point using

Vostrikova’s [39] bina ry segmentation procedure is misleading and y ields an overestimation of the number

of change –points.

A glo bal approach means that all the change–points are simultaneously detected. These change–

points are estimated by minimizing a penalized contrast J(τ , y)+βpen(τ ) (see [3, 27, 40]). Here, J(τ , y)

measures how the model obtained with the change–points se quence τ ﬁts the observed series y. Its role

is to locate the change–points as accurately as possible. For detecting changes in the mean and/ or the

covariance matrix of a multivariate series, we propose to deﬁne the contrast J(τ , y) from the logarithm

of a Gaussian likelihood, e ven if the observed series is not Gaussian. The penalty term pen(τ ) only

depends on the dimension K(τ ) of the model τ a nd increases with K(τ ). The penalization parameter

β adjusts the trade-oﬀ between the minimization of J(τ , y) (obtained with a high dimension of τ ), and

the minimization o f pen(τ ) (obtained with a small dimension o f τ ).

Asymptotic r e sults have be e n obtained in theoretical general contexts in [27], extending the previous

results of Yao [40]. We shall see that this approach is also very us e ful for practical applications, for

detecting changes in the mean and/or variance of multivariate time series, with the restriction that the

series have a common segmentation τ . An adaptive method is proposed for e stimating the number of

change–points. Numerical experiments show tha t the proposed method outperforms the Schwarz criterion

and yields very good results.

For a multivariate time series, the algorithm of the detection procedure will be of order O(mn

where m is the dimension of the vecto r process, instead of the O(n

) order as in the univariate case.

2 A penalized contrast estimate for the multivariate change–

point problem

2.1 The contrast function

We assume that the m–dimensional pro c ess {Y

= (Y

1,t

, . . . , Y

m,t

)

′

} is abruptly changing and is char-

acterized by a parameter θ ∈ Θ that remains constant b etween two changes. We w ill strongly use this

assumption to deﬁne our contras t function J(τ , Y ).

Let K be so me integer and let τ = {τ

, τ

, . . . , τ

K−1

} be an ordered sequence of integers satisfying

0 < τ

< τ

< . . . < τ

K−1

< n. For any 1 6 k 6 K, let U(Y

k−1

, . . . , Y

; θ) be a contrast function

useful for estimating the unknown true value of the parameter in the segment k. In o ther wo rds, the

minimum contrast estimate

θ(Y

k−1

, . . . , Y

), c omputed on the k

segment of τ , is deﬁned as a

solution to the following minimization problem:



k−1

, . . . , Y

;

θ(Y

k−1

, . . . , Y

)



6 U(Y

k−1

, . . . , Y

; θ) , ∀θ ∈ Θ. (1)

For any 1 6 k 6 K, let G be deﬁned as

G(Y

k−1

, . . . , Y

) = U



k−1

, . . . , Y

;

θ(Y

k−1

, . . . , Y

)



. (2)

Then, deﬁne the contrast function J(τ , Y ) as

J(τ , Y ) =

k=1

G(Y

k−1

, . . . , Y

), (3)

where τ

= 0 and τ

= n.

We consider in this paper changes in the covariance matrix of the sequence {Y

}. More precisely,

we assume that there exists an integer K

⋆

, a sequence τ

⋆

= {τ

⋆

, τ

⋆

, . . . , τ

⋆

} with τ

⋆

= 0 < τ

⋆

... < τ

⋆

−1

< τ

⋆

= n and K

⋆

(m × m) covariance matrices Σ

, Σ

, . . . , Σ

⋆

such that Cov (Y

) =

E(Y

− E (Y

))(Y

− E (Y

))

′

= Σ

for τ

⋆

k−1

+ 1 6 t 6 τ

⋆

Model M1: There exist a m-vector µ such that E (Y

) = µ for t = 1, 2, ..., n. Furthermore, Σ

6= Σ

k+1

for 1 6 k 6 K

⋆

− 1.

For this simple case of changes in the covariance matr ix without changes in the mean, which is of

intere st for multivariate volatility processes, the following contr ast function, based on a Gaussian log–

likelihood function, can be used:

J(τ , Y ) =

k=1

log |

|, (4)

where n

= τ

− τ

k−1

is the length of the segment k,

is the (m × m) empir ical covariance matrix

computed on that segment k:

t=τ

k−1

−

Y )(Y

−

Y )

′

. (5)

Here

Y = n

−1

t=1

is the empirical mean of the m–dimensional series Y

computed on the co mplete

series.

Model M2: There exist K

⋆

m-vectors µ

, . . . µ

⋆

such that E (Y

) = µ

for τ

⋆

k−1

+ 1 6 t 6 τ

⋆

Furthermore, (µ

, Σ

) 6= (µ

k+1

, Σ

k+1

) for 1 6 k 6 K

⋆

− 1.

For the detection of changes in the mean vector and/or the covariance matrix of a multivariate

sequence of random variables, this contrast also reduces to

J(τ , Y ) =

k=1

log |

| (6)

but the (m × m) empirical covariance matrix

is c omputed on segment k as

t=τ

k−1

−

)(Y

−

)

′

(7)

where

= n

−1

t=τ

k−1

is the empirical mean of the m–dimensional s e ries Y

computed on tha t

segment.

Asymptotic results for the minimum contrast estimate of τ

⋆

can be obtained within the following

asymptotic framework:

A1 For any 1 6 i 6 m and any 1 6 t ≤ n , deﬁne η

t,i

= Y

t,i

−E (Y

t,i

). There exists C > 0 and 1 6 h < 2

such that for any u ≥ 0 and any s ≥ 1,

u+s

t=u+1

t,i

6 C(θ)s

. (8)

(A1 holds with h = 1 for weakly dependent sequences and 1 < h < 2 for strongly dependent sequences)

A2 There exists a sequence 0 < a

< a

< . . . < a

⋆

−1

< a

⋆

= 1 such that for any n > 1 and for any

1 6 k 6 K

⋆

− 1, τ

⋆

= [na

When the true number K

⋆

of segments is known, we have the following result concerning the r ate of

convergence o f the minimum contrast estimator of τ

⋆

Theorem 2.1 Assume that conditions A1-A2 are satisﬁed. Under model M1 (resp. model M2), let

ˆτ

be the time instants that minimize the empirical contrast J(τ , Y ) deﬁned in (4) (resp. (6)). Then,

the sequence {nkˆτ

− τ

⋆

∞

} is uniformly tight in probability:

lim

n→∞

lim

δ→∞

P( max

16k6K

⋆

−1

|ˆτ

n,k

− τ

⋆

| > δ) = 0. (9)

(Here, J(τ , Y ) is minimized over all possible sequences τ of length K

⋆

)

Proof: The proof is a direct application of Theorem 2.4 by Lavielle [27]. We can easily check that

hypotheses H1-H2 of [2 7] are satisﬁed under models M1 and M2 and under hypotheses A1-A2. 

This result means that the rate of convergence of ˆτ

does not depend on the covariance structure

of the sequence {Y

}. For strongly mixing sequences, as well as for strongly dependent s equences, the

optimal rate is obtained since kˆτ

− τ

⋆

∞

= O

(1).

2.2 Penalty functions for the change–point problem

When the number of change–points is unknown, we estimate it by minimizing a penalized version of

the function J(τ , Y ). For any sequence of change–point instants τ , let pen(τ ) be a function o f τ that

increases with the number K(τ ) of segments of τ . Then, let {ˆτ

} be the sequence of change–point

instants that minimizes

U(τ ) = J(τ , Y ) + βpen(τ ). (10)

The procedure is intuitively simple: the adjustment criteria must be compensated so that the over-

segmentation would be penalized. However, this comp ensation must not be very important as a too large

penalty function yields an underestimation of the number of segments.

If β is a function of n that goes to 0 at an appropriate rate as n goes to inﬁnity, the following theorem

states that the estimated number of segments c onverges in probability to K

⋆

and tha t (9) still holds.

Theorem 2.2 Let {β

} be a positive sequence of real numbers such that

−→

n→∞

0 and n

2−h

−→

n→∞

∞, 1 6 h < 2. (11)

Then, under A1-A2, the estimated number of segments K(ˆτ

), where ˆτ

is the minimum penalized

contrast estimate of τ

⋆

obtained by minimizing J(τ , Y ) + β

pen(τ ), converges in probability to K

⋆

(Here, J(τ , Y ) is m inimized over all possible sequences τ and over all possible 1 ≤ K ≤ K

max

, where

max

is some known upperbound of K

⋆

)

Proof: the proof is a direct application of Theorem 3.1 by Lavielle [27]. 

In practice, asymptotic results are not very useful for selecting the penalty term βpen(τ ). Indeed,

given a real observed signal with a ﬁxed and ﬁnite length n, the para meter β must be ﬁxed to some

arbitrary value. When the parameter β is chosen to be very large, only the more signiﬁcant abrupt changes

are detected. However, a small value of β produces a high number of estimated changes. Therefo re, a

trade-oﬀ must be made, i.e., we have to select a value of β which yields a reasonable level of resolution

in the segmentation.

Various authors suggest diﬀerent penalty functions according to the model they consider. For example,

the Schwarz criterion is used by Braun et al. [6] for detecting changes in a DNA sequence.

Consider ﬁrst the p enalty function pen(τ ). By deﬁnition, pen(τ ) should increase with the numb e r

of segments K(τ ). Following the most popular information c riteria such the AIC and the Schwarz

criteria, the simplest penalty function pen(τ ) = K(τ ) can be used. Furthermore, Ya o [40] has proved

the consistency of the Schwarz criterion for some models.

Remark 2.3 For the multivariate i.i.d. case, the penalization parameter for the Schwarz criterion is

β =

m(m + 1)

log n

. (12)

In order to reduce the computational cost of the algorithm and according to the required precision in the

estimation, the change-points can be detected on a sub-grid d, 2d, 3d, . . . of 1, 2, . . . , n (we used d = 10 in

Detection of multiple change-points in multivariate time series

Figures

Citations

A Nonparametric Approach for Multiple Change Point Analysis of Multivariate Data

A Nonparametric Approach for Multiple Change Point Analysis of Multivariate Data

Multiple-change-point detection for high dimensional time series via sparsified binary segmentation

Multiple-change-point detection for high dimensional time series via sparsified binary segmentation

The group fused Lasso for multiple change-point detection

References

Fundamentals of statistical signal processing: estimation theory

Detection of abrupt changes: theory and application

Modelling the Coherence in Short-run Nominal Exchange Rates: A Multivariate Generalized ARCH Model.

Limit theorems in change-point analysis

Fitting time series models to nonstationary processes

Related Papers (5)

Optimal Detection of Changepoints With a Linear Computational Cost

Detection of abrupt changes: theory and application

Limit theorems in change-point analysis

Use of Cumulative Sums of Squares for Retrospective Detection of Changes of Variance

Estimating and testing linear models with multiple structural changes

Frequently Asked Questions (1)

Q1. What are the contributions in "Detection of multiple change–points in multivariate time series" ?