0% found this document useful (0 votes)

8 views10 pages

Sparse Causal Discovery in Multivariate Time Series

The document discusses a novel method for estimating causal interactions in multivariate time series using vector autoregressive (VAR) models with sparsity-promoting regularization. The proposed approach enforces sparsity for groups of coefficients corresponding to pairs of time series, improving upon traditional methods like Lasso and Ridge Regression. The effectiveness of this method is demonstrated through simulations, showing its superiority in recovering causal structures compared to standard techniques.

Uploaded by

Bakht Zaman

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views10 pages

Sparse Causal Discovery in Multivariate Time Series

Uploaded by

Bakht Zaman

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

JMLR Workshop and Conference Proceedings 6:97–106 NIPS 2008 workshop on causality

Sparse Causal Discovery in Multivariate Time Series

Stefan Haufe HAUFE @ CS . TU - BERLIN . DE
Klaus-Robert Müller KRM @ CS . TU - BERLIN . DE
Machine Learning Group, TU Berlin
Franklinstr. 28/29, 10587 Berlin, Germany
Guido Nolte GUIDO . NOLTE @ FIRST. FRAUNHOFER . DE
Intelligent Data Analysis Group, Fraunhofer FIRST
Kekuléstr. 7, 12489 Berlin, Germany
Nicole Krämer NKRAEMER @ CS . TU - BERLIN . DE
Machine Learning Group, TU Berlin
Franklinstr. 28/29, 10587 Berlin, Germany

Editor: Isabelle Guyon, Dominik Janzing and Bernhard Schölkopf

Abstract
Our goal is to estimate causal interactions in multivariate time series. Using vector autore-
gressive (VAR) models, these can be defined based on non-vanishing coefficients belonging
to respective time-lagged instances. As in most cases a parsimonious causality structure is
assumed, a promising approach to causal discovery consists in fitting VAR models with an ad-
ditional sparsity-promoting regularization. Along this line we here propose that sparsity should
be enforced for the subgroups of coefficients that belong to each pair of time series, as the ab-
sence of a causal relation requires the coefficients for all time-lags to become jointly zero. Such
behavior can be achieved by means of `1,2 -norm regularized regression, for which an efficient
active set solver has been proposed recently. Our method is shown to outperform standard
methods in recovering simulated causality graphs. The results are on par with a second novel
approach which uses multiple statistical testing.

Keywords: Vector Autoregressive Model, Granger Causality, Group Lasso, Multiple Testing

1. Introduction
Causality is commonly defined based on the widely accepted assumption that an effect is always
preceded by its cause. Granger (1969) postulates a measure of causal influence between two
time series (Granger Causality). In a nutshell, a time series zi Granger-causes time series z j if
knowledge of past values of zi improves the prediction of z j (compared to only using past values
of z j ). The improvement is assessed by means of the Granger score, which is defined as the
logarithm of the ratio of the residuals of the two models (1) including only z j and (2) including
both zi and z j .
In the case of a set F = {z1 , . . . , zM } of time series, the pairwise analysis may lead to spurious
detection of a causal relation. For this reason it is advisable to additionally include the set
F ∖ {zi , z j } of all other observable time series in both models. This approach, to which we refer

○2010
c S. Haufe, K.-R. Müller, G. Nolte, and N. Krämer
H AUFE M ÜLLER N OLTE K RÄMER

as complete (or conditional) Granger Causality, resolves the problem of spurious causality due
to common hidden factors z* if z* ∈ F. If the z* are not observable, Granger causality fails and
we refer to Nolte et al. (2008) for a detailed discussion and a remedy.
Just to illustrate the problem, consider that a hidden driving factor is equally pronounced in
two variables zi′ and zi′′ . If both variables contain roughly the same amount of noise, all of the
sets F, F ∖ {zi ′} and F ∖ {zi ′′} provide equal information about z j , for which reason complete
Granger causality will neither identify zi ′ nor zi ′′ as a driver. This type of mistake can only be
avoided if each set F ∖ {zi ′} is tested against all sets not including zi ′, which leads to exponential
complexity.
An elegant alternative to the pairwise comparisons of (complete) Granger causality is to
handle all potential causal relations between all time series at once. Assuming a linear dynamics
of the system under study, this leads us to the vector autoregressive (VAR) model. Interestingly,
the parameters of the VAR model induce a natural alternative definition of causal influence,
which is compliant with Granger’s considerations.
In many applications the true causality graph is assumed to be sparse, i.e. only a few causal
interactions between time series are expected. Ordinary Least Squares (OLS) and Ridge Re-
gression, which are usually used for fitting VAR models, however, are known for producing
dense coefficients. Only recently Valdes-Sosa et al. (2005) have proposed to enforce estima-
tion of sparse AR coefficients using `1 -norm regularized models such as the Lasso (Tibshirani,
1996).
In this paper we propose a novel sparse approach which – unlike Lasso – accounts for the
fact that the absence of a causal relation between zi and z j requires all AR coefficients belonging
to that certain pair of time series to be jointly zero. Furthermore, we consider Ridge Regression
in combination with the multiple statistical testing procedure provided by Hothorn et al. (2008).
More details on the methodology are given in section 3. These methods are evaluated and
compared to standard approaches in extensive simulations.

2. Background
In this section, we briefly summarize related approaches to estimate sparse vector autoregressive
models in the context of causal discovery. We roughly distinguish between sparse estimation
methods and testing strategies.
Given a multivariate time series z(t) ∈ RM a linear vector autoregressive process of order P
is defined as

P
z(t) = ∑ A(p) z(t − p) + ε(t) , (1)
p=1

where A(p) ∈ RM×M , ε ∼ 𝒩 (0, σ 2 I) and t ∈ Z indicates time. Hence, the signal at time t is
modeled as a linear combination of its P past values and Gaussian measurement noise. Inspired
by the initial assumption that the cause should always precede the effect, we suggest the fol-
lowing definition of causality. We say that time series zi has a causal influence on time series z j
(p)
if for at least one p ∈ {1, . . . , P}, the coefficient A ji corresponding to the interaction between
z j and zi at the pth time-lag is nonzero.
Thus, causal inference may be conducted by estimating the matrices A(p) from a sam-
ple Z = (z(1), . . . , z(T )). Let us introduce the following shortcuts. We denote by A =

98
S PARSE C AUSAL D ISCOVERY IN M ULTIVARIATE T IME S ERIES
⊤
A(1) , . . . , A(P) the matrix of all VAR coefficients and set X = (Z1 , . . . , ZP ), Y = Z0 , Z p =
(z(P + 1 − p), . . . , z(T − p))⊤ . Here vec(·) denotes the vectorization operation.

2.1 Sparsity
Probably the most straightforward way to estimate a sparse VAR is to use `1 -regularization on
the set of coefficients,
blasso = arg min ‖vec(XA −Y )‖2 + λ ‖vec(A)‖ , λ ≥ 0 .
A 2 1
A

Recently, Valdes-Sosa et al. (2005) proposed a combination of VAR-estimation and the

Lasso (Tibshirani, 1996). While Valdes-Sosa et al. (2005) only consider a VAR model of order
1, there have been extensions to higher orders (e.g. Arnold et al., 2007). However, we note
in the latter case, Lasso is not used on the VAR coefficients directly, but that the problem
is transformed into the task of estimating partial correlation coefficients between time-lagged
copies of the time series (see also Opgen-Rhein and Strimmer, 2007).

2.2 Testing
Just as in the case of sparse methods, it is often suggested to transform the regression task into
the estimation of the matrix of partial correlation coefficients between time-lagged copies of the
time series. While Drton and Perlman (2008) estimate the correlation matrix in an unregularized
way, Opgen-Rhein and Strimmer (2007) propose a shrinkage estimator, which is superior in the
case of high-dimensional data (Schäfer and Strimmer, 2005). Afterwards, significant partial
correlations are detected by controlling false discovery rates. While the latter approach is only
tested for P = 1, it is straightforward to extend it to higher order VAR’s.

3. Our Approach
In the following, we provide the details regarding the groupwise sparsity and the alternative
testing strategy respectively.

3.1 Ridge Regression and Multiple Testing

Under the assumption of Gaussian white noise it is natural to estimate the AR coefficients using
regularized least squares, and probably the most straightforward way to do so is to use Ridge
Regression,
bridge = arg min ‖vec(XA −Y )‖2 + λ ‖vec(A)‖2 = (X ⊤ X + λ I)−1 X ⊤Y , λ ≥ 0 .
A (2)
2 2
A

Thanks to the Ridge penalty, Eq. 2 delivers solutions with small coefficients, which, however,
are in general never exactly zero. In the strict sense of Granger, this corresponds to a fully-
connected dependency graph, rendering Ridge Regression an improper candidate for sparse
causal recovery. On the other side, many of the estimated coefficients are expected to be non-
significant. Hence, we propose a sparsification by means of statistical testing, where our ap-
proach is, in contrast to e.g. bootstrapping, to explicitly derive p-values.
From Eq. 2 it is apparent that the estimation can be done independently for each col-
umn of A, and so does the testing. Let therefore α k denote the kth column of A and let
yk = (zk (P + 1), . . . , zk (T ))⊤ . Neglecting the dependency of X and Y , the Ridge coeffi-
cients depend linearly on Y , we can conclude that under the null-hypothesis H0 : α k = 0,

99
H AUFE M ÜLLER N OLTE K RÄMER
−1 ⊤ −1
b k ∼ 𝒩 (0, σk2 Σ) with Σ = X ⊤ X + λ I
we have α X X X ⊤X + λ I . Furthermore, setting
−1 ⊤
H = X X ⊤X + λ I X an estimate of the model variance σk2 is given by
‖yk − Hyk ‖2
bk2 =
σ . (3)
trace ((I − H)(I − H ⊤ ))
q
Using Eq. 3 we can now construct normalized test statistics α eik = α bik / σk2 Σii which are jointly
p
normally distributed with αe ∼ 𝒩 (0, R) and Ri j := Σi j / Σii Σ j j . Suppose we want to test all
individual hypotheses H0,i : αik = 0 simultaneously, then, according to Hothorn et al. (2008),
the adjusted p-values are pi = 1 − g (R, |αeik |). We reject a hypothesis, if the p-value is below
the predefined significance level γ. Here,
Zt Z t
g(R,t) = P max |α eik | ≤ t = ... φ (α1 , . . . , αMP )dα1 · · · dαMP (4)
i −t −t

and φ (α) is the density function of the multivariate normal distribution 𝒩 (0, R).

3.2 Group Lasso

Sparse causal discovery using Ridge Regression is a two-step procedure and may possibly suffer
from the aggregation of assumptions that enter in each step. Direct estimation of sparse VAR
coefficients (e.g. via Lasso) is therefore desirable, as this would allow omission of the multiple
significance testing step. However, for higher order models, this approach is prone to selecting a
different set of causal interactions for each of the P time lags. We here suggest that this behavior
can be overcome by enforcing joint sparsity of the coefficient vectors that belong to a certain
pair of time series. This corresponds to incorporating the prior belief that causal influences
between time series are not restricted to only one particular time lag into the estimation. The
positive effect of such modeling can be verified in Figure 1 (see Section 4 for more details).
The idea of imposing groupwise sparse coefficients leads to `1,2 -norm regularized regression
also known as the Group Lasso (Yuan and Lin, 2006), which has also applications in Multiple
Kernel Learning (Bach et al., 2004; Sonnenburg et al., 2006) and the EEG/MEG inverse problem
(e.g. Haufe et al., 2008). The term `1,2 -norm stands here for an `1 -norm of a vector of `2 -norms.
Our proposed objective is given by
bglasso
A = arg min ‖vec(XA −Y )‖22 (5)
A

(1) (P) (1) (P)
s.t. A11 , . . . , AMM + ∑ Ai j , . . . , Ai j ≤κ , (6)
2 i̸= j 2

This penalty leads to a groupwise variable selection, i.e. a whole block of coefficients is jointly
zero. Note that the first term in Eq. 6 penalizes all MP coefficients describing univariate rela-
tions. In this way, those coefficients are shrunk and hence, overfitting is avoided. Furthermore,
we remark that it is also conceivable to to split the the whole estimation of A into M subproblems
(as suggested in Subsection 3.1), which is desirable in large-scale scenarios.
Eqs. 5 and 6 define a non-differentiable but convex optimization problem which can be
solved in polynomial time by means of Second-order Cone Programming (SOCP). For prob-
lems with sparse expected structure, however, the optimization can be carried out much more
efficiently using the results of Roth and Fischer (2008). By keeping a set of active coefficient
groups, their algorithm needs to call the SOCP solver only for problem sizes far smaller than
the original problem – leading to a considerable reduction of memory usage and computation
time. In the experiments, we employ the active-set algorithm of Roth and Fischer (2008) in
combination with a freely available SOCP solver (Sturm, 1999).

100
S PARSE C AUSAL D ISCOVERY IN M ULTIVARIATE T IME S ERIES

4. Simulations
We conduct a series of experiments in which the causal structure of simulated data has to be
recovered. We include the proposed groupwise sparse approach, standard Lasso, Ridge Regres-
sion with multiple testing and complete Granger Causality based on AR models in the compar-
ison. All four approaches are applied both with and without knowledge of the true model order.
In the latter case P = 10 is chosen for the reconstruction. For all methods considered, it is also
possible to estimate the model order P, e.g., via cross-validation.

4.1 Setup
Each simulated data set consists of a multivariate time series with parameters M = 7 and
T = 1000 that is generated by a random VAR process of order P = 5 according to 1. The
distribution of the noise component ε(t) is chosen to be the standard normal distribution. The
VAR coefficients for all but 10 randomly chosen pairs of time series are set to zero, yielding
exactly 10 causal interactions. The non-zero coefficients are drawn randomly from 𝒩 (0, 0.04I).
Each set of VAR coefficients is tested for the stability of its induced dynamical system by look-
ing at the eigenvalues of the corresponding transition matrix. Only coefficients leading to stable
systems (i.e those with transition matrices with eigenvalues of at most 1) are accepted. We con-
sider the following three types of problems, for each of which we created 10 instances: 1) no
noise is added to the data generated by the VAR model 2) the data is superimposed by Gaussian
noise of approximately the same strength, which is uncorrelated (white) both across time and
sensors 3) the data is superimposed by mixed noise of approximately the same strength, which
is generated as a random instantaneous mixture of M univariate AR processes of order 20. Note
that in none of these cases the noise itself possesses a causal structure which would superimpose
the true structure.
For measuring performance we consider Receiver Operating Characteristics (ROC) curves,
which allow objective assessment of the performance in different regimes (e.g. very few false
positives). As an additional measure of absolute performance we also calculate the Area Under
Curve (AUC). ROC curves and AUC values are averaged across the 10 problem instances and
standard errors are computed for AUC.
Complete Granger Causality is calculated using the Levinson-Wiggens-Robinson algorithm
for fitting AR models (Marple, 1987), which is available in the open Biosig toolbox (Schlögl,
2003). For each pair of variables, the Granger score is calculated. The Granger score is stan-
dardized by dividing it by it’s standard deviation as estimated by the jackknife. To obtain a
ROC-curve, the standardized scores are threshold at different values, ranging from completely
sparse to completely dense solutions.
The regularization parameter of Ridge Regression λ is chosen via 10-fold cross-validation
(with respect to time-series prediction accuracy). For this value of λ , we derive the test statistics
defined in Subsection3.1. The multidimensional integrals in Eq. 4 are computed using Monte
Carlo sampling according to Genz (1992). ROC-curves are constructed by varying the signifi-
cance level γ.
For Lasso and Group Lasso, solutions ranging from completely sparse to completely dense
are obtained through variation of the regularizing constant λ and κ respectively.

4.2 Results and Discussion

First, we illustrate the different behavior of the investigated methods in Figure 1. This example
corresponds to the situation without noise and with known model order P = 5. The leftmost part
of the Figure shows the true underlying causal structure. In the top we show the strength of the

101
H AUFE M ÜLLER N OLTE K RÄMER

generating AR coefficients belonging to each pair of variables. Following Granger, this defines
the binary causal influence matrix in the bottom, where black boxes indicate causal interactions.
The reconstructions for the different methods are here based on a point estimate of the
VAR coefficients, rather than the whole ROC curve. For Granger causality, this estimate is
obtained by thresholding the standardized Granger score. A causal influence is defined to be
significant, if the standardized score exceeds a threshold of 0.5. The regularizing constant of
Ridge Regression, Lasso and Group Lasso is fixed using 10-fold cross-validation. Note that for
the Lasso variants, this already determines the sparse causality structure. For Ridge Regression,
we perform subsequent sparsification using a significance level of γ = 0.05.
We display the estimated binary influence matrices in the bottom row of Figure 1. In the top
row, we also show for the sake of comprehensibility the quantities these matrices are derived
from by means of thresholding. In cases of Lasso and Group Lasso these quantities are simply
the estimated AR coefficients and the threshold is zero (the machine precision). For Ridge
Regression we depict the negative logarithmic p-values derived from the AR coefficients, while
for complete Granger causality the standardized Granger score is shown.

1 1 1 1 1
2 2 2 2 2
3 3 3 3 3
4 4 4 4 4
5 5 5 5 5
6 6 6 6 6
7 7 7 7 7
1 2 3 4 5 6 7 1 2 3 4 5 6 7 1 2 3 4 5 6 7 1 2 3 4 5 6 7 1 2 3 4 5 6 7
1 1 1 1 1
2 2 2 2 2
3 3 3 3 3
4 4 4 4 4
5 5 5 5 5
6 6 6 6 6
7 7 7 7 7
1 2 3 4 5 6 7 1 2 3 4 5 6 7 1 2 3 4 5 6 7 1 2 3 4 5 6 7 1 2 3 4 5 6 7
TRUE GRANGER RIDGE LASSO GLASSO

Figure 1: Simulated causal influence matrix and estimates according to Granger Causality,
Ridge Regression, Lasso and Group Lasso. In the top row the generating AR co-
efficients and their Lasso/Group Lasso estimates are shown, as well as the p-values
derived from Ridge Regression and the (complete) Granger-score. The bottom row
depicts the binarized causal influence matrices.

Table 1 summarizes the AUC scores obtained in the experiments described above. The
complementing ROC curves are shown in Figure 2. In short it can be stated that Group Lasso
and Ridge Regression outperform their competitors in all scenarios, although not always sig-
nificantly. While Ridge Regression performs slightly better than Group Lasso in the noiseless
condition, Group Lasso has a clearly visible yet insignificant advantage over all methods in the
white noise setting. Under the influence of mixed noise Ridge Regression and Group Lasso are
on par. Note furthermore that the ROC curve for Lasso is below the ROC curve of Group Lasso,
which shows that Lasso tends to be too dense. Interestingly, knowledge of the true model order
hardly provided any significant advantage in our simulations.

5. Conclusion
We presented a novel approach for causal discovery in multivariate time series which is based
on the Group Lasso. As an alternative we also discussed Ridge Regression with subsequent
multiple testing according to Hothorn et al. (2008) which is also novel in the context of VAR

102
S PARSE C AUSAL D ISCOVERY IN M ULTIVARIATE T IME S ERIES

1 1 1

Sensitivity

Sensitivity
0.9 0.8 0.5

0.8 0.6 0
1 0.5 0 1 0.5 0 1 0.5 0
P=5 Specifity Specifity Specifity
1 1 1
Sensitivity

Sensitivity

Sensitivity
0.8 0.5

0.6 0.5 0
1 0.5 0 1 0.5 0 1 0.5 0
P = 10 Specifity Specifity Specifity
NO NOISE WHITE NOISE MIXED NOISE

Figure 2: Average ROC curves of Granger Causality (red), Ridge Regression (green), Lasso
(blue) and Group Lasso (black) in three different noise conditions and for two differ-
ent model orders.

GRANGER RIDGE LASSO GLASSO

NO NOISE 0.991 ± 0.004 1.000 ± 0.000 0.996 ± 0.002 0.997 ± 0.002
P=5 WHITE NOISE 0.910 ± 0.023 0.948 ± 0.020 0.941 ± 0.021 0.971 ± 0.016
MIXED NOISE 0.896 ± 0.012 0.928 ± 0.010 0.889 ± 0.011 0.926 ± 0.012
NO NOISE 0.980 ± 0.005 0.998 ± 0.002 0.996 ± 0.002 0.999 ± 0.001
P = 10 WHITE NOISE 0.885 ± 0.019 0.958 ± 0.012 0.948 ± 0.013 0.979 ± 0.005
MIXED NOISE 0.893 ± 0.013 0.931 ± 0.015 0.861 ± 0.014 0.931 ± 0.007
Table 1: Average AUC scores and standard errors of Granger Causality, Ridge Regression,
Lasso and Group Lasso in three different noise conditions and for two different model
orders. Entries with significant superior score are highlighted.

modeling. Both approaches were shown to outperform standard methods in simulated scenarios.
Future research will aim at applying our techniques to real-world problems. Given that the
sparsity assumption is correct, our Group Lasso approach should be able to handle much larger
problems than the ones that were considered here by 1) splitting the problem into M independent
subproblems and 2) using the active set solver of Roth and Fischer (2008) in combination with
strong regularization that ensures staying in the sparse regime. We expect that this will allow
large-scale applications such as the estimation of cerebral information flow from functional
Magnetic Resonance Tomography (fMRI) recordings to benefit from the improved accuracy of
our approach.

103
H AUFE M ÜLLER N OLTE K RÄMER

Acknowledgments

This work was supported in part by the German BMBF (FKZ 01GQ0850, 01-IS07007A and
16SV2234) and the FP7-ICT Programme of the European Community under the PASCAL2
Network of Excellence, ICT-216886. We thank Thorsten Dickhaus for discussions.

References
A. Arnold, Y. Liu, and N. Abe. Temporal Causal Modeling with Graphical Granger Methods.
In Proceedings of the Thirteenth ACM SIGKDD International Conference on Knowledge
Discovery and Data Mining, pages 66–75, 2007.
F.R. Bach, G.R.G. Lanckriet, and M.I. Jordan. Multiple kernel learning, conic duality and the
SMO algorithm. In Proceedings of the Twenty-first International Conference on Machine
Learning, 2004.
M. Drton and M.D. Perlman. A SINful approach to Gaussian graphical model selection. Journal
of Statistical Planning and Inference, 138(4):1179–1200, 2008.
Alan Genz. Numerical computation of multivariate normal probabilities. Journal of Computa-
tional and Graphical Statistics, 1:141–150, 1992.
C.W.J. Granger. Investigating causal relations by econometric models and cross-spectral meth-
ods. Econometrica, 37:424–438, 1969.
S. Haufe, V.V. Nikulin, A. Ziehe, K.-R. Müller, and G. Nolte. Combining sparsity and rotational
invariance in EEG/MEG source reconstruction. NeuroImage, 42(2):726–738, 2008.

T. Hothorn, F. Bretz, and P. Westfall. Simultaneous Inference in General Parametric Models.

Biometrical Journal, 3:346–363, 2008.
S.L. Marple. Digital Spectral Analysis with Applications. Prentice Hall, Englewood Cliffs, NJ,
1987.

G. Nolte, A. Ziehe, V.V. Nikulin, A. Schlögl, N. Krämer, T. Brismar, and K.R. Müller. Robustly
Estimating the Flow Direction of Information in Complex Physical Systems. Physical Review
Letters, 100(23):234101, 2008.
R. Opgen-Rhein and K. Strimmer. Learning causal networks from systems biology time course
data: an effective model selection procedure for the vector autoregressive process. BMC
Bioinformatics, 9, 2007.
V. Roth and B. Fischer. The Group Lasso for Generalized Linear Models: Uniqueness of
Solutions and Efficient Algorithms. In Proceedings of the 25th International Conference on
Machine Learning, pages 848 –855, 2008.

J. Schäfer and K. Strimmer. A Shrinkage Approach to Large-Scale Covariance Matrix Esti-

mation and Implications for Functional Genomics. Statistical Applications in Genetics and
Molecular Biology, 4:32, 2005.
A. Schlögl. BIOSIG - an open source software library for biomedical signal processing,
http://BIOSIG.SF.NET, 2003.

104
S PARSE C AUSAL D ISCOVERY IN M ULTIVARIATE T IME S ERIES

S. Sonnenburg, G. Rätsch, C. Schäfer, and B. Schölkopf. Large Scale Multiple Kernel Learning.
The Journal of Machine Learning Research, 7:1531–1565, 2006.
J.F. Sturm. Using SeDuMi 1.02, a MATLAB toolbox for optimization over symmetric cones.
Optimization Methods and Software, 11–12:625–653, 1999.

R. Tibshirani. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical
Society Series B, 58:267–288, 1996.
P.A. Valdes-Sosa, J.M. Sanchez-Bornot, A. Lage-Castellanos, M. Vega-Hernandez, J. Bosch-
Bayard, L. Melie-Garcia, and E. Canales-Rodriguez. Estimating brain functional connectivity
with sparse multivariate autoregression. Philosophical Transactions of the Royal Society B,
360:969–981, 2005.
M. Yuan and Y. Lin. Model selection and estimation in regression with grouped variables.
Journal of the Royal Statistical Society Series B, 68(1):49–67, 2006.

105
H AUFE M ÜLLER N OLTE K RÄMER

106

Journal Time Series Analysis - 2022 - Taufemback - Non Parametric Short and Long Run Granger Causality Testing in The
No ratings yet
Journal Time Series Analysis - 2022 - Taufemback - Non Parametric Short and Long Run Granger Causality Testing in The
24 pages
NN Causality
No ratings yet
NN Causality
17 pages
Causal Discovery From Temporally Aggregated Time Series
No ratings yet
Causal Discovery From Temporally Aggregated Time Series
10 pages
1 - Neural Granger Causality
No ratings yet
1 - Neural Granger Causality
13 pages
1 Causality: POLS571 - Longitudinal Data Analysis September 25, 2001
No ratings yet
1 Causality: POLS571 - Longitudinal Data Analysis September 25, 2001
6 pages
AMFE Module 7_ Vector Auto Regressive Models (1)
No ratings yet
AMFE Module 7_ Vector Auto Regressive Models (1)
13 pages
Vector Auto Regression in Eview Ike
100% (3)
Vector Auto Regression in Eview Ike
37 pages
The Synthetic Instrument: From Sparse Association To Sparse Causation
No ratings yet
The Synthetic Instrument: From Sparse Association To Sparse Causation
33 pages
CWP 107
No ratings yet
CWP 107
26 pages
Statistical Tests For Detecting Granger Causality
No ratings yet
Statistical Tests For Detecting Granger Causality
14 pages
Chapter 6 Var Granger Causality Tests
No ratings yet
Chapter 6 Var Granger Causality Tests
27 pages
Spurious Regressions in Economics PDF
No ratings yet
Spurious Regressions in Economics PDF
10 pages
VAR
No ratings yet
VAR
25 pages
Econ21_Baum
No ratings yet
Econ21_Baum
55 pages
Extensions Beyond Linear Regression: Topics in Data Science
No ratings yet
Extensions Beyond Linear Regression: Topics in Data Science
66 pages
Methodology: I. Adf and Phillips-Perron Tests
No ratings yet
Methodology: I. Adf and Phillips-Perron Tests
2 pages
General-to-Specific Reductions of Vector Autoregressive Processes
No ratings yet
General-to-Specific Reductions of Vector Autoregressive Processes
20 pages
shimizu06a
No ratings yet
shimizu06a
28 pages
Sims (1972) - Gramger Causality
No ratings yet
Sims (1972) - Gramger Causality
8 pages
Regression Vs Box Jenkins Case Study
No ratings yet
Regression Vs Box Jenkins Case Study
14 pages
VAR & Impulse Response Function
No ratings yet
VAR & Impulse Response Function
30 pages
Brown-2010-A GENERAL STATISTICAL FRAMEWORK
No ratings yet
Brown-2010-A GENERAL STATISTICAL FRAMEWORK
5 pages
Traditional Granger Causality (1969 1972) VS Toda and Yamamoto (1995)
100% (6)
Traditional Granger Causality (1969 1972) VS Toda and Yamamoto (1995)
9 pages
Topic 7. VAR Models
No ratings yet
Topic 7. VAR Models
44 pages
Least Absolute Deviation Estimation For All-Pass Time Series Models
No ratings yet
Least Absolute Deviation Estimation For All-Pass Time Series Models
28 pages
Div B Group10 Finalytics Exp. Learning Assignment
No ratings yet
Div B Group10 Finalytics Exp. Learning Assignment
6 pages
PE II-CompLabs-2014 04 01 - T
No ratings yet
PE II-CompLabs-2014 04 01 - T
104 pages
Sornette Optimal Thermal Causal Path
No ratings yet
Sornette Optimal Thermal Causal Path
37 pages
Durbin, J., & Watson, G. S. (1951) - Testing For Serial Correlation in Least Squares Regression. II. Biometrika, 38 (12), 159.
No ratings yet
Durbin, J., & Watson, G. S. (1951) - Testing For Serial Correlation in Least Squares Regression. II. Biometrika, 38 (12), 159.
20 pages
Discovery of Non-Gaussian Linear Causal Models Using ICA
No ratings yet
Discovery of Non-Gaussian Linear Causal Models Using ICA
8 pages
Phillips 1986 Spurious Regressions
No ratings yet
Phillips 1986 Spurious Regressions
30 pages
Score Matching through the roof; lLinear, Nonlinear, and Latent Variables Causal Discovery 26th July 2024 (AAA)
No ratings yet
Score Matching through the roof; lLinear, Nonlinear, and Latent Variables Causal Discovery 26th July 2024 (AAA)
27 pages
Econometrics Chapter Six (1)
No ratings yet
Econometrics Chapter Six (1)
80 pages
Time Series Analysis Using Vector Autoregression Techniques
No ratings yet
Time Series Analysis Using Vector Autoregression Techniques
77 pages
Vector Auto Regressive Models.pptx
No ratings yet
Vector Auto Regressive Models.pptx
13 pages
American Statistical Association
No ratings yet
American Statistical Association
19 pages
CiML v5 Book PDF
No ratings yet
CiML v5 Book PDF
152 pages
Vector Autoregressions: How To Choose The Order of A VAR
No ratings yet
Vector Autoregressions: How To Choose The Order of A VAR
8 pages
Lecture_4_notes_final20180219203938
No ratings yet
Lecture_4_notes_final20180219203938
21 pages
PhysRevE-15-Granger Causality for State-space Models
No ratings yet
PhysRevE-15-Granger Causality for State-space Models
5 pages
MIT14 384F13 Lec11 PDF
No ratings yet
MIT14 384F13 Lec11 PDF
6 pages
MIT14 384F13 Lec11 PDF
No ratings yet
MIT14 384F13 Lec11 PDF
6 pages
Lecture 7 On Time Series Econometrics - Final
No ratings yet
Lecture 7 On Time Series Econometrics - Final
6 pages
Moreno - Final Examination
No ratings yet
Moreno - Final Examination
4 pages
Declare Time Variable Using VAR in Stata: How To Perform Granger Causality Test in STATA?
No ratings yet
Declare Time Variable Using VAR in Stata: How To Perform Granger Causality Test in STATA?
1 page
Presentation of Econometrics: Topic Granger Causality Test Submitted by Qarsam Ilyas Roll No 7
No ratings yet
Presentation of Econometrics: Topic Granger Causality Test Submitted by Qarsam Ilyas Roll No 7
15 pages
2305.05276
No ratings yet
2305.05276
22 pages
Commun Nonlinear Sci Numer Simulat: Xuegeng Mao, Pengjian Shang
No ratings yet
Commun Nonlinear Sci Numer Simulat: Xuegeng Mao, Pengjian Shang
10 pages
This Content Downloaded From 147.188.128.74 On Mon, 01 Jun 2015 14:21:48 UTC All Use Subject To
No ratings yet
This Content Downloaded From 147.188.128.74 On Mon, 01 Jun 2015 14:21:48 UTC All Use Subject To
20 pages
Time Series and Sequential Data
No ratings yet
Time Series and Sequential Data
143 pages
05 Diagnostic Test of CLRM 2
No ratings yet
05 Diagnostic Test of CLRM 2
39 pages
Topic 13 STAT 497 LN13 Cointegration
No ratings yet
Topic 13 STAT 497 LN13 Cointegration
70 pages
Robust Estimation of AR Coefficients Under Simultaneously Influencing Outliers and Missing Values
No ratings yet
Robust Estimation of AR Coefficients Under Simultaneously Influencing Outliers and Missing Values
13 pages
Time Series
No ratings yet
Time Series
22 pages
Bai 1996
No ratings yet
Bai 1996
27 pages
Granger Causality Test
100% (1)
Granger Causality Test
3 pages
6 - VAR
No ratings yet
6 - VAR
47 pages
Mathematical Foundations of Information Theory
From Everand
Mathematical Foundations of Information Theory
A. Ya. Khinchin
3.5/5 (9)
Satplan: Fundamentals and Applications
From Everand
Satplan: Fundamentals and Applications
Fouad Sabry
No ratings yet
Lectures on the Coupling Method
From Everand
Lectures on the Coupling Method
Torgny Lindvall
No ratings yet
hall2013inference
No ratings yet
hall2013inference
28 pages
yamada2019timevarying_ICASSP
No ratings yet
yamada2019timevarying_ICASSP
5 pages
Bai 2000 Var Variance Co Variance
No ratings yet
Bai 2000 Var Variance Co Variance
37 pages
han2017partial
No ratings yet
han2017partial
20 pages
The Joint Lasso- High-dimensional Regression for Group
No ratings yet
The Joint Lasso- High-dimensional Regression for Group
17 pages
[14]Automatic_generation_of_geographical_networks_for_maritime
No ratings yet
[14]Automatic_generation_of_geographical_networks_for_maritime
8 pages
Byzantine-Resilient Locally Optimum Detection Using Collaborative Autonomous Networks
No ratings yet
Byzantine-Resilient Locally Optimum Detection Using Collaborative Autonomous Networks
5 pages
Information Distances For Radar Resolution Analysis: Radmila Pribić Geert Leus
No ratings yet
Information Distances For Radar Resolution Analysis: Radmila Pribić Geert Leus
5 pages
Adam S. Charles Nicholas P. Bertrand John Lee Christopher J. Rozell
No ratings yet
Adam S. Charles Nicholas P. Bertrand John Lee Christopher J. Rozell
5 pages
A Fast Algorithm Based On A Sylvester-Like Equation For LS Regression With GMRF Prior
No ratings yet
A Fast Algorithm Based On A Sylvester-Like Equation For LS Regression With GMRF Prior
5 pages
Variance in Ation Factor: As A Condition For The Inclusion of Suppressor Variable(s) in Regression Analysis
No ratings yet
Variance in Ation Factor: As A Condition For The Inclusion of Suppressor Variable(s) in Regression Analysis
15 pages
T - Table (Critical Values For The Student's T Distribution)
No ratings yet
T - Table (Critical Values For The Student's T Distribution)
1 page
Halamandaris Arianna A10754734 Project2
No ratings yet
Halamandaris Arianna A10754734 Project2
6 pages
On The Smarandache Prime Part
No ratings yet
On The Smarandache Prime Part
4 pages
doc 7
No ratings yet
doc 7
24 pages
Finite Element Method: A First Course in The
No ratings yet
Finite Element Method: A First Course in The
10 pages
Sutherland - Intro To Microsimulation - Nov 2012
No ratings yet
Sutherland - Intro To Microsimulation - Nov 2012
12 pages
Sample Size Determination For Survey Research and Non-Probability Sampling Techniques: A Review and Set of Recommendations
No ratings yet
Sample Size Determination For Survey Research and Non-Probability Sampling Techniques: A Review and Set of Recommendations
22 pages
1 s2.0 S1755008424001066 Main
No ratings yet
1 s2.0 S1755008424001066 Main
50 pages
Sem I - Bba LLB (H)
No ratings yet
Sem I - Bba LLB (H)
15 pages
Direction Encirle The Correct Answer
No ratings yet
Direction Encirle The Correct Answer
2 pages
ASCE7の耐震設計法の補足として分かりやすいです
No ratings yet
ASCE7の耐震設計法の補足として分かりやすいです
4 pages
Volume 2 & 3..no Repeat
No ratings yet
Volume 2 & 3..no Repeat
443 pages
Power System
No ratings yet
Power System
25 pages
PH101 v1-0 Course Overview 2021-0421
No ratings yet
PH101 v1-0 Course Overview 2021-0421
2 pages
Propeller in Open Water
No ratings yet
Propeller in Open Water
64 pages
Example Design of Circular Beam ACI 1999
100% (4)
Example Design of Circular Beam ACI 1999
5 pages
GPI-120L Series Manual PDF
100% (1)
GPI-120L Series Manual PDF
61 pages
Synopsis Report 1
No ratings yet
Synopsis Report 1
10 pages
Chapter 7: Capital Budgeting
No ratings yet
Chapter 7: Capital Budgeting
26 pages
Design Checklist
No ratings yet
Design Checklist
2 pages
Efects of Noise Pollution in The Learning Environment On Cognitive Performances
No ratings yet
Efects of Noise Pollution in The Learning Environment On Cognitive Performances
286 pages
The Effect of NPL, Car, LDR, Oer and Nim To Banking Return On Asset
No ratings yet
The Effect of NPL, Car, LDR, Oer and Nim To Banking Return On Asset
16 pages
Volume Prism, Cylinder, Pyramid
No ratings yet
Volume Prism, Cylinder, Pyramid
4 pages
Fundamental Concepts of Thermodynamics: Cy 11003 Chemistry Spring 2022 2023
No ratings yet
Fundamental Concepts of Thermodynamics: Cy 11003 Chemistry Spring 2022 2023
11 pages
Slender Concrete Columns Sway Frame Moment Magnification ACI318 14 W PDF
No ratings yet
Slender Concrete Columns Sway Frame Moment Magnification ACI318 14 W PDF
36 pages
Nalysis of A Smart Trash Bin Using State-Based Markov Model and Reliabili
No ratings yet
Nalysis of A Smart Trash Bin Using State-Based Markov Model and Reliabili
14 pages
Conceptualizations of Professional Knowledge For Teachers of Mathematics
No ratings yet
Conceptualizations of Professional Knowledge For Teachers of Mathematics
12 pages
Grade 10 Smaw Exam 1st Quarter
80% (5)
Grade 10 Smaw Exam 1st Quarter
3 pages
Numerical Integration
No ratings yet
Numerical Integration
9 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Sparse Causal Discovery in Multivariate Time Series

Uploaded by

Sparse Causal Discovery in Multivariate Time Series

Uploaded by

JMLR Workshop and Conference Proceedings 6:97–106 NIPS 2008 workshop on causality

Sparse Causal Discovery in Multivariate Time Series

Editor: Isabelle Guyon, Dominik Janzing and Bernhard Schölkopf

Recently, Valdes-Sosa et al. (2005) proposed a combination of VAR-estimation and the

3.1 Ridge Regression and Multiple Testing

3.2 Group Lasso

4.2 Results and Discussion

GRANGER RIDGE LASSO GLASSO

T. Hothorn, F. Bretz, and P. Westfall. Simultaneous Inference in General Parametric Models.

J. Schäfer and K. Strimmer. A Shrinkage Approach to Large-Scale Covariance Matrix Esti-

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.