Influence and Outliers

Download as pdf or txt
Download as pdf or txt
You are on page 1of 37

Influence Analysis with Panel Data using Stata

Annalivia Polselli

Institute for Analytics and Data Science (IADS)


University of Essex

Oceania Stata Conference 2023

February 9, 2023
Contex

▶ Small panel data sets with small N but larger than T


▶ e.g., 50 US States, 38 OECD countries, 20 Italian regions, etc.

▶ Observational data may contain “anomalous” observations


(Rousseeuw and Van Zomeren, 1990; Silva, 2001)
▶ Exerting a disproportionate influence on the Least Squares (LS)
estimates
▶ Leading to biases in regression coefficients or standard errors
(Donald and Maddala, 1993; Bramati and Croux, 2007; Verardi and
Croux, 2009)

Annalivia Polselli Influence Analysis using Stata 1 / 20


In this Presentation

▶ I present my method to
▶ Visually detect and identify the type of anomalous unit
▶ Understand how these affect the LS estimates

▶ I develop a unit-wise approach for the detection of anomalous units


▶ As opposed to a case-wise (observational) approach

▶ The method can be conducted before or after the regression analysis

Annalivia Polselli Influence Analysis using Stata 2 / 20


The Commands

▶ I propose two commands for a visual detection of anomalous units


▶ xtlvr2plot – Leverage versus residual plot for panel data
▶ xtinfluence – Influence analysis with panel data

▶ These commands can detect units that exhibit large values


▶ in the outcome variable – vertical outliers VO
▶ in the covariate space – good leverage points GL

▶ in both directions – bad leverage points BL

Plots DGP

▶ These commands are designed to be used with short panel data


▶ e.g., cross-country macro panels, experimental panel data, health data
with repeated units, etc.

Annalivia Polselli Influence Analysis using Stata 3 / 20


Contribution
Diagnostic plots
▶ Leverage vs squared residual plots → lvr2plot and lvr2plot2
▶ Only for cross-sectional data
▶ Less handy for panel data (time-demeaned variables, case-wise
visualization etc.)

Measures of overall influence


▶ Cook-like distances to detect anomalies
▶ in cross-sectional data → predict c, cooksd
▶ in panel data → jackknife2, cooksd(newvar) bpd(newvar):command

▶ These metrics may fail to flag multiple atypical cases


(Atkinson and Mulira, 1993; Chatterjee and Hadi, 1988; Rousseeuw and
Van Zomeren, 1990)
▶ A local approach can overcome this limit (Lawrance, 1995)

Annalivia Polselli Influence Analysis using Stata 4 / 20


Econometric Framework
▶ A static linear panel regression model

yit = x′it β + αi + uit


▶ After the within-group (WG) transformation

e′it β + u
yeit = x eit

where yeit = yit − T −1


P
t yit , etc.
 −1 P
b = PN PT x
▶ WG Estimator: β e′it N
i=1 t=1 e it x i=1 x
eit yeit

▶ LS Residuals: u e′it β
bit = yeit − x b
▶ Average normalised residual squared
!2
T
1 X u
bit
b∗i =
u pP
T t=1 i b2it
u

Annalivia Polselli Influence Analysis using Stata 5 / 20


Leverage
The leverage of a unit is a measure of the distance of the x-values of a unit
from other units.

In panel data models, the individual leverage matrix


 
hii,11 hii,12 ... hii,1T
 hii,21 hii,22 ... hii,2T 
e −1 X
e ′X e ′i = 

Hii = Xei X 
 . .. .. .. 
(T ×T )  .. . . . 
hii,T 1 hii,T 2 ... hii,T T

where Xe i is T ×k, and X


e is N T ×k, with diagonal element hii,tt = x e′it (X
e ′ X)
e −1 x
eit
′ e ′ e −1
and off-diagonal element hii,ts = xeit (X X) x eis for t, s = 1, . . . , T .

The average individual leverage of unit i at time t is


T
1 X
hi = hii,tt
T t=1

Annalivia Polselli Influence Analysis using Stata 6 / 20


xtlvr2plot: Syntax

xtlvr2plot – Leverage versus normalised residual squared plot for panel data.

xtlvr2plot depvar [indepvar] [if] [in] [, options ]

options

graph opts graph options available for twoway scatter

Generated variables
lev average individual leverage
normres2 average individual residual squared

Annalivia Polselli Influence Analysis using Stata 7 / 20


xtlvr2plot: Example

** Use of the 'xtlvr2plot' command


xtset id t

xtlvr2plot y x, ///
mlabel(id) ///
xlabel(, format(%9.3fc)) ///
ylabel(, angle(h) format(%9.3fc)) ///
title("Unit-wise Evaluation", size(medsmall)) ///
saving("xtlvr2plot_example.gph", replace)

Annalivia Polselli Influence Analysis using Stata 8 / 20


xtlvr2plot: Plot
Unit-wise Evaluation with xtlvr2plot
0.015

20

0.010 50

GL units BL units
Leverage

10
0.005 40

61
21
58
9
9096
54
77
62
29
19
84
26
33
81
68
36
83
5
91
1
27
43
37
16
80
93
63
34
87
7
74
66
46
65
67
44
54
57
92
9899
98
9714
31
69
41
11
48
1
12
452
64
24
23
22
856
828
747
35
18
32
2
739
53
42
72
88
49
100
5655
8
8270
76
23
86
15
94
1373
38
99
75
51
1 8
71
90
59
78
17
89
25
79
95
45 7 VO units 60 30
0.000
0.000 0.002 0.004 0.006
Normalised residuals squared

Annalivia Polselli Influence Analysis using Stata 9 / 20


lvr2plot vs xtlvr2plot
Case-wise Evaluation with lvr2plot Unit-wise Evaluation with xtlvr2plot
.09 0.015

50 20

.08
50 50
50 50 0.010 50
Leverage

Leverage
.07

20
20 40
20
20 40 10
20 20
20 40 0.005 40
2020 20
20 40
20 20 40
.06 10
20 20
20
20 10 10
5020 10
20
50
50 1010 10 10
5010 10 10 10
5040 40 10
10
50
10 10
50 5040
10 10 10 83
50
5
34
96
8
875040
32506 10 96
61
21
58
905
1
546
8
91
47
35
18
27
7753
39
43
62
37
16
80
93
32
29
63
34
87
7
74
66
56
9 42
28
46
72
65
67
88
44
49
76
23
86
15
94
84
98
26
97 71
100
55
82
1970
57
92
59
13
73
38
14
31
69
41
11
3399
80
40
50
57
18
29
59
39
8 6
37
61
16
13
67
96
40
28
83
1 73
8340
3140
42
40
27 40 48
81
4
68
36 75
51
12
52
24
3
2
4578
17
89
64
25
22
85
79
952 60 30
35
54
44
36
5
56
12
63
64
46
6
89
2
9
28
58
65
83
24
79
16
84
71
22
5
55
94
70
88
33
8
63
56
46
71
4447
41
29
4340
23
99
86
62
46
7
77
94
62
51
98
97
87
72
35
13 70
53
17
42
65
65
19
37
91
94
761
70
75
74
86 1
86
47
88
66
26
47
18
4 91
93
91
32
7440
85
100
5325
78
93
88
72
35
53
56
66
80
96 393040 30 3060
1
4
57
49
100
90
77
54
11
41
7
21
15
69
54
16
42
96
38
82
992
93
61
59
80
32
19
73
61
98
62
6
84
37
17
52
51
65
94
58
27
42 14
34
23
11
65
1467
82
1
49
75
61
77
39
84
38
43
87
99
44
57
88
47
56
18
39
85
89
81
96
62
99
87
26
55
35
74
76
97
15
83
21
4
47
2
29
71
94
72 77
84
83
92
4872
39
43
14
76
40
38
536
95
3
19
87
19
100
75
52
25
9
3455
98
91
45
90
66 62
27 37
55
43
16
47 60
.05 77
29
61
1
8
63
9
83
3
89
3
6
14
7
5 42
51
67
100
48
82
21
23
78
554
79
59
955
31
69
4
8
58
11
44
47
80
13
100
9
92
19
43
52
58
73
3
68
48
23
63
5
41
63
41
77
19
18
51
4
74
35
32
94
8
54
39
14
17
69
76
53
28
98
88
225
73
11
75
86
15
57
39
87
22
90 35
74
46
90
8263
27
26
59
857
3
51
52
17
98
70
53
75
42
15
12
44
5536
98
36
68
67
54
49
87
6775
12
5415
15
6
46
12
34
55
2746
6
76
94
9696
89
57
64
47
43
53
56
91
29
90
26
66
81
90
167
70
18
48 53
77
27
28
424
91
73
47
66
97
2871
2
83
92
7182
60
100
79
32
85
43
23
67
56
33
76
88
81
38
78 92
31
18
66
23
69
69
3
25
8
39
3
31
7878
38
82
92
62
37 1
12
38
27
72
22
57
51
82
27
83
8888
86
2323
34
46
65
12
62
1
95
82
13
11
23
74
93
8 39
53
88
22
64
93
8135
3839
31
8484
31
100
66
22
95
26
7575
81
18
278
33
96
42
9080
52
24
17
9559
63
2
68
21
82
3333
16
9999
4848
16
19
47
213
83
97
34
18
60
63
99
7
97
52 78
63
36
99
42
98
29
70
89
84
82
37
95 8
73
4
4747
13
91
72
45
45
7
28
49
67
1
31
21
2
49
71
8
78
19
72
34
25
8626
7
4
65
57
87
27
48
689
58
59
37
33
82
46
54
43
44
14
42
3
31
79
81
98
82
44
85
22
61
78
5
14
73 2
24
59
85
83
61
43
93
41
55
44
8
91
60
44
92
13
100
73
44
32
92
91
39
86
61
25
38
87
16
65
17
19
33
58
9
78
7272
37
33
38
59
52 1
64
35
4
69
6969
29
44
36
8
96
26
96
70
662
64
70
36
19
77
59
94
13
97
98
63
46
99
64
56
45
99
32
36
9
66
7453
86
99
24
9
77
67
26
68
93
33
85
17
71
4
99
31
16
84
80
24
94
93
72
80
94
28
70
13
43
66
88
95
55
79
60
31
29
57
76
85
18
2 1
9
99
93
17
32
88
56
36
58
49
3535
51
5252
1
89
74
48
96
42
35
41
21
73
4
701
47
80
11
14
47
21
9
45
68
32
84 6
73
18
95
5
41
56
51
2
74
88
42
24
51
89
41
3775
48
3
49
62
90
48
15
64
4
559
53
16
54
87
86
100
51
92
57
81
35
62
4924
88
28
66
23
97
41
23
3
15
11
65
52
57 7
94
78
35
17
45
63
2
25
49
96
99
57 045
9
46
54
60
7037
28
28
76
58
65
6
3
17
93
8
11
21
38
56 4
71
97
94
78
44
28
79
70
86
70
2
100
77
80
2914 13
45
59
42
12
34
13
87
8787
24 80
83
895
71
71
95
77
8
78
24
85
53
75
79
67
23
69
76
61
15
9
54
2222
54
65
100
56
4
69
39
3272
5775
2
38
2222
1212
79
7373
71
78 5
72
7979
73
76
5 6
62
60
74
3
51
67
46
77
75
57
83
637
94
60
2930
60 30
306060
60
3030 30 30
60
3030
60 30 30 30
30 30 3060
30 60 60 0.000
0 .005 .01 .015 .02 0.000 0.002 0.004 0.006
Normalized residual squared Normalised residuals squared

Annalivia Polselli Influence Analysis using Stata 10 / 20


xtlvr2plot: Summary Table

** Summary table w/detected anomalous units


** generated by 'xtlvr2plot'

Annalivia Polselli Influence Analysis using Stata 11 / 20


Influence Analysis: Measures

▶ Joint influence: Cij (β)


b
▶ Influence exerted by a pair (i,j) on LS estimates jointly
▶ Comparison of LS estimates with and without the pair
▶ With i = j, Cii (β)
b measures the individual influence of i
Formula

▶ Conditional influence: Ci(j) (β)


b
▶ Influence exerted by i on LS estimates conditional on removing j
from the sample
▶ How the absence of j affects the influence i on LS estimates
Formula

Annalivia Polselli Influence Analysis using Stata 12 / 20


Influence Analysis: Measures

▶ Joint influence: Cij (β)


b
▶ Influence exerted by a pair (i,j) on LS estimates jointly
▶ Comparison of LS estimates with and without the pair
▶ With i = j, Cii (β)
b measures the individual influence of i
Formula

▶ Conditional influence: Ci(j) (β)


b
▶ Influence exerted by i on LS estimates conditional on removing j
from the sample
▶ How the absence of j affects the influence i on LS estimates
Formula

Annalivia Polselli Influence Analysis using Stata 12 / 20


Influence Analysis: Measures

▶ Joint influence: Cij (β)


b
▶ Influence exerted by a pair (i,j) on LS estimates jointly
▶ Comparison of LS estimates with and without the pair
▶ With i = j, Cii (β)
b measures the individual influence of i
Formula

▶ Conditional influence: Ci(j) (β)


b
▶ Influence exerted by i on LS estimates conditional on removing j
from the sample
▶ How the absence of j affects the influence i on LS estimates
Formula

Annalivia Polselli Influence Analysis using Stata 12 / 20


Influence Analysis: Effects
▶ Joint Effect
▶ Kj|i = Cij (β)/C
b ii (β)
b

▶ How much the pair is influential wrt i


▶ For large values of Kj|i
▶ j swamps i
▶ the most influential unit swamps the least
▶ j drives the LS estimates swamping the effect of i

▶ Conditional Effect
▶ Mi(j) = Ci(j) (β)/C
b ii (β)
b

▶ How influence of i changes before and after the deletion of j


▶ If Mi(j) ≥ 1
▶ j masks i
▶ influence of i increases without j in the sample
▶ j drives the LS estimates masking the effect of i

Annalivia Polselli Influence Analysis using Stata 13 / 20


Influence Analysis: Effects
▶ Joint Effect
▶ Kj|i = Cij (β)/C
b ii (β)
b

▶ How much the pair is influential wrt i


▶ For large values of Kj|i
▶ j swamps i
▶ the most influential unit swamps the least
▶ j drives the LS estimates swamping the effect of i

▶ Conditional Effect
▶ Mi(j) = Ci(j) (β)/C
b ii (β)
b

▶ How influence of i changes before and after the deletion of j


▶ If Mi(j) ≥ 1
▶ j masks i
▶ influence of i increases without j in the sample
▶ j drives the LS estimates masking the effect of i

Annalivia Polselli Influence Analysis using Stata 13 / 20


Influence Analysis: Effects
▶ Joint Effect
▶ Kj|i = Cij (β)/C
b ii (β)
b

▶ How much the pair is influential wrt i


▶ For large values of Kj|i
▶ j swamps i
▶ the most influential unit swamps the least
▶ j drives the LS estimates swamping the effect of i

▶ Conditional Effect
▶ Mi(j) = Ci(j) (β)/C
b ii (β)
b

▶ How influence of i changes before and after the deletion of j


▶ If Mi(j) ≥ 1
▶ j masks i
▶ influence of i increases without j in the sample
▶ j drives the LS estimates masking the effect of i

Annalivia Polselli Influence Analysis using Stata 13 / 20


Influence Analysis: Effects
▶ Joint Effect
▶ Kj|i = Cij (β)/C
b ii (β)
b

▶ How much the pair is influential wrt i


▶ For large values of Kj|i
▶ j swamps i
▶ the most influential unit swamps the least
▶ j drives the LS estimates swamping the effect of i

▶ Conditional Effect
▶ Mi(j) = Ci(j) (β)/C
b ii (β)
b

▶ How influence of i changes before and after the deletion of j


▶ If Mi(j) ≥ 1
▶ j masks i
▶ influence of i increases without j in the sample
▶ j drives the LS estimates masking the effect of i

Annalivia Polselli Influence Analysis using Stata 13 / 20


xtinfluence: Syntax
xtinfluence – Influence analysis for panel data displaying the measures and
effects of unit j against unit i. The size of the symbols is proportional to the
magnitude of the calculated measures.

xtinfluence depvar [indepvar] [if] [in] [, options ]

options

figure(graphtype) display diagnostic plots like graphtype allows for


the choice between scatter plot or heat plot;
default is scatter
graph opts graph options available for scatter and heatplot
saving(filename) save .dta and .pdf file with the specified name
and location

Saved data sets


filename adj mtx.dta Data sets with the adjacency list for the influence
measures and effects browse

Annalivia Polselli Influence Analysis using Stata 14 / 20


xtinfluence: Example

**Use of the 'xtinfluence' command


xtset id t

** Heat plot
xtinfluence y x, figure(heat) ///
keylabels(all) color(RdBu, reverse) ///
xlabel(5(10)100, angle(h) labsize(small)) ///
xmtick(##10) xmlabel(##2, angle(h)) ///
ylabel(5(10)100, angle(h)) ///
ymtick(##10) ymlabel(##2, angle(h)) ///
saving("xtinfluence_heat")

** Scatter plot
xtinfluence y x, figure(scatter) ///
xlabel(5(10)100, angle(h) labsize(small)) ///
xmtick(##10) xmlabel(##2, angle(h)) ///
ylabel(5(10)100, angle(h)) ///
ymtick(##10) ymlabel(##2, angle(h)) ///
saving("xtinfluence_scatter")

Annalivia Polselli Influence Analysis using Stata 15 / 20


xtinfluence: heat plot
Joint influence Joint Effects
100 100
95 95
90 90
85 85
80 C 80 K
75 13.663
75 4.5e+10
70 70
12.362 4.0e+10
65 11.061
65 3.6e+10
60 60
9.7596 3.2e+10
55 55
Unit j

Unit j
8.4583 2.8e+10
50 50
7.1571 2.3e+10
45 5.8558
45 1.9e+10
40 40
4.5545 1.5e+10
35 3.2533
35 1.1e+10
30 30
1.952 6.4e+09
25 .65079
25 2.1e+09
20 20
15 15
10 10
5 5
0 0

0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100
5 15 25 35 45 55 65 75 85 95 5 15 25 35 45 55 65 75 85 95
Unit i Unit i

Conditional influence Conditional Effects


100 100
95 95
90 90
85 85
80 cC 80 M
75 .09635
75 2180.1
70 70
.08717 1972.5
65 .078
65 1764.9
60 60
.06882 1557.2
55 55
Unit j

50
.05964 Unit j 50
1349.6
.05047 1142
45 .04129
45 934.33
40 40
.03212 726.7
35 .02294
35 519.07
30 30
.01376 311.45
25 .00459
25 103.82
20 20
15 15
10 10
5 5
0 0

0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100
5 15 25 35 45 55 65 75 85 95 5 15 25 35 45 55 65 75 85 95
Unit i Unit i

Annalivia Polselli Influence Analysis using Stata 16 / 20


xtinfluence: scatter plot
Joint influence Joint Effects
100 100
95 95
90 90
85 85
80 80
75 75
70 70
65 65
60 60
55 55
Unit j

Unit j
50 50
45 45
40 40
35 35
30 30
25 25
20 20
15 15
10 10
5 5
10 20 30 40 50 60 70 80 90 100 10 20 30 40 50 60 70 80 90 100
5 15 25 35 45 55 65 75 85 95 5 15 25 35 45 55 65 75 85 95
Unit i Unit i

Conditional influence Conditional Effects


100 100
95 95
90 90
85 85
80 80
75 75
70 70
65 65
60 60
55 55
Unit j

50
Unit j 50
45 45
40 40
35 35
30 30
25 25
20 20
15 15
10 10
5 5
10 20 30 40 50 60 70 80 90 100 10 20 30 40 50 60 70 80 90 100
5 15 25 35 45 55 65 75 85 95 5 15 25 35 45 55 65 75 85 95
Unit i Unit i

Annalivia Polselli Influence Analysis using Stata 17 / 20


xtinfluence: Summary Table

** Output table generated by 'xtinfluence'

Annalivia Polselli Influence Analysis using Stata 18 / 20


Summary of Method
1. Identify anomalous units and their type with xtlvr2plot
2. Conduct the influence analysis with xtinfluence
2.1 Joint Influence Plot
- Identify units with high individual influence (main diagonal)
- Identify pairs with high joint influence (off-diagonal)
- Highly influential units swamp all other units
2.2 Joint Effect Plot
- Identify pairs with largest effect
- j swamps the effect of i
- j must be detected in (1) and (2.1)
2.3 Conditional Influence Plot
- Identify influential i conditional to removing j
- Check if same units as (1) and (2.1)
2.4 Conditional Effect Plot
- Identify pairs with largest effect
- j masks the effect of i
- Compare identified pairs with (2.2)

3. Units detected in (1), (2.1) and (2.3) are anomalous; (2.2) and (2.4) explain
how they affect the influence of other units and, hence, LS estimates

Annalivia Polselli Influence Analysis using Stata 19 / 20


How to treat anomalous units?

Once identified the type of anomaly in the sample,

1. Is it an actual error in the entry of the data?


▶ Deal with measurement error

2. Is it a genuine extreme value in the entry of the data?


▶ Robust estimation techniques if VO and BL units
(Bramati and Croux, 2007; Verardi and Croux, 2009; Aquaro and Čı́žek,
2013, 2014; Jiao, 2022)

▶ Jackknife-type standard errors if GL units


(MacKinnon and White, 1985; Davidson et al., 1993; MacKinnon, 2013;
Belotti and Peracchi, 2020; Polselli, 2022)

Annalivia Polselli Influence Analysis using Stata 20 / 20


Thank you for your attention!

B annalivia.polselli[at]essex.ac.uk
‡ https://github.com/POLSEAN
References I
Aquaro, M. and Čı́žek, P. (2013). One-step robust estimation of fixed-effects panel data
models. Computational Statistics & Data Analysis, 57(1):536–548.
Aquaro, M. and Čı́žek, P. (2014). Robust estimation of dynamic fixed-effects panel data
models. Statistical Papers, 55(1):169–186.
Atkinson, A. and Mulira, H.-M. (1993). The stalactite plot for the detection of multi-
variate outliers. Statistics and Computing, 3(1):27–35.
Banerjee, M. and Frees, E. W. (1997). Influence diagnostics for linear longitudinal
models. Journal of the American Statistical Association, 92(439):999–1005.
Belotti, F. and Peracchi, F. (2020). Fast leave-one-out methods for inference, model
selection, and diagnostic checking. The Stata Journal, 20(4):785–804.
Bramati, M. C. and Croux, C. (2007). Robust estimators for the fixed effects panel data
model. The econometrics journal, 10(3):521–540.
Chatterjee, S. and Hadi, A. S. (1988). Impact of simultaneous omission of a variable
and an observation on a linear regression equation. Computational Statistics & Data
Analysis, 6(2):129–144.
Davidson, R., MacKinnon, J. G., et al. (1993). Estimation and inference in econometrics.
OUP Catalogue.
Donald, S. G. and Maddala, G. (1993). 24 identifying outliers and influential observa-
tions in econometric models. In Econometrics, volume 11 of Handbook of Statistics,
pages 663 – 701. Elsevier.
Jiao, X. (2022). A simple robust procedure in instrumental variables regression.
References II

Lawrance, A. (1995). Deletion influence and masking in regression. Journal of the Royal
Statistical Society: Series B (Methodological), 57(1):181–189.
MacKinnon, J. G. (2013). Thirty years of heteroskedasticity-robust inference. In Recent
advances and future directions in causality, prediction, and specification analysis,
pages 437–461. Springer.
MacKinnon, J. G. and White, H. (1985). Some heteroskedasticity-consistent covariance
matrix estimators with improved finite sample properties. Journal of econometrics,
29(3):305–325.
Polselli, A. (2022). Essays on Econometric Methods. PhD thesis, University of Essex.
Rousseeuw, P. J. and Van Zomeren, B. C. (1990). Unmasking multivariate outliers and
leverage points. Journal of the American Statistical association, 85(411):633–639.
Silva, J. S. (2001). Influence diagnostics and estimation algorithms for powell’s scls.
Journal of Business & Economic Statistics, 19(1):55–62.
Verardi, V. and Croux, C. (2009). Robust regression in stata. The Stata Journal,
9(3):439–453.
Vertical outliers Back

80

30 30
30

60 30
30 30
30 30
30 30
Dep. variable

40

20 30 30
30 30
30 30 30
30
30
30
0
-4 -2 0 2 4
Indep. variable

Fit w/t anomalies Fit with vertical outliers (VO)


Good leverage units Back

30
20

20
20

20
20 20
Dep. variable

20 20
20
20
20
20 20
10 20
20 20
20 20
20
2020

0
-5 0 5 10 15
Indep. variable

Fit w/t anomalies Fit with good leverage units (GL)


Bad leverage units Back

80
10

10 10 10
10
10
60 10
1010
10
Dep. variable

40

20 10 1010
1010

10
1010
10
10
0
-5 0 5 10 15
Indep. variable

Fit w/t anomalies Fit with bad leverage units (BL)


Anomalous units after time-demeaning Back

Original variables Demeaned variables


80 80
10
3030
30 10 1010
10
10
60 30
30
10
10
10 60
30 10
30
30 30
30
10
3030
30 1010
10
40 40
Dep. variable

Dep. variable
10
30 10
30 10
10
10
30
20 30 10
30 30
30 20
20 20
20 20
20 3030 20
10 20
10 1010
30
30 20
10 20 20
3020
3020 30 20
20 20 20 20 20
20
20 30 1020 20 20
20 20 20
30 10
2010 20 20 20 20 20 20
20 20
20
20
10 20 20
3010 20
0 0
3030
10 10 30
30
1010
10 30 30 30
30
30
-20 -20 10 10
10
10
10 30

-5 0 5 10 15 -10 -5 0 5 10
Indep. variable Indep. variable

Fit w/t anomalies With BL With GL With VO


Directed Weighted Adjacency List Back
Joint Influence Back

If i ̸= j, ′
e ′X b(i,j) (s2 K)−1
 
Cij (β) b−β
b = β b(i,j) X e β b−β

where

e −1 X
e ′X e ′ Mj −H′ M−1 Hij −1 H′ M−1 u
e ′ M−1 Hij +X
   
β b(i) − X
b(i,j) = β
i i j ij i ij i b i +b
uj

with Mj = Ij − Hj with Hij = X e ′ X)


e i (X e −1 X
e ′ , and Hj = X e ′ X)
e j (X e −1 X
e′.
j j
Note that Cij (β)
b = Cji (β).
b

If i = j, ′
e ′X b(i) (s2 K)−1
 
Cii (β) b−β
b = β b(i) X e βb−β

e −1 X
e ′X e ′ M−1 u

where β b− X
b(i) = β
i i bi .

This is Banerjee and Frees (1997) metrics as defined by Belotti and Peracchi (2020) for
linear panel data models with fixed effects.
Both measures are distributed as F(ν1 , ν2 ); a distributional cutoff can be chosen.
Conditional Influence Back

N
!
′ X
e ′i(j) X b(j) (s2 K)−1

Ci(j) (β) b(i,j) − β
b = β b(j) X e i(j) b(i,j) − β
β
i=1
i̸=j

▶ Ci(j) (β)
b = 0 for i = j

▶ Ci(j) (β)
b ̸= Cj(i) (β)
b

▶ Ci(j) (β)
b ≈ F (ν1 , ν2 ) from which a distributional cutoff can be
chosen
Data generating process Back

set seed 1408


set obs 100
gen id = _n
expand 20

bys id: generate t = _n


bys id: gen x = rnormal()

bys id: replace x = rnormal(10,1) if id==10 & t<=10 //BL unit


bys id: replace x = rnormal(10,1) if id==40 & t<=5 //BL unit

bys id: replace x = rnormal(15,1) if id==20 & t<=10 //GL unit


bys id: replace x = rnormal(15,1) if id==50 & t<=5 //GL unit

bys id: gen a = runiform(0,20)


bys id: gen y = 1 + 1*x + a + runiform()

bys id: replace y = y + rnormal(50,1) if id==10 & t<=10 //BL unit


bys id: replace y = y + rnormal(50,1) if id==40 & t<=5 //BL unit

bys id: replace y = y + rnormal(50,1) if id==30 & t<=10 //VO


bys id: replace y = y + rnormal(50,1) if id==60 & t<=5 //VO

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy