0% found this document useful (0 votes)

36 views

Lars Based S Estimator

This document describes a robust version of the Least Angle Regression (LARS) algorithm for variable selection based on S-estimators. LARS produces an ordered list of variables according to their importance, but is not robust to outliers. The authors derive an algorithm using S-regression, a robust regression estimator, instead of least squares. This makes LARS robust to outliers in the data. Simulation studies show the robust LARS compares favorably to previous robust model selection proposals and is able to select the correct variables even with outliers present.

Uploaded by

Joanne Wong

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

36 views

Lars Based S Estimator

Uploaded by

Joanne Wong

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

Robust Model Selection with LARS Based on

S-estimators
Claudio Agostinelli1 and Matias Salibian-Barrera2
1

Dipartimento di Statistica
Ca Foscari University
Venice, Italy claudio@unive.it
Department of Statistics
The University of British Columbia
Vancouver, BC, Canada matias@stat.ubc.ca

Abstract. We consider the problem of selecting a parsimonious subset of explanatory variables from a potentially large collection of covariates. We are concerned
with the case when data quality may be unreliable (e.g. there might be outliers
among the observations). When the number of available covariates is moderately
large, fitting all possible subsets is not a feasible option. Sequential methods like
forward or backward selection are generally greedy and may fail to include important predictors when these are correlated. To avoid this problem Efron et al.
(2004) proposed the Least Angle Regression algorithm to produce an ordered list
of the available covariates (sequencing) according to their relevance. We introduce
outlier robust versions of the LARS algorithm based on S-estimators for regression
(Rousseeuw and Yohai (1984)). This algorithm is computationally efficient and suitable even when the number of variables exceeds the sample size. Simulation studies
show that it is also robust to the presence of outliers in the data and compares
favourably to previous proposals in the literature.

Keywords: robustness, model selection, LARS, S-estimators, robust regres-

sion

Introduction

As a result of the recent dramatic increase in the ability to collect data,

researchers sometimes have a very large number of potentially relevant explanatory variables available to them. Typically, some of these covariates
are correlated among themselves and hence not all of them need to be included in a statistical model with good prediction performance. In addition,
models with few variables are generally easier to interpret than models with
many ones. Model selection refers to the process of finding a parsimonious
model with good prediction properties. Many model selection methods consist on sequentially fitting models from a pre-specified list and comparing
their goodness-of-fit, prediction properties, or a combination of both. In this
paper we consider the case where a proportion of the data may not satisfy the
Y. Lechevallier, G. Saporta (eds.), Proceedings of COMPSTAT2010,
c Springer-Verlag Berlin Heidelberg 2010
DOI 10.1007/978-3-7908-2604-3 6,

Agostinelli, C. and Salibian-Barrera, M.

model assumptions and we are interested in predicting the non-outlying observations. Therefore, we consider model selection methods for linear models
based on robust methods.
As it is the case with point estimation and other inference procedures,
likelihood-type model selection methods (e.g. AIC (Akaike (1970)), Mallows
Cp (Mallows (1973)), and BIC (Schwarz (1978)) may be severely affected
by a small proportion of atypical observations in the data. These outliers
may not necessarily consist of large values, but might not follow the model
that applies to the majority of the data. Model selection procedures that are
resistant to the presence of outliers in the sample have only recently started
to receive some attention in the literature. Seminal papers include Hampel
(1983), Ronchetti (1985, 1997) and Ronchetti and Staudte (1994). Other proposals include Sommer and Staudte (1995), Ronchetti, Field and Blanchart
(1997), Qian and K
unsch (1998), Agostinelli (2002a, 2002b), Agostinelli and
Markatou (2005), Morgenthaler, Welsch and Zenide (2003). See also the recent book by Maronna, Martin and Yohai (2006). These proposals are based
on robustified versions of classical selection criteria (e.g. robust Cp , robust
final prediction error, etc.). More recently M
uller and Welsh (2005) proposed
a model selection criterion that combines a measure of goodness-of-fit, a
penalty term to avoid over-fitting and and the expected prediction error conditional on the data. Salibian-Barrera and Van Aelst (2008) use the fast and
robust bootstrap of Salibian-Barrera and Zamar (2002) to obtain a faster
boostrap-based model selection method that is feasible to calculate for larger
number of covariates. Although less expensive from a computational point of
view than the stratified bootstrap of M
uller and Welsh (2005), this method,
as the previous ones, needs to compute the estimator on the full model.
A different approach to variable selection that is attractive when the number of explanatory variables is large is based on ordering the covariates according to their estimated importance in the full model. Forward stepwise
and backward elimination procedures are examples of this approach, whereby
in each step of the procedure a variable may enter or leave the linear model
(see, e.g. Weisberg (1985) or Miller (2002)). With backward elimination one
starts with the full model and then finds the best possible submodel with
one less covariate in it. This procedure is repeated until we fit a model with
a single covariate or a criterion is reached. A similar procedure is forward
stepwise, where we first select the covariate (say x1 ) with the highest absolute correlation with the response variable y. We take the residuals of the
regression of y on x1 as our new response, project all covariates orthogonally to x1 and add the variable with the highest absolute correlation to the
model. At the same step, variables in the model may be deleted according to
a criterion. These steps are repeated until no variables are added or deleted.
Unfortunately, when p is large (p = 100, for example), these procedure becomes unfeasible for highly-robust estimators, furthermore these algorithms

LARS Based on S-estimators

are known to be greedy and may relegate important covariates if they are
correlated with those selected earlier in the sequence.
The Least Angle Regression (LARS) of Efron et al. (2004) is a generalization of stepwise methods, where the length of the step is selected so as
to strike a balance between fast-but-greedy and slow-but-conservative
alternatives, as those in stagewise selection (see, e.g. Hastie, Tibshirani and
Friedman (2001)). It is easy to verify that this method is not robust to the
presence of a small amount of atypical observations. McCann and Welsch
(2007) proposed to add an indicator variable for each observation and then
run the usual LARS on the extended set of covariates. When high-leverage
outliers are possible, they suggest building models from randomly drawn
subsamples of the data, and then selecting the best of them based on their
(robustly estimated) prediction error. Khan, Van Aelst and Zamar (2007b)
showed that the LARS algorithm can be expressed in terms of the pairwise
sample correlations between covariates and the response variable, and proposed to apply this algorithm using robust correlation estimates. This is a
plug-in proposal in the sense that it takes a method derived using least
squares or L2 estimators and replaces the required point estimates by robust
counterparts.
In this paper we derive an algorithm based on LARS, but using a Sregression estimator (Rousseeuw and Yohai (1984)). Section 2 contains a brief
description of the LARS algorithm, while Section 3 describes our proposal.
Simulation results are discussed in Section 4 and concluding remarks can be
found in Section 5.

Review of Least Angle Regression

Let (y1 , x1 ), . . . , (yn , xn ) be n independent observations, where yi R and

xi Rp , i = 1, . . . , n. We are interested in fitting a linear model of the form
yj = + 0 xj + j

j = 1, . . . , n,

where Rp and the errors j are assumed to be independent with zero

mean and constant variance 2 . In what follows, we will assume, without
loss of generality that the variables have been centered and standardized to
satisfy:
n
X
i=1

yi = 0

n
X
i=1

xi,j = 0

n
X

x2i,j = 1

for 1 j p .

i=1

so that the linear model above does not contain the intercept term.
The Least Angle Regression algorithm (LARS) is a generalization of the
Forward Stagewise procedure. The latter is an iterative technique that starts
with the predictor vector
= 0 Rn , and at each step sets

=
+ sign(cj ) x(j)

Agostinelli, C. and Salibian-Barrera, M.

where j = arg max1ip cor(y

, x(i) ), x(i) Rn denotes the i-th column
of the design matrix, cj = cor(y
, x(j) ), and > 0 is a small constant.
Typically the parameter controls the speed and greediness of the method:
small values produce better results at a large computational cost, while large
values result in a faster algorithm that may relegate an important covariate
if it happens to be correlated with one that has entered the model earlier.
The LARS iterations can be described as follows. Start with the predictor

= 0. Let
A be the current predictor and let
c = X 0 (y
A ) ,
where X Rnp denotes the design matrix. In other words, c is the vector of current correlations cj , j = 1, . . . , p. Let A denote the active set,
which corresponds to those covariates with largest absolute correlations:
C = maxj {|cj |} and A = {j : |cj | = C}. Assume, without loss of generality, that A = {1, . . . , m}. Let sj = sign(cj ) for j A, and let XA Rnm
be the matrix formed by the corresponding signed columns of the design
matrix X, sj x(j) . Note that the vector uA = vA /kvA k, where
1

0
vA = XA (XA
XA )

1A ,

satisfies
0
XA
uA = AA 1A ,

(1)

where AA = 1/kvA k R. In other words, the unit vector uA makes equal

angles with the columns of XA . LARS updates
A to

A
A + uA ,
where is taken to be the smallest positive value such that a new covariate
joins the active set A of explanatory variables with largest absolute correlation. More specifically, note that, if for each we let () =
A + uA , then
for each j = 1, . . . , p we have

cj () = cor y (), x(j) = x0(j) (y ()) = cj aj ,
where aj = x0(j) uA . For j A, equation (1) implies that
|cj ()| = C AA ,
so all maximal current correlations decrease at a constant rate along this
direction. We then determine the smallest positive value of that makes the
correlations between the current active covariates and the residuals equal to
that of another covariate x(k) not in the active set A. This variable enters
the model, the active set becomes
A A {k} ,
and the correlations are updated to C C AA . We refer the interested
reader to Efron et al. (2004) for more details.

LARS Based on S-estimators

LARS based on S-estimators

S-regression estimators (Rousseeuw and Yohai (1984)) are defined as the

vector of coefficients that produce the smallest residuals in the sense of
minimizing a robust M-scale estimator. Formally we have:
= arg minp () ,
R

where () satisfies

n
1 X
ri ()

= b,
n i=1
()
: R R+ is a symmetric, bounded, non-decreasing and continuous function, and b (0, 1) is a fixed constant. The choice b = EF0 () ensures that the
resulting estimator is consistent when the errors have distribution function
F0 .
For a given active set A of k covariates let A , 0A ,
A be the S-estimators
of regressing the current residuals on the k active variables with indices in
A. Consider the parameter vector = (, 0 , ) that satisfies
!
n
X
ri x0i,k (A ) 0
1

= b.
n k 1 i=1

A robust measure of covariance between the residuals associated with and

the j-th covariate is given by
!
n
X
ri x0i,k (A ) 0
0
covj () =

xij ,

i=1
and the corresponding correlation is
,
j () = covj ()

n
X
i=1

ri x0i,k (A ) 0

!2
.

Our algorithm can be described as follows:

1. Set k = 0 and compute the S-estimators 0 = (0, 00 ,
0 ) by regressing y
against the intercept. The first variable to enter is that associated with
the largest robust correlation:

1 = max

j (0 ) .
1jp

Without loss of generality, assume that it corresponds to the first covariate.

Agostinelli, C. and Salibian-Barrera, M.

2. set k = k + 1 and compute the current residuals

ri,k = ri,k1 xti,k1 (k1 k1 ) 0,k1 .
3. let k , 0k ,
k be the Sestimators of regressing rk against xk .
4. For each j in the inactive set find j such that
j = |
j
| = |
m | for all 1 m k,
Pn
0
t

i=1 (ri,k xi,k (k k ) 0k )/k = 0, and

Pn
t

i=1 (ri,k xi,k (k k ) 0k )/k = b(n k 1).

k+1 = maxj>k , the associated index, say v corresponds to the
5. Let
j
next variable to enter the active set. Let k = v .
6. Repeat until k = d.
Given an active set A, the above algorithm finds the length k such that
the robust correlation between the current residuals and the active covariates
matches that of an explanatory variable yet to enter the model. The variable
that achieves this with the smallest step is included in the model, and the
procedure is then iterated. It is in this sense that our proposal is based on
LARS.

Simulation results

To study the performance of our proposal we conducted a simulation study

using a similar design to that reported by Khan et al. (2007b). We generated
the response variable y according to the following model:
y = L1 + L2 + + Lk + ,
where Lj , j = 1, . . . , k and are independent random variables with a standard normal distribution. The value of is chosen to obtain a signal to
noise ratio of 3. We then generate d candidate covariates as follows
Xi = Li +
Xk+1 = L1 +
Xk+2 = L1 +
Xk+3 = L2 +
Xk+4 = L2 +
..
.
X3k1 = Lk +
X3k = Lk +

i ,
k+1
k+2
k+3
k+4

i = 1, . . . , k ,

3k1
3k ,

and Xj = j for j = 3k, 3k + 1, . . . , d. The choices = 5 and = 0.3 result in

cor(X1 , Xk+1 ) = cor(X1 , Xk+2 ) = cor(X2 , Xk+2 ) = cor(X2 , Xk+3 ) = =
cor(Xk , X3k ) = 0.5. We consider the following contamination cases:

LARS Based on S-estimators

a. N (0, 1), no contamination;

b. 0.90 N (0, 1) + 0.10 N (0, 1)/U(0, 1), 10 % of symmetric outliers with
the Slash distribution;
c. 0.90 N (0, 1) + 0.10 N (20, 1), 10 % of asymmetric Normal outliers;
d. 10% of high leverage asymmetric Normal outliers (the corresponding covariates were sampled from a N (50, 1) distribution).
For each case we generated 500 independent samples with n = 150, k = 6
and d = 50. In each of these datasets we sorted the 50 covariates in the
order in which they were listed to enter the model. We used the usual LARS
algorithm as implemented in the R package lars, our proposal (LARSROB)
and the robust plug-in algorithm of Khan et al. (2007b) (RLARS).
For case (a) where no outliers were present in the data, all methods performed very close to each other. The results of our simulation for cases (b),
(c) and (d) above are summarized in Figures 1 to 3. For each sequence of
covariates consider the number tm of target explanatory variables included
in the first m covariates entering the model, m = 1, . . . , d. Ideally we would
like a sequence that satisfies tm = k for m k. In Figures 1 to 3 we plot
the average tm over the 500 samples, as a function of the model size m,
for each of the three methods. We see that for symmetric low-leverage outliers LARSROB and RLARS are very close to each other, with both giving
better results that the classical LARS. For asymmetric outliers LARSROB
performed marginally better than RLARS, while for high-leverage outliers
the performance of LARSROB deteriorates noticeably.

Conclusion

We have proposed a new robust algorithm to select covariates for a linear

model. Our method is based on the LARS procedure of Efron et al. (2004).
Rather than replacing classical correlation estimates by robust ones and applying the same LARS algorithm, we derived our method directly following
the intuition behind LARS but starting from robust S-regression estimates.
Simulation studies suggest that our method is robust to the presence of lowleverage outliers in the data, and that in this case it compares well with
the plug-in approach of Khan et al. (2007b). A possible way to make our
proposal more resistant to high-leverage outliers is to downweight extreme
values of the covariates in the robust correlation measure we utilize. Further
research along these lines is ongoing.
An important feature of our approach is that it naturally extends the
relationship between the LARS algorithm and the sequence of LASSO solutions (Tibshirani (1996)). Hence, with our approach we can obtain a resistant
algorithm to calculate the LASSO based on S-estimators. Details of the algorithm discussed here, and its connection with a robust LASSO method will
be published separately.

Agostinelli, C. and Salibian-Barrera, M.

5
4
3
2
1
0

NUMBER OF CORRECT COVARIATES

MODEL SIZE

5
4
3
2
1
0

NUMBER OF CORRECT COVARIATES

Fig. 1. Case (b) - Average number of correctly selected covariates as a function of

the model size. The solid line corresponds to LARS, the dashed line to our proposal
(LARSROB) and the dotted line to the RLARS algorithm of Khan et al. (2007b).

MODEL SIZE

Fig. 2. Case (c) - Average number of correctly selected covariates as a function of

the model size. The solid line corresponds to LARS, the dashed line to our proposal
(LARSROB) and the dotted line to the RLARS algorithm of Khan et al. (2007b).

5
4
3
2
1
0

NUMBER OF CORRECT COVARIATES

LARS Based on S-estimators

MODEL SIZE

Fig. 3. Case (d) - Average number of correctly selected covariates as a function of

the model size. The solid line corresponds to LARS, the dashed line to our proposal
(LARSROB) and the dotted line to the RLARS algorithm of Khan et al. (2007b).

References
AGOSTINELLI, C. (2002a): Robust model selection in regression via weighted
likelihood methodology. Statistics and Probability Letters, 56 289-300.
AGOSTINELLI, C. (2002b): Robust stepwise regression. Journal of Applied Statistics, 29(6) 825-840.
AGOSTINELLI, C. and MARKATOU, M. (2005): M. Robust model selection by
cross-validation via weighted likelihood. Unpublished manuscript.
AKAIKE, H. (1970): Statistical predictor identification. Annals of the Institute of
Statistical Mathematics, 22 203-217.
EFRON, B., HASTIE, T., JOHNSTONE, I. and TIBSHIRANI, R. (2004): Least
angle regression. The Annals of Statistics 32(2), 407-499.
HAMPEL, F.R. (1983): Some aspects of model choice in robust statistics. In: Proceedings of the 44th Session of the ISI, volume 2, 767-771. Madrid.
HASTIE, T., TIBSHIRANI, R. and FRIEDMAN, J. (2001): The Elements of Statistical Learning. Springer-Verlag, New York.
KHAN, J.A., VAN AELST, S., and ZAMAR, R.H. (2007a): Building a robust
linear model with forward selection and stepwise procedures. Computational
Statistics and Data Analysis 52, 239-248.
KHAN, J.A., VAN AELST, S., and ZAMAR, R.H. (2007b): Robust Linear Model
Selection Based on Least Angle Regression. Journal of the American Statistical
Association 102, 1289-1299.
MALLOWS, C.L. (1973): Some comments on Cp . Technometrics 15, 661-675.
MARONNA, R.A., MARTIN, D.R. and YOHAI, V.J. (2006): Robust Statistics:
Theory and Methods. Wiley, Ney York.

Agostinelli, C. and Salibian-Barrera, M.

McCANN, L. and WELSCH, R.E. (2007): Robust variable selection using least
angle regression and elemental set sampling. Computational Statistical and
Data Analysis 52, 249-257.
MILLER, A.J. (2002): Subset selection in regression. Chapman-Hall, New York.
MORGENTHALER, S., WELSCH, R.E. and ZENIDE, A. (2003): Algorithms for
robust model selection in linear regression. In: M. Hubert, G. Pison, A. Struyf
and S. Van Aelst (Eds.): Theory and Applications of Recent Robust Methods.
Brikh
auser-Verlag, Basel, 195-206.

MULLER,
S. and WELSH, A. H. (2005): Outlier robust model selection in linear
regression. Journal of the American Statistical Association 100, 1297-1310.

QIAN, G. and KUNSCH,

H.R. (1998): On model selection via stochastic complexity
in robust linear regression. Journal of Statistical Planning and Inference 75,
91-116.
RONCHETTI, E. (1985): Robust model selection in regression. Statistics and Probability Letters 3, 21-23.
RONCHETTI, E. (1997): Robustness aspects of model choice. Statistica Sinica 7,
327-338.
RONCHETTI, E. and STAUDTE, R.G. (1994): A robust version of Mallows Cp .
Journal of the American Statistical Association 89, 550-559.
RONCHETTI, E., FIELD, C. and BLANCHARD, W. (1997): Robust linear model
selection by cross-validation. Journal of the American Statistical Association
92, 1017-1023.
ROUSSEEUW, P.J. and YOHAI, V.J. (1984). Robust regression by means of Sestimators. In: J. Franke, W. Hardle and D. Martin (Eds.): Robust and Nonlinear Time Series, Lecture Notes in Statistics 26. Springer-Verlag, Berlin, 256272.
SALIBIAN-BARRERA, M. and VAN AELST, S. (2008): Robust model selection
using fast and robust bootstrap. Computational Statistics and Data Analysis
52 5121-5135.
SALIBIAN-BARRERA, M. and ZAMAR, R.H. (2002): Bootstrapping robust estimates of regression. The Annals of Statistics 30, 556-582.
SCHWARTZ, G. (1978): Estimating the dimensions of a model. The Annals of
Statistics 6, 461-464.
SOMMER, S. and STAUDTE, R.G. (1995): Robust variable selection in regression
in the presence of outliers and leverage points. Australian Journal of Statistics
37, 323-336.
TIBSHIRANI, R. (1996): Regression shrinkage and selection via the lasso. Journal
of the Royal Statistical Society, Series B: Methodological 58, 267-288.
WEISBERG, S. (1985): Applied linear regression. Wiley, New York.

The Pirotechnia of Vannoccio Biringuccio
100% (1)
The Pirotechnia of Vannoccio Biringuccio
37 pages
The Christian Marriage and Family Series
100% (6)
The Christian Marriage and Family Series
129 pages
4.QT1 ASGMT 2016.01 Final
No ratings yet
4.QT1 ASGMT 2016.01 Final
5 pages
PT3 Essay
50% (2)
PT3 Essay
9 pages
Miraculous Messages From Water
No ratings yet
Miraculous Messages From Water
8 pages
Ufo Contact From Planet Ummo PDF
75% (4)
Ufo Contact From Planet Ummo PDF
150 pages
Euclid Aos 1083178935
No ratings yet
Euclid Aos 1083178935
93 pages
Model Selection-Handout PDF
No ratings yet
Model Selection-Handout PDF
57 pages
Lecture 5 Model Selection I: STAT 441: Statistical Methods For Learning and Data Mining
No ratings yet
Lecture 5 Model Selection I: STAT 441: Statistical Methods For Learning and Data Mining
17 pages
MultiLinear VariableSelection
No ratings yet
MultiLinear VariableSelection
10 pages
Lesson 5 Model Selection
No ratings yet
Lesson 5 Model Selection
41 pages
A Universal Selection Method in Linear Regression Models: Eckhard Liebscher
No ratings yet
A Universal Selection Method in Linear Regression Models: Eckhard Liebscher
10 pages
STA302 Week12 Full
No ratings yet
STA302 Week12 Full
30 pages
EDA 4th Module
No ratings yet
EDA 4th Module
26 pages
Ch5 Slide VariableSelection
No ratings yet
Ch5 Slide VariableSelection
36 pages
L2D-Multiple Regression D 2022-03-03 21_20_03
No ratings yet
L2D-Multiple Regression D 2022-03-03 21_20_03
31 pages
A Robust Instrumental-Variables Estimator: 1 Theory
No ratings yet
A Robust Instrumental-Variables Estimator: 1 Theory
13 pages
Chapter 2
No ratings yet
Chapter 2
37 pages
Chapter 4
No ratings yet
Chapter 4
23 pages
Desingn of Experiments ch10
No ratings yet
Desingn of Experiments ch10
5 pages
Module 4: Regression Shrinkage Methods
No ratings yet
Module 4: Regression Shrinkage Methods
5 pages
3rd Module EDBA Contiuation1
No ratings yet
3rd Module EDBA Contiuation1
6 pages
Model Fitting and Error Estimation: BSR 1803 Systems Biology: Biomedical Modeling
No ratings yet
Model Fitting and Error Estimation: BSR 1803 Systems Biology: Biomedical Modeling
34 pages
13 Paper PDF
No ratings yet
13 Paper PDF
14 pages
Adaptive Estimation in An Autoregression and A Geometrical
No ratings yet
Adaptive Estimation in An Autoregression and A Geometrical
37 pages
Problems With Stepwise Regression
No ratings yet
Problems With Stepwise Regression
1 page
Module07 - Model Selection and Regularization
No ratings yet
Module07 - Model Selection and Regularization
46 pages
Linear-Model-Selection-and-Regularization
No ratings yet
Linear-Model-Selection-and-Regularization
23 pages
Some Methods of Detection of Outliers in Linear Regression Model-Ranjit PDF
No ratings yet
Some Methods of Detection of Outliers in Linear Regression Model-Ranjit PDF
19 pages
Model Selection R Chap 4
No ratings yet
Model Selection R Chap 4
5 pages
11. VariableSelectionAndModelBuilding IIT
No ratings yet
11. VariableSelectionAndModelBuilding IIT
22 pages
Sa 16
No ratings yet
Sa 16
5 pages
AN ASYMPTOTIC THEORY FOR LINEAR MODEL SELECTION
No ratings yet
AN ASYMPTOTIC THEORY FOR LINEAR MODEL SELECTION
44 pages
Jurnal Asli Diagram Sa
No ratings yet
Jurnal Asli Diagram Sa
11 pages
SRM Notes
No ratings yet
SRM Notes
38 pages
OutlierDetectionAlgorithms
No ratings yet
OutlierDetectionAlgorithms
38 pages
Yang-39 2 Proof 27
No ratings yet
Yang-39 2 Proof 27
11 pages
lOFTUS_ET_AL
No ratings yet
lOFTUS_ET_AL
17 pages
Slides 2
No ratings yet
Slides 2
27 pages
Lasso and Ridge Regression
No ratings yet
Lasso and Ridge Regression
30 pages
Diagnostic Tests2
No ratings yet
Diagnostic Tests2
25 pages
Lecture 24: Weighted and Generalized Least Squares 1 Weighted Least Squares
No ratings yet
Lecture 24: Weighted and Generalized Least Squares 1 Weighted Least Squares
8 pages
Best Subset Methods
No ratings yet
Best Subset Methods
3 pages
Appendix Robust Regression
No ratings yet
Appendix Robust Regression
8 pages
Robust Regression: 1 M-Estimation
No ratings yet
Robust Regression: 1 M-Estimation
8 pages
Lab 5
No ratings yet
Lab 5
30 pages
DDMA05_ModelSelection
No ratings yet
DDMA05_ModelSelection
28 pages
MA 324, Lecture 1: Yohann Tendero Yohann - Tendero@
No ratings yet
MA 324, Lecture 1: Yohann Tendero Yohann - Tendero@
19 pages
Design and Analysis of Computer Experiments: Theory: 1 Density Estimation
No ratings yet
Design and Analysis of Computer Experiments: Theory: 1 Density Estimation
9 pages
Appendix Robust Regression
No ratings yet
Appendix Robust Regression
17 pages
A Review On Variable Selection in Regression
No ratings yet
A Review On Variable Selection in Regression
27 pages
Similar With My Research
No ratings yet
Similar With My Research
8 pages
Unit 4
No ratings yet
Unit 4
7 pages
SSRN Id1369144 PDF
No ratings yet
SSRN Id1369144 PDF
14 pages
lecture 3
No ratings yet
lecture 3
33 pages
Regression Analysis
100% (1)
Regression Analysis
280 pages
Lecture 5
No ratings yet
Lecture 5
16 pages
Econometrics
No ratings yet
Econometrics
13 pages
Linear Regression
83% (6)
Linear Regression
499 pages
Reg Analysis
No ratings yet
Reg Analysis
63 pages
WINSEM2023-24 MAT6015 ETH VL2023240501308 2024-03-19 Reference-Material-I
No ratings yet
WINSEM2023-24 MAT6015 ETH VL2023240501308 2024-03-19 Reference-Material-I
39 pages
Machine Learning. Supervised Learning Techniques and Tools: Nonlinear Models Exercises with R, SAS, Stata, Eviews and SPSS
From Everand
Machine Learning. Supervised Learning Techniques and Tools: Nonlinear Models Exercises with R, SAS, Stata, Eviews and SPSS
César Pérez López
No ratings yet
Exercises of Numerical Analysis
From Everand
Exercises of Numerical Analysis
Simone Malacrida
No ratings yet
Acceptance-Rejection Sampling and Multi-dimensional Monte Carlo Integrations Utilizing Mathematica®
From Everand
Acceptance-Rejection Sampling and Multi-dimensional Monte Carlo Integrations Utilizing Mathematica®
SUJAUL CHOWDHURY
No ratings yet
Direct Linear Transformation: Practical Applications and Techniques in Computer Vision
From Everand
Direct Linear Transformation: Practical Applications and Techniques in Computer Vision
Fouad Sabry
No ratings yet
Dynamic Bayesian Networks: Fundamentals and Applications
From Everand
Dynamic Bayesian Networks: Fundamentals and Applications
Fouad Sabry
No ratings yet
Form 1 Chapter 1 7 Science Notes PDF
0% (1)
Form 1 Chapter 1 7 Science Notes PDF
11 pages
English L1
No ratings yet
English L1
5 pages
Chia Ruo Qian Matthew Josephine Chiong Wei Leong Fong Siong Long Debra Huang Ing Ling Kok Cheng Hoo Din Foong
No ratings yet
Chia Ruo Qian Matthew Josephine Chiong Wei Leong Fong Siong Long Debra Huang Ing Ling Kok Cheng Hoo Din Foong
1 page
Vocabulary: Underline The Correct Answer
No ratings yet
Vocabulary: Underline The Correct Answer
4 pages
2000 Rep (Single Outlier) N 30
No ratings yet
2000 Rep (Single Outlier) N 30
1,396 pages
Robust Estimation Methods and Outlier Detection in Mediation Model
No ratings yet
Robust Estimation Methods and Outlier Detection in Mediation Model
25 pages
F5 Chapter 1 and 2
No ratings yet
F5 Chapter 1 and 2
12 pages
Sss 3
No ratings yet
Sss 3
21 pages
Testing
No ratings yet
Testing
7 pages
Hat Is An Outlier Istory of Utliers Mportance of Etecting Utliers Auses of Utliers
No ratings yet
Hat Is An Outlier Istory of Utliers Mportance of Etecting Utliers Auses of Utliers
3 pages
Fair'S Fair: Synopsis
No ratings yet
Fair'S Fair: Synopsis
13 pages
Plotting Johnson's S Distribution Using A New Parameterization
No ratings yet
Plotting Johnson's S Distribution Using A New Parameterization
7 pages
Additional Mathematics Trial SPM - Module 2 - Paper 2 PDF
No ratings yet
Additional Mathematics Trial SPM - Module 2 - Paper 2 PDF
20 pages
Bayesian3 (1) - 1
No ratings yet
Bayesian3 (1) - 1
13 pages
Robust Regression Shrinkage and Consistent Variable Selection Through The LAD-Lasso
No ratings yet
Robust Regression Shrinkage and Consistent Variable Selection Through The LAD-Lasso
9 pages
1.1 The Little Book of Humanism
No ratings yet
1.1 The Little Book of Humanism
248 pages
Readph Prelim Reviewer
No ratings yet
Readph Prelim Reviewer
18 pages
Kindergarten Model Curriculum For Mathematics PDF 1
No ratings yet
Kindergarten Model Curriculum For Mathematics PDF 1
18 pages
Front Sheet Stressed Out
No ratings yet
Front Sheet Stressed Out
1 page
Cross-Cultural Differences in The Acceptance of Barnum Profiles Supposedly Derived From Western Versus Chinese Astrology
No ratings yet
Cross-Cultural Differences in The Acceptance of Barnum Profiles Supposedly Derived From Western Versus Chinese Astrology
21 pages
Ultimate Guide To The Basics of Efficient Lighting
100% (1)
Ultimate Guide To The Basics of Efficient Lighting
152 pages
DR - Ram Manohar Lohiya National Law University, Lucknow: Project of Property Law
No ratings yet
DR - Ram Manohar Lohiya National Law University, Lucknow: Project of Property Law
14 pages
Order of Nine Angles: Reality
No ratings yet
Order of Nine Angles: Reality
20 pages
The Journey of Self-Discovery_ A Lifelong Quest__.
No ratings yet
The Journey of Self-Discovery_ A Lifelong Quest__.
1 page
Practicum Case Study
No ratings yet
Practicum Case Study
27 pages
Neoclassical Organisation Theory: Welcome To The Second Lecture On This First Week and Today!
No ratings yet
Neoclassical Organisation Theory: Welcome To The Second Lecture On This First Week and Today!
9 pages
Retail Theories
100% (1)
Retail Theories
2 pages
BPSC 111
No ratings yet
BPSC 111
3 pages
EDUC-105_FINALS_PERFORMANCE-TASK-4.2
No ratings yet
EDUC-105_FINALS_PERFORMANCE-TASK-4.2
2 pages
Coaching High Achievers-12
0% (1)
Coaching High Achievers-12
12 pages
Av Aid File
50% (2)
Av Aid File
40 pages
A Semiotic Analysis of A Literary Text
No ratings yet
A Semiotic Analysis of A Literary Text
22 pages
Timothy Pawl - The Incarnation (Cambridge Elements)
No ratings yet
Timothy Pawl - The Incarnation (Cambridge Elements)
80 pages
Disertatie
No ratings yet
Disertatie
50 pages
Blast Off - Traderencyclopedia
100% (1)
Blast Off - Traderencyclopedia
4 pages
Bata
100% (1)
Bata
62 pages
STATS - DOANE - Chapter 15 Chi-Square Tests
100% (1)
STATS - DOANE - Chapter 15 Chi-Square Tests
189 pages
Loop Corrections To The Propagator
No ratings yet
Loop Corrections To The Propagator
13 pages
Body Ritual Among The Nacirema
0% (1)
Body Ritual Among The Nacirema
13 pages
Social Perception: Nonverbal Communication
No ratings yet
Social Perception: Nonverbal Communication
5 pages
Camus - The Stranger: Lost in Translation
No ratings yet
Camus - The Stranger: Lost in Translation
4 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Lars Based S Estimator

Uploaded by

Lars Based S Estimator

Uploaded by

Robust Model Selection with LARS Based on

Keywords: robustness, model selection, LARS, S-estimators, robust regres-

As a result of the recent dramatic increase in the ability to collect data,

Agostinelli, C. and Salibian-Barrera, M.

LARS Based on S-estimators

Review of Least Angle Regression

Let (y1 , x1 ), . . . , (yn , xn ) be n independent observations, where yi R and

where Rp and the errors j are assumed to be independent with zero

Agostinelli, C. and Salibian-Barrera, M.

where j = arg max1ip cor(y

where AA = 1/kvA k R. In other words, the unit vector uA makes equal

LARS Based on S-estimators

LARS based on S-estimators

S-regression estimators (Rousseeuw and Yohai (1984)) are defined as the

A robust measure of covariance between the residuals associated with and

Our algorithm can be described as follows:

Without loss of generality, assume that it corresponds to the first covariate.

Agostinelli, C. and Salibian-Barrera, M.

2. set k = k + 1 and compute the current residuals

i=1 (ri,k xi,k (k k ) 0k )/k = 0, and

i=1 (ri,k xi,k (k k ) 0k )/k = b(n k 1).

To study the performance of our proposal we conducted a simulation study

and Xj = j for j = 3k, 3k + 1, . . . , d. The choices = 5 and = 0.3 result in

LARS Based on S-estimators

a.  N (0, 1), no contamination;

We have proposed a new robust algorithm to select covariates for a linear

Agostinelli, C. and Salibian-Barrera, M.

NUMBER OF CORRECT COVARIATES

NUMBER OF CORRECT COVARIATES

Fig. 1. Case (b) - Average number of correctly selected covariates as a function of

Fig. 2. Case (c) - Average number of correctly selected covariates as a function of

NUMBER OF CORRECT COVARIATES

LARS Based on S-estimators

Fig. 3. Case (d) - Average number of correctly selected covariates as a function of

Agostinelli, C. and Salibian-Barrera, M.

QIAN, G. and KUNSCH,

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

where Rp and the errors j are assumed to be independent with zero

and Xj = j for j = 3k, 3k + 1, . . . , d. The choices = 5 and = 0.3 result in

a. N (0, 1), no contamination;