Linear Models: CS771: Introduction To Machine Learning Piyush Rai

Linear Models
CS771: Introduction to Machine Learning

Piyush Rai
2
Linear Models
 Consider learning to map an input to the corresponding (say real-valued) output
 Assume the output to be a linear weighted combination of the input features

This defines a linear model with
parameters given by a “weight
vector”
Each of these weights have a simple
interpretation: is the “weight” or importance
of the feature in making this prediction
The “optimal” weights are unknown and
 This simple model can be used for Linear Regression have to be learned by solving an
optimization problem, using some training
data
 This simple model can also be used as a “building block” for more complex models
 Even classification (binary/multiclass/multi-output/multi-label) and various other ML/deep learning
models
 Even unsupervised learning problems (e.g., dimensionality reduction models)
CS771: Intro to ML
3
Simple Linear Models as Building Blocks
 In some regression problems, each output itself is a real-valued vector
 Example: Given a full body image of a person, predict height, weight, hand size, and leg size (
 Such problems are commonly known as multi-output regression
 We can assume a separate linear model for each of the outputs

Now each is a D-dim weight
⊤
𝑦 𝑚 =𝒘 𝒙
𝑚
vector for predicting the output
Here is an MxD weight matrix

𝒚=𝐖 𝒙 with its row containing
𝒘⊤
Note: Learning separate models may not be ideal
1
Learning this model will
𝒘⊤
these multiple outputs are somewhat correlated with 2
require us to learn this weight
each other. But this model can be extended to matrix (or equivalently, the
handle such situation (techniques are a bit advanced 𝒙
to be discussed right now – but if curious, you may 𝒘 ⊤𝑀 weight vectors)
look up more about multitask learning techniques) 𝒚 𝐖 CS771: Intro to ML
4
 A linear model can also be used in classification problems
 For binary classfn, can treat as the “score” of input and threshold to get binary label
Recall that the LwP model can also

be seen as a linear model (although
it wasn’t formulated like this)
Don’t worry. Can easily fold-in the bias

term here as shown in the figure below
Can append a constant feature “1”

Wait – when
for each input and rewrite as where
discussing LwP,
now both and
wasn’t the linear
model of the form ?
We will assume the same and omit
Where did the “bias”
the explicit bias for simplicity of
term go? CS771: Intro to ML
notation
5
 Linear models are also used in multiclass classification problems
 Assuming classes, we can assume the following model
𝑦 =argmax 𝑘∈ {1,2 , …, 𝐾 } 𝒘 ⊤
𝑘 𝒙
 Can think of as the score of the input for the class
 Once learned (using some optimization technique), these weight vectors (one for each
class) can sometimes have nice interpretations, especially when the inputs are images
The learned weight
These images sort
vectors of each of the 4 of look like class
classes visualized as prototypes if I
images – they kind of were using LwP
look like a “template” of 𝒘 𝑐𝑎𝑟 𝒘 𝑓𝑟𝑜𝑔 𝒘 h𝑜𝑟𝑠𝑒 𝒘 𝑐𝑎𝑡 
Yeah, “sort of”. 
what the images from
That’s why the dot product of each of these weight vectors with No wonder why LwP (with
that class should look an image from the correct class will be expected to be the largest Euclidean distances) acts
like like a linear model.  CS771: Intro to ML
6
 Linear models are building blocks for dimensionality reduction methods like PCA
This looks very similar to the multi-

output model, except that the values of
the latent features are not known and
have to be learned
 Linear models are building blocks for even deep learning model (each layer is like a multi-
output linear model, followed by a nonlinearity)
In a deep learning model, each layer learns a latent

feature representation of the inputs using a model like a
multi-output linear model, followed by a nonlinearity
The last (output) layer can have one or more outputs
More on this when we discuss deep learning later
CS771: Intro to ML
7
Learning Linear Models
CS771: Intro to ML
8
Next Lecture
 Linear Regression
CS771: Intro to ML

Linear Models: CS771: Introduction To Machine Learning Piyush Rai

Uploaded by

Copyright:

Available Formats

Linear Models: CS771: Introduction To Machine Learning Piyush Rai

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Linear Models: CS771: Introduction To Machine Learning Piyush Rai

Uploaded by

Copyright:

Available Formats

Linear Models

CS771: Introduction to Machine Learning

 Assume the output to be a linear weighted combination of the input features

 Such problems are commonly known as multi-output regression

 We can assume a separate linear model for each of the outputs

Here is an MxD weight matrix

Recall that the LwP model can also

Don’t worry. Can easily fold-in the bias

Can append a constant feature “1”

 Assuming classes, we can assume the following model

 Can think of as the score of the input for the class

This looks very similar to the multi-

In a deep learning model, each layer learns a latent

The last (output) layer can have one or more outputs

More on this when we discuss deep learning later

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.