Seminar
Seminar
Seminar
ON
BLIND SOURCE
SEPERATION
By
Srilakshmi.A
11321A0498
is
the
separation
of
set
of
party problem
being
able
auditoryattentionon
to
a
focus
particular
one's
stimulus
It
may
also
describe
similar
detect
from
words
of
unattended
importance
stimuli,
for
BSS is a transform??
1.ICA MODEL
Suppose we have N statistically independent signals, s i(t), i = 1, ...,N.
obtain a set of
N observation signals, xi(t), i = 1, ...,N that are mixtures of the
sources.
A fundamental aspect of the mixing process is that the sensors must
be
spatially separated (e.g. microphones that are spatially distributed
around a room)
so that
sources.
mixing
process with matrix multiplication as follows:
Figure: Blind source separation (BSS) block diagram. s(t) are the sources.
x(t) are the recordings, s (t) are the estimated sources A is mixing matrix
and W is un-mixing matrix
The objective is to recover the original signals, s i(t), from only the
observed vector xi(t).
We obtain estimates for the sources by first obtaining the unmixing
matrix W, where,
W = A-1.
This enables an estimate, s (t), of the independent sources to be
2.INDEPENDENCE
A key concept that constitutes the foundation of independent
component analysis is statistical independence.
To simplify the above discussion consider the case of two different
random variables s1 and s2. The random variable s1 is independent of
s2.Here s1 and s2 could be random signals originating from two
different physical process that are not related to each other.
Under independence we will be dealing with
a)Independence definition
b)Uncorrelatedness and Independence
c)Non-Gaussianity and Indepedence
a) Independence definition
Mathematically, statistical independence is defined in terms of probability density of
the signals.
Consider the joint probability density function (pdf) of s1 and s2 be p(s1, s2). Let the
marginal pdf of s1 and s2 be denoted by p1(s1) and p2(s2) respectively. s1 and s2 are said
to be independent if and only if the joint pdf can be expressed as;
ps1,s2(s1, s2) = p1(s1)p2(s2)
(3)
(4)
where E{.} is the expectation operator. In the following section we use the above
properties to explain the relationship between uncorrelated and independence.
(5)
= E{s1s2}E{s1}E{s2}
=0
where ms1 is the mean of the signal. Equation 4 and 5 are identical for independent
variables taking g1(s1) = s1.
Hence independent variables are always uncorrelated.
However the opposite is not always true. The above discussion proves that
independence is stronger than uncorrelatedness and hence independence is used
as the basic principle for ICA source estimation process.
However uncorrelatedness is also important for computing the mixing matrix in
ICA.
tends
toward
Gaussian
distribution
under
certain
is
an
important
and essential
principle
in
ICA
estimation.
To use non-Gaussianity in ICA estimation, there needs to be quantitative
Before using any measures of non-Gaussianity, the signals should be normalized. Some of the
commonly used measures are kurtosis and entropy measures.
-Kurtosis
Kurtosis is the classical method of measuring Non-Gaussianity. When data is reprocessed to
have unit variance, kurtosis is equal to the fourth moment of the data.
The Kurtosis of signal (s), denoted by kurt (s), is defined by
kurt(s) = E{s4}3(E{s4})2
(6)
This is a basic definition of kurtosis using higher order (fourth order) cumulant, this
simplification is based on the assumption that the signal has zero mean.
To simplify things, we can further assume that (s) has been normalized so that its variance is equal
to one: E{s2} = 1.
Hence equation 6 can be further simplified to
kurt(s) = E{s4}3
(7)
Equation 7 illustrates that kurtosis is a normalized form of the fourth moment E{s4} = 1.
For Gaussian signal, E{s4} = 3(E{s4})2 and hence its kurtosis is zero.
Entropy
Entropy is a measure of the uniformity of the distribution of a bounded set of values,
such that a complete uniformity corresponds to maximum entropy.
From the information theory concept, entropy is considered as the measure of randomness of a
signal.
Entropy H of discrete-valued signal S is defined as
H(S) = P(S = ai)logP(S = ai)
(10)
This definition of entropy can be generalised for a continuous-valued signal (s), called
differential entropy, and is defined as
H(S) = p(s)logp(s)ds
(11)
Gaussian signal has the largest entropy among the other signal distributions of unit variance.
entropy will be small for signals that have distribution concerned on certain values or have pdf
that is very "spiky". Hence, entropy can be used as a measure of non-Gaussianity.
In ICA estimation, it is often desired to have a measure of non-Gaussianity which is zero for
Gaussian signal and nonzero for non-Gaussian signal for computational simplicity.
ICA Ambiguity
There are two inherent ambiguities in the ICA framework. These are (i)
magnitude and
scaling ambiguity and (ii) permutation ambiguity.
Magnitude and scaling ambiguity
The true variance of the independent components cannot be determined.
To explain, we can rewrite the mixing in equation 1 in the form
x= As
N
= aj s j
j=1
(13)
where aj denotes the jth column of the mixing matrix A.
Since both the coefficients aj of the mixing matrix and the independent
components sj are unknown, we can transform Equation 13
N
x= (1/jaj)(jsj)
(14)
j=1
The natural solution for this is to use assumption that each source has
unit variance.
Permutation ambiguity
The order of the estimated independent components is unspecified.
Formally, introducing a permutation matrix P and its inverse into the mixing
process in
Equation 1.
x =AP-1Ps
= As
(15)
Here the elements of P s are the original sources, except in a different
order, and
A = AP-1 is another unknown mixing matrix.
Equation 15 is indistinguishable from Equation 1 within the ICA framework,
demonstrating that the permutation ambiguity is inherent to Blind Source
Separation.
4 .Preprocessing
Before examining specific ICA algorithms, it is instructive to discuss
preprocessing steps
that are generally carried out before ICA.
Centering
A simple preprocessing step that is commonly performed is to center the
observation
vector x by subtracting its mean vector m=E{x}. That is then we obtain the
centered
observation vector, xc, as follows:
xc = xm
This step simplifies ICA algorithms by allowing us to assume a zero mean.
Whitening
Another step which is very useful in practice is to pre-whiten the observation
vector x.
Whitening involves linearly transforming the observation vector such that its
components are
uncorrelated and have unit variance.
Let xw denote the whitened vector, then it satisfies the following equation:
E{x x T} = I
ICA Algorithms
There are several ICA algorithms available in literature. However the
following three
algorithms are widely used in numerous signal processing applications.
These includes FastICA, JADE, and Infomax. Each algorithm used a different
approach to
solve equation.
FastICA
FastICA is a fixed point ICA algorithm that employs higher order statistics
for the recovery
of independent sources.
FastICA uses simple estimates of Negentropy based on the maximum
entropy principle,
which requires the use of appropriate nonlinearities for the learning rule of
the neural network.
Fixed point algorithm is based on the mutual information.
Infomax
The BSS algorithm, proposed by Bell and Sejnowski, is also a gradient
based
neural network algorithm, with a learning rule for information maximization
of information.
Infomax uses higher order statistics for the information maximization.
In perfect cases, it does provide the best estimate to ICA components.
The strength of this algorithm comes from its direct relationship to
information theory.
The algorithm is derived through an information maximization principle,
applied here between the inputs and the non linear outputs.
2 )Undercomplete ICA
The mixture of unknown sources is referred to as under complete
when the numbers of
sources n.
Applications
N
A
H
T U
O
Y