Welcome to Scribd!

0% found this document useful (0 votes)

177 views

Usc Poster

Uploaded by

This document proposes a novel i-vector based convolutive non-negative matrix factorization (CNMF) approach for noise robust automatic speech recognition. It aims to learn robust speech representations that are invariant to speaker environment and recording conditions. The approach uses CNMF to represent spectrograms as a product of a basis speech dictionary and time activations. It then learns the total variability matrix to model noise and channel variations in the speech dictionary, leading to features invariant to noisy environments. The method is evaluated on the Aurora 4 noisy speech database, and results show the i-vector CNMF approach outperforms traditional filterbank features for noise robust ASR. Future work includes hyperparameter optimization and integrating the features into a GMM-DNN ASR

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Usc Poster

Uploaded by

api-332129590

0% found this document useful (0 votes)

177 views1 page

Original Title

usc poster

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Download as pdf or txt

0% found this document useful (0 votes)

177 views1 page

Usc Poster

Uploaded by

api-332129590

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Download as pdf or txt

Jump to Page

You are on page 1of 1

Search inside document

A NOVEL I-VECTOR BASED CNMF APPROACH TOWARDS NOISE ROBUST ASR

Kunal Dhawan1 , Colin Vaz2 , Ruchir Travadi2 and Shrikanth Narayanan2

1
Department of Electronics and Electrical Engineering, Indian InsPtute of Technology GuwahaP, GuwahaP, Assam, India
2 Signal Analysis and InterpretaPon Lab, University of Southern California, Los Angeles, CA 90089

Motivation Proposed Feature GeneraPon Algorithm Dataset DescripPon

➢ Problem: Commonly used acoustic features for Automatic Step 1: Learning UBM Speech Dictionary Used the Aurora 4 database which has the following characteristics:
7 noise types: airport, babble, car, clean, restaurant, street, train
j jj Speech Recognition are not robust to noise 2 microphone position conditions: near field, far field
Degrades ASR performance in noisy conditions Training set of 7138 utterances, each approximately 8 seconds
➢ Aim: Learn robust representation of speech which is invariant clclong ( noise is not labeled)
k to speaker environment and recording conditions Test set of 330 utterances for each noise+channel type (∴
7*2=14 cc different conditions)
Background Result and Discussion
1) Non-Negative Matrix Factorization (NMF) Step 2: Calculating the 0th and 1st order sufficient statistics ,
estimating the mean and covariance supervectors
The i-vector approach presented is very effective in modeling the p
An algorithm to approximate a non-negative matrix V
ppnoise type:
(∈ℝ≥0,M×N) as a product of two non-negative matrices W
(∈ℝ≥0,M×R) and H (∈ℝ≥0,R×N), where R≤M.
To minimize the error of reconstruction, decomposition is
done according to an adapted Kullback-Leibler divergence
cost metric:
123
#(%| '×) = ∑,-(%,- log 4×5 23
− %,- + '×) ,- )
Step 3: Learning the Total Variability Matrix
The T matrix captures the variations in the training set about the UBM and
hence defines the space where the speech dictionaries will adapt
This idea can be used in ASR to represent a spectrogram as a
product of a basis dictionary and corresponding time Step 4: Calculating the adapted Dictionaries for the training
activations: set and hence find corresponding time activations Learning the noise and channel variations in the speech dictionary
ppitself leads to features which are invariant of the noisy
lllllenvironment:

Spectrogram (V) Speech Dict. (W) Time Ac9va9ons (H)

Spectro-temporal repn. of Learns a set of basis Chooses the appropriate
a given speech sample elements having different basis elements for each Pme
frequency characteristics frame in a given u]erance

2) Total Variability approach

Model the variations in a dataset as low-dimensional vector
Car Noise Airport Noise Clean Speech Car Noise

Complete Data distribution Utterance Specific Data

(UBM) Distribution (means slightly shifted
from UBM) Babble Noise

Feature Extraction Step:

Babble Noise Airport Noise

9 = : + ;×< i-Vector (a random

vector having a standard Feature corresponding to Conclusion and Future Work
Speaker & Channel dependent given utterance
normal distribution)
dicPonary supervector ⭐ Current system outperforms the classical ﬁlterbank features.
Step 5: Extract features for the test set using the same method ⭐ Currently performing a grid search over the hyperparameters
UBM supervector (Speaker & Channel Total Variability Matrix (rectangular
independent dicPonary supervector) matrix of low rank which models the as in step 4 llll(sparsity of the dicPonary , number of basis elements to choose)
variation space) lllllwith the ﬁnal aim of building GMM-DNN ASR system.

Project Report Group1
Document91 pages
Project Report Group1
api-332129590
100% (2)
2 Text Independent Voice Based Students Attendance System Under Noisy Environment Using RASTA-MFCC Feature
Document6 pages
2 Text Independent Voice Based Students Attendance System Under Noisy Environment Using RASTA-MFCC Feature
Susanta Sarangi
No ratings yet
Paper 3
Document8 pages
Paper 3
Navneet
No ratings yet
GMM and Ann
Document4 pages
GMM and Ann
malathi sharavanan
No ratings yet
Sattelite Communications
Document5 pages
Sattelite Communications
kavita gangwar
No ratings yet
Towards Adapting NMF Dictionaries Using Total Variability Modeling For Noise-Robust Acoustic Features
Document5 pages
Towards Adapting NMF Dictionaries Using Total Variability Modeling For Noise-Robust Acoustic Features
api-332129590
No ratings yet
David S Undermann, Harald H Oge, Antonio Bonafonte, Helenca Duxans
Document5 pages
David S Undermann, Harald H Oge, Antonio Bonafonte, Helenca Duxans
Fawwaz Al Maki
No ratings yet
Stability and Performance Analysis of Pitch Filters in Speech Coders
Document10 pages
Stability and Performance Analysis of Pitch Filters in Speech Coders
Sevy Tom Zoriuq
No ratings yet
Voice Command Based Wheelchair: Subtitle As Needed (Paper Subtitle)
Document4 pages
Voice Command Based Wheelchair: Subtitle As Needed (Paper Subtitle)
shubham
No ratings yet
Stop Gap Removal Using Spectral Parameters For Stuttered Speech Signal
Document5 pages
Stop Gap Removal Using Spectral Parameters For Stuttered Speech Signal
Velumani s
No ratings yet
Lang TDNN
Document21 pages
Lang TDNN
mike
No ratings yet
Automatic Speech Recognition Using Cepstral and Itakura-Saito Distances For Vocal Command
Document5 pages
Automatic Speech Recognition Using Cepstral and Itakura-Saito Distances For Vocal Command
Bouhafs Abdelkader
No ratings yet
Identification of Underwater Acoustic No
Document6 pages
Identification of Underwater Acoustic No
nhan1999.nt
No ratings yet
pP11 Maas
Document2 pages
pP11 Maas
Abiy Mulugeta
No ratings yet
giacobello14asr
Document6 pages
giacobello14asr
citincabin
No ratings yet
話者認識1
Document4 pages
話者認識1
lijialinqaz
No ratings yet
Modeling Beats and Downbeats With A Time-Frequency Transformer
Document5 pages
Modeling Beats and Downbeats With A Time-Frequency Transformer
Yearnyeen Ho
No ratings yet
Efficient Classification of Noisy Speech Using Neural Networks
Document4 pages
Efficient Classification of Noisy Speech Using Neural Networks
Bharath yk
No ratings yet
Hybrid/Tandem Models + Tdnns + Intro To RNNS: Instructor: Preethi Jyothi
Document23 pages
Hybrid/Tandem Models + Tdnns + Intro To RNNS: Instructor: Preethi Jyothi
Sammy K
No ratings yet
Instantaneous Pitch Estimation Algorithm Based On Multirate Sampling
Document5 pages
Instantaneous Pitch Estimation Algorithm Based On Multirate Sampling
Raiatea Moeata
No ratings yet
2019-BUT System Description For DIHARD
Document5 pages
2019-BUT System Description For DIHARD
Mohammed Nabil
No ratings yet
Vaswani, A. (2017) - Attention Is All You Need. Advances in Neural Information Processing Systems
Document5 pages
Vaswani, A. (2017) - Attention Is All You Need. Advances in Neural Information Processing Systems
lamphilippe999
No ratings yet
Formant Estimation and Tracking Using Probabilistic Heat-Maps
Document5 pages
Formant Estimation and Tracking Using Probabilistic Heat-Maps
Cường Nguyễn
No ratings yet
1412 1602 PDF
Document10 pages
1412 1602 PDF
Muhammet Ali Köker
No ratings yet
Miron Et Al. - 2013 - An Open-Source Drum Transcription System For Pure Data and Max MSP
Document5 pages
Miron Et Al. - 2013 - An Open-Source Drum Transcription System For Pure Data and Max MSP
Alessandro Ratoci
No ratings yet
Artigo MLSP Atenuacao Multiplas
Document6 pages
Artigo MLSP Atenuacao Multiplas
farzinshams
No ratings yet
Urban Sound Classification PaperV2
Document6 pages
Urban Sound Classification PaperV2
Shree Bohara
No ratings yet
Research Paper ML
Document3 pages
Research Paper ML
nowaynora2
No ratings yet
Robust Speech Recognition Using Adaptive Noise Cancellation
Document4 pages
Robust Speech Recognition Using Adaptive Noise Cancellation
engr.makbaralikhan
No ratings yet
Inter Speech 2018
Document5 pages
Inter Speech 2018
Rahul Kumar
No ratings yet
Waspaa 2021 CRV
Document6 pages
Waspaa 2021 CRV
haduong812
No ratings yet
Multi_stage_Collaborative_Microphone_Arr
Document4 pages
Multi_stage_Collaborative_Microphone_Arr
mfalcone
No ratings yet
Deep Speech 3 1707.07413
Document8 pages
Deep Speech 3 1707.07413
heavywater
No ratings yet
A Computationally Efficient Speech/music Discriminator For Radio Recordings
Document4 pages
A Computationally Efficient Speech/music Discriminator For Radio Recordings
Pablo Loste Ramos
No ratings yet
Icassp 2016
Document6 pages
Icassp 2016
api-535501735
No ratings yet
Tabu Based Back Propagation Algorithm For Performance Improvement in Communication Channels
Document5 pages
Tabu Based Back Propagation Algorithm For Performance Improvement in Communication Channels
jkl316
No ratings yet
Keynote Slides
Document33 pages
Keynote Slides
sliu2002
No ratings yet
1973 Resolving power and sensitivity to mismatch of optimum array processors_H. Cox
Document15 pages
1973 Resolving power and sensitivity to mismatch of optimum array processors_H. Cox
颜千程
No ratings yet
System For Automatic Formant Analysis of Voiced Speech
Document15 pages
System For Automatic Formant Analysis of Voiced Speech
EarthNandan
No ratings yet
Underdetermined Blind Source Separation Using Capsnet
Document9 pages
Underdetermined Blind Source Separation Using Capsnet
Ali KHALFA
No ratings yet
Analyzing Noise in Autoencoders and Deep Networks
Document10 pages
Analyzing Noise in Autoencoders and Deep Networks
Massimo Tormen
No ratings yet
Rave
Document15 pages
Rave
bug
No ratings yet
Preamble-Based SNR Estimation in Frequency Selective Channels For Wireless OFDM Systems
Document5 pages
Preamble-Based SNR Estimation in Frequency Selective Channels For Wireless OFDM Systems
prathajayarajan
No ratings yet
A Priori SNR Estimation Based On A RNN For Robust Speech Enhancement
Document5 pages
A Priori SNR Estimation Based On A RNN For Robust Speech Enhancement
gupnaval2473
No ratings yet
2014 10 Cho EMNLP
Document11 pages
2014 10 Cho EMNLP
hungbkpro90
No ratings yet
DWT and Mfccs Based Feature Extraction Methods For Isolated Word Recognition
Document6 pages
DWT and Mfccs Based Feature Extraction Methods For Isolated Word Recognition
Bouhafs Abdelkader
No ratings yet
Self-Attention For Long Sequences
Document14 pages
Self-Attention For Long Sequences
HIMANSHU SINGH
No ratings yet
Chord Detection Using Deep Learning
Document7 pages
Chord Detection Using Deep Learning
D Ban Cho
No ratings yet
A Comprehensive Analysis of Voice Activity Detection Algorithms For Robust Speech Recognition System Under Different Noisy Environment
Document6 pages
A Comprehensive Analysis of Voice Activity Detection Algorithms For Robust Speech Recognition System Under Different Noisy Environment
vol1no2
No ratings yet
VOICE IDENTIFICATION USING MACHINE LEARNING MODELS
Document4 pages
VOICE IDENTIFICATION USING MACHINE LEARNING MODELS
sreekanth
No ratings yet
Support Vector Machines Versus Fast Scoring in The
Document5 pages
Support Vector Machines Versus Fast Scoring in The
Tran Trung
No ratings yet
A Syllable Segmentation Algorithm For English and Italian
Document4 pages
A Syllable Segmentation Algorithm For English and Italian
Cristina Husa
No ratings yet
7
Document8 pages
7
T Jayasree
No ratings yet
Report - SIP - KWS Key Word Spotting
Document2 pages
Report - SIP - KWS Key Word Spotting
somnath mukherjee
No ratings yet
Spectral Entropy Employment in Speech Enhancement Based On Wavelet Packet
Document8 pages
Spectral Entropy Employment in Speech Enhancement Based On Wavelet Packet
Mvp Navin
No ratings yet
Bird clef CNN architecture
Document16 pages
Bird clef CNN architecture
Arindam Baruah
No ratings yet
Tian 2019
Document6 pages
Tian 2019
Neelam Dewangan
No ratings yet
Prabharoop Interim Report
Document4 pages
Prabharoop Interim Report
irshadmk3399
No ratings yet
Error-Correction on Non-Standard Communication Channels
From Everand
Error-Correction on Non-Standard Communication Channels
Edward A. Ratzer
No ratings yet
A Rapid Introduction to Adaptive Filtering
From Everand
A Rapid Introduction to Adaptive Filtering
Leonardo Rey Vega
No ratings yet
Natural Language Processing with TensorFlow: Teach language to machines using Python's deep learning library
From Everand
Natural Language Processing with TensorFlow: Teach language to machines using Python's deep learning library
Thushan Ganegedara
No ratings yet
Fairmot Explained 1
Document19 pages
Fairmot Explained 1
api-332129590
No ratings yet
BTP Thesis rs1 End-To-End-Asr
Document51 pages
BTP Thesis rs1 End-To-End-Asr
api-332129590
No ratings yet
Attention
Document12 pages
Attention
api-332129590
No ratings yet
Rs 1 Poster
Document1 page
Rs 1 Poster
api-332129590
No ratings yet
Projectreport-G15 Tue
Document19 pages
Projectreport-G15 Tue
api-332129590
100% (1)
Coursera lz9hplj95ph6
Document1 page
Coursera lz9hplj95ph6
api-332129590
No ratings yet
Presentation 2
Document12 pages
Presentation 2
api-332129590
No ratings yet
Coursera vg79h67t6f58
Document1 page
Coursera vg79h67t6f58
api-332129590
No ratings yet
Coursera A6n52bwq2vkg
Document1 page
Coursera A6n52bwq2vkg
api-332129590
No ratings yet
Coursera Wx29vxacwe33
Document1 page
Coursera Wx29vxacwe33
api-332129590
No ratings yet
Coursera Kaxe2yuddqpy
Document1 page
Coursera Kaxe2yuddqpy
api-332129590
No ratings yet
Coursera Vtwwcbh3ae6w
Document1 page
Coursera Vtwwcbh3ae6w
api-332129590
No ratings yet
Project Report-Lg
Document85 pages
Project Report-Lg
api-332129590
100% (1)
Coursera Vgle3dsyt3ke
Document1 page
Coursera Vgle3dsyt3ke
api-332129590
No ratings yet
Cls v2 1 6
Document15 pages
Cls v2 1 6
api-332129590
No ratings yet
Project Report Iitd KD
Document48 pages
Project Report Iitd KD
api-332129590
No ratings yet
Verilog Final Code
Document5 pages
Verilog Final Code
api-332129590
No ratings yet
Ps Ip
Document7 pages
Ps Ip
api-332129590
No ratings yet
Top 25 Document Hits For Siebel IPT
Document3 pages
Top 25 Document Hits For Siebel IPT
Mohmmedali Somji
No ratings yet
Muratec Printer/Scanner Drivers and Officebridge Install/Uninstall For Windows 7
Document11 pages
Muratec Printer/Scanner Drivers and Officebridge Install/Uninstall For Windows 7
Bulent Dogru
No ratings yet
Cr09-Lsmw Day 3
Document16 pages
Cr09-Lsmw Day 3
Saket Shahi
No ratings yet
A Eb 101 Introduction To Computers
Document147 pages
A Eb 101 Introduction To Computers
Anonymous sibic3VfN
No ratings yet
Blockly Arduino
Document8 pages
Blockly Arduino
Amit Bhatt
100% (1)
Create Dynamic Table Using RTTS and Display in ALV
Document10 pages
Create Dynamic Table Using RTTS and Display in ALV
Ricky Das
No ratings yet
Introduction To Scilab: Kannan M. Moudgalya IIT Bombay Kannan@iitb - Ac.in
Document52 pages
Introduction To Scilab: Kannan M. Moudgalya IIT Bombay Kannan@iitb - Ac.in
bluemoon9090
No ratings yet
Object Oriented Programming Name: TE-53 Sec: Lab Task 1 Clo 1,2 Plo 1
Document3 pages
Object Oriented Programming Name: TE-53 Sec: Lab Task 1 Clo 1,2 Plo 1
Muneeb Sami
No ratings yet
Browser Cus Part 4
Document119 pages
Browser Cus Part 4
Fin Technet
100% (2)
Presentation For Electronic Tender
Document15 pages
Presentation For Electronic Tender
api-3729458
No ratings yet
Advance Computer Graphics
Document32 pages
Advance Computer Graphics
namank999
No ratings yet
Price Comparison For Big Data Appliance and Hadoop (The Data Warehouse Insider)
Document12 pages
Price Comparison For Big Data Appliance and Hadoop (The Data Warehouse Insider)
Lavanya_123
No ratings yet
Advocate Document
Document111 pages
Advocate Document
SathishPerla
No ratings yet
Traffcalc: Erlang-B Details Extended Erlang-B Details Engset Details
Document7 pages
Traffcalc: Erlang-B Details Extended Erlang-B Details Engset Details
Sudheera Indrajith
No ratings yet
STM32F4xx Clock Configuration V1.1.0
Document30 pages
STM32F4xx Clock Configuration V1.1.0
alaiuppa2535
No ratings yet
Notes 5
Document4 pages
Notes 5
kiranbdpl
No ratings yet
Árboles Binarios de Búsqueda (ABB) : ABB Árbol Binario Ordenado Según Uno o Más Criterios Cada Nodo Tiene Dos Hijos
Document31 pages
Árboles Binarios de Búsqueda (ABB) : ABB Árbol Binario Ordenado Según Uno o Más Criterios Cada Nodo Tiene Dos Hijos
Camilo Rojas
No ratings yet
Azenqos Installation Guide For Oneplus 6
Document6 pages
Azenqos Installation Guide For Oneplus 6
emilson cruz
No ratings yet
Pic Microcontroller
Document5 pages
Pic Microcontroller
Farhat Abbas
No ratings yet
Deploying JRE (Native Plug-In) For Windows Clients in Oracle E-Business Suite 11i
Document54 pages
Deploying JRE (Native Plug-In) For Windows Clients in Oracle E-Business Suite 11i
Mohammed Abdul Muqeet
No ratings yet
Installation of SAP Content Server DMS On Windows
Document40 pages
Installation of SAP Content Server DMS On Windows
Andres Ortiz
No ratings yet
Bluefish Review
Document2 pages
Bluefish Review
makulat
No ratings yet
Sub Code & Name: Ec2207 Digital Electronics Lab Semester: Iii DATE: 29.10.12-31.10.12
Document4 pages
Sub Code & Name: Ec2207 Digital Electronics Lab Semester: Iii DATE: 29.10.12-31.10.12
Dewayne Rogers
No ratings yet
Hierarchical Clustering PDF
Document9 pages
Hierarchical Clustering PDF
Vijay M
No ratings yet
General - OpenBoot PROM (OBP) Commands
Document17 pages
General - OpenBoot PROM (OBP) Commands
Mohannad Nadaeus
No ratings yet
B.tech CS S8 Security in Computing Notes Module 3
Document36 pages
B.tech CS S8 Security in Computing Notes Module 3
Jisha Shaji
100% (2)
Virtual Private Network Management
Document3 pages
Virtual Private Network Management
phuc
No ratings yet
Nasm Tutorial
Document14 pages
Nasm Tutorial
manuvt07
100% (1)
CSE 202: Design and Analysis of Algorithms: Instructor: Kamalika Chaudhuri
Document12 pages
CSE 202: Design and Analysis of Algorithms: Instructor: Kamalika Chaudhuri
ballechase
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Usc Poster

Uploaded by

Copyright:

Available Formats

Usc Poster

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Usc Poster

Uploaded by

Copyright:

Available Formats

A NOVEL I-VECTOR BASED CNMF APPROACH TOWARDS NOISE ROBUST ASR

Kunal Dhawan1 , Colin Vaz2 , Ruchir Travadi2 and Shrikanth Narayanan2

Motivation Proposed Feature GeneraPon Algorithm Dataset DescripPon

Spectrogram (V) Speech Dict. (W) Time Ac9va9ons (H)

2) Total Variability approach

Complete Data distribution Utterance Specific Data

Feature Extraction Step:

Babble Noise Airport Noise

9 = : + ;×< i-Vector (a random

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.