Music Recommendation

Music Recommendation
Listen to the music You like
Recuperao de Informao 1semestre 2011/2012 Ricardo Dias, n55444
Bibliography
Music Recommendation and Discovery: The Long Tail, Long Fail, and Long Play
scar Celma, Springer 2010, Ch-1-3,5
Recommender Systems
Prem Melville, Vikas Sindhwani Encyclopedia of Machine Learning, 2010
Handbook of Multimedia for Digital Entertainment and Arts

Borko Furht (Ed.), Springer 2009
What is Recommendation? Do we need it? Why?
MOTIVATION & CONTEXT
Recommendation in our lives
Restaurants
Discos and Bars
Food
Books
And Music!
Do we need Music recommendation?
Music Consumption Change

Physical Stores Online Stores
Time
Digital Era - Portability
Up to ~20 tracks
Up to ~40.000 tracks
Time
Digital Era Online Services
Digital Era Online Services
Amazon
17 Million Songs
Music Recommendation Problem

Overwhelming number of choices of which music to listen to
Users feel:
Paralyzed Doubtful
Need to provide personalized filters and recommendations to ease users decisions
Before Digital Era
Cannot only rely on recommendations from:

Radios Friends Local Record Dealers Djs and Music Experts Etc.
Digital Era
Music Characteristics
Different from other types of media
Tracking users preferences can be implicit Items can be consumed several times (even repeatedly and continuously) Instant feedback Music consumption depends on context (morning, work, afternoon, etc.)
Music Recommendation Specificities

Current music recommendation algorithms try to accurately predict what people will want to listen
Making accurate predictions about a user could listen to, or buy next, independently of how
useful the provided recommendations are to

the user
Formalization, Use Cases, Profile Generation, Recommendation Methods
THE RECOMMENDATION PROBLEM
Formalization
Recommendation Problem
Prediction problem estimation of the items likeliness for a given user Recommend a list of N items assuming that the system can predict likeliness for yet unrated items
Prediction Problem
= , , , the set of Users = , , , items that can be recommend list of items a user j expressed his interests
Function ,j predicted likeness of item , for the active user , where
Usually represented by a rating < , , >
Recommendation Problem
Find a list of N items that the user will like the most
The ones with higher , The resultant list should not contain items from the users interests
=
Use Cases
Common usages of a recommender system:
1. 2. 3. 4. 5. 6. 7. Find good items Find all good items Recommend sequence (e.g. playlist generation) Just browsing Find credible recommender Express self Influence others
General Model
Users and Items Two types of recommendations:
Top-N predicted items Top-N predicted neighbors
User Profile Generation

Two key elements:
Generation and Maintenance Exploitation of the profile using a recommendation system
User Profile Creation

Empty Profile The simplest, but Manual Direct feedback to the system, but Data import create the profile from an external representation Training Set Provide feedback to concrete items, marking them relevant or irrelevant to users interests, but Stereotyping Assign a user into a cluster of similar users that are represented by their stereotype
User Profile Maintenance

Explicit Feedback
Ratings (Problems?) Comments and Opinions
Implicit Feedback
Monitoring users actions (e.g., tracking play, pause, skip and stop buttons in the media player, etc.) Problem? Advantage over Explicit feedback approaches?
User Profile Adaptation

Adapt the system to users profile changes:
Manually Adding new information while keeping the old Gradually forgetting old interest and promoting the new ones
Recommendation Methods
Standard classification of recommender systems:
1. 2. 3. 4. 5. Demographic Filtering Collaborative Filtering Content-based Filtering Context-based Filtering Hybrid Approaches
Demographic Filtering
Used to identify the kind of users that like a certain item Classifies user profiles in clusters based on:
Personal data (age, gender, marital status, etc.) Geographic data (city, country) Psycographic data (interests, lifestyle, etc.)
Advantages/Limitations
The simplest recommendation method But
Recommendations are too general Requires effort from the user to generate the profile
Collaborative Filtering
Predict user preferences for items by learning past user-item relationships CF methods work by build a matrix M with n items and m users, that contains the interaction (e.g. ratings, plays, etc.) of the users with the items.
The value , represents the rating of the user for the item
Collaborative Filtering Approaches

Item-Based Neighborhood User-Based Neighborhood Matrix Factorization
Item-Based Neighborhood
Only users that rated the items and , are taken into account in the process
Only users that rated the items and , are taken into account in the process
1. Compute the similarity between two items, i and j
1. Example: Adjusted cosine similarity
2. Predict to the target user, u, a value for the active item, i

; - set of k neighbors of item I, that the user u has rated
User-Based Neighborhood
Compute the predicted rating value of item i, for the active user u, taking into account those users that are similar to u
- average rating for user u ()- set of k neighbors for user u (the similar ones)
Matrix Factorization
Useful when the M user-item matrix is sparse
Reduce dimensionality of the original matrix, generating matrices U and V that approximate the original one Example: SVD Singular Value Decomposition
Computes matrices and , for a given number k, such as: diagonal matrix containing the singular values of M
Matrix Factorization
After matrix reduction we can calculate the predicted rating value for item i for a user u
Limitations
Data sparsity and high dimensionality Gray sheep problem* Cold-start problem (early-rater problem) Does not take into account items descriptions Popularity Bias Feedback Loop
Content-based Filtering
Uses information describing the items Process of characterizing item data set can be:
Manual (annotations by domain experts) Automatic (extracting features by analyzing the content)
Key component: Similarity Function
Content-based Filtering
Similarity Functions
1. Euclidean
2. Manhattan
3. Chebychev 4. Mahalanobis
Limitations
Cold-start problem (only to user) Gray-sheep problem Novelty (?) Limitation of extracted automatic features Subjective (personal opinions) not taken into account
Context-based Filtering
Uses context information to describe and characterize the items
Context Information any information that can be used to characterize a situation or an entity Context != Content
Two main techniques:

Web mining Social Tagging
Web Mining
3 different web mining categories:
Web content mining
text, hypertext, markup, and multimedia mining
Web structure mining

focuses on link analysis (in- and out- links)
Web usage mining

uses the information available on session logs. This information can be used to derive user habits and preferences, link prediction, or item similarity based on co-occurrences in the session log
Social Tagging
Aims at annotating web content using tags Tags are freely chosen keywords, not constrained to a predefined vocabulary Recommender systems can derive social tagging data to derive item (or user) similarity
Social Tagging
When users tag items, we get tuples of :
< , , >
These triples conform a 3-order matrix (tensor)
Social Tagging
Two approaches to compute item (and user) similarity:
1. Unfold the 3-order tensor in three bidimensional matrices (user- tag, item-tag and user-item matrices) 2. Directly use the 3-order tensor
Unfolding the 3-order tensor

User-Tag (U matrix) - , contains the number of times user i applied the tag j Item-Tag (I matrix) - , contains the number of times an item i has been tagged with tag j User-Item (R binary matrix) - , denotes whether the user i has tagged the item j
Unfolding the 3-order tensor

Item similarity (using I) or user similarity (using U or I), can be computed using:
Cosine-based distance Dimensionality reduction techniques(SVD, NMF)
Then recommendations can be made by using:

R user-item matrix or, User profile obtained from U or I
Using the 3-order tensor

The available techniques are (high-order) extensions of SVD and NMF
HOSVD is a higher order generalization of matrix SVD for tensors, Non-negative Tensor Factorization (NTF) is a generalization of NMF
Limitations
Coverage Problems with tags:
Polysemy Synonymy Usefulness of personal tags Sparsity
Attacks / Vandalism
Hybrid Approaches
Goal
Achieve better recommendations by combining some of the previous approaches
Methods:
Weighted Switching Mixed Cascade
Factors Affecting Recommendation

Novelty and Serendipity Explainability (transparency) Cold Start Problem Data Sparsity and High Dimensionality Coverage
Factors Affecting Recommendation

Trust Attacks Temporal Effects Understanding the Users
Use cases, User and Item Profiles Representation, Recommendation Examples
MUSIC RECOMMENDATION
Use Cases
Main task of a music recommendation system:
Propose interesting music, consisting of a mix of known and unknown artists, as well as the available tracks, given a user profile
Use Cases
Artist Recommendation Playlist Generation
Shuffle, Random Playlists Personalized Playlists
Neighbor Recommendation
How about other use cases?
User Profile Representation

Extend user profile with music related information
Has not been largely investigated
Useful to:
Improve music recommendation Share with others your preferences
Type of Listeners
Each type of listener needs different type of recommendations
User Profile Representation Proposals

Most relevant proposals are:
User modeling for Information Retrieval (UMIRL) MPEG-7 standard Friend of a Friend (FOAF) initiative
User Modeling for Information Retrieval

Allows one to describe perceptual and qualitative features of the music
MPEG-7 User Preferences

User preferences in MPEG-7 includes:
Content filtering Searching and browsing preferences Usage history
FOAF: User Profiling in the Semantic Web

Provides conventions and a language to tell a machine the type of things a user says about herself in her homepage
Item Profile Representation

Music items:
Artists Songs
Music Information Plane
Music Information Plane

Music knowledge management categories:
Editorial Metadata Cultural Metadata Acoustic Metadata
Editorial Metadata
Cultural Metadata
Acoustic Metadata
Music Description Facets

Low-level Timbre Descriptors
Spectral Centroid/Flateness/Skewness, MFCCs, etc.
Instrumentation Rhythm Harmony Structure Intensity Genre Mood
Recommendation Methods (examples and specificities)

Collaborative Filtering (CF)
Explicit/Implicit Feedback
Content-Based Filtering Context-Based Filtering Hybrid Methods
CF makes use of the editorial and cultural information
Explicit feedback based on ratings about songs / artists Implicit feedback tracking user listening habits
CF with Explicit Feedback

Examples:
Ringo 1st music recommender based on CF and explicit feedback Racofi based on CF and a set of logic rules based on Horn clauses
Indiscover Slope One CF method
CF with Implicit Feedback

Main Drawbacks:
The value that a user assigns to an item is not always in a predefined range (e.g. from 1..5 or like it/hate it) Cannot gather negative feedback
Recommendations usually performed at artist level, but listening habits are recorded at song level Aggregation
Content-Based Filtering
Uses content extracted from music to provide recommendations
Compute similarity among songs, in order to recommend music to the user
Two ways to describe audio content:

Manually Automatically
Manually Audio Content Description

Very time consuming
Scalability problems
But
Annotations can be more accurate than automatic
Example: Pandora
Analysts annotate 400 parameters per song, using a ten point scale per attribute ~ 15.000 songs analyzed per month
Automatic Audio Content Description

Early work on audio similarity is based on lowlevel descriptors, such as Mel Frequency Cepstral Coefficients ( MFCC)
Foote proposed a music indexing system based on MFCC histograms Audio features are usually aggregated together using mean and variance, or modeling it as a Gaussian Mixture Model (GMM)
Automatic Audio Content Description

Analyses audio signal and automatically extracts a set of features:
Tzanetakis extracted a set of features representing the spectrum, rhythm and harmony (chord structure); then merged into a single vector, and were used to determine song similarity Cataltepe et al. presented a music recommendation system based on audio similarity, where users listening history is taken into account
Context-Based Filtering Techniques

Uses cultural information to compute artist or song similarity Mainly based on web mining techniques, or mining data from collaborative tagging
Context-Based Filtering Techniques

Example:
M3 (Music for My Mood), uses context (season, month, day of the week, weather, temperature) and Case-based Reasoning to recommend music
Hybrid Methods
Allows a system to minimize the issues that a solely method can have
How cascade approach works:
One technique is applied first, obtaining a ranked list of items. Then, a second technique refines or re-rank the results obtained in the first step
Hybrid Method
Example:
Tiemann et al. investigate ensemble learning methods for hybrid music recommender algorithms. Their approach combines social and content-based methods, where each one produces a weak learner. Then using a combination rule, it unifies the output of the weak learners.
System-centric, Network-centric, User-centric
EVALUATION
Evaluation
Three different strategies
System-centric Network-centric User-centric
System-centric Evaluation
Evaluation measures how accurate the system can predict the actual values that user have previously assigned
Most approaches based on the leave-n-out method
Similar to the classic n-fold cross validation
Dataset divided into two (usually disjunct) sets:

Training and Test
Accuracy evaluation based only on a users dataset

The rest of the items of the catalog are ignored
Metrics:
Predictive accuracy
Mean Absolute Error, Root Mean Square Error
Decision based
Mean Average Precision, Recall, F-measure, Accuracy, ROC
Rank based
Spearmans , Kandall , Half-life Utility, Discounted Cumulative Gain
System-centric Evaluation Limitations

Cannot evaluate recommendations concerning:
1. 2. 3. 4. 5. Coverage Novelty Transparency (explainability) Trustworthiness (confidence) Perceived Quality
Network-centric Evaluation
Evaluation aims at measuring the topology of the item (or user) similarity network
The similarity network is the basis to provide the recommendations
Important to analyze and understand the underlying topology of the similarity network
Measures:
Coverage Diversity of recommendations
In terms of:
Navigation
Average Shortest Path, Giant Component
Connectivity
Degree Distribution, Degree-Degree Correlation, Mixing Patterns
Clustering
Local/Global Clustering Coefficient
Centrality
Degree, Closeness, Betweeness
Limitations:
Accuracy of the recommendations cannot be measured Transparency (explainability) and trustworthiness (confidence) of the recommendations cannot be measured The perceived quality (i.e. usefulness and effectiveness) of the recommendations cannot be measured
User-centric Evaluation
Evaluation focuses on the users perceived quality and usefulness of the recommendations
Copes with the limitations of both:
System- and Network-centric approaches
Evaluates:
Novelty Perceived Quality
Gathering Feedback (Explicit, Implicit)
Perceived Quality Novelty A/B Testing
Perceived Quality
Easiest way to measure?
Explicitly ask the users User needs information about:
Item (e.g. metadata, preview, etc.) Reasons why the item was recommended
Then can rate the quality of each recommended item (or the whole list)
Novelty
Ask users if they recognize the predicted items or not Combining novelty and perceived quality we can infer if:
User likes to receive and discover unknown items Prefers more conservative and familiar recommendations
A/B Testing
Present two different versions of an algorithm (or two algorithms)
Evaluate which one performs the best
Performance measured by the impact the new algorithm has on the visitors behavior, compared to the baseline algorithm
Limitations:
Need of user intervention in the evaluation process
Gathering feedback from the user can be tedious for some users Time-consuming
Evaluation summary
Combining the three methods we can cover all the facets of a recommender algorithm
Evaluation summary
System-centric
Evaluates performance accuracy of the algorithm
Network-centric
Analyses the structure of the similarity network
User-centric
Measure satisfaction about recommendations they receive
Which datasets can we use to evaluate Music Recommendation Approaches?
DATASETS FOR EVALUATION
Last.fm Dataset 1K users

Contains <user, timestamp, artist, song> tuples
Represents the listening habits for ~1.000 users
Collected from Last.fmAPI

User.getRecentTracks()
Statistics:
~108,000 Artists with MusicBrainz ID ~70.000 Artists without MusicBrainz ID
Last.fm Dataset 360K users

Contains tuples <user, artist, plays> from 360.000 users Collected from Last.fm API
User.getTopArtists()
Statistics:
~190.000 Artists with MusicBrainz ID ~107.000 Artists without MusicBrainz ID
The Million Song Dataset

One Million Songs!!!
280GB of data ~45.000 unique artists ~8.000 unique terms > 2 Million asymmetric similarity relationships Acoustic features
Pitch, Timbre, Loudness, etc.
Links to other sources to obtain more information

Musicbrainz, 7digital, playme
NEXTONE PLAYER
NEXTONE PLAYER: A Music Recommendation System Based on User Behavior
Yajie Hu and Mitsunori Ogihara
ISMIR 2011
Session-based CF for Music Recommendation

Session-based Collaborative Filtering for Predicting the Next Song
Sung Eun Park, Sangkeun Lee, Sanggoo Lee
CNSI 2011
The End
Thank you!

Music Recommendation

Uploaded by

Copyright:

Available Formats

Music Recommendation

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Music Recommendation

Uploaded by

Copyright:

Available Formats

Music Recommendation

Listen to the music You like

Recuperao de Informao 1semestre 2011/2012 Ricardo Dias, n55444

Handbook of Multimedia for Digital Entertainment and Arts

What is Recommendation? Do we need it? Why?

MOTIVATION & CONTEXT

Recommendation in our lives

Discos and Bars

Do we need Music recommendation?

Music Consumption Change

Music Recommendation Problem

Need to provide personalized filters and recommendations to ease users decisions

Cannot only rely on recommendations from:

Music Recommendation Specificities

useful the provided recommendations are to

Formalization, Use Cases, Profile Generation, Recommendation Methods

THE RECOMMENDATION PROBLEM

User Profile Generation

User Profile Creation

User Profile Maintenance

User Profile Adaptation

Collaborative Filtering Approaches

2. Predict to the target user, u, a value for the active item, i

Key component: Similarity Function

Two main techniques:

Web structure mining

Web usage mining

These triples conform a 3-order matrix (tensor)

Unfolding the 3-order tensor

Unfolding the 3-order tensor

Then recommendations can be made by using:

Using the 3-order tensor

Factors Affecting Recommendation

Factors Affecting Recommendation

Use cases, User and Item Profiles Representation, Recommendation Examples

How about other use cases?

User Profile Representation

User Profile Representation Proposals

User Modeling for Information Retrieval

MPEG-7 User Preferences

FOAF: User Profiling in the Semantic Web

Item Profile Representation

Music Information Plane

Music Information Plane

Music Description Facets

Instrumentation Rhythm Harmony Structure Intensity Genre Mood

Recommendation Methods (examples and specificities)

Content-Based Filtering Context-Based Filtering Hybrid Methods

CF with Explicit Feedback

CF with Implicit Feedback

Two ways to describe audio content:

Manually Audio Content Description

Automatic Audio Content Description

Automatic Audio Content Description

Context-Based Filtering Techniques

Context-Based Filtering Techniques

System-centric, Network-centric, User-centric

Dataset divided into two (usually disjunct) sets:

Accuracy evaluation based only on a users dataset