0% found this document useful (0 votes)
100 views

Deep Learning Based Context Aware Recommender System

Uploaded by

SarraSaroura
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
100 views

Deep Learning Based Context Aware Recommender System

Uploaded by

SarraSaroura
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 70

MASTER ISG TUNIS/ 2018

UNIVERSITÉ DE TUNIS
INSTITUT SUPÉRIEUR DE GESTION

MÉMOIRE DE MASTER DE RECHERCHE

SPÉCIALITÉ
SCIENCES ET TECHNIQUES DE L’INFORMATIQUE DE DÉCISION
OPTION
DEEP LEARNING BASED CONTEXT-AWARE RECOMMEDER SYSTEM

INFORMATIQUE ET GESTION DE LA CONNAISSANCE

Deep Learning Based Context-Aware

Recommender System

MARWEN BDIRI

SOUTENU EN OCTOBRE 2018, DEVANT LE JURY COMPOSÉ DE:

SAOUSSEN KRICHEN PROFESSEUR, UNIVERSITÉ DE TUNIS PRÉSIDENT


KAOUTHER NOUIRA FERCHICHI MAÎTRE ASSISTANT, UNIVERSITÉ DE TUNIS DIRECTEUR DU MÉMOIRE
SALSABIL TRABELSI MAÎTRE ASSISTANT, UNIVERSITÉ DE TUNIS RAPPORTEUR

LABORATOIRE: BESTMOD
MARWEN BDIRI

Année Universitaire 2017 - 2018


Résumé:

Grâce à sa capacité à anticiper les éléments qui peuvent intéresser les utilisateurs,
les systèmes de recommandation sont devenus une technique courante pour aider
les utilisateurs à trouver des éléments intéressants dans un large ensemble
d'éléments tels que des films, des livres et de la musique. Les approches avancées
des systèmes de recommandation s'appuient sur l'apprentissage automatique
et l’apprentissage profond afin d'obtenir des recommandations plus précises et
personnalisées pour les utilisateurs. Le but de ce master est d'examiner d'abord les
techniques de recommandations basées sur l’apprentissage profond, puis de
concevoir un nouvel algorithme de recommandation contextuelle basée sur cette
technologie.

Mots-clés: Systèmes de Recommandation, Systèmes de Recommandation Contextuels,


Apprentissage Automatique, Apprentissage profond.

Abstract:

Thanks to its ability to expect items that can be of interest to users, recommender
systems have become a common technique to help users find interesting items
within large data set of items such as movies, books, and music. Advanced
Recommender systems approaches rely on machine learning and deep learning area
for the purpose of performing better accurate and personalized recommendations
for users. The aim of this master is to firstly review exiting recommendation
techniques that use Deep Learning and then design a new contextual
recommendation algorithm based on this technology.

Keywords: Recommender systems, Context-Aware recommender systems, Machine


learning, Deep Learning.
Acknowledgments

I would like to express my deepest gratitude and thanks to my advisor Dr. Kaouther Nouira
Ferchichi who has always encouraged me throughout the progress of this master thesis. Thank
you for your patience and your continuous guidance during the preparation of this work. I have
learnt many things through your useful comments, your interesting remarks and the precious
discussions we have had during our collaboration.

I would especially like to thank all the members of BESTMOD laboratory of the Institut Supérieur
de Gestion Tunis, University of Tunis and all my professors.

A special thanks to my family. Words cannot express how grateful I am to my parents and
my two sisters for all what your patience and all the sacrifices that you have made on my behalf

ii
Contents

Introduction 1

1 Recommender Systems and Deep Learning 4

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.1 Basic Concepts of Recommander Systems . . . . . . . . . . . . . . . . . . . . . 4

1.1.1 Content-Based Recommendation . . . . . . . . . . . . . . . . . . . . . . 5

1.1.2 Collaborative Filtering Recommendation . . . . . . . . . . . . . . . . . 5

1.1.3 Hybrid Recommendation . . . . . . . . . . . . . . . . . . . . . . . . . . 9

1.1.4 Further Recommendation Technique . . . . . . . . . . . . . . . . . . . . 9

1.2 Fundamentals of Deep Learning . . . . . . . . . . . . . . . . . . . . . . . . . . 11

1.2.1 Artificial Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . 12

1.2.2 Convolutional Neural Networks . . . . . . . . . . . . . . . . . . . . . . 14

1.2.3 Recurrent Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . 15

1.2.4 Backpropagation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

1.2.5 Optimization Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . 16

Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

iii
CONTENTS

2 Deep Learning based Recommender System Review 18

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

2.1 Content-Based Recommendation . . . . . . . . . . . . . . . . . . . . . . . . . 19

2.2 Collaborative Filtering Recommendation . . . . . . . . . . . . . . . . . . . . . . 20

2.3 Hybrid Approach Recommendation . . . . . . . . . . . . . . . . . . . . . . . . 22

2.4 Advanced Approach Recommendation . . . . . . . . . . . . . . . . . . . . . . . 24

2.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

3 Deep Learning Approach for Context-Aware Recommendation 28

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

3.1 Context-Aware Recommendation . . . . . . . . . . . . . . . . . . . . . . . . . . 28

3.2 Deep Learning based method for Context-Aware Recommendation . . . . . . . . 30

3.2.1 Input Layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

3.2.2 Embedding Layer and Concatenation Layer . . . . . . . . . . . . . . . . 33

3.2.3 Neural Network Layer . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

3.2.4 Output Layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

3.3 Implementation Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

4 Experimental Study 38

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

4.1 Datasets Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

4.2 Evaluation Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

4.3 Empirical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

iv
CONTENTS

4.3.1 Experimental environment . . . . . . . . . . . . . . . . . . . . . . . . . 41

4.3.2 Obtained Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

Conclusion 49

v
List of Figures

1.1 Collaborative Filtering Process. . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.2 User-Based Collaborative Filtering . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.3 Item-Based Collaborative Filtering . . . . . . . . . . . . . . . . . . . . . . . . . 8

1.4 The relationships between Artificial intelligence, Machine Learning and Deep
Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

1.5 Biological Neuron versus Artificial Neuron . . . . . . . . . . . . . . . . . . . . 12

1.6 Non-Linear Function used in Neural Networks . . . . . . . . . . . . . . . . . . . 13

1.7 A Feed-Forward Neural Network . . . . . . . . . . . . . . . . . . . . . . . . . . 14

1.8 Convolutional Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . 15

1.9 Recurrent Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.1 Statistics of publication counts in four Recommender Systems categories. . . . . 26

2.2 Statistics of publication counts in four Recommender Systems categories by ap-


plication domain. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

3.1 The architecture of the proposed Deep Context-Aware Recommendation method 32

3.2 Transformation of Context-Aware Recommendation data set. . . . . . . . . . . . 33

vi
LIST OF FIGURES

4.1 Evaluation of DCARS over different methods on Tripadvisor dataset . . . . . . . 42

4.2 Evaluation of DCARS over different method on Tripadvisor dataset . . . . . . . 43

4.3 Evaluation of DCARS Movielens 1 million using different optimization algo-


rithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

4.4 Evaluation of DCARS Movielens 1 million using iteration numbers . . . . . . . 46

4.5 Evaluation of DCARS Movielens 1 million using different layers numbers . . . . 47

vii
List of Tables

4.1 Statistics of the Recommendation Dataset . . . . . . . . . . . . . . . . . . . . . 39

4.2 Optimization algorithms over training and testing using MovieLens 1M dataset . 44

viii
List of Algorithms

3.1 Deep Context-Aware Recommeder system . . . . . . . . . . . . . . . . . . . . . 31

ix
Introduction

With the rise of big data and large information especially in the web, it is becoming difficult for
Internet-users to effectively navigate in this sea of information. For instance, a user browsing an
online store such as Amazon does not wish to go through tens, or perhaps hundreds, of uninter-
esting items before finding a desired object such as book or movie to buy. In other words, this
huge amount of options makes it harder for users to find exactly the wanted item (Linden et al.,
2003). This issue can be seen in many domain including Multimedia, E-commerce, Tourism and
in many companies like Google, Spotify, Amazon, Tripadvisor . Thus, in order to be competitive,
such companies should provide their users with satisfied services.
Owing to its capability to overcome information overload, recommender systems have become
a common solution that can provide suitable recommendation for users among large informa-
tion (Ricci et al., 2015). Recommnader Systems can be defined as software tool and techniques
that can help users to make the best choice within a certain domain (Linden et al., 2003). Two
common techniques of recommendation can be destigished from the literature, Non-Personalized
recommendation and Personalized recommendation (Ricci et al., 2015). The former is consid-
ered as less personal recommander systems since it provides a recommendation for group of
people. For instance, this approach can be used in magazines and newspapers and typical exam-
ples include top ten selections of news, trends, books, etc. Non-Personalized recommendation
systems are much simpler to generate that why it’s not typically addressed by recommendation
systems researcher. On the other hand, Personalized recommendation is costumer friendly sys-
tems where it can help user find the most suitable products or services based on his preferences
and constraints. Indeed, such systems collect either explicit information from the user’s activities
such as product’s ratings or implicit information such as product’s historical navigation (Ricci
et al., 2015; Linden et al., 2003). In this setting, this thesis rely on the personalized recommen-

1
LIST OF ALGORITHMS

dation systems as it’s the most used and challenged approach in both research and industrial
area.

Motivation

Personalized recommendation systems use two main information to provide recommendation:


the users and the items. Based on these data, recommendation can be achieved using three
common approaches (Linden et al., 2003): Content-Based approach, Collaborative Filtering ap-
proach and Hybrid approach. Content-Based approach uses content or feature of an item within
the systems to perform recommendation (Lops et al., 2011). Collaborative Filtering approach
uses interaction and similarity between user and items to predict relevant items to user (Linden
et al., 2003). Finally, Hybrid recommendation assures recommendation using advantages of both
techniques (Burke, 2007).

Several research works have been conducted to enhance the accuracy and the performance of
recommendation techniques which use only items and users (Bobadilla et al., 2013). However,
in many applications, these information is not sufficient For modern application (Adomavicius
& Tuzhilin, 2015). Indeed, such traditional representation of recommendation take only the in-
teraction or similarity between users and items without any other information such as the time
where the interaction have been done, location, mood of the users, etc. For instance, a travel
agency is assumed to do a good recommendation if it provides a ski package to someone who
like skiing, and it should not be a good idea to provide him this package in the summer. In other
words, a vacation recommendation should not be the same in winter as well as in summer. Thus,
it is of important to incorporate contextual information in the recommendation process.

In contrast to traditional techniques, Context-Aware Recommender System is an advanced rec-


ommendation technique invented recently which can involve other contextual information such
as time, location, weather, the users mood or type of device etc, into the recommendation pro-
cess in order to enhance recommendation results (Adomavicius & Tuzhilin, 2015). Indeed, Such
process can add new feature to the recommendation model rather than just the users and the
items dimensions and this can provide more accurate and personalized recommendation to users.
Context-Aware recommendations is considered as a challenging task due to the high number of
feature that can be involved within its recommendation process, and classical recommendation

2
LIST OF ALGORITHMS

techniques such as Collaborative Filtering models are not able to accurately deal with such data
due to the high dimensional nature of the problem (Adomavicius & Tuzhilin, 2015). Therefore,
the need of new models that can easily deal with such data.

Deep Learning is sub-field of Machine Learning that has shown tremendous success in a wide
range of research areas such as Image Recognition (Wan et al., 2014), Natural Language Process-
ing (Sarikaya et al., 2014), Automatic Speech Recognition (Hinton et al., 2012) and Biomedical
Informatics (Holzinger & Jurisica, 2014). Recently, Deep Learning technology have been ap-
plied for solving recommender systems related problems, in fact, two Deep Learning workshops
have been held in conjunction with the famous ACM Conference on recommender systems (Rec-
Sys) to promote using such technology within recommendation field (Singhal et al., 2017; Zhang
et al., 2017). Despite this big success especially in complex task, application of Deep Learning
on recommendation field still limited especially in Context Aware recommendation. Thus, this
thesis rely on this new technology to create a recommender systems which is able to deal with
large amount of features in order to provide Context-Aware Personalized recommendations.

Purpose of the Work

The purpose of this master thesis is to firstly review exiting recommendation techniques based on
Deep Learning technology and than to design a new recommendation algorithm based on Deep
Learning models that can deal with contextual features in order to improve recommendations
accuracy.

Thesis Outline

This thesis is structured as follows: the first chapter provides the fundamental concepts of
recommeder systems and Deep Learning technology. The second chapter reviews some ap-
plication of Deep Learning in recommender systems. The third chapter describes the proposed
method. Finally, the fourth chapter presents the experimental study and the obtained results.

3
Chapter 1
Recommender Systems and Deep
Learning

Introduction

This chapter provides an overview of recommender systems and Deep Learning technology.
Section 1.1 presents basic concepts of recommander systems and Section 1.2 presents basic
concepts of Deep Learning technology.

1.1 Basic Concepts of Recommander Systems

Since they were introduced in the early 90s, recommnader systems have become an important
field of research that can help users deal with information overload (Goldberg et al., 1992).
recommnader systems can be deffined as software tool and techniques that can suggest items
which are most likely of interest to a specific user such as Books and article to read, movie to
watch, music to listen, and restaurant to dinner (Ricci et al., 2015). An example of such systems
can be seen on the popular website, Amazon.com, where it employs a recommender sytem to
personalize the online store for each customer in order to assist them in selecting a book to
read (Linden et al., 2003). Based on how recommendations are made, recommender systems

4
Chapter 1 : Recommender Systems and Deep Learning

can be classified into three main categories: Content-Based, Collaborative Filtering and Hybrid
recommender systems (Adomavicius & Tuzhilin, 2005).

1.1.1 Content-Based Recommendation

Content-Based recommendation approach has root from two main fields: Information Retrieval
and Information Filtering (Ricci et al., 2015). This approach perferms recommendation by sug-
gesting items similar to the ones the user liked in the past. This technique indeed, computes
and compares similarity between content or features associated with items within a data set to
perform recommendation (Adomavicius & Tuzhilin, 2005). For instance, if a user has listen to
a song that belongs to the Jazz genre, then the system following this approach can learn to rec-
ommend other songs from this genre (e.i Jazz). A good application of such systems could be
a movie recommendation system that can recommend movies to users. In this case, Content-
Based technique can be used to see the commonalities between all the movies that a user have
seen in the past such as specific genres, director, or actor. Then, it recommends only the movies
that have a high similarity with the movies that have been watched in the past (Lops et al., 2011).

1.1.2 Collaborative Filtering Recommendation

Collaborative Filtering recommendation is considered as the most popular and widely used ap-
proach in recommender systems field (Ricci et al., 2015). The basic idea of this approach con-
sists of providing item recommendations or predictions based on the opinions of other users with
similar preferences within the systems, these opinions can be obtained explicitly such as a rating
sacle ( value between 1 to 5) or implicit such as navigation history or interaction with the system
(Sarwar et al., 2001). Examples of popular Web Sites that use such technique are Amazon.com
and Netflix (Koren, 2008).
Collaborative Filtering technique recommend new items for user based on the previously liked
items and the opinion history of other similar users that share the same preferences (Ricci et al.,
2015). As delineated in Figure 1.1, the input of the recommendation process is a 2-dimensional
matrix of m ∗ n rating values where m is the number of the users and n is the number of items
within the system. Each cell in the matrix represents an opinion called also rating value (rui )
expressed by user u for an item i. Opinion can be an explicit information given by the user
such as a rating score or implicit information such as transaction records, page navigation his-

5
Chapter 1 : Recommender Systems and Deep Learning

tory. List of m user is refereed to as U = {u1 , u2 , u3 , ..., um } and list of n items is refereed to as
I = {i1 , i2 , u3 , ..., un }. The task of collaborative filtering recommendation provides two forms of
results for the active user: prediction which expresses the degree of likeliness (e.g a rating value
within 1 to 5) of an item i j for active user and a list of N items (Top-N recommendations) that he
might like the most.

Figure 1.1: Collaborative Filtering Process.

Collaborative filtering algorithms can further be divided into two main techniques: Memory-
Based techniques and Model-Based techniques. The following two sub-sections detail respec-
tively both techniques.

Memory-Based Collaborative Filtering

This technique can be refereed to as Neighborhood-Based approach or Heuristic approach. This


approach has gained a lot of interest for researchers and industries due to its simplicity, efficiency,
and ability to produce accurate and personalized recommendations (Ning et al., 2015). It is based
on the principle that similar users prefer similar items, and similar items are preferred by similar
users (Ning et al., 2015). Thus, such approach focuses on the relationships between items and
users within the ratings matrix in order to select the most similar ones as a recommendation.
Two main strategies can be used to implement Memory-Based Collaborative Filtering: User-
Based and Item-Based Collaborative Filtering.

6
Chapter 1 : Recommender Systems and Deep Learning

User-Based Collaborative Filtering: estimates the rating of an active user for a new item
using the opinions given to this item by users most similar to this active user called also the
Nearest-Neighbors users (Ning et al., 2015). Figure 1.2 presents a toy example of movie rec-
ommendation systems using User-Based Collaborative Filtering. The figure considers Eric as an
active user and Lucy as a similar user to Eric. given that Eric has watched and rated Titanic and
Forrest Gump movies and given that Lucy also has watched and rated positively the two movies.
Accordingly, one can conclude that the two user have similar taste and the movies that Lucy
enjoy will also enjoy Eric. Thus, it well be obvious to recommend all the movie liked by Lucy
to Eric.

Figure 1.2: User-Based Collaborative Filtering

Item-Based Collaborative Filtering: this method relies on item similarity to perferom rec-
ommendation. In fact, it looks at the ratings given to neighbors items in order to recommend
the potentially interesting items for an active user (Sarwar et al., 2001). Figure 1.3 illustrates
an example of this method on movie recommender system. In this example, Eric have watched
Punchline movie, accordingly, the system will determine a similarity between this movie and
the potentially interesting movies. In our tiny example, recommendation will be a Forrest Gump
movie which has the same actor (e.i Tom Hansks) and genre (e.i comedy/drama movie) like
Punchline movie.

7
Chapter 1 : Recommender Systems and Deep Learning

Figure 1.3: Item-Based Collaborative Filtering

Choosing between User-Based and an Item-Based approaches have a big impact in the accu-
racy and the efficiency of recommendation task. In particular, User-Based methods usually pro-
vide more original recommendations compared to Item-Based methods and this may lead users
to a more satisfying and good experience. While in large data set, where the number of users ex-
ceeds the number of items (especially in real application) this approach become computationally
expensive. On the other hands, Item-Based approach are typically preferred in practice since it
provides good and accurate recommendation and computationally is efficient and requiring less
frequent updates since items within the systems are rarely updated.

Model-Based Collaborative Filtering

Unlike Memory-Based methods which use the whole ratings matrix each time to determine rec-
ommendation which represents a problem when the system contains million of ratings. Model-
Based methods tend to create a pre-trained model that can be used later to perform recommen-
dation. To emphasize, training the model in recommendation systems takes much more time at
first but it perferoms fast recommendations in production (Koren, 2008). Usually, Model-Based
approaches uses Machine Learning algorithms to perform model train. Thus, several Machine

8
Chapter 1 : Recommender Systems and Deep Learning

Learning algorithm have been adapted in recommendations area such as Naive Bayes Classifier
(Miyahara & Pazzani, 2000), Association Ruled-Based Method (Mobasher et al., 2001), Deci-
sion Trees (Bouza et al., 2008), Clustering (G.-R. Xue et al., 2005).

1.1.3 Hybrid Recommendation

Thanks to its ability to combinie multiple recommendation techniques to achieve better recom-
mendation results, Hybrid recommendation approach is consedered as a very paractical approach
in real application (Burke, 2007). It combines characteristics of both Content-Based and Collabo-
rative Filtering methods in order to overcome limitations of each one. Several studies have shown
that Hybrid recommendation approach can provide more accurate recommendations and can be
very effective in some hard recommendation cases compared to pure methods (Koren, 2009). For
instance, The BellKor Solution, in the Netflix Grand Prize, has combined over 100 recommenda-
tion algorithms to win the competition (Koren, 2009). Hybrid recommendation technique can be
divided into four main categories (Burke, 2007): Weighted strategy, Switching strategy, Mixed
strategy an Cascade strategy. Weighted strategy computes the score or the predicts results of di-
verse recommendation techniques then it combines together the result in order to get score used
as a final recommendation. Switching chooses a recommender technique among the several rec-
ommendation techniques then it applies the most appropriate one based on the problem situation.
Mixed strategy consists of using different recommendation technique together at the same time
in order to overcome the drawbacks of each single technique. Finally, Cascade uses sequence
of recommendation technique where the output of a recommendation technique represents as an
input for another recommendation technique, and this can refine recommendations results given
by another.

1.1.4 Further Recommendation Technique

Recently, several new recommendation approaches have been proposed in the literature. These
approaches are not as known as the traditional recommendation technique but they still used to
overcome the limitations, the following presents some of them.

9
Chapter 1 : Recommender Systems and Deep Learning

Demographic-Based Recommendation: in this type of approach, recommender systems rely


on demographic profile of the user such as the age, the language and the country to recommend
item. Many Web Sites uses this approach in order to personalize its content. For instance, a
website can switch the display language to users according to its region or it may suggest items
based on the user’s age or gender.

Knowledge-Based Recommendation : this approach is based on a specific domain knowledge


of both items and users to achieve recommendation. Indeed, this approach suggests items based
on specific domain knowledge about how certain item features meet user’s needs and preferences
(This knowledge will sometimes contain explicit functional knowledge about how certain prod-
uct features meet user needs). That is, how the item is useful for the user. For instance, when a
user buys a laptop in an online store, the system knows that the user may be interested in a laptop
bag as well.

Community-Based Recommendation: usually referred to as social recommendations espe-


cially with the rise of social-networks. This kind of systems are based on the famous proverb
”Tell me who your friends are, and I will tell you who you are” which means that people of-
ten accept recommendations from their close friends than on recommendations from anonymous
people. Thus, these systems rely on the preferences of the user’s friends to perform recommen-
dation. In other words, this kind of system models and get information about the social relations
of the users and the preferences of the user’s friends, then, it assures recommendation based on
ratings that were provided by the user’s friends.

Context-Aware Recommendation: it is used to empower recommender systems in several ar-


eas, including movies, restaurants, travel, music, news, shopping assistants, mobile advertising,
mobile apps, and many others. Traditional recommendation systems perform recommendation
on the two dimensional user*item matrix and this can provide weak personalized recommenda-
tion since the system relies only on two value to perform recommendation. On the other hand ,
Context-Aware Recommendation operates recommendation on multi-dimensionality matrix that
can handle not only the traditional user*item but the contextual information of the recommenda-
tion such as time, location and mood. Indeed, this set up can be an advantage and can provide
more personalized systems. For instance, a travel agency Web Site may want to use the season
of the year to determine whether they should sell beach resort packages or ski vacations to their

10
Chapter 1 : Recommender Systems and Deep Learning

users.

1.2 Fundamentals of Deep Learning

This section presents the fundamentals of deep learning : theory and architecture of deep neural
networks. The theory and architecture of artificial neural networks represent the foundation of
Deep Learning technology (Goodfellow et al., 2016). Figure 1.4 illustrates the relationship be-
tween Artificial Intelligence, Machine Learning and Deep Learning. First, Artificial Intelligence
represents the general field that encompasses Machine Learning and Deep Learning but also
many more approaches which may not involve learning. For instance, chess programs can yield
only hard coded rules that can be written by programmers which are not considered as Machine
Learning. The second circle is Machine Learning which encompasses Deep Learning. Machine
Learning involves the using of a set of models that can learn from data how to do something
rather then being explicitly programmed to do it. The final circle represents the Deep Learning
which is based on artificial neural network models to achieve learning.

Figure 1.4: The relationships between Artificial intelligence, Machine Learning and Deep Learn-
ing
(Goodfellow et al., 2016)

11
Chapter 1 : Recommender Systems and Deep Learning

1.2.1 Artificial Neural Networks

An artificial neural network is a Machine Learning technique that has been inspired by the struc-
ture and function of the brain (LeCun et al., 2015). Like the brain which is based on a biological
neuron, the fundamental processing element of an artificial neural network is a neuron (Goodfel-
low et al., 2016). Figure 1.5 presents biological neuron and artificial neuron where the right side
figure shows a representation of an artificial neuron: xn represents the inputs data to the networks
and each of these inputs are multiplied by a weight w j . Then these products are summed and fed
into a transfer function in order to generate the output. Usually, the transfer function relies on a
non-linear activation function and is modeled as shown in Equation 1.1.

Figure 1.5: Biological Neuron versus Artificial Neuron


(Chollet, 2017)

ai = f (Σi (wi xi ) + bi ) (1.1)

where ai is the activation of a neuron, f (x) represents the non-linear activation function wi is
the link strength weight, xi is the neuron inputs and b is the neuron bias. Neural networks were
originally modeled using three main activation functions.

the Sigmoid function: is represented as follows (Funahashi, 1989)

1
σ(x) = (1.2)
1 − exp−x

It takes a real value and squashes it between 0 and 1. Therefore, it is especially used for
models where we have to predict the probability as an output where probability values can range
only between 0 and 1.

12
Chapter 1 : Recommender Systems and Deep Learning

Hyperbolic Tangent function (TanH): the following equation shows the TanH function (Glo-
rot & Bengio, 2010)

2
T anH(x) = −1 (1.3)
1 − exp−2x

It squashes a real-valued number between -1 and 1. TanH function is also like logistic Sig-
moid but it can provide better results.

Rectified Linear Unit function (ReLU) the following equation shows the ReLU function
(Dahl et al., 2013)

ReLU(x) = max(0, x) (1.4)

The ReLU function is the most commonly used activation function in neural networks models
(LeCun et al., 2015). This function returns 0 if it receives any negative input, but for any positive
value x it returns that value back. This function can greatly accelerate the convergence compared
to the other functions thus it is widely used in Deep Learning (LeCun et al., 2015).

Figure 1.6 shows the variation of the three activation functions

Figure 1.6: Non-Linear Function used in Neural Networks


(Chollet, 2017)

The simplest neural networks model is the Feed-Forward Neural Network which consists
on a set of ensemble of artificial neurons which are categorized into three types of layers: the
input layer, the hidden layers and the output layer. The progression of information flows in one

13
Chapter 1 : Recommender Systems and Deep Learning

forward direction, from the input layer, through the hidden layers to the output layer. In fact, in
this type of network structure, the connections between the units presented as directed acyclic
graph which means that does not form a cycle and there is no loops in the network (Goodfellow
et al., 2016). Figure 1.7 illustrates an example of Feed-Forward Neural Network with two hidden
layers.

Figure 1.7: A Feed-Forward Neural Network


(Chollet, 2017)

1.2.2 Convolutional Neural Networks

The Convolutional Neural Networks (CNN) are a particular model of artificial neural networks
that can be used to learn from data, especially They are successfully applied to images, texts, etc
(LeCun et al., 2015). Unlike a regular Feed-Forward Neural Network, the layers of a CNN have
neurons arranged in 3 dimensions: width, height, depth. As shown in Figure 1.8, every layer
of a CNN transforms the 3D input volume to a 3D output volume of neuron activations. For
instance, the red input layer holds the input image. Consequently, its width and height would be
the dimensions of the image, and the depth would be 3 (Red, Green, Blue channels). This type of
networks have been very successful and widely used in practical applications such as computer
vision and robotics (LeCun et al., 2015).

14
Chapter 1 : Recommender Systems and Deep Learning

Figure 1.8: Convolutional Neural Networks


(Chollet, 2017)

1.2.3 Recurrent Neural Networks

Recurrent Neural Networks (RNN) are another family of neural networks that can process se-
quential data like time series, speech and texts processing. Compared to the other neural net-
works family, this kind of model provides an output which results in each time step. Further-
more, the hidden units in the network have recurrent connections (Graves et al., 2013). On the
other hand, all the input data in the other models are independent of each other whereas in RNN
all the inputs are related to each other as shown in Figure 1.9. For instance, this can help in the
prediction of the next word in a given sentence using the previous words. Figure 1.9 illustrates
a toy example of such a model. The Figure (left side) shows that the model creates the networks
with loops in it, which allows to persist the information (i.e., the right side of the Figure) the
model first takes the x(0) from the sequence of input and then it outputs h(0) together with x(1)
is the input for the next step. The h(0) and x(1) are the input for the next step. Similarly, h(1)
and x(2) are the inputs for the next step, and so on.. In this way, the model keeps remembering
the context while performing the training.

15
Chapter 1 : Recommender Systems and Deep Learning

Figure 1.9: Recurrent Neural Networks


(Chollet, 2017)

Recurrent neural networks are very powerful dynamic models, but training them is considered
as computationally expensive (Goodfellow et al., 2016).

1.2.4 Backpropagation

The backpropagation is a commonly used technique to train artificial neural networks. Contrarily
to forward propagation which flows the input data from the input layer to produce a final output
within the neural networks. The backpropagation algorithm trains the model by propagating
a feedback from the final output layer down to the input layers in order to propagate the loss
between the predicted and the true output value. This can update the weights within the networks.
Backpropagation can be achieved using an optimization method such as gradient descent. In
Deep Learning, backpropagation of the output loss through a large number of layers can affect
the training because the neurons in the earlier layers learn very slowly as compared to the neurons
in the later layers in the models. The earlier layers in the network are slower to train. This issue
is known as vanishing gradients which has been solved using new optimization techniques like
Momentum and Adagrad (Rezende et al., 2014).

1.2.5 Optimization Techniques

Optimization techniques are considered as the task that can help minimizing or maximizing an
objective function. Deep Learning relies on optimization algorithms, like gradient descent or
momentum, to minimize the cost function (also known as loss function or error function) by
changing learnable parameters, which is in our case the weight and the bias, in order to improve

16
Chapter 1 : Recommender Systems and Deep Learning

accuracy. The following subsections present some of optimization techniques used in Deep
Learning.

Gradient Descent: it is the most widely used optimization algorithm to train deep neural net-
works (Ruder, 2016). It is used to minimize the target function by iteratively moving in the
direction of steepest descent as defined by the negative of the gradient. In other words, the gra-
dient descent algorithm uses the derivatives of a function to follow the function downhill to a
minimum (Ruder, 2016).

Momentum: it is also called Stochastic Gradient Descent with the use of momentum, where
the update of the last parameter is remembered at each iteration and the next update is determined
as a convex combination of the gradient and the previous update. Momentum guarantees of better
convergence rate than gradient descent in certain situations, where Momentum achieves a global
convergence rate of O( T12 ) after T steps, in contrast to the O( T1 ) of gradient descent (Sutskever et
al., 2013).

Adagrad: it is referred to as Adaptive Gradient Descent, it is a modified gradient descent


algorithm that uses learning rate parameters. Adagrad allows to increase the learning rate for
more sparse parameters and decreases the learning rates for less sparse parameters. Using the
technique of adaptive, the learning improves convergence performances over the ones of standard
gradient descent where the data is sparse (Duchi et al., 2011).

Conclusion

In this chapter, we present the fundamental concepts of recommender systems as well as some
recommendation approach such as Content-based, Collaborative Filtering and Hybrid approaches.
Then, we present the fundamental concepts and theories of Deep Learning technology like neu-
ral networks, Backpropagation and optimization techniques. In the next chapter, we present a
literature review of applying Deep Learning in the recommendation field.

17
Chapter 2
Deep Learning based Recommender
System Review

Introduction

Deep Learning technology achieved great success in many complex tasks including Image Recog-
nition (Wan et al., 2014), Natural Language Processing (Sarikaya et al., 2014), Automatic Speech
Recognition (Hinton et al., 2012) and Biomedical Informatics (Holzinger & Jurisica, 2014). In
last three years, recommender systems community interested in such advanced technology and
indeed several new recommendation methods have relied on it to enhance recommendation per-
formance in order to provide better recommendation to users (Cheng et al., 2016). Aiming to
show the state-of-the-arts of applying Deep Learning in the recommender systems field, this
chapter presents a literature review of applying Deep Learning technology on the most common
recommendation approaches including the Content-Based, Collaborative Filtering, Hybrid and
Advanced approaches. Section 2.1 reviews the application of Deep Learning in Content-Based
approach. Section 2.2 reviews the application of Deep Learning in Collaborative Filtering. Sec-
tion 2.3 and Section 2.4 present respectively the application of Deep Learning on Hybrid and
Advanced approaches. Finally, Section 2.5 discusses all the presented works.

18
Chapter 2 : Deep Learning based Recommender System Review

2.1 Content-Based Recommendation

Van den Oord et al. (2013) proposed to train a deep Convolutional Neural Networks (CNN) to
predict latent factors from music audio data in order to extract characteristics that affect user
preference directly from audio signals. Experimental study, on one million song dataset and
industrial scale dataset with audio excerpts of over 380,000 songs, shows better results compared
to the traditional approach that uses a bag-of-words representation of the audio signals which is
a very often used in the field. In other hands, their results show the benefit of using deep CNN in
musical information retrieval field.

X. Wang & Wang (2014) combined a model based on Deep Belief Network and Probabilistic
Graphical model to simultaneously learn features from music content and to make personalized
recommendations. Using the Echo Nest Taste Profile dataset, The proposed solution outperforms
existing Deep Learning based recommendation models like Probabilistic Matrix Factorization in
term of the warm-start and cold-start.

Bansal et al. (2016) applied Deep Recurrent Neural Networks (RNN) to encode the text
sequence into a latent vector in order to enhance the Collaborative Filtering accuracy and solve
the cold start problem. The model has been evaluated on two real world datasets in the domain
of scientific paper recommendation and experimental study shows that the proposed method
outperforms previous state-of-the-art methods, specifically previous RNN model based on gated
recurrent units in term of recommendation accuracy.

L. Zheng et al. (2017) relied on review information to jointly build a deep model named Deep
Cooperative Neural Networks (DeepCoNN) in order to learn item properties and user behaviors.
The proposed approach uses two parallel neural networks: the first one focuses on learning be-
haviors of the user and the second learns item properties from the user’s reviews written for the
item. These parallel representation is coupled using a shared layer in order to enables latent fac-
tors learned for users and items similarly like factorization machine techniques. Experimental
study on three real word datasets, yelp reviews, Amazon reviews, Beer reviews, shows that the
proposed model outperforms state-of-the arts models such as Matrix Factorization, Probabilis-
ticMatrix Factorization.

X. Wang et al. (2017) relied on Dynamic Attention Deep model to develop a news recom-
mendation approach in order to solve the problem of selecting new article to users from a data
base of article called also the selection pool. The proposed solution performs recommendation

19
Chapter 2 : Deep Learning based Recommender System Review

by first letting the editors selecting a subset of articles from a dynamically changing pool of arti-
cles getting filled from various news feed sources. Then, it applies convolutional neural networks
and wide model to automatically learn the editor’s underlying selection criteria. Experimental
study conducted over commercial API platform linking to a Chinese finance article feeds plat-
form shows that the proposed model performs well in both accuracy metrics prediction: AUC
and F1 score compared to other models like CNN based model and wide& deep model.

The next section presents the application of this technology on Collaborative Filtering ap-
proach which is considered as the most used recommendation technique in the industry.

2.2 Collaborative Filtering Recommendation

H. Wang et al. (2015) proposed Collaborative Deep Learning (CDL) which jointly perform-
ing Deep Learning for the content information and Collaborative Filtering in order to solve the
sparsity problem within the ratings matrix and auxiliary information for Collaborative Topic Re-
gression based approaches. Empirical study on three real-world datasets, two CiteULike datasets
and one Netflix dataset, shows that the proposed approach can significantly outperform the state
of the art Collaborative Filtering models like Collective Matrix Factorization and Collaborative
Topic Regression.

Li et al. (2015) relied on Deep Feature Learning to enhance Collaborative Filtering approach
more exactly a specific model based approach called Matrix Factorization. The proposed model
learns effective latent factors from both user-item ratings and side information. Comparison has
been performed on five real world datasets: MovieLens-100k, MovieLeans-1M, Book-Crossing,
Advertising dataset and results show that the enhanced method shows good results compared to
the state-of-the-art methods that use Matrix Factorization model.

H. Liang & Baldwin (2015) enhance collaborative filtering by developing a new method
called Probabilistic Rating Auto-Encoder in order to perform unsupervised feature learning and
generate underline user profiles from large-scale user rating data. Experimental study on yelp
dataset shows that the proposed model performs well compared to model based approach like
Matrix Factorization.

Devooght & Bersini (2016) proposed a new Collaborative Filtering approach based on RNN
model. In this sitting, Collaborative Filtering recommendation have been viewed as a sequence

20
Chapter 2 : Deep Learning based Recommender System Review

prediction problem and the Long Short-Term Memory (LSTM) model have been trained in order
to capture the evolution of the user’s taste over time. Comparison with standard Nearest Neigh-
bors and Matrix Factorization methods on the Movielens and Netflix datasets, shows that the
proposed approach outperforms them in terms of item coverage and short term predictions.

Ko et al. (2016) proposed a new extension of Collaborative Sequence model based on RNN
in order to analyze sequence of user’s activities which has been seen in modern recommender
systems. Empirical study, on two different tasks such as music recommendation and mobility
prediction, using two real-world datasets in these field shows that the model outperforms Non-
Collaborative methods.

Y. Wu et al. (2016) introduced a new approach called Collaborative Denoising Auto-Encoder


(CDAE) in order to perferm top-N recommendation task. the model formulates users and items
feedback data using a Denoising Auto-Encoder structure. It has shown that CDAE is a general-
ization of several well-known Collaborative Filtering models but with more flexible components.
Empirical study conducted on three data sets: MovieLens 10M, Netflix8 and Yelp data set, shows
that the proposed method outperforms several state-of-the-art top-N recommendation methods
such as Matrix Factorization, Item-Based Collaborative Filtering.

Y. Zheng et al. (2016) propose a new Model-Based Collaborative Filtering approach named
CF-NADE based on the Neural Autoregressive Distribution Estimator. CF-NADE is inspired by
the work of Restricted Boltzmann Machines based Collaborative Filtring (RBM-CF) (Salakhut-
dinov et al., 2007) also they have proposed a factored version of CF-NADE to deal with large-
scale dataset efficiently. The performance of the model is conducted on three real world datasets:
MovieLens 1M, MovieLens 10M and Netflix dataset and results show that CF-NADE outper-
forms wide number of state-of-the-art Collaborative Filtering methods like User based RBM and
BIAS Matrix Factorization.

Kuchaiev & Ginsburg (2017) developed a new model for rating prediction task in order to
improve the predictions accuracy in recommender systems. The model is based on Autoencoder
with 6 layers and have shown that such model can perform much butter then the shallow neural
networks. Further, they have introduced iterative output re-feeding, a technique which performs
dense updates in Collaborative Filtering, increase learning rate and further improve generaliza-
tion performance of the model. They demonstrate the versatility of their model by applying it
on time-split Netflix dataset. Results show that the proposed solution outperforms other Deep
Learning based approaches even without using additional temporal signals.

21
Chapter 2 : Deep Learning based Recommender System Review

H.-J. Xue et al. (2017) proposed a new extension of Matrix Factorization which a model
based recommendation approach with a neural network. The model uses both explicit ratings
and implicit feedback. Also they proposed a novel loss function to train the models in which both
explicit and implicit feedback are considered in the training. The experiments on several bench-
mark datasets such as MovieLens, Amazon Music and Amazon movies datasets demonstrate
the effectiveness of the proposed model and show up a 7.5 percent improvement in Normalized
Discounted Cumulative Gain(NDCG) metric.

Ebesu & Fang (2017) addressed the cold start problem and proposed a probabilistic modeling
approach called Neural Semantic Personalized Ranking (NSPR) to unify the strengths of deep
neural network and pairwise learning. The model relies on Deep Learning to extract semantic
representation of items and couple it with the latent factors learned from implicit feedback. Em-
pirical study on two datasets, Netflix and CiteuLike, shows that the approach outperforms Matrix
Factorization and topic regression based Collaborative Filtering approaches.

Cao et al. (2017) introduced a new Collaborative Filtering extension based on Stacked Auto-
encoder with Denoising which is an unsupervised Deep Learning method used to extract useful
low-dimension features from the original sparse user-item matrices. Experiemental study shows
that the proposed method perform well compared to methods based on Matrix Factorization or
item-based Collaborative Filtering.

He et al. (2017) relied on neural network models to enhance Collaborative Filtering. They
proposed a general framework named Neural network-based Collaborative Filtering (NCF). Ex-
tensive experiments on two publicly accessible datasets, MovieLens and Pinterest, show signif-
icant improvements of the proposed framework over the state-of-the-art Collaborative Filtering
methods such Matrix Factorization approaches and Item Based Collaborative Filtering.

The next section presents the application of Deep learning on Hybrid approach.

2.3 Hybrid Approach Recommendation

D. Liang et al. (2015) proposed a new hybrid method that can yield different sources of infor-
mation using Content-Based and Collaborative Filtering approach in order to enhance music
recommendation. The proposed system has two main steps: first, Content-Based model trains a
multi-layer neural network to produce semantic tags from vector-quantized acoustic feature and

22
Chapter 2 : Deep Learning based Recommender System Review

the output of the last hidden layer is treated as a high-level representation of the musical content.
Than, the obtained content is used as a prior for the song latent representation in Collaborative
Filtering. The proposed system is evaluated on the million song dataset and results show compa-
rably better result than the Collaborative Filtering approaches, in addition The system achieves
the state-of-the-art performance in music recommendation given content and implicit feedback
data.

Strub et al. (2016) developed an new model which uses an Autoencoder model to enhance the
Collaborative Filtering model. Indeed, they introduced a proper training loss/process of Autoen-
coders on incomplete data as well as used the side information to alleviate the cold start problem.
The proposed framework integrates both ratings and side information into a unique network and
this yield a scalable and robust approach which outperforms state-of-the-art Collaborative Filter-
ing methods.

Xu et al. (2017) introduced a hybrid Deep-Semantic Matrix Factorization (HDMF) model in


order to improve the performance of tag-aware personalized recommendation. The model com-
bine techniques of Deep-Semantic modeling, Hybrid learning, and Matrix Factorization. Ex-
perimental study conducted on social bookmarking dataset from Delicious bookmarking system
shows that the proposed method known as tag-aware personalized recommendation performs
well compared with clustering based models, Matrix Factorization, Encoder Based model and
Deep Semantic Similarity based methods.

Kumar et al. (2017) used a RNN combined with Neural Attention to combine user-item Col-
laborative Filtering with the content of the read news articles in order to tackle the problem of
changing and diverse reading interests of users as well as to solve cold start problem. Empiri-
cal study on real world CLEF NewsREEL dataset shows the effective of the model over other
baselines methods such as Item-based Collaborative Filtering and Matrix Factorization.

Sottocornola et al. (2017) provided a new architecture based on Deep Learning for Hybrid
recommender systems. The proposed architecture combines two information sources: natural
language text and user rating. Natural language text is used to learn a user-specific content-
based classifier, while user ratings are used to develop user-adaptive Collaborative Filtering rec-
ommendations. The results of both methods are combined using feed-forward neural network.
Performance conducted on the MovieLens dataset shows that the proposed method outperforms
state-of-the-art methods such as Matrix factorization.

Dong et al. (2017) introduced a hybrid Collaborative Filtering model based on Matrix Fac-

23
Chapter 2 : Deep Learning based Recommender System Review

torization and additional Stacked Denoising Autoencoder which is used to model the side in-
formation. The model outperforms several Collaborative Filtering techniques on the MovieLens
dataset and Book-Crossing dataset.

Kim et al. (2017) proposed a novel document Context-Aware Hybrid method which inte-
grates CNN and Probabilistic Matrix Factorization in order to capture contextual information in
description documents. Experimental study conducted on three real world datasets: two movie-
lens datasets and Amazon instant video dataset shows that the technique performs well compared
to non-Deep Learning based approaches such as Matrix Factorization and Collaborative topic re-
gression as well as Deep Learning approaches such as Collaborative Deep Learning.

In the next section, we present a literature review of applying of Deep Learning on two
advanced recommendation approaches: Social-Based recommendation and Context-Aware rec-
ommendation.

2.4 Advanced Approach Recommendation

Rawat & Kankanhalli (2016) introduced Context-Aware model which can predict multiple tags
for an image which can be recommended to a user. This model performs a CNN to learn image
features and the context representations are processed by two neural network layers. Then, The
outputs of these two neural networks are concatenated in order to predict the tag. The proposed
solution evaluated on YFCC100M dataset provided by the organizers of Yahoo-Flickr Grand
Challenge and results showed significant improvement in the prediction accuracy after adding
context information with the process.

Kim et al. (2016) developed a new Context-Aware recommendation approach, ConvMF, that
seamlessly integrates CNN into Probabilistic Matrix Factorization in order to capture contextual
information in description documents for the rating prediction. The model considers contextual
information such as surrounding words and word orders in description documents which can pro-
vide deeper understanding of description documents and this can enhances the rating prediction
accuracy. Experimental study on two real datasets: MovieLens and Amazon review datasets,
show that the proposed method significantly outperforms the state-of-the-art recommendation
models such as Collaborative Deep Learning, Collaborative Topic Regression and Probabilistic
Matrix Factorization.

24
Chapter 2 : Deep Learning based Recommender System Review

Smirnova & Vasile (2017) developed a Context-Aware Recommender system based on Con-
ditional Recurrent Neural Networks. The model involve the contextual information with in the
recommendation process in order to perform suggest of the next items. Experimental results
show that the proposed solution achieves better results against sequential and non-sequential
state-of-the-art baselines methods.

Deng et al. (2017) relied on Deep Learning to develop a new Trust-Based approach in order
to perform recommendation in social networks. The proposed method used this advanced tech-
nology to initialize user and item latent feature vectors for Trust-Aware Social recommendations
and to separate the community effect in user’s trusted friendships. Extensive experiments con-
ducted on two real-world social network data set: Epinions and Flixster datasets show that the
method performs well compared with several variation of Matrix Factorization approaches.

Ding et al. (2017) proposed BayDNN model which combines Bayesian Personalized Rank-
ing and Deep Neural Networks in order to perform friendship recommendation in social network.
The model first extracts latent deep structural feature representations from the input network
data using one-dimensional Convolutional Neural Network, then, applies Bayesian Personalized
Ranking Learning to captures user’s preference based on these extracted feature. Experimental
study on Epinion and Slashdot datasets shows that the proposed approach outperforms the base-
line approaches such as Matrix Factorization algorithms, Katz similarity, Adamic/Adar similarity
and simple pairwise input neural network.

The next section presents a discussion about the application of Deep learning on the recom-
mendation approaches presented above.

2.5 Discussion

To summarize the literature review of applying of Deep Learning on recommendation systems,


Figure 2.1 presents a statistics of all presented papers. As shown in the figure, most of the publi-
cations have used Deep Learning technology to enhance Collaborative Filtering performance due
the successful application of these approach in commercial area. Furthermore, various works in
this category have shown significant improvement over the state-of-art approaches such as Ma-
trix Factorization and Item-based Collaborative Filtering. Hybrid approach has also taken a big
part of applying of such technology due to the nature of Deep Learning technology which can

25
Chapter 2 : Deep Learning based Recommender System Review

work with complex and heterogeneous data simultaneously. In the other hand, application of
Deep Learning is relatively limited in Content-Based recommendation, although it has shown
a tremendous success on large amount of content such as image, music, video, text. Finally,
Advanced approach such as Context-aware recommendation and Social-Based recommendation
have shown a big interest within recommender systems community owing to their capabilities
to overcome drawbacks of conventional recommendation models. Therefore, applying of Deep
Learning technology on such approaches can give must more interest within recommendation
community.

Figure 2.1: Statistics of publication counts in four Recommender Systems categories.

Figure 2.2 shows a summary of all presented papers by application domain. Based on ap-
plication domains and data set used in the experimental studies, the presented papers show that
Deep Learning based recommendation systems have been applied in several domain including
music, video, news, E-commerce, social media. In the other hands, Entertainment industry has
gain large part form these advanced technology due to the availability of rich datasets.

26
Chapter 2 : Deep Learning based Recommender System Review

Figure 2.2: Statistics of publication counts in four Recommender Systems categories by applica-
tion domain.

Advanced approach such as Context-Aware recommendation have shown a big interest within
recommender system. Indeed, such approach is capable of parsing and understanding as much
information as possible which can enhance the recommendation performance. Thus, in this
thesis, we rely on Deep Learning technology to design a new Context-Aware recommendation
model in order to outperform the classical approaches of recommendation. The model called
Deep Learning Based Context recommendation which is the subject of the next chapter.

Conclusion

This chapter presents a literature review of applying of Deep Learning technology in recom-
mender systems field. As shown This area of research is very young, thus there is much work
to do especially in Hybrid and advanced approaches. The next chapter presents the proposed
method which consists of applying of Deep Learning on one of Advanced recommendation
method more specifically on Context-Aware recommendations.

27
Chapter 3
Deep Learning Approach for
Context-Aware Recommendation

Introduction

This Chapter is dedicated to the presentation of our proposed method which combines Deep
Learning technology with Context-Aware recommendation method in order to improve recom-
mendation performance. This chapter is organized as follow: Section 3.1 gives a brief intro-
duction of Context-Aware recommender systems. Then, Section 3.2, is dedicated to present the
proposed method which aims to use the Deep Learning technology with the Context-Aware rec-
ommendation in order to enhance recommendation performance. Finally, Section 3.3 presents
the implementation details.

3.1 Context-Aware Recommendation

Standard recommendation techniques such as Collaborative Filtering, Content-Based and Hybrid


methods provide users with personalized recommendations without taking into account any con-
textual information (Ning et al., 2015), such as time, location and mood. In fact, such approaches
deal usually with two types of information (users and items) and do not put them into any con-

28
Chapter 3 : Deep Learning Approach for Context-Aware Recommendation

textual situation in the time of providing the recommendations (Adomavicius & Tuzhilin, 2015).
For instance, Recommender Systems application like recommending a vacation package, person-
alized content on a website or music recommendation can depond on contextual infomation such
as the mood of the users, the place, etc. Thus, Context-Aware Recommender System aims to
enhance the personalized recommendations by contextualizing the process of recommendation
(Adomavicius & Tuzhilin, 2015).

In recent years, academic researchers have shown a considerable attention to Context-Aware


Recommender Systems and they have applied this concepts in a variety of different applica-
tion, including movie recommenders (Campos et al., 2013), tourism recommenders (Y. Zheng et
al., 2015b), news recommenders (S. Wang et al., 2015), restaurant recommenders (Y. Zheng et
al., 2014, 2015b), music recommenders (Y. Zheng et al., 2014, 2015b) and mobile app recom-
menders (Shi et al., 2014). Similarly, companies have also started incorporating some contextual
information into their recommendation systems, for instance, Netflix, nwhich is a big movie
streaming and rental company, knows the location of its customers and uses the location as well
as the time as a contextual variables to provide context recommendations of movies (Adomavi-
cius & Tuzhilin, 2015).

Three main approaches have been proposed to perform recommendation according to con-
textual information (Campos et al., 2013; Haruna et al., 2017): Contextual Pre-filtering, Con-
textual Post-filtering and Contextual Modeling. In Contextual Pre-filtering approach, contextual
information is used for matching the relevant set of data records (ratings). Then, ratings can
be predicted on the selected data using any convolution Recommender System such as Collab-
orative Filtering (Codina et al., 2016). In Contextual Post-filtering, the recommendation algo-
rithm initially ignores the contextual information. The ratings are predicted using any traditional
Recommender System on the whole data set. Then, the process adjusted the resulting set of rec-
ommendations for each user using the contextual information (Ramirez-Garcia & Garcı́a-Valdez,
2014). Finally, Contextual Modeling is considered as more sophisticated approach where contex-
tual data is integrate directly in the prediction model and this gives rise to truly multidimensional
recommendation which represent a truly predictive models which can be solved using any Ma-
chine Learning algorithms such as regression, decision trees, probabilistic models (H. Wu et al.,
2015). In this thesis, the Contextual Modeling approach will be the framework of our proposed
approach.

We have provided, in the second chapter, a literature review of the application of Deep Learn-
ing which is considered as Model-Based approach in the field of Recommender Systems (Singhal

29
Chapter 3 : Deep Learning Approach for Context-Aware Recommendation

et al., 2017). Accordingly, we have shown that such technology can achieve higher performance
over standard state-of-the-arts recommendation approaches including Memory-Based approach
such as similarity based approach and Model-Based approaches such as Matrix Factorization.
Thus, we will rely on Deep Learning technology to improve recommendation accurate using
contextual information and we propose a new Context-Aware Recommender based on this state-
of-the art technology. The next Section presents the proposed method.

3.2 Deep Learning based method for Context-Aware Recom-


mendation

In order to improve the process of recommendation using contextual information, we design a


Deep Neural Network model which allows to perform recommendation in Context-Aware Rec-
ommendation. The proposed method, referred to as Deep Context-Aware Recommender system
(DCARS), is based on the Deep Neural Network technology. DCARS can perform recommen-
dation on multidimensional recommendation data that can yield user, item and contextual infor-
mation. However, by taking only the two common recommendation feature: the users and the
items information, the proposed method can be adapted as Collaborative Filtering Approach.

Standard Recommender Systems which aim to predict ratings can be defined as a regression
problem over users U = {u1 , u2 , ...} and items I = {i1 , i2 , ...}, where the target function has to be
estimated is as follows (Y. Zheng et al., 2015a):

Y :U×I →R (3.1)

Where Y is the rating value (e.g 1..5 stars). In the other hand, in Context-Aware Recommender
Systems, The ratings prediction is assumed to be influenced with some additional contextual
information such that C j = {c j,1 , c j,2 , ...}. Example of contextual information can be the time at
which a rating was given by the user, the mood of the users, etc. Hence, with such additional
information The ratings prediction problem can be formalized as follows (Y. Zheng et al., 2015a):

Y : U × I × C1 , C2 , ..., C3 → R (3.2)

In this sittings the task of rating prediction using contextual information is to estimate a function
Ŷu,i,c j that can predict rating Yu,i,c j for any user u, item i, and context c j combination.

30
Chapter 3 : Deep Learning Approach for Context-Aware Recommendation

Algorithm3.1 describes the pseudo-code of the proposed Deep Context-Aware Recommender


model which can perform prediction training in Context-Aware Recommendation data. The
inputs of the algorithm are the data set {X1 , ..., Xn }, the label of the data {Y1 , ..., Yn } and the tuning
parameters: Ldims which is a list of length (number of layers) containing the input size and each
layer size, Lrate learning rate of optimization algorithm such as Gradient Descent and Imax which
represents the number of iterations of the optimization algorithm. The outputs is a final learned
parameters PImax which can be used in rating predict processes. The Algorithm first initialization
the model’s parameters by setting up the number of the layers and the dimension of each layer
using Ldims . Then, it performs the Imax training loop. Each iteration in the process can be divided
into four main steps: the first step performs the feed forward propagation using parameter Pi
aiming to predict the ŶPi . The second step computes the loss cost Licost (ŶPi , Y) between the ŶPi
and the true labels Y. Then, the process trains the model using the estimated ŶPi derived from
the forward propagation using the current parameters Pi and a backward propagation algorithm
such as Gradient Descent, Adam. Finally, the algorithm update the parameters, Pi+1 , using the
parameters Pi , gradŶPi , and Lrate .

Algorithm 3.1 Deep Context-Aware Recommeder system


1: Input: X: input recommendation data; Y: label of the data; Ldims : list containing the size of

the layers; Lrate : learning rate of optimization algorithm; Imax : maximum iterations number
of the optimization algorithm;
2: Output: PImax : parameters of the model
/* Parameters initialization */
3: Initialize parameters P0 using Ldims ; Initialize Ŷ0 , L0cost and grad0
/* learning part */
4: for i = 0, ..., Imax
5: Calculate ŶPi using forward propagation using Ldims layers
6: Calculate loss cost Licost (ŶPi , Y)
7: Calculate gradŶPi using backward propagation and Y
8: Calculate parameters Pi+1 using Pi , gradŶPi and Lrate
9: return: PImax

To further describe the proposed model, Fugure3.1 shows the multi-layer architecture model

31
Chapter 3 : Deep Learning Approach for Context-Aware Recommendation

of the proposed Deep Context-Aware Recommender System. The output of one layer serves as
the input of the next one. The bottom input layer consists of three feature vectors: The user
feature vector, the item feature vector and the N context feature vectors. Next to the input layer,
we see the embedding layer which is a fully connected layer that can be used to transform the
sparse data representation to dense representations suitable for neural networks layer. Then, the
embedded feature vectors fed into a concatenation layer in order to merge all the features vectors
into a single layers. The next stage is the multi-layer neural network architecture which aim to
map the concatenation input layer to prediction score layer. Finally, The output layer is used to
predicted score Ŷu,i,c j where u, i and c j mean respectively the users, the items and the context,
and the training is performed by minimizing the cost between Ŷu,i,c j and the target value Yu,i,c j .

Figure 3.1: The architecture of the proposed Deep Context-Aware Recommendation method

In the next part, we present each step described in Figure 3.1 and we give implementation
details of the proposed method using Deep Learning framework.

32
Chapter 3 : Deep Learning Approach for Context-Aware Recommendation

3.2.1 Input Layer

To carry out the Context-Aware Recommendation data sets which usually represents as a cat-
egorical feature triplet of user, item and context (Adomavicius & Tuzhilin, 2015). Figure 3.2
presents an example of Context-Aware data where:
U = {Alia, Ahmed, Adel}
I = {T itanic, Avangers, S tarWars, S tarT rek}
C1 = mood = {S ad, Normal, Happy}
The Figure 3.2 shows how one can transform Context-Aware Recommendation data into a fea-
tures value used in the prediction algorithm. The transformation consists of encoding the cat-
egorical (user, item and context) with indicator variables. Therefore, Following the proposed
example, the first tuple (Alia, T itanic, S ad, 5) means that Alia is rated T itanic with 5 stars when
she is sad. So, after the transformation this tuple will yield (1, 1, 1, 5) which can be used to train
the prediction algorithm.

Figure 3.2: Transformation of Context-Aware Recommendation data set.

3.2.2 Embedding Layer and Concatenation Layer

In addition to the user and the item data, the context features can be categorical. The categorical
features can be a binary such as whether the user is male or female or with other possible values
such as job occupation (doctor, teacher, etc), Location (home, cinema, work, party, etc), Time
(morning, evening, weekend, weekday, etc), mood(sad, happy). Also we can have numerical
feature such as the Age of the user, Temperature (Adomavicius & Tuzhilin, 2015).
The proposed model learns high dimensional embedding for each categorical feature in a Fixed

33
Chapter 3 : Deep Learning Approach for Context-Aware Recommendation

vocabulary. Indeed, embeddings technique have been inspired by continuous bag of words lan-
guage models (Mikolov et al., 2013), to map categorical features to dense representations suitable
for neural networks and this technique have been used in several Deep Learning recommendation
models (He et al., 2017; L. Zheng et al., 2017; Cheng et al., 2016; Kim et al., 2016). Embedding
layer in the proposed model usually derived a low-dimensional and dense tables (Mikolov et al.,
2013) and perform also normalization of the other input such as the age and the gender. After
This process, the model flatten these tables into low-dimensional tables and concatenates them
all with the other input information into one layer that will be fed into the hidden layers of the
neural network.

3.2.3 Neural Network Layer

After data embedding and concatenation, the model perform the hidden neural network layers
using Algorithm 3.1. Each layer perform the flowing computation

al+1 = f (W (l) a(l) + b(l) ) (3.3)

where l represents the layer’s number and f is the activation function, al+1 is the activation, W (l)
and b(l) are respectively the weight and the bias parameters of the layer l. There exit several
activation function used in Deep Learning such the sigmoid activation, RELU Activation which
stands to Rectified linear units, Tanh activation (Schmidhuber, 2015). But often RELU has shown
good results when we work with deep models compared to other functions (Le et al., 2015), thus
in the proposed method we will apply RELU in each hidden layer. The number of the hidden
layers can determine the model’s capability, such that, when we have low number of hidden
layer the model can perform weak prediction, in contrast when we have a high number of hidden
layers, the models tends to give very accurate results. The choice of the number of layers often
depends on the processing capabilities of the machine because increasing the number of layers
can yield a high computation time (He et al., 2017). Hence, we keep the layer number as a
parameter that can be given by the user.

3.2.4 Output Layer

The output layer is used to predict the ratings Ŷu,i,c j . The algorithm compute the loss cost between
the predicted ratings Ŷu,i,c j and the true ratings Yu,i,c j by minimizing the cost between Ŷuic and

34
Chapter 3 : Deep Learning Approach for Context-Aware Recommendation

the target value Yu,i,c j . In the proposed model we use either Mean Absolute Error (MAE) or
Mean Squared Error (MSE) loss functions which is suitable for recommendation systems. MAE
measures the average over the absolute differences between prediction rating and actual rating as
follows: n
1X
L(Ŷ, Y) = |Ŷi − Yi | (3.4)
n i=1

while, Mean Squared Error can be derived by the averaging of the squared differences be-
tween prediction and actual rating as follows:
n
1X
L(Ŷ, Y) = (Ŷi − Yi )2 (3.5)
n i=1

In the next section we present some Deep Learning tools that can be used to implement the
proposed method.

3.3 Implementation Details

This section presents the tools available for practical implementation of Deep Learning models
and presents the tool that will be used to implement the proposed model.
Today there exist dozens of Deep Learning tools available, As follow we presents some of the
widely used frameworks such as Theano 1 , Pytorch 2 , Caffe3 , TensorFlow4 , Keras5 .

Theano: was the first widely used Deep Learning framework created by Yoshua Bengio at the
University of Montreal in 2007. Theano is a Python based library and a low level Deep Learning
framework which is extremely fast and powerful framework. In 2017, research team announced
that there will be no support for Theano.

Pytorch: is also a python based library that was released by Facebook in early 2017. Pytorch
is considered as a simple framework that offers high speed and flexibility and can perform a
1
http://deeplearning.net/software/theano/
2
https://pytorch.org/
3
http://caffe.berkeleyvision.org/
4
https://www.tensorflow.org/
5
https://keras.io/

35
Chapter 3 : Deep Learning Approach for Context-Aware Recommendation

dynamic computational graphs which can help analyzing unstructured data. Furthermore, it
allows using Graphics processing unit (GPU) capability of the material and this can yield very
efficient model. One drawbacks of this tool is that it is still in new beta version and there is not
enough community support.

Caffe was developed by Berkeley Artificial Intelligence Research. If is suitable tool for Con-
volutional Neural Network model, and has three characteristics which are priority to expression,
speed, and modularity. The second upgrade of coffe which is coffe2 is introduced by Facebook in
2017 and can provide users with pre-trained models that can be used to build demo applications
without any extra hassle.

TensorFlow: is an open source and python based Deep Learning framework which performs
numerical computation using data flow graphs. It was developed by Google Brain Team to deploy
machine learning and Deep Learning researches. Today, TensorFlow is considered as the most
commonly used Deep Learning framework and it’s supported by a big community. In the last
few years, it has been adapted in many big company like Twitter, Uber and eBay.

Keras: is developed as high level Deep Leaning framework that simplify building deep mod-
els. It is a python based library which can be functioned on top of low level framework such as
Theano and TensorFlow. In 2018, Google has supported Keras and it will be including in the
coming TenserFlow releases.

Out of all these available Deep Learning tools for numerical computation we choose to im-
plement the proposed method using Python programming language, TensorFlow as a low level
framework because of the following reasons: the maturity and the big community support. Also
we will use Keras as high level framework which can make easy the model implementation.

Conclusion

In this chapter, we proposed a Context-Aware Recommendation Systems which is based on the


Deep Learning technology. We detailed our proposed model and we selected the framework that

36
Chapter 3 : Deep Learning Approach for Context-Aware Recommendation

we will be used to implement our model. In the next chapter, we present an experimental study
to evaluate the ability of our method to perform personalized recommendation.

37
Chapter 4
Experimental Study

Introduction

In order to evaluate the ability of the proposed Deep Context-Aware Recommendation method to
perform prediction on recommendation data set, we present an experimental study on real data
sets. The effectiveness of the proposed method is evaluated in terms of prediction accuracy.

This chapter is organized as follows: Section 4.1 describes the datasets used to examine the
performance of our approach. Section 4.2 details the evaluation measures that will be used to
test the performance of the proposed DCARS method. Finally, Section 4.3 presents the obtained
results.

4.1 Datasets Description

The experiment was conducted on real recommendation dataset: Tripadvisor dataset (Y. Zheng
et al., 2012), Movielens100k 1 dataset and Movielens1M 2 dataset. Table 4.1 describes statistics
of the used data sets.
1
https://grouplens.org/datasets/movielens/100k/
2
https://grouplens.org/datasets/movielens/1m/

38
Chapter 4 : Experimental Study

Tipadvisor dataset: this data was scripted from online reviews on tripadvisor.com web site.
It contains trip type (Family, Couples, Business, Solo travel, Friends) as well as geographical
location information from the users and the hotels. Such that the user information yield the user
id, user’s state of residence, user ’s time zone information, the hotels information yield the hotel
id, hotel city, hotel state and time zone information, as well as the trip type and user’s rating.
Overall this data set contains 4669 ratings, 1202 users, and 1890 hotels.

Movielens100k dataset: it is considered as the one of the most used data set in recommen-
dation area and it is publicly available on the MovieLens3 website. This dataset was collected
through the MovieLens website4 where users regularly visit and rate the movies that they have
already watched. Ratings can range from 1 to 5 stars, that is 1 star means that the visitor don’t
like the movie whereas 5 starts means that the user gives a full interest to the movie. Overall,
the dataset contains 100.000 ratings in total, collected from 943 users on 1682 movies and each
user has rated a least 20 movies among all the available movies. In addition to this information,
Movielens group provided a metadata about the users that visit the website such as the age, the
occupation, the gender, etc. Thus, in this experimental study we will take all this information as
contextual data.

Movielens1M dataset: to further evaluate the scalability of the proposed method we have
used the 1 million dataset which is also provided by MovieLens group. Overall, the dataset
contains 1.000.209 ratings, collected from 6,040 MovieLens users and 3,900 movies and each
user has rated a least 20 movies among all the available movies. The dataset contains also a meta
data of the user which will take as contextual information in our experimental study.

Table 4.1: Statistics of the Recommendation Dataset


Dataset Number of users Numbers of items Number of context Ratings

TripAdvisor 1202 1890 6 contexts 4669


MovieLens 100k 943 1682 4 contexts 100000
MovieLens 1M 6040 3900 4 contexts 1000209
3
https://grouplens.org/datasets/movielens
4
movielens.umn.edu

39
Chapter 4 : Experimental Study

4.2 Evaluation Measures

In recommendation area, the predictive accuracy metrics aim to measure how much the predic-
tion is close to the true numerical rating expressed by the user (Gunawardana & Shani, 2009).
Several metrics have been proposed in the literature. In this section, we introduce the commonly
used ones which is the Mean Absolute Error (Pennock et al., 2000) and the Root Mean Squared
Error (Bennett et al., 2007). In what follows, we present the two evaluations metrics.

Mean Absolute Error (MAE) : consists of taking the mean of the absolute difference between
each prediction and the actual rating of all the users in the system. MAE is defined by Equation
4.1, where n is the total number of predictions, Ŷi is the prediction value and Yi is the actual
rating value. MAE metric provides a value that can range from 0 to 4. Hence, the more MAE
metric is lower, the more accurate the recommendation approach predict user’s ratings.

n
1X
MAE = |Ŷi − Yi | (4.1)
n i=1

Root Mean Squared Error (RMSE) is another widely used method to evaluates recommen-
dation approaches in ratings prediction. RMSE is defined as the square root of the averaging of
the squared differences between prediction and actual rating as described by Equation 4.2, where
n is the total number of prediction, Ŷi is the prediction value and Yi is the actual rating value.

n
1X
RMS E = (Ŷi − Yi )2 (4.2)
n i=1

4.3 Empirical Results

In this section, we start by presenting the environment where we performed experiments study.
After that, we present and discuss the experimental results.

40
Chapter 4 : Experimental Study

4.3.1 Experimental environment

5
Experiments are performed on a Google Colab which is a Google’s free cloud service for
Artificial intelligence developers which is suitable for Deep Learning model implementation,
the cloud provides a 2-core Xeon Processor with 2.2GHz, 13GB of RAM and 33GB HDD.
Furthermore, we used a machine with 4 cores (Intel Corei5-6500 Processor, up to 3.6 GHz),
8GB of RAM to run state-of-the-art recommendation methods used to compare the proposed
method.
Experiments were realized using libraries and frameworks such that, numpy 6 which is a python
library used for numerical computation, pandas7 which is a python library used to manipulate
different format data such as text files and CSV files, matplotlib8 which used to plot results. As
a deep learning framework, we used Keras 2.1.6 9 , Tensorflow 1.9.0 10
. In the other hand, we
have CARSKit11 which is a Context-Aware Recommendation library based on Java programming
language. CARSKit contains a state-of-art Context-Aware Recommendation methods which can
be useful for comparison with our proposed method.

4.3.2 Obtained Results

For investigating the ability of our method to deal with Context-Aware recommendation data,
We evaluate the performance of our proposed method with three state-of-the-art recommenda-
tion methods: the first method called Matrix Factorization (Koren et al., 2009) and perform
recommendation without taking any contextual information while CAMF CI (Baltrunas et al.,
2011) and Tensor factorization (Karatzoglou et al., 2010) take into consideration the contextual
information. All these methods are implemented in the CARSKit.

Figure 4.1 shows the evaluation of the proposed method over the three state-of-the-art meth-
ods on Tripadvisor dataset. As shown in Figure 4.1(a), the proposed method performs well
compared to the other recommendation methods using MAE metrics, for instance, the proposed
5
https://colab.research.google.com/
6
http://www.numpy.org/
7
https://pandas.pydata.org/
8
https://matplotlib.org/
9
https://keras.io/
10
https://www.tensorflow.org/
11
https://github.com/irecsys/CARSKit

41
Chapter 4 : Experimental Study

DCARS can provide prediction with less than 0.8 MAE whereas Tensor Factorization method
provides a prediction with 1 MAE. On the other hand, Figure 4.1(b) shows the performance of
proposed DCARS using RMSE metrics. Also, the proposed method outperforms the other meth-
ods. For example, CAMF CI predict recommendation with more than 1.4 RMSE while DCARS
provides recommendation with less than 0.8 RMSE.

(a) Tripadvisor datasets using MAE metric

(b) Tripadvisor datasets using RMSE metric

Figure 4.1: Evaluation of DCARS over different methods on Tripadvisor dataset

To further test the proposed method, Figure 4.2 shows the evaluation of the proposed method
using Movielens 100K dataset. As shown in Figure 4.2, the proposed method outperforms the
other recommendation method using both metrics MAE and RMSE. The Figure also shows that

42
Chapter 4 : Experimental Study

the proposed method scales well with large datasets and can keep the same performance as small
datasets. For instance, the proposed method can predict ratings with almost 0.8 RMSE on both
Movielens 100K ad Tripadvisor Dataset.

(a) Movielens 100K datasets using MAE metric

(b) Movielens 100K datasets using RMSE metric

Figure 4.2: Evaluation of DCARS over different method on Tripadvisor dataset

In the next experiments, we rely on MovieLens 1M dataset which is a large recommenda-


tion dataset to investigate the good optimization algorithm that can be used within the proposed
method. We perform the experiments using four optimization algorithms: Stochastic Gradient
Descent (SGD), ADAM, RMSprop and Adagrad. Table 4.2 presents the results. The columns
present comparison between training and testing results using both MAE and RMSE metrics.
As shown, the training and testing performance of the proposed method are quite similar. For
instance, MAE training and MAE training using SGD are respectively 0.8738 and 0.8582.

43
Chapter 4 : Experimental Study

Table 4.2: Optimization algorithms over training and testing using MovieLens 1M dataset
Optimization algorithm MAE training MAE testing RMSE training RMSE testing

SGD 0.8738 0.8582 0.9907 0.9957


ADAM 0.6433 0.718 0.8373 0.9317
RMSprop 0.6442 0.7218 0.8432 0.8978
Adagrad 0.6741 0.7986 0.8919 0.9718

TO further show the results, Figure 4.3 illustrates the results presented in Table 4.2. As shown
both ADAM and RMSprop provide quite similar results compared to the other optimization
methods in term of training and testing results. For instance, the MSE using Adam optimization
algorithm performs 0.6433 MAE training compared to 0.8738 for SGD. In the other hand, the
Figure shows that both training and testing results using MSE and RMSE metrics are quite similar
which indicate that the proposed method provides good results.

44
Chapter 4 : Experimental Study

(a) Movielens 1 million datasets using MAE metric

(b) Movielens 1 million datasets using RMSE metric

Figure 4.3: Evaluation of DCARS Movielens 1 million using different optimization algorithms

In another experiments, we compare the proposed DCARS over a range of iteration numbers.
We execute DCARS on MovieLens 1M dataset using ADAM optimization algorithm since it
provides good performance compared to the other techniques as well as we range the iteration
number from 10 to 50. Figure 4.4 shows the results of this experiments. As shown as we increase
the iteration number, the prediction quality of the model increases, especially with the RMSE
evaluations metrics but in MAE metrics the preduction quality decreases after 30 iteration values
which means that in this case we should use 30 as iteration numbers.

To further test our method, we compare the proposed DCARS over a range of different layers
numbers. Also in this experiments we execute DCARS on MovieLens 1M dataset using ADAM
optimization algorithm as well as we used 30 as iteration numbers. Figure 4.5 shows the results
of this experiments. As shown as we increase the number of layer, the prediction between the

45
Chapter 4 : Experimental Study

(a) Movielens 1 million dataset using MAE metric

(b) Movielens 1 million datasets using RMSE metric

Figure 4.4: Evaluation of DCARS Movielens 1 million using iteration numbers

46
Chapter 4 : Experimental Study

training and the testing approach which indicate that the proposed method perform well as we
increase the hidden layers. For instance, when we have 6 layers (128,64,32,16,8,4) in both MAE
and RMSE metrics the gap between training results and tresting results become smaller and this
provides good prediction results.

(a) Movielens 1 million dataset using MAE metric

(b) Movielens 1 million dataset using RMSE metric

Figure 4.5: Evaluation of DCARS Movielens 1 million using different layers numbers

Conclusion

In this chapter, we evaluate our proposed method DCARS real recommendation datasets. The
results show that our recommendation algorithm outperforms the other recommendation meth-
ods in term of MAE and RMSE metrics. Likewise, we show that our approach can perform

47
Chapter 4 : Experimental Study

recommendation in large dataset.

48
Conclusion

Recommender systems aim to provide suitable recommendation for users among large infor-
mation in many fields such as Multimedia, E-commerce, Tourism. Two common techniques of
recommendation can be distinguished from the literature, Non-Personalized Recommendation
and Personalized Recommendation. Personalized Recommendation systems, or simply recom-
mender systems, are very used, due to it capability to find the most suitable products or services
based on user’s preferences and constraints. Personalized recommendation can be classified
into four main approaches: Content-Based, Collaborative Filtering, Hybrid, and Advanced ap-
proaches. In this thesis, we relied on advanced recommendation approach called Context-Aware
recommendation which can involve contextual information in the time of recommendation.

In recent years, Recommender Systems community has shown a big interest in Machine
Learning field especially Deep Learning sub-field and indeed several new recommendation meth-
ods have relied on such technologies to enhance recommendation performance in order to pro-
vide better recommendation to users. However, most work applied Deep Learning on common
recommendation approaches not the advanced one such as Context-Aware recommendation.
Thus, in this master thesis, we relied on Deep Learning technology to improve recommenda-
tion accurate using contextual information and we propose a new Context-Aware Recommender
based on this state-of-the-art technology which we called Deep Context-Aware Recommender
System (DCARS).

Experimental study conducted on three datasets and we used three stat-of-the-art approaches
to compare the proposed model. The obtained results shows that the proposed Deep Learn-
ing based model can enhance prediction performance in the time of providing recommendation
for user in a specific context. Likewise, the experiments show that combine deep learning and

49
Chapter 4 : Experimental Study

Context-Aware recommendation can outperform stat-of-the-art with a good margin using MSE
and RMSE Metrics.

In this thesis we have used deep learning technology more exactly embeddings annd feed-
forward neural networks architectures for the task of recommendation. Further research could
extend such model to improve the performance with additionally including other features like
text and images in the learning stage. Furthermore, it might be of interest to use other Deep
Learning models such as CNN and RNN with Context-aware recommendation approach.

50
References

Adomavicius, G., & Tuzhilin, A. (2005). Toward the next generation of recommender systems:
A survey of the state-of-the-art and possible extensions. IEEE transactions on knowledge and
data engineering, 17(6), 734–749.

Adomavicius, G., & Tuzhilin, A. (2015). Context-aware recommender systems. In Recom-


mender systems handbook (pp. 191–226). Springer.

Baltrunas, L., Ludwig, B., & Ricci, F. (2011). Matrix factorization techniques for context aware
recommendation. In Proceedings of the fifth acm conference on recommender systems (pp.
301–304).

Bansal, T., Belanger, D., & McCallum, A. (2016). Ask the gru: Multi-task learning for deep text
recommendations. In Proceedings of the 10th acm conference on recommender systems (pp.
107–114).

Bennett, J., Lanning, S., et al. (2007). The netflix prize. In Proceedings of kdd cup and workshop
(Vol. 2007, p. 35).

Bobadilla, J., Ortega, F., Hernando, A., & Gutiérrez, A. (2013). Recommender systems survey.
Knowledge-based systems, 46, 109–132.

51
REFERENCES

Bouza, A., Reif, G., Bernstein, A., & Gall, H. (2008). Semtree: Ontology-based decision tree
algorithm for recommender systems. In Proceedings of the 2007 international conference on
posters and demonstrations-volume 401 (pp. 106–107).

Burke, R. (2007). Hybrid web recommender systems. In The adaptive web (pp. 377–408).
Springer.

Campos, P. G., Fernández-Tobı́as, I., Cantador, I., & Dı́ez, F. (2013). Context-aware movie rec-
ommendations: an empirical comparison of pre-filtering, post-filtering and contextual mod-
eling approaches. In International conference on electronic commerce and web technologies
(pp. 137–149).

Cao, S., Yang, N., & Liu, Z. (2017). Online news recommender based on stacked auto-encoder.
In Computer and information science (icis), 2017 ieee/acis 16th international conference on
(pp. 721–726).

Cheng, H.-T., Koc, L., Harmsen, J., Shaked, T., Chandra, T., Aradhye, H., . . . others (2016).
Wide & deep learning for recommender systems. In Proceedings of the 1st workshop on deep
learning for recommender systems (pp. 7–10).

Chollet, F. (2017). Deep learning with python. Manning Publications Co.

Codina, V., Ricci, F., & Ceccaroni, L. (2016). Distributional semantic pre-filtering in context-
aware recommender systems. User Modeling and User-Adapted Interaction, 26(1), 1–32.

Dahl, G. E., Sainath, T. N., & Hinton, G. E. (2013). Improving deep neural networks for lvcsr
using rectified linear units and dropout. In Acoustics, speech and signal processing (icassp),
2013 ieee international conference on (pp. 8609–8613).

Deng, S., Huang, L., Xu, G., Wu, X., & Wu, Z. (2017). On deep learning for trust-aware
recommendations in social networks. IEEE transactions on neural networks and learning
systems, 28(5), 1164–1177.

52
REFERENCES

Devooght, R., & Bersini, H. (2016). Collaborative filtering with recurrent neural networks. arXiv
preprint arXiv:1608.07400.

Ding, D., Zhang, M., Li, S.-Y., Tang, J., Chen, X., & Zhou, Z.-H. (2017). Baydnn: Friend
recommendation with bayesian personalized ranking deep neural network. In Proceedings of
the 2017 acm on conference on information and knowledge management (pp. 1479–1488).

Dong, X., Yu, L., Wu, Z., Sun, Y., Yuan, L., & Zhang, F. (2017). A hybrid collaborative filtering
model with deep structure for recommender systems. In Aaai (pp. 1309–1315).

Duchi, J., Hazan, E., & Singer, Y. (2011). Adaptive subgradient methods for online learning and
stochastic optimization. Journal of Machine Learning Research, 12(Jul), 2121–2159.

Ebesu, T., & Fang, Y. (2017). Neural semantic personalized ranking for item cold-start recom-
mendation. Information Retrieval Journal, 20(2), 109–131.

Funahashi, K.-I. (1989). On the approximate realization of continuous mappings by neural


networks. Neural networks, 2(3), 183–192.

Glorot, X., & Bengio, Y. (2010). Understanding the difficulty of training deep feedforward neural
networks. In Proceedings of the thirteenth international conference on artificial intelligence
and statistics (pp. 249–256).

Goldberg, D., Nichols, D., Oki, B. M., & Terry, D. (1992). Using collaborative filtering to weave
an information tapestry. Communications of the ACM, 35(12), 61–70.

Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep learning. MIT Press. (http://
www.deeplearningbook.org)

Graves, A., Mohamed, A.-r., & Hinton, G. (2013). Speech recognition with deep recurrent
neural networks. In Acoustics, speech and signal processing (icassp), 2013 ieee international
conference on (pp. 6645–6649).

Gunawardana, A., & Shani, G. (2009). A survey of accuracy evaluation metrics of recommen-
dation tasks. Journal of Machine Learning Research, 10(Dec), 2935–2962.

53
REFERENCES

Haruna, K., Ismail, M. A., Damiasih, D., Chiroma, H., & Herawan, T. (2017). Comprehensive
survey on comparisons across contextual pre-filtering, contextual post-filtering and contextual
modelling approaches. Telkomnika, 15(4), 1865–1875.

He, X., Liao, L., Zhang, H., Nie, L., Hu, X., & Chua, T.-S. (2017). Neural collaborative filtering.
In Proceedings of the 26th international conference on world wide web (pp. 173–182).

Hinton, G., Deng, L., Yu, D., Dahl, G. E., Mohamed, A.-r., Jaitly, N., . . . others (2012). Deep
neural networks for acoustic modeling in speech recognition: The shared views of four re-
search groups. IEEE Signal Processing Magazine, 29(6), 82–97.

Holzinger, A., & Jurisica, I. (2014). Knowledge discovery and data mining in biomedical in-
formatics: The future is in integrative, interactive machine learning solutions. In Interactive
knowledge discovery and data mining in biomedical informatics (pp. 1–18). Springer.

Karatzoglou, A., Amatriain, X., Baltrunas, L., & Oliver, N. (2010). Multiverse recommendation:
n-dimensional tensor factorization for context-aware collaborative filtering. In Proceedings of
the fourth acm conference on recommender systems (pp. 79–86).

Kim, D., Park, C., Oh, J., Lee, S., & Yu, H. (2016). Convolutional matrix factorization for
document context-aware recommendation. In Proceedings of the 10th acm conference on
recommender systems (pp. 233–240).

Kim, D., Park, C., Oh, J., & Yu, H. (2017). Deep hybrid recommender systems via exploiting
document context and statistics of items. Information Sciences, 417, 72–87.

Ko, Y. J., Maystre, L., & Grossglauser, M. (2016). Collaborative recurrent neural networks
for dynamic recommender systems. In Journal of machine learning research: Workshop and
conference proceedings (Vol. 63).

Koren, Y. (2008). Factorization meets the neighborhood: a multifaceted collaborative filter-


ing model. In Proceedings of the 14th acm sigkdd international conference on knowledge
discovery and data mining (pp. 426–434).

54
REFERENCES

Koren, Y. (2009). The bellkor solution to the netflix grand prize. Netflix prize documentation,
81, 1–10.

Koren, Y., Bell, R., & Volinsky, C. (2009). Matrix factorization techniques for recommender
systems. Computer, 42(8).

Kuchaiev, O., & Ginsburg, B. (2017). Training deep autoencoders for collaborative filtering.
arXiv preprint arXiv:1708.01715.

Kumar, V., Khattar, D., Gupta, S., Gupta, M., & Varma, V. (2017). Deep neural architecture
for news recommendation. In Working notes of the 8th international conference of the clef
initiative, dublin, ireland. ceur workshop proceedings.

Le, Q. V., Jaitly, N., & Hinton, G. E. (2015). A simple way to initialize recurrent networks of
rectified linear units. arXiv preprint arXiv:1504.00941.

LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. nature, 521(7553), 436.

Li, S., Kawale, J., & Fu, Y. (2015). Deep collaborative filtering via marginalized denoising
auto-encoder. In Proceedings of the 24th acm international on conference on information and
knowledge management (pp. 811–820).

Liang, D., Zhan, M., & Ellis, D. P. (2015). Content-aware collaborative music recommendation
using pre-trained neural networks. In Ismir (pp. 295–301).

Liang, H., & Baldwin, T. (2015). A probabilistic rating auto-encoder for personalized recom-
mender systems. In Proceedings of the 24th acm international on conference on information
and knowledge management (pp. 1863–1866).

Linden, G., Smith, B., & York, J. (2003). Amazon. com recommendations: Item-to-item collab-
orative filtering. IEEE Internet computing, 7(1), 76–80.

Lops, P., De Gemmis, M., & Semeraro, G. (2011). Content-based recommender systems: State
of the art and trends. In Recommender systems handbook (pp. 73–105). Springer.

55
REFERENCES

Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J. (2013). Distributed represen-
tations of words and phrases and their compositionality. In Advances in neural information
processing systems (pp. 3111–3119).

Miyahara, K., & Pazzani, M. J. (2000). Collaborative filtering with the simple bayesian classifier.
In Pacific rim international conference on artificial intelligence (pp. 679–689).

Mobasher, B., Dai, H., Luo, T., & Nakagawa, M. (2001). Effective personalization based on
association rule discovery from web usage data. In Proceedings of the 3rd international work-
shop on web information and data management (pp. 9–15).

Ning, X., Desrosiers, C., & Karypis, G. (2015). A comprehensive survey of neighborhood-based
recommendation methods. In Recommender systems handbook (pp. 37–76). Springer.

Pennock, D. M., Horvitz, E., Lawrence, S., & Giles, C. L. (2000). Collaborative filtering by
personality diagnosis: A hybrid memory-and model-based approach. In Proceedings of the
sixteenth conference on uncertainty in artificial intelligence (pp. 473–480).

Ramirez-Garcia, X., & Garcı́a-Valdez, M. (2014). Post-filtering for a restaurant context-aware


recommender system. In Recent advances on hybrid approaches for designing intelligent
systems (pp. 695–707). Springer.

Rawat, Y. S., & Kankanhalli, M. S. (2016). Contagnet: Exploiting user context for image tag
recommendation. In Proceedings of the 2016 acm on multimedia conference (pp. 1102–1106).

Rezende, D. J., Mohamed, S., & Wierstra, D. (2014). Stochastic backpropagation and approxi-
mate inference in deep generative models. arXiv preprint arXiv:1401.4082.

Ricci, F., Rokach, L., & Shapira, B. (2015). Recommender systems: introduction and challenges.
In Recommender systems handbook (pp. 1–34). Springer.

Ruder, S. (2016). An overview of gradient descent optimization algorithms. arXiv preprint


arXiv:1609.04747.

56
REFERENCES

Salakhutdinov, R., Mnih, A., & Hinton, G. (2007). Restricted boltzmann machines for collabo-
rative filtering. In Proceedings of the 24th international conference on machine learning (pp.
791–798).

Sarikaya, R., Hinton, G. E., & Deoras, A. (2014). Application of deep belief networks for
natural language understanding. IEEE/ACM Transactions on Audio, Speech, and Language
Processing, 22(4), 778–784.

Sarwar, B., Karypis, G., Konstan, J., & Riedl, J. (2001). Item-based collaborative filtering
recommendation algorithms. In Proceedings of the 10th international conference on world
wide web (pp. 285–295).

Schmidhuber, J. (2015). Deep learning in neural networks: An overview. Neural networks, 61,
85–117.

Shi, Y., Karatzoglou, A., Baltrunas, L., Larson, M., & Hanjalic, A. (2014). Cars2: Learning
context-aware representations for context-aware recommendations. In Proceedings of the 23rd
acm international conference on conference on information and knowledge management (pp.
291–300).

Singhal, A., Sinha, P., & Pant, R. (2017). Use of deep learning in modern recommendation
system: A summary of recent works. arXiv preprint arXiv:1712.07525.

Smirnova, E., & Vasile, F. (2017). Contextual sequence modeling for recommendation with
recurrent neural networks. arXiv preprint arXiv:1706.07684.

Sottocornola, G., Stella, F., Zanker, M., & Canonaco, F. (2017). Towards a deep learning
model for hybrid recommendation. In Proceedings of the international conference on web
intelligence (pp. 1260–1264).

Strub, F., Gaudel, R., & Mary, J. (2016). Hybrid recommender system based on autoencoders.
In Proceedings of the 1st workshop on deep learning for recommender systems (pp. 11–16).

57
REFERENCES

Sutskever, I., Martens, J., Dahl, G., & Hinton, G. (2013). On the importance of initialization
and momentum in deep learning. In International conference on machine learning (pp. 1139–
1147).

Van den Oord, A., Dieleman, S., & Schrauwen, B. (2013). Deep content-based music recom-
mendation. In Advances in neural information processing systems (pp. 2643–2651).

Wan, J., Wang, D., Hoi, S. C. H., Wu, P., Zhu, J., Zhang, Y., & Li, J. (2014). Deep learning
for content-based image retrieval: A comprehensive study. In Proceedings of the 22nd acm
international conference on multimedia (pp. 157–166).

Wang, H., Wang, N., & Yeung, D.-Y. (2015). Collaborative deep learning for recommender sys-
tems. In Proceedings of the 21th acm sigkdd international conference on knowledge discovery
and data mining (pp. 1235–1244).

Wang, S., Zou, B., Li, C., Zhao, K., Liu, Q., & Chen, H. (2015). Crown: a context-aware rec-
ommender for web news. In Data engineering (icde), 2015 ieee 31st international conference
on (pp. 1420–1423).

Wang, X., & Wang, Y. (2014). Improving content-based and hybrid music recommendation
using deep learning. In Proceedings of the 22nd acm international conference on multimedia
(pp. 627–636).

Wang, X., Yu, L., Ren, K., Tao, G., Zhang, W., Yu, Y., & Wang, J. (2017). Dynamic attention
deep model for article recommendation by learning human editors’ demonstration. In Pro-
ceedings of the 23rd acm sigkdd international conference on knowledge discovery and data
mining (pp. 2051–2059).

Wu, H., Yue, K., Liu, X., Pei, Y., & Li, B. (2015). Context-aware recommendation via graph-
based contextual modeling and postfiltering. International Journal of Distributed Sensor Net-
works, 11(8), 613612.

58
REFERENCES

Wu, Y., DuBois, C., Zheng, A. X., & Ester, M. (2016). Collaborative denoising auto-encoders
for top-n recommender systems. In Proceedings of the ninth acm international conference on
web search and data mining (pp. 153–162).

Xu, Z., Chen, C., Lukasiewicz, T., & Miao, Y. (2017). Hybrid deep-semantic matrix factorization
for tag-aware personalized recommendation. arXiv preprint arXiv:1708.03797.

Xue, G.-R., Lin, C., Yang, Q., Xi, W., Zeng, H.-J., Yu, Y., & Chen, Z. (2005). Scalable collabo-
rative filtering using cluster-based smoothing. In Proceedings of the 28th annual international
acm sigir conference on research and development in information retrieval (pp. 114–121).

Xue, H.-J., Dai, X.-Y., Zhang, J., Huang, S., & Chen, J. (2017). Deep matrix factorization
models for recommender systems. static. ijcai. org.

Zhang, S., Yao, L., & Sun, A. (2017). Deep learning based recommender system: A survey and
new perspectives. arXiv preprint arXiv:1707.07435.

Zheng, L., Noroozi, V., & Yu, P. S. (2017). Joint deep modeling of users and items using reviews
for recommendation. In Proceedings of the tenth acm international conference on web search
and data mining (pp. 425–434).

Zheng, Y., Burke, R., & Mobasher, B. (2012). Differential context relaxation for context-aware
travel recommendation. In International conference on electronic commerce and web tech-
nologies (pp. 88–99).

Zheng, Y., Mobasher, B., & Burke, R. (2014). Cslim: Contextual slim recommendation algo-
rithms. In Proceedings of the 8th acm conference on recommender systems (pp. 301–304).

Zheng, Y., Mobasher, B., & Burke, R. (2015a). Incorporating context correlation into context-
aware matrix factorization. In Proceedings of the 2015 international conference on constraints
and preferences for configuration and recommendation and intelligent techniques for web
personalization-volume 1440 (pp. 21–27).

59
REFERENCES

Zheng, Y., Mobasher, B., & Burke, R. (2015b). Integrating context similarity with sparse lin-
ear recommendation model. In International conference on user modeling, adaptation, and
personalization (pp. 370–376).

Zheng, Y., Tang, B., Ding, W., & Zhou, H. (2016). A neural autoregressive approach to collabo-
rative filtering. arXiv preprint arXiv:1605.09477.

60

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy