Deep Learning Based Context Aware Recommender System
Deep Learning Based Context Aware Recommender System
UNIVERSITÉ DE TUNIS
INSTITUT SUPÉRIEUR DE GESTION
SPÉCIALITÉ
SCIENCES ET TECHNIQUES DE L’INFORMATIQUE DE DÉCISION
OPTION
DEEP LEARNING BASED CONTEXT-AWARE RECOMMEDER SYSTEM
Recommender System
MARWEN BDIRI
LABORATOIRE: BESTMOD
MARWEN BDIRI
Grâce à sa capacité à anticiper les éléments qui peuvent intéresser les utilisateurs,
les systèmes de recommandation sont devenus une technique courante pour aider
les utilisateurs à trouver des éléments intéressants dans un large ensemble
d'éléments tels que des films, des livres et de la musique. Les approches avancées
des systèmes de recommandation s'appuient sur l'apprentissage automatique
et l’apprentissage profond afin d'obtenir des recommandations plus précises et
personnalisées pour les utilisateurs. Le but de ce master est d'examiner d'abord les
techniques de recommandations basées sur l’apprentissage profond, puis de
concevoir un nouvel algorithme de recommandation contextuelle basée sur cette
technologie.
Abstract:
Thanks to its ability to expect items that can be of interest to users, recommender
systems have become a common technique to help users find interesting items
within large data set of items such as movies, books, and music. Advanced
Recommender systems approaches rely on machine learning and deep learning area
for the purpose of performing better accurate and personalized recommendations
for users. The aim of this master is to firstly review exiting recommendation
techniques that use Deep Learning and then design a new contextual
recommendation algorithm based on this technology.
I would like to express my deepest gratitude and thanks to my advisor Dr. Kaouther Nouira
Ferchichi who has always encouraged me throughout the progress of this master thesis. Thank
you for your patience and your continuous guidance during the preparation of this work. I have
learnt many things through your useful comments, your interesting remarks and the precious
discussions we have had during our collaboration.
I would especially like to thank all the members of BESTMOD laboratory of the Institut Supérieur
de Gestion Tunis, University of Tunis and all my professors.
A special thanks to my family. Words cannot express how grateful I am to my parents and
my two sisters for all what your patience and all the sacrifices that you have made on my behalf
ii
Contents
Introduction 1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.4 Backpropagation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
iii
CONTENTS
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
4 Experimental Study 38
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
iv
CONTENTS
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
Conclusion 49
v
List of Figures
1.4 The relationships between Artificial intelligence, Machine Learning and Deep
Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
vi
LIST OF FIGURES
vii
List of Tables
4.2 Optimization algorithms over training and testing using MovieLens 1M dataset . 44
viii
List of Algorithms
ix
Introduction
With the rise of big data and large information especially in the web, it is becoming difficult for
Internet-users to effectively navigate in this sea of information. For instance, a user browsing an
online store such as Amazon does not wish to go through tens, or perhaps hundreds, of uninter-
esting items before finding a desired object such as book or movie to buy. In other words, this
huge amount of options makes it harder for users to find exactly the wanted item (Linden et al.,
2003). This issue can be seen in many domain including Multimedia, E-commerce, Tourism and
in many companies like Google, Spotify, Amazon, Tripadvisor . Thus, in order to be competitive,
such companies should provide their users with satisfied services.
Owing to its capability to overcome information overload, recommender systems have become
a common solution that can provide suitable recommendation for users among large informa-
tion (Ricci et al., 2015). Recommnader Systems can be defined as software tool and techniques
that can help users to make the best choice within a certain domain (Linden et al., 2003). Two
common techniques of recommendation can be destigished from the literature, Non-Personalized
recommendation and Personalized recommendation (Ricci et al., 2015). The former is consid-
ered as less personal recommander systems since it provides a recommendation for group of
people. For instance, this approach can be used in magazines and newspapers and typical exam-
ples include top ten selections of news, trends, books, etc. Non-Personalized recommendation
systems are much simpler to generate that why it’s not typically addressed by recommendation
systems researcher. On the other hand, Personalized recommendation is costumer friendly sys-
tems where it can help user find the most suitable products or services based on his preferences
and constraints. Indeed, such systems collect either explicit information from the user’s activities
such as product’s ratings or implicit information such as product’s historical navigation (Ricci
et al., 2015; Linden et al., 2003). In this setting, this thesis rely on the personalized recommen-
1
LIST OF ALGORITHMS
dation systems as it’s the most used and challenged approach in both research and industrial
area.
Motivation
Several research works have been conducted to enhance the accuracy and the performance of
recommendation techniques which use only items and users (Bobadilla et al., 2013). However,
in many applications, these information is not sufficient For modern application (Adomavicius
& Tuzhilin, 2015). Indeed, such traditional representation of recommendation take only the in-
teraction or similarity between users and items without any other information such as the time
where the interaction have been done, location, mood of the users, etc. For instance, a travel
agency is assumed to do a good recommendation if it provides a ski package to someone who
like skiing, and it should not be a good idea to provide him this package in the summer. In other
words, a vacation recommendation should not be the same in winter as well as in summer. Thus,
it is of important to incorporate contextual information in the recommendation process.
2
LIST OF ALGORITHMS
techniques such as Collaborative Filtering models are not able to accurately deal with such data
due to the high dimensional nature of the problem (Adomavicius & Tuzhilin, 2015). Therefore,
the need of new models that can easily deal with such data.
Deep Learning is sub-field of Machine Learning that has shown tremendous success in a wide
range of research areas such as Image Recognition (Wan et al., 2014), Natural Language Process-
ing (Sarikaya et al., 2014), Automatic Speech Recognition (Hinton et al., 2012) and Biomedical
Informatics (Holzinger & Jurisica, 2014). Recently, Deep Learning technology have been ap-
plied for solving recommender systems related problems, in fact, two Deep Learning workshops
have been held in conjunction with the famous ACM Conference on recommender systems (Rec-
Sys) to promote using such technology within recommendation field (Singhal et al., 2017; Zhang
et al., 2017). Despite this big success especially in complex task, application of Deep Learning
on recommendation field still limited especially in Context Aware recommendation. Thus, this
thesis rely on this new technology to create a recommender systems which is able to deal with
large amount of features in order to provide Context-Aware Personalized recommendations.
The purpose of this master thesis is to firstly review exiting recommendation techniques based on
Deep Learning technology and than to design a new recommendation algorithm based on Deep
Learning models that can deal with contextual features in order to improve recommendations
accuracy.
Thesis Outline
This thesis is structured as follows: the first chapter provides the fundamental concepts of
recommeder systems and Deep Learning technology. The second chapter reviews some ap-
plication of Deep Learning in recommender systems. The third chapter describes the proposed
method. Finally, the fourth chapter presents the experimental study and the obtained results.
3
Chapter 1
Recommender Systems and Deep
Learning
Introduction
This chapter provides an overview of recommender systems and Deep Learning technology.
Section 1.1 presents basic concepts of recommander systems and Section 1.2 presents basic
concepts of Deep Learning technology.
Since they were introduced in the early 90s, recommnader systems have become an important
field of research that can help users deal with information overload (Goldberg et al., 1992).
recommnader systems can be deffined as software tool and techniques that can suggest items
which are most likely of interest to a specific user such as Books and article to read, movie to
watch, music to listen, and restaurant to dinner (Ricci et al., 2015). An example of such systems
can be seen on the popular website, Amazon.com, where it employs a recommender sytem to
personalize the online store for each customer in order to assist them in selecting a book to
read (Linden et al., 2003). Based on how recommendations are made, recommender systems
4
Chapter 1 : Recommender Systems and Deep Learning
can be classified into three main categories: Content-Based, Collaborative Filtering and Hybrid
recommender systems (Adomavicius & Tuzhilin, 2005).
Content-Based recommendation approach has root from two main fields: Information Retrieval
and Information Filtering (Ricci et al., 2015). This approach perferms recommendation by sug-
gesting items similar to the ones the user liked in the past. This technique indeed, computes
and compares similarity between content or features associated with items within a data set to
perform recommendation (Adomavicius & Tuzhilin, 2005). For instance, if a user has listen to
a song that belongs to the Jazz genre, then the system following this approach can learn to rec-
ommend other songs from this genre (e.i Jazz). A good application of such systems could be
a movie recommendation system that can recommend movies to users. In this case, Content-
Based technique can be used to see the commonalities between all the movies that a user have
seen in the past such as specific genres, director, or actor. Then, it recommends only the movies
that have a high similarity with the movies that have been watched in the past (Lops et al., 2011).
Collaborative Filtering recommendation is considered as the most popular and widely used ap-
proach in recommender systems field (Ricci et al., 2015). The basic idea of this approach con-
sists of providing item recommendations or predictions based on the opinions of other users with
similar preferences within the systems, these opinions can be obtained explicitly such as a rating
sacle ( value between 1 to 5) or implicit such as navigation history or interaction with the system
(Sarwar et al., 2001). Examples of popular Web Sites that use such technique are Amazon.com
and Netflix (Koren, 2008).
Collaborative Filtering technique recommend new items for user based on the previously liked
items and the opinion history of other similar users that share the same preferences (Ricci et al.,
2015). As delineated in Figure 1.1, the input of the recommendation process is a 2-dimensional
matrix of m ∗ n rating values where m is the number of the users and n is the number of items
within the system. Each cell in the matrix represents an opinion called also rating value (rui )
expressed by user u for an item i. Opinion can be an explicit information given by the user
such as a rating score or implicit information such as transaction records, page navigation his-
5
Chapter 1 : Recommender Systems and Deep Learning
tory. List of m user is refereed to as U = {u1 , u2 , u3 , ..., um } and list of n items is refereed to as
I = {i1 , i2 , u3 , ..., un }. The task of collaborative filtering recommendation provides two forms of
results for the active user: prediction which expresses the degree of likeliness (e.g a rating value
within 1 to 5) of an item i j for active user and a list of N items (Top-N recommendations) that he
might like the most.
Collaborative filtering algorithms can further be divided into two main techniques: Memory-
Based techniques and Model-Based techniques. The following two sub-sections detail respec-
tively both techniques.
6
Chapter 1 : Recommender Systems and Deep Learning
User-Based Collaborative Filtering: estimates the rating of an active user for a new item
using the opinions given to this item by users most similar to this active user called also the
Nearest-Neighbors users (Ning et al., 2015). Figure 1.2 presents a toy example of movie rec-
ommendation systems using User-Based Collaborative Filtering. The figure considers Eric as an
active user and Lucy as a similar user to Eric. given that Eric has watched and rated Titanic and
Forrest Gump movies and given that Lucy also has watched and rated positively the two movies.
Accordingly, one can conclude that the two user have similar taste and the movies that Lucy
enjoy will also enjoy Eric. Thus, it well be obvious to recommend all the movie liked by Lucy
to Eric.
Item-Based Collaborative Filtering: this method relies on item similarity to perferom rec-
ommendation. In fact, it looks at the ratings given to neighbors items in order to recommend
the potentially interesting items for an active user (Sarwar et al., 2001). Figure 1.3 illustrates
an example of this method on movie recommender system. In this example, Eric have watched
Punchline movie, accordingly, the system will determine a similarity between this movie and
the potentially interesting movies. In our tiny example, recommendation will be a Forrest Gump
movie which has the same actor (e.i Tom Hansks) and genre (e.i comedy/drama movie) like
Punchline movie.
7
Chapter 1 : Recommender Systems and Deep Learning
Choosing between User-Based and an Item-Based approaches have a big impact in the accu-
racy and the efficiency of recommendation task. In particular, User-Based methods usually pro-
vide more original recommendations compared to Item-Based methods and this may lead users
to a more satisfying and good experience. While in large data set, where the number of users ex-
ceeds the number of items (especially in real application) this approach become computationally
expensive. On the other hands, Item-Based approach are typically preferred in practice since it
provides good and accurate recommendation and computationally is efficient and requiring less
frequent updates since items within the systems are rarely updated.
Unlike Memory-Based methods which use the whole ratings matrix each time to determine rec-
ommendation which represents a problem when the system contains million of ratings. Model-
Based methods tend to create a pre-trained model that can be used later to perform recommen-
dation. To emphasize, training the model in recommendation systems takes much more time at
first but it perferoms fast recommendations in production (Koren, 2008). Usually, Model-Based
approaches uses Machine Learning algorithms to perform model train. Thus, several Machine
8
Chapter 1 : Recommender Systems and Deep Learning
Learning algorithm have been adapted in recommendations area such as Naive Bayes Classifier
(Miyahara & Pazzani, 2000), Association Ruled-Based Method (Mobasher et al., 2001), Deci-
sion Trees (Bouza et al., 2008), Clustering (G.-R. Xue et al., 2005).
Thanks to its ability to combinie multiple recommendation techniques to achieve better recom-
mendation results, Hybrid recommendation approach is consedered as a very paractical approach
in real application (Burke, 2007). It combines characteristics of both Content-Based and Collabo-
rative Filtering methods in order to overcome limitations of each one. Several studies have shown
that Hybrid recommendation approach can provide more accurate recommendations and can be
very effective in some hard recommendation cases compared to pure methods (Koren, 2009). For
instance, The BellKor Solution, in the Netflix Grand Prize, has combined over 100 recommenda-
tion algorithms to win the competition (Koren, 2009). Hybrid recommendation technique can be
divided into four main categories (Burke, 2007): Weighted strategy, Switching strategy, Mixed
strategy an Cascade strategy. Weighted strategy computes the score or the predicts results of di-
verse recommendation techniques then it combines together the result in order to get score used
as a final recommendation. Switching chooses a recommender technique among the several rec-
ommendation techniques then it applies the most appropriate one based on the problem situation.
Mixed strategy consists of using different recommendation technique together at the same time
in order to overcome the drawbacks of each single technique. Finally, Cascade uses sequence
of recommendation technique where the output of a recommendation technique represents as an
input for another recommendation technique, and this can refine recommendations results given
by another.
Recently, several new recommendation approaches have been proposed in the literature. These
approaches are not as known as the traditional recommendation technique but they still used to
overcome the limitations, the following presents some of them.
9
Chapter 1 : Recommender Systems and Deep Learning
10
Chapter 1 : Recommender Systems and Deep Learning
users.
This section presents the fundamentals of deep learning : theory and architecture of deep neural
networks. The theory and architecture of artificial neural networks represent the foundation of
Deep Learning technology (Goodfellow et al., 2016). Figure 1.4 illustrates the relationship be-
tween Artificial Intelligence, Machine Learning and Deep Learning. First, Artificial Intelligence
represents the general field that encompasses Machine Learning and Deep Learning but also
many more approaches which may not involve learning. For instance, chess programs can yield
only hard coded rules that can be written by programmers which are not considered as Machine
Learning. The second circle is Machine Learning which encompasses Deep Learning. Machine
Learning involves the using of a set of models that can learn from data how to do something
rather then being explicitly programmed to do it. The final circle represents the Deep Learning
which is based on artificial neural network models to achieve learning.
Figure 1.4: The relationships between Artificial intelligence, Machine Learning and Deep Learn-
ing
(Goodfellow et al., 2016)
11
Chapter 1 : Recommender Systems and Deep Learning
An artificial neural network is a Machine Learning technique that has been inspired by the struc-
ture and function of the brain (LeCun et al., 2015). Like the brain which is based on a biological
neuron, the fundamental processing element of an artificial neural network is a neuron (Goodfel-
low et al., 2016). Figure 1.5 presents biological neuron and artificial neuron where the right side
figure shows a representation of an artificial neuron: xn represents the inputs data to the networks
and each of these inputs are multiplied by a weight w j . Then these products are summed and fed
into a transfer function in order to generate the output. Usually, the transfer function relies on a
non-linear activation function and is modeled as shown in Equation 1.1.
where ai is the activation of a neuron, f (x) represents the non-linear activation function wi is
the link strength weight, xi is the neuron inputs and b is the neuron bias. Neural networks were
originally modeled using three main activation functions.
1
σ(x) = (1.2)
1 − exp−x
It takes a real value and squashes it between 0 and 1. Therefore, it is especially used for
models where we have to predict the probability as an output where probability values can range
only between 0 and 1.
12
Chapter 1 : Recommender Systems and Deep Learning
Hyperbolic Tangent function (TanH): the following equation shows the TanH function (Glo-
rot & Bengio, 2010)
2
T anH(x) = −1 (1.3)
1 − exp−2x
It squashes a real-valued number between -1 and 1. TanH function is also like logistic Sig-
moid but it can provide better results.
Rectified Linear Unit function (ReLU) the following equation shows the ReLU function
(Dahl et al., 2013)
The ReLU function is the most commonly used activation function in neural networks models
(LeCun et al., 2015). This function returns 0 if it receives any negative input, but for any positive
value x it returns that value back. This function can greatly accelerate the convergence compared
to the other functions thus it is widely used in Deep Learning (LeCun et al., 2015).
The simplest neural networks model is the Feed-Forward Neural Network which consists
on a set of ensemble of artificial neurons which are categorized into three types of layers: the
input layer, the hidden layers and the output layer. The progression of information flows in one
13
Chapter 1 : Recommender Systems and Deep Learning
forward direction, from the input layer, through the hidden layers to the output layer. In fact, in
this type of network structure, the connections between the units presented as directed acyclic
graph which means that does not form a cycle and there is no loops in the network (Goodfellow
et al., 2016). Figure 1.7 illustrates an example of Feed-Forward Neural Network with two hidden
layers.
The Convolutional Neural Networks (CNN) are a particular model of artificial neural networks
that can be used to learn from data, especially They are successfully applied to images, texts, etc
(LeCun et al., 2015). Unlike a regular Feed-Forward Neural Network, the layers of a CNN have
neurons arranged in 3 dimensions: width, height, depth. As shown in Figure 1.8, every layer
of a CNN transforms the 3D input volume to a 3D output volume of neuron activations. For
instance, the red input layer holds the input image. Consequently, its width and height would be
the dimensions of the image, and the depth would be 3 (Red, Green, Blue channels). This type of
networks have been very successful and widely used in practical applications such as computer
vision and robotics (LeCun et al., 2015).
14
Chapter 1 : Recommender Systems and Deep Learning
Recurrent Neural Networks (RNN) are another family of neural networks that can process se-
quential data like time series, speech and texts processing. Compared to the other neural net-
works family, this kind of model provides an output which results in each time step. Further-
more, the hidden units in the network have recurrent connections (Graves et al., 2013). On the
other hand, all the input data in the other models are independent of each other whereas in RNN
all the inputs are related to each other as shown in Figure 1.9. For instance, this can help in the
prediction of the next word in a given sentence using the previous words. Figure 1.9 illustrates
a toy example of such a model. The Figure (left side) shows that the model creates the networks
with loops in it, which allows to persist the information (i.e., the right side of the Figure) the
model first takes the x(0) from the sequence of input and then it outputs h(0) together with x(1)
is the input for the next step. The h(0) and x(1) are the input for the next step. Similarly, h(1)
and x(2) are the inputs for the next step, and so on.. In this way, the model keeps remembering
the context while performing the training.
15
Chapter 1 : Recommender Systems and Deep Learning
Recurrent neural networks are very powerful dynamic models, but training them is considered
as computationally expensive (Goodfellow et al., 2016).
1.2.4 Backpropagation
The backpropagation is a commonly used technique to train artificial neural networks. Contrarily
to forward propagation which flows the input data from the input layer to produce a final output
within the neural networks. The backpropagation algorithm trains the model by propagating
a feedback from the final output layer down to the input layers in order to propagate the loss
between the predicted and the true output value. This can update the weights within the networks.
Backpropagation can be achieved using an optimization method such as gradient descent. In
Deep Learning, backpropagation of the output loss through a large number of layers can affect
the training because the neurons in the earlier layers learn very slowly as compared to the neurons
in the later layers in the models. The earlier layers in the network are slower to train. This issue
is known as vanishing gradients which has been solved using new optimization techniques like
Momentum and Adagrad (Rezende et al., 2014).
Optimization techniques are considered as the task that can help minimizing or maximizing an
objective function. Deep Learning relies on optimization algorithms, like gradient descent or
momentum, to minimize the cost function (also known as loss function or error function) by
changing learnable parameters, which is in our case the weight and the bias, in order to improve
16
Chapter 1 : Recommender Systems and Deep Learning
accuracy. The following subsections present some of optimization techniques used in Deep
Learning.
Gradient Descent: it is the most widely used optimization algorithm to train deep neural net-
works (Ruder, 2016). It is used to minimize the target function by iteratively moving in the
direction of steepest descent as defined by the negative of the gradient. In other words, the gra-
dient descent algorithm uses the derivatives of a function to follow the function downhill to a
minimum (Ruder, 2016).
Momentum: it is also called Stochastic Gradient Descent with the use of momentum, where
the update of the last parameter is remembered at each iteration and the next update is determined
as a convex combination of the gradient and the previous update. Momentum guarantees of better
convergence rate than gradient descent in certain situations, where Momentum achieves a global
convergence rate of O( T12 ) after T steps, in contrast to the O( T1 ) of gradient descent (Sutskever et
al., 2013).
Conclusion
In this chapter, we present the fundamental concepts of recommender systems as well as some
recommendation approach such as Content-based, Collaborative Filtering and Hybrid approaches.
Then, we present the fundamental concepts and theories of Deep Learning technology like neu-
ral networks, Backpropagation and optimization techniques. In the next chapter, we present a
literature review of applying Deep Learning in the recommendation field.
17
Chapter 2
Deep Learning based Recommender
System Review
Introduction
Deep Learning technology achieved great success in many complex tasks including Image Recog-
nition (Wan et al., 2014), Natural Language Processing (Sarikaya et al., 2014), Automatic Speech
Recognition (Hinton et al., 2012) and Biomedical Informatics (Holzinger & Jurisica, 2014). In
last three years, recommender systems community interested in such advanced technology and
indeed several new recommendation methods have relied on it to enhance recommendation per-
formance in order to provide better recommendation to users (Cheng et al., 2016). Aiming to
show the state-of-the-arts of applying Deep Learning in the recommender systems field, this
chapter presents a literature review of applying Deep Learning technology on the most common
recommendation approaches including the Content-Based, Collaborative Filtering, Hybrid and
Advanced approaches. Section 2.1 reviews the application of Deep Learning in Content-Based
approach. Section 2.2 reviews the application of Deep Learning in Collaborative Filtering. Sec-
tion 2.3 and Section 2.4 present respectively the application of Deep Learning on Hybrid and
Advanced approaches. Finally, Section 2.5 discusses all the presented works.
18
Chapter 2 : Deep Learning based Recommender System Review
Van den Oord et al. (2013) proposed to train a deep Convolutional Neural Networks (CNN) to
predict latent factors from music audio data in order to extract characteristics that affect user
preference directly from audio signals. Experimental study, on one million song dataset and
industrial scale dataset with audio excerpts of over 380,000 songs, shows better results compared
to the traditional approach that uses a bag-of-words representation of the audio signals which is
a very often used in the field. In other hands, their results show the benefit of using deep CNN in
musical information retrieval field.
X. Wang & Wang (2014) combined a model based on Deep Belief Network and Probabilistic
Graphical model to simultaneously learn features from music content and to make personalized
recommendations. Using the Echo Nest Taste Profile dataset, The proposed solution outperforms
existing Deep Learning based recommendation models like Probabilistic Matrix Factorization in
term of the warm-start and cold-start.
Bansal et al. (2016) applied Deep Recurrent Neural Networks (RNN) to encode the text
sequence into a latent vector in order to enhance the Collaborative Filtering accuracy and solve
the cold start problem. The model has been evaluated on two real world datasets in the domain
of scientific paper recommendation and experimental study shows that the proposed method
outperforms previous state-of-the-art methods, specifically previous RNN model based on gated
recurrent units in term of recommendation accuracy.
L. Zheng et al. (2017) relied on review information to jointly build a deep model named Deep
Cooperative Neural Networks (DeepCoNN) in order to learn item properties and user behaviors.
The proposed approach uses two parallel neural networks: the first one focuses on learning be-
haviors of the user and the second learns item properties from the user’s reviews written for the
item. These parallel representation is coupled using a shared layer in order to enables latent fac-
tors learned for users and items similarly like factorization machine techniques. Experimental
study on three real word datasets, yelp reviews, Amazon reviews, Beer reviews, shows that the
proposed model outperforms state-of-the arts models such as Matrix Factorization, Probabilis-
ticMatrix Factorization.
X. Wang et al. (2017) relied on Dynamic Attention Deep model to develop a news recom-
mendation approach in order to solve the problem of selecting new article to users from a data
base of article called also the selection pool. The proposed solution performs recommendation
19
Chapter 2 : Deep Learning based Recommender System Review
by first letting the editors selecting a subset of articles from a dynamically changing pool of arti-
cles getting filled from various news feed sources. Then, it applies convolutional neural networks
and wide model to automatically learn the editor’s underlying selection criteria. Experimental
study conducted over commercial API platform linking to a Chinese finance article feeds plat-
form shows that the proposed model performs well in both accuracy metrics prediction: AUC
and F1 score compared to other models like CNN based model and wide& deep model.
The next section presents the application of this technology on Collaborative Filtering ap-
proach which is considered as the most used recommendation technique in the industry.
H. Wang et al. (2015) proposed Collaborative Deep Learning (CDL) which jointly perform-
ing Deep Learning for the content information and Collaborative Filtering in order to solve the
sparsity problem within the ratings matrix and auxiliary information for Collaborative Topic Re-
gression based approaches. Empirical study on three real-world datasets, two CiteULike datasets
and one Netflix dataset, shows that the proposed approach can significantly outperform the state
of the art Collaborative Filtering models like Collective Matrix Factorization and Collaborative
Topic Regression.
Li et al. (2015) relied on Deep Feature Learning to enhance Collaborative Filtering approach
more exactly a specific model based approach called Matrix Factorization. The proposed model
learns effective latent factors from both user-item ratings and side information. Comparison has
been performed on five real world datasets: MovieLens-100k, MovieLeans-1M, Book-Crossing,
Advertising dataset and results show that the enhanced method shows good results compared to
the state-of-the-art methods that use Matrix Factorization model.
H. Liang & Baldwin (2015) enhance collaborative filtering by developing a new method
called Probabilistic Rating Auto-Encoder in order to perform unsupervised feature learning and
generate underline user profiles from large-scale user rating data. Experimental study on yelp
dataset shows that the proposed model performs well compared to model based approach like
Matrix Factorization.
Devooght & Bersini (2016) proposed a new Collaborative Filtering approach based on RNN
model. In this sitting, Collaborative Filtering recommendation have been viewed as a sequence
20
Chapter 2 : Deep Learning based Recommender System Review
prediction problem and the Long Short-Term Memory (LSTM) model have been trained in order
to capture the evolution of the user’s taste over time. Comparison with standard Nearest Neigh-
bors and Matrix Factorization methods on the Movielens and Netflix datasets, shows that the
proposed approach outperforms them in terms of item coverage and short term predictions.
Ko et al. (2016) proposed a new extension of Collaborative Sequence model based on RNN
in order to analyze sequence of user’s activities which has been seen in modern recommender
systems. Empirical study, on two different tasks such as music recommendation and mobility
prediction, using two real-world datasets in these field shows that the model outperforms Non-
Collaborative methods.
Y. Zheng et al. (2016) propose a new Model-Based Collaborative Filtering approach named
CF-NADE based on the Neural Autoregressive Distribution Estimator. CF-NADE is inspired by
the work of Restricted Boltzmann Machines based Collaborative Filtring (RBM-CF) (Salakhut-
dinov et al., 2007) also they have proposed a factored version of CF-NADE to deal with large-
scale dataset efficiently. The performance of the model is conducted on three real world datasets:
MovieLens 1M, MovieLens 10M and Netflix dataset and results show that CF-NADE outper-
forms wide number of state-of-the-art Collaborative Filtering methods like User based RBM and
BIAS Matrix Factorization.
Kuchaiev & Ginsburg (2017) developed a new model for rating prediction task in order to
improve the predictions accuracy in recommender systems. The model is based on Autoencoder
with 6 layers and have shown that such model can perform much butter then the shallow neural
networks. Further, they have introduced iterative output re-feeding, a technique which performs
dense updates in Collaborative Filtering, increase learning rate and further improve generaliza-
tion performance of the model. They demonstrate the versatility of their model by applying it
on time-split Netflix dataset. Results show that the proposed solution outperforms other Deep
Learning based approaches even without using additional temporal signals.
21
Chapter 2 : Deep Learning based Recommender System Review
H.-J. Xue et al. (2017) proposed a new extension of Matrix Factorization which a model
based recommendation approach with a neural network. The model uses both explicit ratings
and implicit feedback. Also they proposed a novel loss function to train the models in which both
explicit and implicit feedback are considered in the training. The experiments on several bench-
mark datasets such as MovieLens, Amazon Music and Amazon movies datasets demonstrate
the effectiveness of the proposed model and show up a 7.5 percent improvement in Normalized
Discounted Cumulative Gain(NDCG) metric.
Ebesu & Fang (2017) addressed the cold start problem and proposed a probabilistic modeling
approach called Neural Semantic Personalized Ranking (NSPR) to unify the strengths of deep
neural network and pairwise learning. The model relies on Deep Learning to extract semantic
representation of items and couple it with the latent factors learned from implicit feedback. Em-
pirical study on two datasets, Netflix and CiteuLike, shows that the approach outperforms Matrix
Factorization and topic regression based Collaborative Filtering approaches.
Cao et al. (2017) introduced a new Collaborative Filtering extension based on Stacked Auto-
encoder with Denoising which is an unsupervised Deep Learning method used to extract useful
low-dimension features from the original sparse user-item matrices. Experiemental study shows
that the proposed method perform well compared to methods based on Matrix Factorization or
item-based Collaborative Filtering.
He et al. (2017) relied on neural network models to enhance Collaborative Filtering. They
proposed a general framework named Neural network-based Collaborative Filtering (NCF). Ex-
tensive experiments on two publicly accessible datasets, MovieLens and Pinterest, show signif-
icant improvements of the proposed framework over the state-of-the-art Collaborative Filtering
methods such Matrix Factorization approaches and Item Based Collaborative Filtering.
The next section presents the application of Deep learning on Hybrid approach.
D. Liang et al. (2015) proposed a new hybrid method that can yield different sources of infor-
mation using Content-Based and Collaborative Filtering approach in order to enhance music
recommendation. The proposed system has two main steps: first, Content-Based model trains a
multi-layer neural network to produce semantic tags from vector-quantized acoustic feature and
22
Chapter 2 : Deep Learning based Recommender System Review
the output of the last hidden layer is treated as a high-level representation of the musical content.
Than, the obtained content is used as a prior for the song latent representation in Collaborative
Filtering. The proposed system is evaluated on the million song dataset and results show compa-
rably better result than the Collaborative Filtering approaches, in addition The system achieves
the state-of-the-art performance in music recommendation given content and implicit feedback
data.
Strub et al. (2016) developed an new model which uses an Autoencoder model to enhance the
Collaborative Filtering model. Indeed, they introduced a proper training loss/process of Autoen-
coders on incomplete data as well as used the side information to alleviate the cold start problem.
The proposed framework integrates both ratings and side information into a unique network and
this yield a scalable and robust approach which outperforms state-of-the-art Collaborative Filter-
ing methods.
Kumar et al. (2017) used a RNN combined with Neural Attention to combine user-item Col-
laborative Filtering with the content of the read news articles in order to tackle the problem of
changing and diverse reading interests of users as well as to solve cold start problem. Empiri-
cal study on real world CLEF NewsREEL dataset shows the effective of the model over other
baselines methods such as Item-based Collaborative Filtering and Matrix Factorization.
Sottocornola et al. (2017) provided a new architecture based on Deep Learning for Hybrid
recommender systems. The proposed architecture combines two information sources: natural
language text and user rating. Natural language text is used to learn a user-specific content-
based classifier, while user ratings are used to develop user-adaptive Collaborative Filtering rec-
ommendations. The results of both methods are combined using feed-forward neural network.
Performance conducted on the MovieLens dataset shows that the proposed method outperforms
state-of-the-art methods such as Matrix factorization.
Dong et al. (2017) introduced a hybrid Collaborative Filtering model based on Matrix Fac-
23
Chapter 2 : Deep Learning based Recommender System Review
torization and additional Stacked Denoising Autoencoder which is used to model the side in-
formation. The model outperforms several Collaborative Filtering techniques on the MovieLens
dataset and Book-Crossing dataset.
Kim et al. (2017) proposed a novel document Context-Aware Hybrid method which inte-
grates CNN and Probabilistic Matrix Factorization in order to capture contextual information in
description documents. Experimental study conducted on three real world datasets: two movie-
lens datasets and Amazon instant video dataset shows that the technique performs well compared
to non-Deep Learning based approaches such as Matrix Factorization and Collaborative topic re-
gression as well as Deep Learning approaches such as Collaborative Deep Learning.
In the next section, we present a literature review of applying of Deep Learning on two
advanced recommendation approaches: Social-Based recommendation and Context-Aware rec-
ommendation.
Rawat & Kankanhalli (2016) introduced Context-Aware model which can predict multiple tags
for an image which can be recommended to a user. This model performs a CNN to learn image
features and the context representations are processed by two neural network layers. Then, The
outputs of these two neural networks are concatenated in order to predict the tag. The proposed
solution evaluated on YFCC100M dataset provided by the organizers of Yahoo-Flickr Grand
Challenge and results showed significant improvement in the prediction accuracy after adding
context information with the process.
Kim et al. (2016) developed a new Context-Aware recommendation approach, ConvMF, that
seamlessly integrates CNN into Probabilistic Matrix Factorization in order to capture contextual
information in description documents for the rating prediction. The model considers contextual
information such as surrounding words and word orders in description documents which can pro-
vide deeper understanding of description documents and this can enhances the rating prediction
accuracy. Experimental study on two real datasets: MovieLens and Amazon review datasets,
show that the proposed method significantly outperforms the state-of-the-art recommendation
models such as Collaborative Deep Learning, Collaborative Topic Regression and Probabilistic
Matrix Factorization.
24
Chapter 2 : Deep Learning based Recommender System Review
Smirnova & Vasile (2017) developed a Context-Aware Recommender system based on Con-
ditional Recurrent Neural Networks. The model involve the contextual information with in the
recommendation process in order to perform suggest of the next items. Experimental results
show that the proposed solution achieves better results against sequential and non-sequential
state-of-the-art baselines methods.
Deng et al. (2017) relied on Deep Learning to develop a new Trust-Based approach in order
to perform recommendation in social networks. The proposed method used this advanced tech-
nology to initialize user and item latent feature vectors for Trust-Aware Social recommendations
and to separate the community effect in user’s trusted friendships. Extensive experiments con-
ducted on two real-world social network data set: Epinions and Flixster datasets show that the
method performs well compared with several variation of Matrix Factorization approaches.
Ding et al. (2017) proposed BayDNN model which combines Bayesian Personalized Rank-
ing and Deep Neural Networks in order to perform friendship recommendation in social network.
The model first extracts latent deep structural feature representations from the input network
data using one-dimensional Convolutional Neural Network, then, applies Bayesian Personalized
Ranking Learning to captures user’s preference based on these extracted feature. Experimental
study on Epinion and Slashdot datasets shows that the proposed approach outperforms the base-
line approaches such as Matrix Factorization algorithms, Katz similarity, Adamic/Adar similarity
and simple pairwise input neural network.
The next section presents a discussion about the application of Deep learning on the recom-
mendation approaches presented above.
2.5 Discussion
25
Chapter 2 : Deep Learning based Recommender System Review
work with complex and heterogeneous data simultaneously. In the other hand, application of
Deep Learning is relatively limited in Content-Based recommendation, although it has shown
a tremendous success on large amount of content such as image, music, video, text. Finally,
Advanced approach such as Context-aware recommendation and Social-Based recommendation
have shown a big interest within recommender systems community owing to their capabilities
to overcome drawbacks of conventional recommendation models. Therefore, applying of Deep
Learning technology on such approaches can give must more interest within recommendation
community.
Figure 2.2 shows a summary of all presented papers by application domain. Based on ap-
plication domains and data set used in the experimental studies, the presented papers show that
Deep Learning based recommendation systems have been applied in several domain including
music, video, news, E-commerce, social media. In the other hands, Entertainment industry has
gain large part form these advanced technology due to the availability of rich datasets.
26
Chapter 2 : Deep Learning based Recommender System Review
Figure 2.2: Statistics of publication counts in four Recommender Systems categories by applica-
tion domain.
Advanced approach such as Context-Aware recommendation have shown a big interest within
recommender system. Indeed, such approach is capable of parsing and understanding as much
information as possible which can enhance the recommendation performance. Thus, in this
thesis, we rely on Deep Learning technology to design a new Context-Aware recommendation
model in order to outperform the classical approaches of recommendation. The model called
Deep Learning Based Context recommendation which is the subject of the next chapter.
Conclusion
This chapter presents a literature review of applying of Deep Learning technology in recom-
mender systems field. As shown This area of research is very young, thus there is much work
to do especially in Hybrid and advanced approaches. The next chapter presents the proposed
method which consists of applying of Deep Learning on one of Advanced recommendation
method more specifically on Context-Aware recommendations.
27
Chapter 3
Deep Learning Approach for
Context-Aware Recommendation
Introduction
This Chapter is dedicated to the presentation of our proposed method which combines Deep
Learning technology with Context-Aware recommendation method in order to improve recom-
mendation performance. This chapter is organized as follow: Section 3.1 gives a brief intro-
duction of Context-Aware recommender systems. Then, Section 3.2, is dedicated to present the
proposed method which aims to use the Deep Learning technology with the Context-Aware rec-
ommendation in order to enhance recommendation performance. Finally, Section 3.3 presents
the implementation details.
28
Chapter 3 : Deep Learning Approach for Context-Aware Recommendation
textual situation in the time of providing the recommendations (Adomavicius & Tuzhilin, 2015).
For instance, Recommender Systems application like recommending a vacation package, person-
alized content on a website or music recommendation can depond on contextual infomation such
as the mood of the users, the place, etc. Thus, Context-Aware Recommender System aims to
enhance the personalized recommendations by contextualizing the process of recommendation
(Adomavicius & Tuzhilin, 2015).
Three main approaches have been proposed to perform recommendation according to con-
textual information (Campos et al., 2013; Haruna et al., 2017): Contextual Pre-filtering, Con-
textual Post-filtering and Contextual Modeling. In Contextual Pre-filtering approach, contextual
information is used for matching the relevant set of data records (ratings). Then, ratings can
be predicted on the selected data using any convolution Recommender System such as Collab-
orative Filtering (Codina et al., 2016). In Contextual Post-filtering, the recommendation algo-
rithm initially ignores the contextual information. The ratings are predicted using any traditional
Recommender System on the whole data set. Then, the process adjusted the resulting set of rec-
ommendations for each user using the contextual information (Ramirez-Garcia & Garcı́a-Valdez,
2014). Finally, Contextual Modeling is considered as more sophisticated approach where contex-
tual data is integrate directly in the prediction model and this gives rise to truly multidimensional
recommendation which represent a truly predictive models which can be solved using any Ma-
chine Learning algorithms such as regression, decision trees, probabilistic models (H. Wu et al.,
2015). In this thesis, the Contextual Modeling approach will be the framework of our proposed
approach.
We have provided, in the second chapter, a literature review of the application of Deep Learn-
ing which is considered as Model-Based approach in the field of Recommender Systems (Singhal
29
Chapter 3 : Deep Learning Approach for Context-Aware Recommendation
et al., 2017). Accordingly, we have shown that such technology can achieve higher performance
over standard state-of-the-arts recommendation approaches including Memory-Based approach
such as similarity based approach and Model-Based approaches such as Matrix Factorization.
Thus, we will rely on Deep Learning technology to improve recommendation accurate using
contextual information and we propose a new Context-Aware Recommender based on this state-
of-the art technology. The next Section presents the proposed method.
Standard Recommender Systems which aim to predict ratings can be defined as a regression
problem over users U = {u1 , u2 , ...} and items I = {i1 , i2 , ...}, where the target function has to be
estimated is as follows (Y. Zheng et al., 2015a):
Y :U×I →R (3.1)
Where Y is the rating value (e.g 1..5 stars). In the other hand, in Context-Aware Recommender
Systems, The ratings prediction is assumed to be influenced with some additional contextual
information such that C j = {c j,1 , c j,2 , ...}. Example of contextual information can be the time at
which a rating was given by the user, the mood of the users, etc. Hence, with such additional
information The ratings prediction problem can be formalized as follows (Y. Zheng et al., 2015a):
Y : U × I × C1 , C2 , ..., C3 → R (3.2)
In this sittings the task of rating prediction using contextual information is to estimate a function
Ŷu,i,c j that can predict rating Yu,i,c j for any user u, item i, and context c j combination.
30
Chapter 3 : Deep Learning Approach for Context-Aware Recommendation
the layers; Lrate : learning rate of optimization algorithm; Imax : maximum iterations number
of the optimization algorithm;
2: Output: PImax : parameters of the model
/* Parameters initialization */
3: Initialize parameters P0 using Ldims ; Initialize Ŷ0 , L0cost and grad0
/* learning part */
4: for i = 0, ..., Imax
5: Calculate ŶPi using forward propagation using Ldims layers
6: Calculate loss cost Licost (ŶPi , Y)
7: Calculate gradŶPi using backward propagation and Y
8: Calculate parameters Pi+1 using Pi , gradŶPi and Lrate
9: return: PImax
To further describe the proposed model, Fugure3.1 shows the multi-layer architecture model
31
Chapter 3 : Deep Learning Approach for Context-Aware Recommendation
of the proposed Deep Context-Aware Recommender System. The output of one layer serves as
the input of the next one. The bottom input layer consists of three feature vectors: The user
feature vector, the item feature vector and the N context feature vectors. Next to the input layer,
we see the embedding layer which is a fully connected layer that can be used to transform the
sparse data representation to dense representations suitable for neural networks layer. Then, the
embedded feature vectors fed into a concatenation layer in order to merge all the features vectors
into a single layers. The next stage is the multi-layer neural network architecture which aim to
map the concatenation input layer to prediction score layer. Finally, The output layer is used to
predicted score Ŷu,i,c j where u, i and c j mean respectively the users, the items and the context,
and the training is performed by minimizing the cost between Ŷu,i,c j and the target value Yu,i,c j .
Figure 3.1: The architecture of the proposed Deep Context-Aware Recommendation method
In the next part, we present each step described in Figure 3.1 and we give implementation
details of the proposed method using Deep Learning framework.
32
Chapter 3 : Deep Learning Approach for Context-Aware Recommendation
To carry out the Context-Aware Recommendation data sets which usually represents as a cat-
egorical feature triplet of user, item and context (Adomavicius & Tuzhilin, 2015). Figure 3.2
presents an example of Context-Aware data where:
U = {Alia, Ahmed, Adel}
I = {T itanic, Avangers, S tarWars, S tarT rek}
C1 = mood = {S ad, Normal, Happy}
The Figure 3.2 shows how one can transform Context-Aware Recommendation data into a fea-
tures value used in the prediction algorithm. The transformation consists of encoding the cat-
egorical (user, item and context) with indicator variables. Therefore, Following the proposed
example, the first tuple (Alia, T itanic, S ad, 5) means that Alia is rated T itanic with 5 stars when
she is sad. So, after the transformation this tuple will yield (1, 1, 1, 5) which can be used to train
the prediction algorithm.
In addition to the user and the item data, the context features can be categorical. The categorical
features can be a binary such as whether the user is male or female or with other possible values
such as job occupation (doctor, teacher, etc), Location (home, cinema, work, party, etc), Time
(morning, evening, weekend, weekday, etc), mood(sad, happy). Also we can have numerical
feature such as the Age of the user, Temperature (Adomavicius & Tuzhilin, 2015).
The proposed model learns high dimensional embedding for each categorical feature in a Fixed
33
Chapter 3 : Deep Learning Approach for Context-Aware Recommendation
vocabulary. Indeed, embeddings technique have been inspired by continuous bag of words lan-
guage models (Mikolov et al., 2013), to map categorical features to dense representations suitable
for neural networks and this technique have been used in several Deep Learning recommendation
models (He et al., 2017; L. Zheng et al., 2017; Cheng et al., 2016; Kim et al., 2016). Embedding
layer in the proposed model usually derived a low-dimensional and dense tables (Mikolov et al.,
2013) and perform also normalization of the other input such as the age and the gender. After
This process, the model flatten these tables into low-dimensional tables and concatenates them
all with the other input information into one layer that will be fed into the hidden layers of the
neural network.
After data embedding and concatenation, the model perform the hidden neural network layers
using Algorithm 3.1. Each layer perform the flowing computation
where l represents the layer’s number and f is the activation function, al+1 is the activation, W (l)
and b(l) are respectively the weight and the bias parameters of the layer l. There exit several
activation function used in Deep Learning such the sigmoid activation, RELU Activation which
stands to Rectified linear units, Tanh activation (Schmidhuber, 2015). But often RELU has shown
good results when we work with deep models compared to other functions (Le et al., 2015), thus
in the proposed method we will apply RELU in each hidden layer. The number of the hidden
layers can determine the model’s capability, such that, when we have low number of hidden
layer the model can perform weak prediction, in contrast when we have a high number of hidden
layers, the models tends to give very accurate results. The choice of the number of layers often
depends on the processing capabilities of the machine because increasing the number of layers
can yield a high computation time (He et al., 2017). Hence, we keep the layer number as a
parameter that can be given by the user.
The output layer is used to predict the ratings Ŷu,i,c j . The algorithm compute the loss cost between
the predicted ratings Ŷu,i,c j and the true ratings Yu,i,c j by minimizing the cost between Ŷuic and
34
Chapter 3 : Deep Learning Approach for Context-Aware Recommendation
the target value Yu,i,c j . In the proposed model we use either Mean Absolute Error (MAE) or
Mean Squared Error (MSE) loss functions which is suitable for recommendation systems. MAE
measures the average over the absolute differences between prediction rating and actual rating as
follows: n
1X
L(Ŷ, Y) = |Ŷi − Yi | (3.4)
n i=1
while, Mean Squared Error can be derived by the averaging of the squared differences be-
tween prediction and actual rating as follows:
n
1X
L(Ŷ, Y) = (Ŷi − Yi )2 (3.5)
n i=1
In the next section we present some Deep Learning tools that can be used to implement the
proposed method.
This section presents the tools available for practical implementation of Deep Learning models
and presents the tool that will be used to implement the proposed model.
Today there exist dozens of Deep Learning tools available, As follow we presents some of the
widely used frameworks such as Theano 1 , Pytorch 2 , Caffe3 , TensorFlow4 , Keras5 .
Theano: was the first widely used Deep Learning framework created by Yoshua Bengio at the
University of Montreal in 2007. Theano is a Python based library and a low level Deep Learning
framework which is extremely fast and powerful framework. In 2017, research team announced
that there will be no support for Theano.
Pytorch: is also a python based library that was released by Facebook in early 2017. Pytorch
is considered as a simple framework that offers high speed and flexibility and can perform a
1
http://deeplearning.net/software/theano/
2
https://pytorch.org/
3
http://caffe.berkeleyvision.org/
4
https://www.tensorflow.org/
5
https://keras.io/
35
Chapter 3 : Deep Learning Approach for Context-Aware Recommendation
dynamic computational graphs which can help analyzing unstructured data. Furthermore, it
allows using Graphics processing unit (GPU) capability of the material and this can yield very
efficient model. One drawbacks of this tool is that it is still in new beta version and there is not
enough community support.
Caffe was developed by Berkeley Artificial Intelligence Research. If is suitable tool for Con-
volutional Neural Network model, and has three characteristics which are priority to expression,
speed, and modularity. The second upgrade of coffe which is coffe2 is introduced by Facebook in
2017 and can provide users with pre-trained models that can be used to build demo applications
without any extra hassle.
TensorFlow: is an open source and python based Deep Learning framework which performs
numerical computation using data flow graphs. It was developed by Google Brain Team to deploy
machine learning and Deep Learning researches. Today, TensorFlow is considered as the most
commonly used Deep Learning framework and it’s supported by a big community. In the last
few years, it has been adapted in many big company like Twitter, Uber and eBay.
Keras: is developed as high level Deep Leaning framework that simplify building deep mod-
els. It is a python based library which can be functioned on top of low level framework such as
Theano and TensorFlow. In 2018, Google has supported Keras and it will be including in the
coming TenserFlow releases.
Out of all these available Deep Learning tools for numerical computation we choose to im-
plement the proposed method using Python programming language, TensorFlow as a low level
framework because of the following reasons: the maturity and the big community support. Also
we will use Keras as high level framework which can make easy the model implementation.
Conclusion
36
Chapter 3 : Deep Learning Approach for Context-Aware Recommendation
we will be used to implement our model. In the next chapter, we present an experimental study
to evaluate the ability of our method to perform personalized recommendation.
37
Chapter 4
Experimental Study
Introduction
In order to evaluate the ability of the proposed Deep Context-Aware Recommendation method to
perform prediction on recommendation data set, we present an experimental study on real data
sets. The effectiveness of the proposed method is evaluated in terms of prediction accuracy.
This chapter is organized as follows: Section 4.1 describes the datasets used to examine the
performance of our approach. Section 4.2 details the evaluation measures that will be used to
test the performance of the proposed DCARS method. Finally, Section 4.3 presents the obtained
results.
The experiment was conducted on real recommendation dataset: Tripadvisor dataset (Y. Zheng
et al., 2012), Movielens100k 1 dataset and Movielens1M 2 dataset. Table 4.1 describes statistics
of the used data sets.
1
https://grouplens.org/datasets/movielens/100k/
2
https://grouplens.org/datasets/movielens/1m/
38
Chapter 4 : Experimental Study
Tipadvisor dataset: this data was scripted from online reviews on tripadvisor.com web site.
It contains trip type (Family, Couples, Business, Solo travel, Friends) as well as geographical
location information from the users and the hotels. Such that the user information yield the user
id, user’s state of residence, user ’s time zone information, the hotels information yield the hotel
id, hotel city, hotel state and time zone information, as well as the trip type and user’s rating.
Overall this data set contains 4669 ratings, 1202 users, and 1890 hotels.
Movielens100k dataset: it is considered as the one of the most used data set in recommen-
dation area and it is publicly available on the MovieLens3 website. This dataset was collected
through the MovieLens website4 where users regularly visit and rate the movies that they have
already watched. Ratings can range from 1 to 5 stars, that is 1 star means that the visitor don’t
like the movie whereas 5 starts means that the user gives a full interest to the movie. Overall,
the dataset contains 100.000 ratings in total, collected from 943 users on 1682 movies and each
user has rated a least 20 movies among all the available movies. In addition to this information,
Movielens group provided a metadata about the users that visit the website such as the age, the
occupation, the gender, etc. Thus, in this experimental study we will take all this information as
contextual data.
Movielens1M dataset: to further evaluate the scalability of the proposed method we have
used the 1 million dataset which is also provided by MovieLens group. Overall, the dataset
contains 1.000.209 ratings, collected from 6,040 MovieLens users and 3,900 movies and each
user has rated a least 20 movies among all the available movies. The dataset contains also a meta
data of the user which will take as contextual information in our experimental study.
39
Chapter 4 : Experimental Study
In recommendation area, the predictive accuracy metrics aim to measure how much the predic-
tion is close to the true numerical rating expressed by the user (Gunawardana & Shani, 2009).
Several metrics have been proposed in the literature. In this section, we introduce the commonly
used ones which is the Mean Absolute Error (Pennock et al., 2000) and the Root Mean Squared
Error (Bennett et al., 2007). In what follows, we present the two evaluations metrics.
Mean Absolute Error (MAE) : consists of taking the mean of the absolute difference between
each prediction and the actual rating of all the users in the system. MAE is defined by Equation
4.1, where n is the total number of predictions, Ŷi is the prediction value and Yi is the actual
rating value. MAE metric provides a value that can range from 0 to 4. Hence, the more MAE
metric is lower, the more accurate the recommendation approach predict user’s ratings.
n
1X
MAE = |Ŷi − Yi | (4.1)
n i=1
Root Mean Squared Error (RMSE) is another widely used method to evaluates recommen-
dation approaches in ratings prediction. RMSE is defined as the square root of the averaging of
the squared differences between prediction and actual rating as described by Equation 4.2, where
n is the total number of prediction, Ŷi is the prediction value and Yi is the actual rating value.
n
1X
RMS E = (Ŷi − Yi )2 (4.2)
n i=1
In this section, we start by presenting the environment where we performed experiments study.
After that, we present and discuss the experimental results.
40
Chapter 4 : Experimental Study
5
Experiments are performed on a Google Colab which is a Google’s free cloud service for
Artificial intelligence developers which is suitable for Deep Learning model implementation,
the cloud provides a 2-core Xeon Processor with 2.2GHz, 13GB of RAM and 33GB HDD.
Furthermore, we used a machine with 4 cores (Intel Corei5-6500 Processor, up to 3.6 GHz),
8GB of RAM to run state-of-the-art recommendation methods used to compare the proposed
method.
Experiments were realized using libraries and frameworks such that, numpy 6 which is a python
library used for numerical computation, pandas7 which is a python library used to manipulate
different format data such as text files and CSV files, matplotlib8 which used to plot results. As
a deep learning framework, we used Keras 2.1.6 9 , Tensorflow 1.9.0 10
. In the other hand, we
have CARSKit11 which is a Context-Aware Recommendation library based on Java programming
language. CARSKit contains a state-of-art Context-Aware Recommendation methods which can
be useful for comparison with our proposed method.
For investigating the ability of our method to deal with Context-Aware recommendation data,
We evaluate the performance of our proposed method with three state-of-the-art recommenda-
tion methods: the first method called Matrix Factorization (Koren et al., 2009) and perform
recommendation without taking any contextual information while CAMF CI (Baltrunas et al.,
2011) and Tensor factorization (Karatzoglou et al., 2010) take into consideration the contextual
information. All these methods are implemented in the CARSKit.
Figure 4.1 shows the evaluation of the proposed method over the three state-of-the-art meth-
ods on Tripadvisor dataset. As shown in Figure 4.1(a), the proposed method performs well
compared to the other recommendation methods using MAE metrics, for instance, the proposed
5
https://colab.research.google.com/
6
http://www.numpy.org/
7
https://pandas.pydata.org/
8
https://matplotlib.org/
9
https://keras.io/
10
https://www.tensorflow.org/
11
https://github.com/irecsys/CARSKit
41
Chapter 4 : Experimental Study
DCARS can provide prediction with less than 0.8 MAE whereas Tensor Factorization method
provides a prediction with 1 MAE. On the other hand, Figure 4.1(b) shows the performance of
proposed DCARS using RMSE metrics. Also, the proposed method outperforms the other meth-
ods. For example, CAMF CI predict recommendation with more than 1.4 RMSE while DCARS
provides recommendation with less than 0.8 RMSE.
To further test the proposed method, Figure 4.2 shows the evaluation of the proposed method
using Movielens 100K dataset. As shown in Figure 4.2, the proposed method outperforms the
other recommendation method using both metrics MAE and RMSE. The Figure also shows that
42
Chapter 4 : Experimental Study
the proposed method scales well with large datasets and can keep the same performance as small
datasets. For instance, the proposed method can predict ratings with almost 0.8 RMSE on both
Movielens 100K ad Tripadvisor Dataset.
43
Chapter 4 : Experimental Study
Table 4.2: Optimization algorithms over training and testing using MovieLens 1M dataset
Optimization algorithm MAE training MAE testing RMSE training RMSE testing
TO further show the results, Figure 4.3 illustrates the results presented in Table 4.2. As shown
both ADAM and RMSprop provide quite similar results compared to the other optimization
methods in term of training and testing results. For instance, the MSE using Adam optimization
algorithm performs 0.6433 MAE training compared to 0.8738 for SGD. In the other hand, the
Figure shows that both training and testing results using MSE and RMSE metrics are quite similar
which indicate that the proposed method provides good results.
44
Chapter 4 : Experimental Study
Figure 4.3: Evaluation of DCARS Movielens 1 million using different optimization algorithms
In another experiments, we compare the proposed DCARS over a range of iteration numbers.
We execute DCARS on MovieLens 1M dataset using ADAM optimization algorithm since it
provides good performance compared to the other techniques as well as we range the iteration
number from 10 to 50. Figure 4.4 shows the results of this experiments. As shown as we increase
the iteration number, the prediction quality of the model increases, especially with the RMSE
evaluations metrics but in MAE metrics the preduction quality decreases after 30 iteration values
which means that in this case we should use 30 as iteration numbers.
To further test our method, we compare the proposed DCARS over a range of different layers
numbers. Also in this experiments we execute DCARS on MovieLens 1M dataset using ADAM
optimization algorithm as well as we used 30 as iteration numbers. Figure 4.5 shows the results
of this experiments. As shown as we increase the number of layer, the prediction between the
45
Chapter 4 : Experimental Study
46
Chapter 4 : Experimental Study
training and the testing approach which indicate that the proposed method perform well as we
increase the hidden layers. For instance, when we have 6 layers (128,64,32,16,8,4) in both MAE
and RMSE metrics the gap between training results and tresting results become smaller and this
provides good prediction results.
Figure 4.5: Evaluation of DCARS Movielens 1 million using different layers numbers
Conclusion
In this chapter, we evaluate our proposed method DCARS real recommendation datasets. The
results show that our recommendation algorithm outperforms the other recommendation meth-
ods in term of MAE and RMSE metrics. Likewise, we show that our approach can perform
47
Chapter 4 : Experimental Study
48
Conclusion
Recommender systems aim to provide suitable recommendation for users among large infor-
mation in many fields such as Multimedia, E-commerce, Tourism. Two common techniques of
recommendation can be distinguished from the literature, Non-Personalized Recommendation
and Personalized Recommendation. Personalized Recommendation systems, or simply recom-
mender systems, are very used, due to it capability to find the most suitable products or services
based on user’s preferences and constraints. Personalized recommendation can be classified
into four main approaches: Content-Based, Collaborative Filtering, Hybrid, and Advanced ap-
proaches. In this thesis, we relied on advanced recommendation approach called Context-Aware
recommendation which can involve contextual information in the time of recommendation.
In recent years, Recommender Systems community has shown a big interest in Machine
Learning field especially Deep Learning sub-field and indeed several new recommendation meth-
ods have relied on such technologies to enhance recommendation performance in order to pro-
vide better recommendation to users. However, most work applied Deep Learning on common
recommendation approaches not the advanced one such as Context-Aware recommendation.
Thus, in this master thesis, we relied on Deep Learning technology to improve recommenda-
tion accurate using contextual information and we propose a new Context-Aware Recommender
based on this state-of-the-art technology which we called Deep Context-Aware Recommender
System (DCARS).
Experimental study conducted on three datasets and we used three stat-of-the-art approaches
to compare the proposed model. The obtained results shows that the proposed Deep Learn-
ing based model can enhance prediction performance in the time of providing recommendation
for user in a specific context. Likewise, the experiments show that combine deep learning and
49
Chapter 4 : Experimental Study
Context-Aware recommendation can outperform stat-of-the-art with a good margin using MSE
and RMSE Metrics.
In this thesis we have used deep learning technology more exactly embeddings annd feed-
forward neural networks architectures for the task of recommendation. Further research could
extend such model to improve the performance with additionally including other features like
text and images in the learning stage. Furthermore, it might be of interest to use other Deep
Learning models such as CNN and RNN with Context-aware recommendation approach.
50
References
Adomavicius, G., & Tuzhilin, A. (2005). Toward the next generation of recommender systems:
A survey of the state-of-the-art and possible extensions. IEEE transactions on knowledge and
data engineering, 17(6), 734–749.
Baltrunas, L., Ludwig, B., & Ricci, F. (2011). Matrix factorization techniques for context aware
recommendation. In Proceedings of the fifth acm conference on recommender systems (pp.
301–304).
Bansal, T., Belanger, D., & McCallum, A. (2016). Ask the gru: Multi-task learning for deep text
recommendations. In Proceedings of the 10th acm conference on recommender systems (pp.
107–114).
Bennett, J., Lanning, S., et al. (2007). The netflix prize. In Proceedings of kdd cup and workshop
(Vol. 2007, p. 35).
Bobadilla, J., Ortega, F., Hernando, A., & Gutiérrez, A. (2013). Recommender systems survey.
Knowledge-based systems, 46, 109–132.
51
REFERENCES
Bouza, A., Reif, G., Bernstein, A., & Gall, H. (2008). Semtree: Ontology-based decision tree
algorithm for recommender systems. In Proceedings of the 2007 international conference on
posters and demonstrations-volume 401 (pp. 106–107).
Burke, R. (2007). Hybrid web recommender systems. In The adaptive web (pp. 377–408).
Springer.
Campos, P. G., Fernández-Tobı́as, I., Cantador, I., & Dı́ez, F. (2013). Context-aware movie rec-
ommendations: an empirical comparison of pre-filtering, post-filtering and contextual mod-
eling approaches. In International conference on electronic commerce and web technologies
(pp. 137–149).
Cao, S., Yang, N., & Liu, Z. (2017). Online news recommender based on stacked auto-encoder.
In Computer and information science (icis), 2017 ieee/acis 16th international conference on
(pp. 721–726).
Cheng, H.-T., Koc, L., Harmsen, J., Shaked, T., Chandra, T., Aradhye, H., . . . others (2016).
Wide & deep learning for recommender systems. In Proceedings of the 1st workshop on deep
learning for recommender systems (pp. 7–10).
Codina, V., Ricci, F., & Ceccaroni, L. (2016). Distributional semantic pre-filtering in context-
aware recommender systems. User Modeling and User-Adapted Interaction, 26(1), 1–32.
Dahl, G. E., Sainath, T. N., & Hinton, G. E. (2013). Improving deep neural networks for lvcsr
using rectified linear units and dropout. In Acoustics, speech and signal processing (icassp),
2013 ieee international conference on (pp. 8609–8613).
Deng, S., Huang, L., Xu, G., Wu, X., & Wu, Z. (2017). On deep learning for trust-aware
recommendations in social networks. IEEE transactions on neural networks and learning
systems, 28(5), 1164–1177.
52
REFERENCES
Devooght, R., & Bersini, H. (2016). Collaborative filtering with recurrent neural networks. arXiv
preprint arXiv:1608.07400.
Ding, D., Zhang, M., Li, S.-Y., Tang, J., Chen, X., & Zhou, Z.-H. (2017). Baydnn: Friend
recommendation with bayesian personalized ranking deep neural network. In Proceedings of
the 2017 acm on conference on information and knowledge management (pp. 1479–1488).
Dong, X., Yu, L., Wu, Z., Sun, Y., Yuan, L., & Zhang, F. (2017). A hybrid collaborative filtering
model with deep structure for recommender systems. In Aaai (pp. 1309–1315).
Duchi, J., Hazan, E., & Singer, Y. (2011). Adaptive subgradient methods for online learning and
stochastic optimization. Journal of Machine Learning Research, 12(Jul), 2121–2159.
Ebesu, T., & Fang, Y. (2017). Neural semantic personalized ranking for item cold-start recom-
mendation. Information Retrieval Journal, 20(2), 109–131.
Glorot, X., & Bengio, Y. (2010). Understanding the difficulty of training deep feedforward neural
networks. In Proceedings of the thirteenth international conference on artificial intelligence
and statistics (pp. 249–256).
Goldberg, D., Nichols, D., Oki, B. M., & Terry, D. (1992). Using collaborative filtering to weave
an information tapestry. Communications of the ACM, 35(12), 61–70.
Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep learning. MIT Press. (http://
www.deeplearningbook.org)
Graves, A., Mohamed, A.-r., & Hinton, G. (2013). Speech recognition with deep recurrent
neural networks. In Acoustics, speech and signal processing (icassp), 2013 ieee international
conference on (pp. 6645–6649).
Gunawardana, A., & Shani, G. (2009). A survey of accuracy evaluation metrics of recommen-
dation tasks. Journal of Machine Learning Research, 10(Dec), 2935–2962.
53
REFERENCES
Haruna, K., Ismail, M. A., Damiasih, D., Chiroma, H., & Herawan, T. (2017). Comprehensive
survey on comparisons across contextual pre-filtering, contextual post-filtering and contextual
modelling approaches. Telkomnika, 15(4), 1865–1875.
He, X., Liao, L., Zhang, H., Nie, L., Hu, X., & Chua, T.-S. (2017). Neural collaborative filtering.
In Proceedings of the 26th international conference on world wide web (pp. 173–182).
Hinton, G., Deng, L., Yu, D., Dahl, G. E., Mohamed, A.-r., Jaitly, N., . . . others (2012). Deep
neural networks for acoustic modeling in speech recognition: The shared views of four re-
search groups. IEEE Signal Processing Magazine, 29(6), 82–97.
Holzinger, A., & Jurisica, I. (2014). Knowledge discovery and data mining in biomedical in-
formatics: The future is in integrative, interactive machine learning solutions. In Interactive
knowledge discovery and data mining in biomedical informatics (pp. 1–18). Springer.
Karatzoglou, A., Amatriain, X., Baltrunas, L., & Oliver, N. (2010). Multiverse recommendation:
n-dimensional tensor factorization for context-aware collaborative filtering. In Proceedings of
the fourth acm conference on recommender systems (pp. 79–86).
Kim, D., Park, C., Oh, J., Lee, S., & Yu, H. (2016). Convolutional matrix factorization for
document context-aware recommendation. In Proceedings of the 10th acm conference on
recommender systems (pp. 233–240).
Kim, D., Park, C., Oh, J., & Yu, H. (2017). Deep hybrid recommender systems via exploiting
document context and statistics of items. Information Sciences, 417, 72–87.
Ko, Y. J., Maystre, L., & Grossglauser, M. (2016). Collaborative recurrent neural networks
for dynamic recommender systems. In Journal of machine learning research: Workshop and
conference proceedings (Vol. 63).
54
REFERENCES
Koren, Y. (2009). The bellkor solution to the netflix grand prize. Netflix prize documentation,
81, 1–10.
Koren, Y., Bell, R., & Volinsky, C. (2009). Matrix factorization techniques for recommender
systems. Computer, 42(8).
Kuchaiev, O., & Ginsburg, B. (2017). Training deep autoencoders for collaborative filtering.
arXiv preprint arXiv:1708.01715.
Kumar, V., Khattar, D., Gupta, S., Gupta, M., & Varma, V. (2017). Deep neural architecture
for news recommendation. In Working notes of the 8th international conference of the clef
initiative, dublin, ireland. ceur workshop proceedings.
Le, Q. V., Jaitly, N., & Hinton, G. E. (2015). A simple way to initialize recurrent networks of
rectified linear units. arXiv preprint arXiv:1504.00941.
LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. nature, 521(7553), 436.
Li, S., Kawale, J., & Fu, Y. (2015). Deep collaborative filtering via marginalized denoising
auto-encoder. In Proceedings of the 24th acm international on conference on information and
knowledge management (pp. 811–820).
Liang, D., Zhan, M., & Ellis, D. P. (2015). Content-aware collaborative music recommendation
using pre-trained neural networks. In Ismir (pp. 295–301).
Liang, H., & Baldwin, T. (2015). A probabilistic rating auto-encoder for personalized recom-
mender systems. In Proceedings of the 24th acm international on conference on information
and knowledge management (pp. 1863–1866).
Linden, G., Smith, B., & York, J. (2003). Amazon. com recommendations: Item-to-item collab-
orative filtering. IEEE Internet computing, 7(1), 76–80.
Lops, P., De Gemmis, M., & Semeraro, G. (2011). Content-based recommender systems: State
of the art and trends. In Recommender systems handbook (pp. 73–105). Springer.
55
REFERENCES
Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J. (2013). Distributed represen-
tations of words and phrases and their compositionality. In Advances in neural information
processing systems (pp. 3111–3119).
Miyahara, K., & Pazzani, M. J. (2000). Collaborative filtering with the simple bayesian classifier.
In Pacific rim international conference on artificial intelligence (pp. 679–689).
Mobasher, B., Dai, H., Luo, T., & Nakagawa, M. (2001). Effective personalization based on
association rule discovery from web usage data. In Proceedings of the 3rd international work-
shop on web information and data management (pp. 9–15).
Ning, X., Desrosiers, C., & Karypis, G. (2015). A comprehensive survey of neighborhood-based
recommendation methods. In Recommender systems handbook (pp. 37–76). Springer.
Pennock, D. M., Horvitz, E., Lawrence, S., & Giles, C. L. (2000). Collaborative filtering by
personality diagnosis: A hybrid memory-and model-based approach. In Proceedings of the
sixteenth conference on uncertainty in artificial intelligence (pp. 473–480).
Rawat, Y. S., & Kankanhalli, M. S. (2016). Contagnet: Exploiting user context for image tag
recommendation. In Proceedings of the 2016 acm on multimedia conference (pp. 1102–1106).
Rezende, D. J., Mohamed, S., & Wierstra, D. (2014). Stochastic backpropagation and approxi-
mate inference in deep generative models. arXiv preprint arXiv:1401.4082.
Ricci, F., Rokach, L., & Shapira, B. (2015). Recommender systems: introduction and challenges.
In Recommender systems handbook (pp. 1–34). Springer.
56
REFERENCES
Salakhutdinov, R., Mnih, A., & Hinton, G. (2007). Restricted boltzmann machines for collabo-
rative filtering. In Proceedings of the 24th international conference on machine learning (pp.
791–798).
Sarikaya, R., Hinton, G. E., & Deoras, A. (2014). Application of deep belief networks for
natural language understanding. IEEE/ACM Transactions on Audio, Speech, and Language
Processing, 22(4), 778–784.
Sarwar, B., Karypis, G., Konstan, J., & Riedl, J. (2001). Item-based collaborative filtering
recommendation algorithms. In Proceedings of the 10th international conference on world
wide web (pp. 285–295).
Schmidhuber, J. (2015). Deep learning in neural networks: An overview. Neural networks, 61,
85–117.
Shi, Y., Karatzoglou, A., Baltrunas, L., Larson, M., & Hanjalic, A. (2014). Cars2: Learning
context-aware representations for context-aware recommendations. In Proceedings of the 23rd
acm international conference on conference on information and knowledge management (pp.
291–300).
Singhal, A., Sinha, P., & Pant, R. (2017). Use of deep learning in modern recommendation
system: A summary of recent works. arXiv preprint arXiv:1712.07525.
Smirnova, E., & Vasile, F. (2017). Contextual sequence modeling for recommendation with
recurrent neural networks. arXiv preprint arXiv:1706.07684.
Sottocornola, G., Stella, F., Zanker, M., & Canonaco, F. (2017). Towards a deep learning
model for hybrid recommendation. In Proceedings of the international conference on web
intelligence (pp. 1260–1264).
Strub, F., Gaudel, R., & Mary, J. (2016). Hybrid recommender system based on autoencoders.
In Proceedings of the 1st workshop on deep learning for recommender systems (pp. 11–16).
57
REFERENCES
Sutskever, I., Martens, J., Dahl, G., & Hinton, G. (2013). On the importance of initialization
and momentum in deep learning. In International conference on machine learning (pp. 1139–
1147).
Van den Oord, A., Dieleman, S., & Schrauwen, B. (2013). Deep content-based music recom-
mendation. In Advances in neural information processing systems (pp. 2643–2651).
Wan, J., Wang, D., Hoi, S. C. H., Wu, P., Zhu, J., Zhang, Y., & Li, J. (2014). Deep learning
for content-based image retrieval: A comprehensive study. In Proceedings of the 22nd acm
international conference on multimedia (pp. 157–166).
Wang, H., Wang, N., & Yeung, D.-Y. (2015). Collaborative deep learning for recommender sys-
tems. In Proceedings of the 21th acm sigkdd international conference on knowledge discovery
and data mining (pp. 1235–1244).
Wang, S., Zou, B., Li, C., Zhao, K., Liu, Q., & Chen, H. (2015). Crown: a context-aware rec-
ommender for web news. In Data engineering (icde), 2015 ieee 31st international conference
on (pp. 1420–1423).
Wang, X., & Wang, Y. (2014). Improving content-based and hybrid music recommendation
using deep learning. In Proceedings of the 22nd acm international conference on multimedia
(pp. 627–636).
Wang, X., Yu, L., Ren, K., Tao, G., Zhang, W., Yu, Y., & Wang, J. (2017). Dynamic attention
deep model for article recommendation by learning human editors’ demonstration. In Pro-
ceedings of the 23rd acm sigkdd international conference on knowledge discovery and data
mining (pp. 2051–2059).
Wu, H., Yue, K., Liu, X., Pei, Y., & Li, B. (2015). Context-aware recommendation via graph-
based contextual modeling and postfiltering. International Journal of Distributed Sensor Net-
works, 11(8), 613612.
58
REFERENCES
Wu, Y., DuBois, C., Zheng, A. X., & Ester, M. (2016). Collaborative denoising auto-encoders
for top-n recommender systems. In Proceedings of the ninth acm international conference on
web search and data mining (pp. 153–162).
Xu, Z., Chen, C., Lukasiewicz, T., & Miao, Y. (2017). Hybrid deep-semantic matrix factorization
for tag-aware personalized recommendation. arXiv preprint arXiv:1708.03797.
Xue, G.-R., Lin, C., Yang, Q., Xi, W., Zeng, H.-J., Yu, Y., & Chen, Z. (2005). Scalable collabo-
rative filtering using cluster-based smoothing. In Proceedings of the 28th annual international
acm sigir conference on research and development in information retrieval (pp. 114–121).
Xue, H.-J., Dai, X.-Y., Zhang, J., Huang, S., & Chen, J. (2017). Deep matrix factorization
models for recommender systems. static. ijcai. org.
Zhang, S., Yao, L., & Sun, A. (2017). Deep learning based recommender system: A survey and
new perspectives. arXiv preprint arXiv:1707.07435.
Zheng, L., Noroozi, V., & Yu, P. S. (2017). Joint deep modeling of users and items using reviews
for recommendation. In Proceedings of the tenth acm international conference on web search
and data mining (pp. 425–434).
Zheng, Y., Burke, R., & Mobasher, B. (2012). Differential context relaxation for context-aware
travel recommendation. In International conference on electronic commerce and web tech-
nologies (pp. 88–99).
Zheng, Y., Mobasher, B., & Burke, R. (2014). Cslim: Contextual slim recommendation algo-
rithms. In Proceedings of the 8th acm conference on recommender systems (pp. 301–304).
Zheng, Y., Mobasher, B., & Burke, R. (2015a). Incorporating context correlation into context-
aware matrix factorization. In Proceedings of the 2015 international conference on constraints
and preferences for configuration and recommendation and intelligent techniques for web
personalization-volume 1440 (pp. 21–27).
59
REFERENCES
Zheng, Y., Mobasher, B., & Burke, R. (2015b). Integrating context similarity with sparse lin-
ear recommendation model. In International conference on user modeling, adaptation, and
personalization (pp. 370–376).
Zheng, Y., Tang, B., Ding, W., & Zhou, H. (2016). A neural autoregressive approach to collabo-
rative filtering. arXiv preprint arXiv:1605.09477.
60