0% found this document useful (0 votes)

14 views8 pages

EAI Endorsed Transactions: Music Recommendation Based On Facial Emotion Recognition

Uploaded by

Juliany Helen Das Graças

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views8 pages

EAI Endorsed Transactions: Music Recommendation Based On Facial Emotion Recognition

Uploaded by

Juliany Helen Das Graças

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

EAI Endorsed Transactions

on Scalable Information Systems Research Article

Music Recommendation Based on Facial Emotion

Recognition
Rajesh B1 , Keerthana V1, Narayana Darapaneni2, Anwesh Reddy P3,*

1
PES University, Bangalore, Karnataka, 560050, India
2
Northwestern University, Evanston, IL 60208, United States
3
Great Learning, Hyderabad, Telangana, 500089, India

Abstract

INTRODUCTION: Music provides an incredible avenue for individuals to express their thoughts and emotions, while also
serving as a delightful mode of entertainment for enthusiasts and music lovers.
OBJECTIVES: This paper presents a comprehensive approach to enhancing the user experience through the integration of
emotion recognition, music recommendation, and explainable AI using GRAD-CAM.
METHODS: The proposed methodology utilizes a ResNet50 model trained on the Facial Expression Recognition (FER)
dataset, consisting of real images of individuals expressing various emotions.
RESULTS: The system achieves an accuracy of 82% in emotion classification. By leveraging GRAD-CAM, the model
provides explanations for its predictions, allowing users to understand the reasoning behind the system's recommendations.
The model is trained on both FER and real user datasets, which include labelled facial expressions, and real images of
individuals expressing various emotions. The training process involves pre-processing the input images, extracting features
through convolutional layers, reasoning with dense layers, and generating emotion predictions through the output layer
CONCLUSION: The proposed methodology, leveraging the Resnet50 model with ROI-based analysis and explainable AI
techniques, offers a robust and interpretable solution for facial emotion detection paper.

Keywords: Facial emotion detection, Resnet50, convolutional neural network, deep learning, region of interest, explainable AI.

Received on DD MM YYYY, accepted on DD MM YYYY, published on DD MM YYYY

Copyright © YYYY Author et al., licensed to EAI. This is an open access article distributed under the terms of the CC BY-NC-SA
4.0, which permits copying, redistributing, remixing, transformation, and building upon the material in any medium so long as the
original work is properly cited.

doi: 10.4108/_______________

Elements such as sex, age [3], cultural background [4], mood,

individual choices, and contextual factors [5] collectively
1. Introduction influence an individual's emotional reaction to a specific
piece
In the current digital age, the significance of user experience
is pivotal across a wide range of applications.This paper seeks of music, taking into account variables like the time of day or
to enhance user experience by integrating three key the setting. Notwithstanding these external influences,
components: emotion recognition, music recommendation, humans possess the ability to consistently classify music into
and explainable AI. The proposed system leverages deep categories such as happiness, sadness, excitement, or
learning techniques and GRAD-CAM for accurate emotion calmness.
recognition and personalized music recommendations. Facial expressions possess considerable potential as
Recent studies in the domain of music have revealed that indicators of an individual's psychological well-being,
music evokes strong emotional reactions in its listeners [1]. serving as the most fundamental and instinctive means of
There is a significant correlation between musical preferences expressing emotions [6, 7, 2]. Despite the strong association,
and personality traits as well as emotions. The areas of the a majority of current music software lacks the capability to
brain responsible for emotions and mood control the various generate playlists aligned with diverse emotional states.
aspects of music, such as meter, timber, rhythm, and pitch [2]. Emotion recognition plays a vital role in understanding users'

1
Rajesh B , Keerthana V, Narayana Darapaneni, Anwesh Reddy P

emotional states and tailoring content accordingly. By In 2005, Wieczorkowska et al. conducted a study aiming
accurately detecting and classifying facial emotions, the to assist users in discovering music aligned with their moods.
system can adapt its functionality and provide content that They employed the K-nearest neighbors (KNN) algorithm to
resonates with users' emotional needs. This capability opens classify a vast dataset of 327,683 songs into six distinct
up new possibilities for interactive applications, such as emotions, resulting in an overall accuracy of 37%. Similarly,
personalized music recommendation systems. in 2008, another study [12] utilized a regression method for
Music recommendation systems have gained immense Music Emotion Recognition (MER) and achieved accuracy
popularity in recent years, leveraging advanced algorithms to rates of 64% for arousal and 59% for valence. Yading Song
suggest songs based on users' preferences. However, et al. [13] explored various facets of music for MER and
integrating emotion recognition into music recommendation utilized a labeled dataset of 2,904 songs categorized as
adds a new dimension to the system. By considering the "happy," "sad," "angry," or "relaxed." Support Vector
detected emotions, the system can recommend songs that Machines (SVM) were employed, with spectral
align with users' emotional states, creating a more meaningful characteristics exhibiting superior performance compared to
and engaging music listening experience.. other acoustic parameters.
Explainable AI is another crucial aspect of the proposed In 1978, Ekman and Friesen [14] introduced Action Units
system. Conventional machine learning models frequently (AU), which incorporated both permanent and transient facial
lack interpretability, posing a challenge for users to traits for emotion recognition. The increasing popularity of
comprehend the rationale behind their recommendations. By employing Convolutional Neural Networks (CNNs) in
incorporating the GRAD-CAM technique, the system emotion recognition can be attributed to the continuous
provides visual explanations for its predictions, enabling advancements in methodologies. Lyrical analysis has also
users to grasp the underlying features that contribute to the been utilized for music classification [15], [16]. However,
recommended content. This transparency fosters trust, relying exclusively on tokenized methods falls short in
understanding, and user engagement. achieving accurate song categorization. Additionally, the
The primary objective of this study is to create an affordable presence of language barriers poses a limitation to
music application that utilizes real-time video and classification within a single language, creating a distinct
Convolutional Neural Network (CNN) technology to disadvantage in the overall process.
automatically select songs based on the user's current mood In the year 2020, T. Vijayakumar [17] introduced a
state. The system aims to minimize resource consumption research paper that concentrated on tackling inverse problems
while incorporating an emotion module that analyzes the utilizing Convolutional Neural Networks (CNNs). The study
user's real-time video to evaluate their emotional state. initially employed CNN and later transitioned to direct
Subsequently, it matches the identified mood with songs from inversion using a combination of Filtered Back Projection
a categorized collection and offers recommendations for a (FBP) and CNN, known as FBP-C. The approach utilized
diverse range of songs. By alleviating the burden of manual individual learning and a U-net architecture. The synthetic
song selection, this system has the potential to address the dataset used in the study consisted of 475 training images and
existing challenge in finding suitable songs. 25 validation images. The backpropagation technique
employed in the study produced satisfactory results.
In a recent study carried out in 2021, Sungheetha, Akey,
2. Literature Survey and Rajesh Sharma [18] directed their attention towards
image classification using Convolutional Neural Networks
The literature survey encompasses a comprehensive (CNN) for the early detection of Diabetic Retinopathy. The
analysis of existing research and studies related to emotion conventional methods employed for detecting Hard Exudates
recognition, music recommendation systems, and explainable (HE) in retinopathy images, which are crucial for assessing
AI. This section highlights key findings, methodologies, and diabetes severity, were found to be ineffective. To address
insights from previous works, laying the foundation for the this challenge, the study proposed the utilization of CNN to
proposed system. extract relevant features from deep networks, offering a
Several studies have explored the use of different viable solution. Deep learning architectures, including CNN,
techniques and methodologies for music recommendation have demonstrated their effectiveness as powerful tools for
systems based on mood/emotions. In their study, Renu Taneja image recognition, analysis, classification, and identification
et al. [9] utilized Audio to retrieve audio features such as within the domain of medical imaging.
pace, beats, and RMSE. They then constructed clusters to In the year 2021, a survey was conducted by Smys, S., Joy
represent various moods based on these extracted properties. Iong Zong Chen, and Subarna Shakya [19] to investigate
Kee Moe Han et al. [8], on the other hand, employed the various architectures and design methodologies employed in
average emotions from a group of 15 individuals to determine neural networks. The study classified deep neural networks
the emotion of a song. They trained a classifier using this data into three distinct types: hybrid architectures, generative
and categorized the music signal's emotion by considering architectures, and discriminative architectures. The hybrid
audio parameters such as pitch and timbre. Another research architecture was presented by integrating Convolutional
conducted by V. R. Ghule et al. [10] centered on the Neural Networks (CNN) with deep belief networks, while the
development of a music system that employed facial discriminative architecture predominantly relied on CNN,
recognition technology for analyzing emotions. featuring stacked pooling and convolution layers to construct

2
Music Recommendation Based on Facial Emotion Recognition

a deep model. The survey provided insights into the diverse As emphasized in the literature review, Peter Burkert, Felix
approaches and structures employed. Trier, Muhammad Zeshan Afzal, Andreas Dengel, and
Emotion recognition has emerged as a highly investigated Marcus Liwicki [21] presented DeXpression, a deep
domain within the fields of computer vision and human- convolutional neural network designed for expression
computer interaction. Researchers have explored various recognition. This network integrates explainable AI
techniques, including facial expression analysis, techniques to provide visualizations of the specific image
physiological signals, and audio analysis. In the realm of regions that contribute to the model's predictions.Building
music psychology, Swathi Swaminathan and E. Glenn upon the insights gained from the extensive literature survey,
Schellenberg [1] conducted research to shed light on the this paper now proceeds to outline the methodology
current state of emotion research, emphasizing the employed to address the research gaps identified in the
importance of comprehending emotions within music-related existing studies. The methodology section represents a fusion
contexts. F. Abdat, C. Maaoui, and A. Pruski [2] directed their of established techniques from prior research and innovative
attention toward human-computer interaction and highlighted approaches tailored to the specific objectives of our project.
the significance of facial cues in emotion detection through By capitalizing on the strengths and limitations of previous
facial expression recognition. These studies offer valuable methodologies, we have designed a refined framework that
insights into the theoretical foundations and practical aims to advance the fields of facial emotion recognition and
applications of techniques utilized in emotion recognition. music recommendation. This comprehensive methodology
Music recommendation systems have the objective of encompasses various stages, including dataset acquisition,
delivering personalized and pertinent music suggestions to preprocessing, model selection, training, and evaluation.
users, taking into account their preferences, context, and While also introducing novel strategies to tackle the unique
emotional states. In the realm of mood classification from challenges associated with facial emotion recognition and
musical audio, Kyogu Lee and Minsu Cho [4] investigated music recommendation.
the utilization of user group-dependent models, highlighting
the importance of acknowledging user diversity and
individual preferences in the realm of music 3.1. Dataset Description
recommendation. In a distinct perspective, Daniel Wolff,
Tillman Weyde, and Andrew MacFarlane [5] concentrated on The proposed system utilizes a dataset consisting of facial
culture-aware music recommendation, recognizing the images with labeled emotions for training and evaluation
influence of cultural background on music preferences. purposes. The dataset includes real images of different
Additionally, Mirim Lee and Jun-Dong Cho [15] developed individuals encompassing a diverse range of emotions. The
a context-based social music recommendation service, dataset may be augmented and pre-processed to enhance
underscoring the significance of contextual factors in model performance and generalize well to real-world
augmenting music recommendation systems. These studies scenarios.
contribute to the comprehension of music recommendation The dataset used for training the facial emotion recognition
techniques and the factors that impact user satisfaction and model consists of two components: the FER dataset and real
engagement. images of different individuals. The dataset for facial
Explainable AI has gained considerable attention to expression recognition (FER) consists of categorized facial
enhance transparency and interpretability in AI models. Peter expressions, encompassing a range of emotions including
Burkert, Felix Trier, Muhammad Zeshan Afzal, Andreas anger, disgust, fear, happiness, sadness, surprise, and
Dengel, and Marcus Liwicki [21] presented DeXpression, a neutrality. This dataset serves as the foundation for training
deep convolutional neural network designed for expression the deep learning model and enables it to learn patterns
recognition. This network integrates explainable AI associated with various emotions. To enhance the diversity
techniques to provide visualizations of the specific image and generalization capability of the model, real images of
regions that contribute to the model's predictions. This study different individuals are also captured and included in the
demonstrates the potential of explainable AI in enhancing the training dataset.
interpretability of deep learning models. The literature survey In addition to the FER dataset, a music dataset is utilized for
explores the existing research and developments in the field generating personalized music recommendations. The music
of explainable AI and its relevance to the proposed project. dataset contains a diverse collection of music tracks from
By reviewing these studies and research papers, it is various genres and styles. This information serves as the basis
evident that emotion recognition, music recommendation, for mapping the detected emotions to appropriate music
and explainable AI are active areas of research with various tracks. The dataset for music can be acquired from diverse
approaches and techniques. However, there is still a need to sources, such as online music platforms, curated databases, or
integrate these domains to enhance user experience. The personalized collections.
proposed project aims to bridge these gaps and provide a
comprehensive system that combines emotion recognition
from facial cues, music recommendation based on emotions,
and explainable AI using GRAD-CAM visualization..

3. Methodology

3
Rajesh B , Keerthana V, Narayana Darapaneni, Anwesh Reddy P

The input layer of Resnet50 performs a vital function in

processing input images prior to their integration into the
network. Given that the input images usually have specific
dimensions, any images that deviate from these dimensions
are either resized or cropped to meet the expected
requirements. Furthermore, preprocessing techniques like
normalization and data augmentation may be employed to
Figure 1. Sample images from the dataset. improve the model's resilience and ability to generalize.

By incorporating both the Real user image dataset and a Convolutional Layer
comprehensive music dataset, the system can offer
personalized music recommendations based on the user's real- The convolutional layers in Resnet50 are responsible for
time emotional state. The facial emotion recognition model feature extraction. They consist of filters that slide across the
trained on the FER dataset enables accurate emotion detection, input images, convolving with the pixel values to produce
while the music dataset provides a rich selection of tracks for feature maps. Each filter specializes in capturing specific
mapping and playlist generation. Table 1 provides the patterns or features, such as edges, textures, and shapes. The
classifications of moods and corresponding songs. depth of the network enables Resnet50 to learn increasingly
complex and abstract features as the information passes
through multiple convolutional layers.
Table 1. Music Dataset
Dense Layer
Sr.No Emotions No of Songs
1 Happy 20 Following the convolutional layers, Resnet50 incorporates
2 Sad 30 dense layers, also known as fully connected layers. These
3 Angry 20 layers receive the extracted features from the previous layers
4 Surprise 20 and perform high-level reasoning and decision-making. The
5 Neutral 20 dense layers are typically comprised of multiple neurons, with
6 Disgust 20 each neuron representing a specific class or emotion in the
7 Fear 16 case of facial emotion detection. Through a series of weighted
connections and activation functions, the dense layers
3.2. Dataset Description transform the extracted features into probability scores or
confidence values for each class.
Facial emotion recognition utilizes the ResNet50 Output Layer
model, a deep CNN architecture renowned for its exceptional
performance in image classification assignments It consists of The output layer in Resnet50 represents the final stage,
50 layers, including convolutional layers, shortcut responsible for generating predicted emotions for the input
connections, and global average pooling. The input layer of images in facial emotion detection. This layer comprises
the ResNet50 model takes in facial images as input. The neurons that correspond to various emotions, including
convolutional layers extract meaningful features from the happiness, neutral, anger, surprise, fear, disgust, and sadness.
input images, capturing the facial expressions' key The choice of activation function in the output layer depends
characteristics. The dense layers process the extracted features on the problem's characteristics. In this scenario, a commonly
and learn the relationship between facial expressions and employed activation function is softmax, which ensures that
emotions. Finally, the output layer provides the predicted the predicted emotion probabilities sum up to 1. This property
facilitates interpretation and comparison of the predictions.
emotion based on the learned features.
Resnet50 excels in facial emotion detection due to its deep
The output of the dense layer in the network employs the architecture and residual connections. The depth enables the
softmax activation function, which enables the prediction of network to learn rich and meaningful representations of facial
a multinomial probability distribution. This distribution is features, capturing intricate details relevant to emotions. The
well-suited for multiclass classification tasks that involve residual connections help alleviate the vanishing gradient
more than two labels. In this particular project, which problem, allowing for better training and improved
involves classifying emotions into seven distinct labels, the performance.
output for each class is represented as a probability By leveraging Resnet50 as the core model for facial
distribution. The network architecture comprises nine emotion detection, this research aims to accurately classify the
convolutional layers, with a max-pooling layer following emotional states conveyed by individuals' facial expressions.
every three convolutional layers, and two dense layers. The trained model is capable of analyzing facial images and
predicting the corresponding emotions, contributing to the
Input Layer creating of advanced systems capable of comprehending and
appropriately reacting to human emotion.

4
Music Recommendation Based on Facial Emotion Recognition

To summarize, Resnet50 serves as a robust deep-learning utilized to detect and extract the eyes from facial images in
model that constitutes the foundation of the facial emotion real time. These extracted eye regions were then fed into the
detection system created in this study. It leverages its trained Resnet50 model, which generated predictions of the
architecture, including convolutional layers, dense layers, and corresponding emotional states.
residual connections, to extract features and make accurate
predictions. Through extensive training on the dataset, the This methodology offers several advantages. Firstly, by
model can capture the subtle variations in facial expressions narrowing the focus to the eyes, the model's attention is
and classify them into different emotions. This model serves concentrated on the most expressive and informative facial
as a valuable tool for understanding and analyzing human
region, potentially improving the accuracy of emotion
emotions, opening doors to numerous applications in fields
detection. Secondly, working with a specialized dataset of
like psychology, human-computer interaction, and affective
computing. eye regions allows for more targeted training, enabling the
model to learn eye-specific features more effectively. Lastly,
the use of the Haar cascade classifier for eye detection
provides a robust and efficient means of isolating the eyes,
3.3. Facial Emotion Detection with ROI (Eyes): ensuring accurate extraction even in real-time
In the context of facial emotion detection, the eyes are scenarios.Overall, the integration of facial emotion detection
considered a crucial region for accurate emotion recognition. with ROI (Eyes) using the Haar cascade classifier and the
The eyes exhibit significant changes in various emotional Resnet50 model demonstrates a tailored approach to emotion
states, and capturing these subtle variations can enhance the recognition. By leveraging the distinctive features of the eyes
performance of emotion classification models. To leverage the and training on eye-specific datasets, this methodology
distinctive features of the eyes, a specific approach was enhances the precision and granularity of facial emotion
employed, involving the extraction of the region of interest detection systems, providing valuable insights into the role of
(ROI) using a Haar cascade classifier. the eyes in expressing and recognizing emotions.

The Haar cascade classifier is a popular technique in computer By focusing solely on the eyes, the model gained a deeper
vision for object detection, known for its efficiency and understanding of the specific eye-related cues and expressions
accuracy. In this methodology, the Haar cascade classifier was associated with each emotion. This approach allowed for a
trained to identify and localize the eyes in facial images. Once more fine-grained analysis of the eyes' role in emotion
the eyes were successfully detected, they were cropped and recognition, capturing the nuances and subtleties that
extracted as separate images, creating a specialized dataset contribute to accurate classification. Once the model was
specifically consisting of eye regions. trained on the eye-centric dataset, it was capable of predicting
facial emotions based on new, unseen eye regions. During
This eye-centric dataset was then utilized for training a facial inference, the Haar cascade classifier was utilized to detect
emotion classification model. The model architecture, based and extract the eyes from facial images in real time. These
on the Resnet50 deep convolutional neural network (CNN), extracted eye regions were then fed into the trained Resnet50
was employed to learn the intricate patterns and features model, which generated predictions of the corresponding
present in the eye regions. The Resnet50 model has been emotional states.
widely recognized for its exceptional performance in
computer vision tasks, making it a suitable choice for this 3.4. Explainable AI:
research.
Explainable AI (XAI) is an essential aspect of building
During the training process, the model was exposed to the eye trustworthy and interpretable machine learning models. It
images from the specialized dataset, with each eye region aims to provide insights into the reasoning behind the
associated with a corresponding emotional label. The model predictions made by the model, offering transparency and
learned to analyze the eye features and classify them into enabling users to understand and trust the decision-making
different emotional states, including happiness, sadness, process. In this project, the GRAD-CAM (Gradient-weighted
anger, fear, disgust, surprise, and neutral. Class Activation Mapping) technique was employed to
achieve explainability in the facial emotion detection model.
By focusing solely on the eyes, the model gained a deeper
understanding of the specific eye-related cues and GRAD-CAM is a visualization technique that helps identify
expressions associated with each emotion. This approach the regions of an image that are influential in a model's
allowed for a more fine-grained analysis of the eyes' role in decision-making process. It generates heatmaps by
emotion recognition, capturing the nuances and subtleties that highlighting the important areas of an input image that
contribute to accurate classification. contribute most significantly to the predicted class. By
applying GRAD-CAM to the facial emotion detection model,
Once the model was trained on the eye-centric dataset, it was we can gain insights into the regions of the face that
capable of predicting facial emotions based on new, unseen contribute to the classification of specific emotions.
eye regions. During inference, the Haar cascade classifier was

5
Rajesh B , Keerthana V, Narayana Darapaneni, Anwesh Reddy P

To apply GRAD-CAM, the pre-trained Resnet50 model was effectively classify emotions such as anger, disgust, fear,
utilized. After an input image was fed into the model for happiness, sadness, surprise, and neutral expressions.
prediction, the gradients of the target class (the predicted
emotion) were computed with respect to the final
convolutional layer. These gradients were then used to weigh 4.2. Region of Interest (ROI) Analysis:
the activations of the convolutional layer, creating a heatmap
By focusing on the eyes as the region of interest, we
that visually represents the regions that influenced the
observed that the model's performance improved in detecting
prediction the most. By visualizing the heatmaps generated subtle changes in emotions. The eye region, known to convey
by GRAD-CAM, we were able to identify the facial regions, vital emotional cues, proved to be influential in accurately
such as the eyes, nose, or mouth, that played a significant role predicting emotions. The use of Haar cascades for eye
in the model's decision-making process. This information can detection and a separate dataset consisting solely of eye
be invaluable for understanding how the model interprets images contributed to the model's enhanced performance in
emotions and which facial features contribute most capturing subtle variations in emotional states.
prominently to each emotion classification.

The incorporation of Explainable AI techniques like GRAD- 4.3. Music Recommendation:

CAM enhances the transparency and interpretability of the
facial emotion detection model. It allows users to understand The integrated music recommendation system
why specific emotions were predicted for a given input successfully generated personalized playlists based on the
image, providing them with confidence in the system's detected emotions. By mapping emotions to corresponding
decision-making process. This interpretability is especially music tracks, users were provided with a curated selection of
valuable in real-world applications where users need to trust songs that matched their emotional state. This personalized
approach enhanced user satisfaction and engagement with the
and comprehend the decisions made by AI systems.
music player.

4.4. Explainability with GRAD-CAM:

The incorporation of GRAD-CAM for explainable AI
provided insights into the model's decision-making process.
The generated heatmaps visualized the facial regions that
played a significant role in the model's classification of
emotions. This visualization not only validated the model's
attention to relevant facial features but also provided
transparency and interpretability to end-users, promoting trust
Fig2. Explainable AI with GRADCAM and confidence in the system.
Overall, the results demonstrate the successful
By combining facial emotion detection with ROI (Eyes) implementation of the proposed methodology for facial
and Explainable AI using GRAD-CAM, our methodology not emotion detection and music recommendation. The accuracy
only achieved accurate emotion classification but also achieved in emotion classification, the focus on the eye region,
provided valuable insights into the visual cues and facial and the incorporation of explainable AI techniques contribute
regions that contribute to each emotion. This holistic approach to the robustness, interpretability, and user-centric nature of
fosters transparency, interpretability, and trust in the model's the system.
predictions, paving the way for wider acceptance and The outcomes of this study have significant implications
application of facial emotion recognition systems in various for various applications such as personalized music streaming,
domains. emotion-aware user interfaces, and affective computing. The
combination of facial emotion detection, ROI analysis, music
4. Result recommendation, and explainable AI provides a
comprehensive framework that enhances user experience,
The results of our study explains the effectiveness of the
engagement, and satisfaction.
proposed methodology for facial emotion detection and music
recommendation based on real-time emotion recognition. The
system achieved promising performance in accurately
classifying facial expressions and generating personalized 5. Conclusion And Future Scope
music playlists based on the detected emotions. In conclusion, this paper presented a novel approach for
4.1 Facial Emotion Detection Performance: facial emotion detection and music recommendation based on
real-time emotion recognition. The developed system
The trained Resnet50 model exhibited impressive achieved high accuracy in classifying facial expressions,
accuracy in recognizing facial emotions. On the FER dataset leveraging the power of the Resnet50 model. By incorporating
and real images of different individuals, the model achieved a region of interest (ROI) analysis focusing on the eyes and
an overall accuracy of 86%. This indicates that the model can utilizing a separate dataset of eye images, the system

6
Music Recommendation Based on Facial Emotion Recognition

demonstrated improved performance in capturing subtle [3] "How music changes your mood," Examined Existence.
emotional cues. [Online]. Available: http://examinedexistence.com/how-
music-changes-your-mood/. Accessed: Jan. 13, 2017.
The integration of music recommendations based on the [4] Kyogu Lee and Minsu Cho, "Mood Classification from
detected emotions enhanced the user experience, providing Musical Audio Using User Group-dependent Models."
personalized playlists that resonated with the user's emotional [5] Daniel Wolff, Tillman Weyde, and Andrew MacFarlane,
state. The system's effectiveness was further enhanced by "Culture-aware Music Recommendation."
incorporating explainable AI techniques, particularly the [6] A. Lehtiniemi and J. Holm, "Using Animated Mood Pictures in
GRAD-CAM method, which provided insights into the Music Recommendation," in 16th International Conference on
Information Visualization, 2012.
model's decision-making process and enhanced transparency.
[7] A. S. Dhavalikar and Dr. R. K. Kulkarni, "Face Detection and
The results of this paper have several implications for Facial Expression Recognition System," International
Conference on Electronics and Communication Systems
future research and development. Some potential areas for (ICECS -2014).
further exploration and improvement include: [8] K. Han, T. Zin, and H. M. Tun, "Extraction Of Audio Features
Expansion of Emotion Categories: While the current For Emotion Recognition System Based On Music,"
International Journal Of Scientific & Technology Research,
system successfully classified emotions into seven categories, June 2016.
future work could involve expanding the range of emotions [9] R. Taneja, A. Bhatia, J. Monga, and P. Marwaha, "Emotion
recognized. This could include more nuanced emotional states detection of audio files," in IEEE Computing for Sustainable
or cultural-specific emotions, allowing for a more Global Development (INDIACom), 2016 3rd International
comprehensive understanding of users' emotional Conference on, March 2016, pp. 2397-240.
experiences. [10] V. R. Ghule, A. B. Benke, S. S. Jadhav, and S. A. Joshi,
"Emotion Based Music Player Using Facial Recognition,"
Multi-modal Emotion Recognition: Incorporating International Journal of Innovative Research in Computer and
additional modalities, such as voice or gesture recognition, Communication Engineering, February 2017, Vol. 5, Issue 2.
alongside facial emotion detection, can provide a more holistic [11] A. Wieczorkowska, P. Synak, R. Lewis, and Z. W. Raś,
understanding of users' emotional states. Multi-modal "Extracting emotions from music data," in International
Symposium on Methodologies for Intelligent Systems, May
approaches have the potential to enhance the accuracy and 2005, pp. 456-465.
robustness of emotion detection systems. [12] Y. H. Yang, Y. C. Lin, Y. F. Su, and H. H. Chen, "A regression
Real-Time System Deployment: While our system approach to music emotion recognition," IEEE Transactions on
audio, speech, and language processing, 2008, 16(2), 448-457.
performed real-time emotion recognition, further optimization
and deployment on low-latency platforms can ensure its [13] Y. Song, S. Dixon, and M. Pearce, "Evaluation of Musical
Features for Emotion Classification," in ISMIR, October 2012,
practical usability in real-world scenarios, such as interactive pp. 523-528.
applications or emotion-aware systems. [14] Ying-li Tian, T. Kanade, and J. Cohn, "Recognizing lower face
action units for facial expression analysis," in Proceedings of
User Feedback and Personalization: Integrating user the 4th IEEE International Conference on Automatic Face and
feedback mechanisms can enable the system to adapt and Gesture Recognition (FG'00), Mar. 2000, pp. 484-490.
personalize its recommendations based on individual [15] Mirim Lee and Jun-Dong Cho, "Logmusic: context-based
preferences and responses. User feedback loops can contribute social music recommendation service on mobile device,"
to continuous improvement and user satisfaction. Ubicomp'14 Adjunct, Seattle, WA, USA, Sep. 13-17, 2014.
[16] Gil Levi and Tal Hassner, "Emotion Recognition in the Wild
Generalization to Diverse Populations: Future research via Convolutional Neural Networks and Mapped Binary
should focus on expanding the diversity of the dataset utilised Patterns."
for training the model, encompassing individuals from various [17] Vijayakumar, T., "Posed Inverse Problem Rectification Using
demographics, cultures, and age groups. This will ensure the Novel Deep Convolutional Neural Network," Journal of
generalizability and inclusiveness of the system across Innovative Image Processing (JIIP) 2, no. 03 (2020): 121-127.
different populations. [18] Sungheetha, A., and Rajesh Sharma, "Design an Early
Detection and Classification for Diabetic Retinopathy by Deep
In conclusion, our proposed system demonstrates the Feature Extraction based Convolution Neural Network,"
potential of combining facial emotion detection, ROI analysis, Journal of Trends in Computer Science and Smart technology
(TCSST) 3, no. 02 (2021): 81-94.
music recommendation, and explainable AI techniques to
create a user-centric, personalized experience. The achieved [19] Smys, S., Joy Iong Zong Chen, and Subarna Shakya, "Survey
on Neural Network Architectures with Deep Learning," Journal
results, along with the future scope outlined, contribute to the of Soft Computing Paradigm (JSCP) 2, no. 03 (2020): 186-194.
advancement of affective computing and emotion-aware [20] "Unsupervised feature learning and deep learning Tutorial."
systems, with implications in fields such as entertainment, [Online]. Available:
healthcare, and human-computer interaction. http://ufldl.stanford.edu/tutorial/supervised/OptimizationStoch
asticGradientDescent/. Accessed: Jan. 13,
References 2017.OptimizationStochasticGradientDescent/. Accessed: Jan.
13, 2017.onal Conference on, March 2016, pp. 2397-240.
[1] Swathi Swaminathan and E. Glenn Schellenberg, "Current [21] Peter Burkert, Felix Trier, Muhammad Zeshan Afzal, Andreas
emotion research in music psychology," Emotion Review, vol. Dengel, and Marcus Liwicki, "DeXpression: Deep
7, no. 2, pp. 189-197, Apr. 2015. Convolutional Neural Network for Expression Recognition."
[2] F. Abdat, C. Maaoui, and A. Pruski, "Human-computer [22] "Unsupervised feature learning and deep learning Tutorial."
interaction using emotion recognition from facial expression," [Online]. Available:
in UKSim 5th European Symposium on Computer Modeling http://ufldl.stanford.edu/tutorial/supervised/OptimizationStoch
and Simulation, 2011. asticGradientDescent/. Accessed: Jan. 13, 2017.

7
Rajesh B , Keerthana V, Narayana Darapaneni, Anwesh Reddy P

[23] Ian J. Goodfellow et al., "Challenges in Representation

Learning: A report on three machine learning contests."
[24] S. Lawrence, C. L. Giles, Ah Chung Tsoi, and A. D. Back,
"Face recognition: a convolutional neural-network approach,"
in IEEE Transactions on Neural Networks, vol. 8, no. 1, pp. 98-
113, Jan. 1997.
[25] Rainer Lienhart and Jochen Maydt, "An extended set of haar-
like features for rapid object detection," in Image Processing.
2002. Proceedings. 2002 International Conference on, volume
1, pages I-900. IEEE, 2002.

Full Document- Emotion Based Music Player[2}[1]
No ratings yet
Full Document- Emotion Based Music Player[2}[1]
83 pages
IJNRD2305392
No ratings yet
IJNRD2305392
5 pages
Emotion-Based Music Player
No ratings yet
Emotion-Based Music Player
8 pages
Facial Expression Based Music Recommendation System
No ratings yet
Facial Expression Based Music Recommendation System
10 pages
paper7 (2)
No ratings yet
paper7 (2)
11 pages
Ai Updated
No ratings yet
Ai Updated
17 pages
Aljanaki
No ratings yet
Aljanaki
149 pages
Emotion Based Music Recommendation System
100% (1)
Emotion Based Music Recommendation System
6 pages
Mp
No ratings yet
Mp
17 pages
IRJET-V6I340320190826-49615-bg0qqz-libre
No ratings yet
IRJET-V6I340320190826-49615-bg0qqz-libre
6 pages
Music Recommendations Real Time Based On Face Emotions With Spotify
No ratings yet
Music Recommendations Real Time Based On Face Emotions With Spotify
22 pages
Music Recommendation System Using Facial Expression Recognition Using Machine Learning
No ratings yet
Music Recommendation System Using Facial Expression Recognition Using Machine Learning
7 pages
Smart Music Player Project
No ratings yet
Smart Music Player Project
28 pages
EmoMelody Mapper_researchppr
No ratings yet
EmoMelody Mapper_researchppr
8 pages
music-recommendation-based-on-current-mood-using-ai--ml
No ratings yet
music-recommendation-based-on-current-mood-using-ai--ml
10 pages
Batch No-11.
No ratings yet
Batch No-11.
17 pages
2paper 8
No ratings yet
2paper 8
15 pages
Emotion Based Music Recommendation System
No ratings yet
Emotion Based Music Recommendation System
4 pages
Music Recommendation Based On Facial Expression and Hand Gesture
No ratings yet
Music Recommendation Based On Facial Expression and Hand Gesture
11 pages
fin_irjmets1656670953
No ratings yet
fin_irjmets1656670953
11 pages
Feel the Beat Through Emotion Using Convolutional Neural Network
No ratings yet
Feel the Beat Through Emotion Using Convolutional Neural Network
5 pages
XPRESSIFY - HARISH S 71772118113
No ratings yet
XPRESSIFY - HARISH S 71772118113
6 pages
phase1 project report (2)
No ratings yet
phase1 project report (2)
8 pages
Emotion_based_Music_Recommendation_System_using_Deep_Learning_Model
No ratings yet
Emotion_based_Music_Recommendation_System_using_Deep_Learning_Model
6 pages
Emotion_Based_Music_Recommendation_System
No ratings yet
Emotion_Based_Music_Recommendation_System
6 pages
paper9 (1)
No ratings yet
paper9 (1)
18 pages
JETIR2305271
No ratings yet
JETIR2305271
5 pages
A_Novel_Emotion_based_Music_Recommendation_System_using_CNN
No ratings yet
A_Novel_Emotion_based_Music_Recommendation_System_using_CNN
5 pages
paper1 (1)
No ratings yet
paper1 (1)
7 pages
ANN_Based_Facial_Emotion_Detection_and_Music_Selection
No ratings yet
ANN_Based_Facial_Emotion_Detection_and_Music_Selection
5 pages
Emotion-Based_Music_Recommendation_System
No ratings yet
Emotion-Based_Music_Recommendation_System
5 pages
IJCRT2106505
No ratings yet
IJCRT2106505
5 pages
Class VIII CH 1 Excite (AI)
100% (2)
Class VIII CH 1 Excite (AI)
14 pages
Emotion Based Music Recommendation System
No ratings yet
Emotion Based Music Recommendation System
7 pages
425 17.face Emotion Based Music Detection System
No ratings yet
425 17.face Emotion Based Music Detection System
4 pages
LIL SEM
No ratings yet
LIL SEM
20 pages
Emotion-Based Music Recommendation Systems
No ratings yet
Emotion-Based Music Recommendation Systems
4 pages
JETIR2202408
No ratings yet
JETIR2202408
7 pages
Facial Emotion Based Automatic Music Recommender System
No ratings yet
Facial Emotion Based Automatic Music Recommender System
5 pages
Free
No ratings yet
Free
3 pages
Facial Expression Based Music Recommendation System: Ijarcce
No ratings yet
Facial Expression Based Music Recommendation System: Ijarcce
11 pages
Batch11 Documentation_removed (1)
No ratings yet
Batch11 Documentation_removed (1)
7 pages
Music Recommendation Based On Facial Expressions and Mood Detection Using CNN
No ratings yet
Music Recommendation Based On Facial Expressions and Mood Detection Using CNN
4 pages
Diabetic Retinopathy
No ratings yet
Diabetic Retinopathy
18 pages
Harmonic Fusion: AI-Driven Music Personalization Via Emotion-Enhanced Facial Expression Recognition Using Python, OpenCV, TensorFlow, and Flask
No ratings yet
Harmonic Fusion: AI-Driven Music Personalization Via Emotion-Enhanced Facial Expression Recognition Using Python, OpenCV, TensorFlow, and Flask
7 pages
Batch 7paper
No ratings yet
Batch 7paper
8 pages
Song Recommdation
No ratings yet
Song Recommdation
18 pages
Đề xuất bài hát thông qua phân tích cảm xúc
No ratings yet
Đề xuất bài hát thông qua phân tích cảm xúc
14 pages
Face Emotion Based Music Player System
No ratings yet
Face Emotion Based Music Player System
4 pages
Song Recommdation
No ratings yet
Song Recommdation
18 pages
Music Recommendation System Using Facial Detection Based Emotion Analysis
No ratings yet
Music Recommendation System Using Facial Detection Based Emotion Analysis
6 pages
Final 4
No ratings yet
Final 4
11 pages
203931682393208
No ratings yet
203931682393208
6 pages
New Final Poster
No ratings yet
New Final Poster
1 page
Reasearch Paper Abstracts
No ratings yet
Reasearch Paper Abstracts
5 pages
Emotion Based Music Recomentation System
No ratings yet
Emotion Based Music Recomentation System
2 pages
Music Recommendation System Based On Facial Expression
No ratings yet
Music Recommendation System Based On Facial Expression
5 pages
IJCSP23C1045
No ratings yet
IJCSP23C1045
6 pages
Music Recommendation Using Facial Emotion Recognition
No ratings yet
Music Recommendation Using Facial Emotion Recognition
4 pages
TensorFlow Cheatsheet Zero To Mastery V1.01
No ratings yet
TensorFlow Cheatsheet Zero To Mastery V1.01
26 pages
140 Quality Analysis of Vegetables Using Machine Learning Techniques
No ratings yet
140 Quality Analysis of Vegetables Using Machine Learning Techniques
28 pages
Mengistu Abebe
No ratings yet
Mengistu Abebe
137 pages
AI Made Easy For All
No ratings yet
AI Made Easy For All
54 pages
ascaad2023_075
No ratings yet
ascaad2023_075
20 pages
LSTM_Architecture_Presentation
No ratings yet
LSTM_Architecture_Presentation
18 pages
CCS355-Neural networks and deep learning__Assignment 1
No ratings yet
CCS355-Neural networks and deep learning__Assignment 1
15 pages
Artificial Intelligence (AI)
No ratings yet
Artificial Intelligence (AI)
34 pages
S5-4 - Kyu-Hwan Jung
No ratings yet
S5-4 - Kyu-Hwan Jung
50 pages
BTP Research Internship Final Report
No ratings yet
BTP Research Internship Final Report
21 pages
A Survey On Image Data Augmentation For Deep Learn
No ratings yet
A Survey On Image Data Augmentation For Deep Learn
49 pages
English Paper
No ratings yet
English Paper
13 pages
Developments in The Built Environment
No ratings yet
Developments in The Built Environment
29 pages
Topic Modeling Using NLP for Student Feedback
No ratings yet
Topic Modeling Using NLP for Student Feedback
5 pages
Delve Deep Into End-To-End Automatic Speech Recognition Models
No ratings yet
Delve Deep Into End-To-End Automatic Speech Recognition Models
6 pages
853-Article Text-1385-1-10-20220919
No ratings yet
853-Article Text-1385-1-10-20220919
10 pages
Explainability in IDS, Read Just After Intros
No ratings yet
Explainability in IDS, Read Just After Intros
6 pages
Ai (PPT) Shivam1
No ratings yet
Ai (PPT) Shivam1
11 pages
Long Short-Term Memory Networks PDF
No ratings yet
Long Short-Term Memory Networks PDF
22 pages
IEEE Research Paper 2
No ratings yet
IEEE Research Paper 2
5 pages
Sparse Predictive Hierarchies: Eric Laukien, Ogma Corp
No ratings yet
Sparse Predictive Hierarchies: Eric Laukien, Ogma Corp
30 pages
Machine Learning Framework For Pridicting Popularity of Pet Images
No ratings yet
Machine Learning Framework For Pridicting Popularity of Pet Images
10 pages
Second Review Final
No ratings yet
Second Review Final
19 pages
btech-cs-7-sem-artificial-intelligence-kcs071-2022
No ratings yet
btech-cs-7-sem-artificial-intelligence-kcs071-2022
1 page
AI Question Bank 2017 18 CSE
No ratings yet
AI Question Bank 2017 18 CSE
4 pages
AI (Artificial Intelligence)
No ratings yet
AI (Artificial Intelligence)
4 pages
STTP Schedule Atal
No ratings yet
STTP Schedule Atal
1 page
About Data Science Dojo: Bootcamp Outline
No ratings yet
About Data Science Dojo: Bootcamp Outline
1 page
New Microsoft Word Document
No ratings yet
New Microsoft Word Document
2 pages
Computer Audition: Fundamentals and Applications
From Everand
Computer Audition: Fundamentals and Applications
Fouad Sabry
No ratings yet
Affective Computing: Fundamentals and Applications
From Everand
Affective Computing: Fundamentals and Applications
Fouad Sabry
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

EAI Endorsed Transactions: Music Recommendation Based On Facial Emotion Recognition

Uploaded by

EAI Endorsed Transactions: Music Recommendation Based On Facial Emotion Recognition

Uploaded by

EAI Endorsed Transactions

on Scalable Information Systems Research Article

Music Recommendation Based on Facial Emotion

Received on DD MM YYYY, accepted on DD MM YYYY, published on DD MM YYYY

Elements such as sex, age [3], cultural background [4], mood,

The input layer of Resnet50 performs a vital function in

The incorporation of Explainable AI techniques like GRAD- 4.3. Music Recommendation:

4.4. Explainability with GRAD-CAM:

[23] Ian J. Goodfellow et al., "Challenges in Representation

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.