Industrial Marketing Management

Contents lists available at ScienceDirect

Industrial Marketing Management

journal homepage: www.elsevier.com/locate/indmarman

Research paper

An empirical case study on Indian consumers' sentiment towards electric

vehicles: A big data analytics approach
Rabindra Jena
IT and Analytics, Institute of Management Technology, Nagpur, India


Keywords: Today, climate change due to global warming is a significant concern to all of us. India's rate of greenhouse gas
Electric vehicles emissions is increasing day by day, placing India in the top ten emitters in the world. Air pollution is one of the
Deep learning significant contributors to the greenhouse effect. Transportation contributes about 10% of the air pollution in
Big data India. The Indian government is taking steps to reduce air pollution by encouraging the use of electric vehicles.
Sentiment analysis
But, success depends on consumer's sentiment, perception and understanding towards Electric Vehicles (EV).
This case study tried to capture the feeling, attitude, and emotions of Indian consumers' towards electric vehicles.
The main objective of this study was to extract opinions valuable to prospective buyers (to know what is best for
them), marketers (for determining what features should be advertised) and manufacturers (for deciding what
features should be improved) using Deep Learning techniques (e.g Doc2Vec Algorithm, Recurrent Neural
Network (RNN), Convolutional Neural Network (CNN)). Due to the very nature of social media data, big data
platform was chosen to analyze the sentiment towards EV. Deep Learning based techniques were preferred over
traditional machine learning algorithms (Support Vector Machine, Logistic regression and Decision tree, etc.)
due to its superior text mining capabilities. Two years data (2016 to 2018) were collected from different social
media platform for this case study. The results showed the efficiency of deep learning algorithms and found CNN
yield better results in-compare to others. The proposed optimal model will help consumers, designers and
manufacturers in their decision-making capabilities to choose, design and manufacture EV.

1. Introduction EVs are quite different technical specification from the vehicles
operating by an internal combustion engine (ICE) propulsion systems
The Electric Vehicles (EV) concepts are receiving colossal interest (Dijk, Orsato, & Kemp, 2013). Larminie and Lowry (2003) studied the
from different stakeholders due to its promising future. Electric features of Electric Vehicle technology and succeeded well to convey
Vehicles are becoming a topical concept (ACEA, 2017) to control future the essentials of this concept to stakeholders. There are several types of
long term emission targets. To promote this cause, an automobile event Electric Vehicles, such as Battery Electric Vehicles (BEV), Zero-Emission
had organised under The International Geneva Motor Show (2011) on vehicles (ZEVs) and Pure Electric Vehicle (PEV) (Ewing & Sarigollu,
March 2011 to showcase the future of eco-friendly transport industry. 2000), but this study was focused on the Battery Electric Vehicles
The event had a “Green Vision” section that displayed more than twelve (BEV). Battery Electric Vehicles (BEV) are being used mostly in India
brands of eco-friendly cars. The event stressed the importance of the and popularly known as EV. The EV consists of different building
Electric Vehicle concept and shown the future automobile footprint. But blocks, i.e. an electric battery for energy storage, an electric motor, and
the idea of an eco-friendly vehicle is not a new concept and has been a controller. The batteries are generally recharged from mains elec-
present for more than a century now (Anderson & Anderson, 2010; tricity via a plug. The battery charging unit can either be carried on-
Green Car Institute, 2003). Now, due to the need for new alternative board or fitted at the charging point. The controller controls typically
eco-friendly transportation requirements, there is a real consumer the power supplied to the motor in forward and reversed directions.
market waiting for these products. The enormous benefits of Electric The controller is usually known as a two-quadrant controller. It is
Vehicles in terms of emission and energy consumption are undoubtedly generally desirable to use regenerative braking to recoup energy and
bringing the future of environmentally friendly transportation. The frictionless braking. The controller is also allowed the regenerative
interest in Electric Vehicles is increasing day by day due to global braking in both forward and reverse directions. This technique is known
warming and climate change concerns. as a four-quadrant controller (Larminie & Lowry, 2003). Highly

Received 15 November 2018; Received in revised form 10 December 2019; Accepted 27 December 2019
0019-8501/ © 2019 Elsevier Inc. All rights reserved.

Please cite this article as: Rabindra Jena, Industrial Marketing Management, https://doi.org/10.1016/j.indmarman.2019.12.012
efficient electric motors are nowadays using permanent magnetic ma- towards electric vehicles. Therefore, this case study attempted to sense
terials from the rare earth elements (neodymium and samarium) to the people's sentiment towards EV. Different internet platforms were
make it more reliable and efficient. The efficiencies of different EV used to collect data for this research. Due to the very nature of the data
engine can be measured by energy units (MJ/km), or by CO2 equivalent (e.g. Velocity, Volume, Variety, veracity), a Big Data platform was
emissions per course unit (e.g., CO2/km), or by percentages of energy chosen for the sentiment analysis. So far, a large number of researches
transformed to the motion. has been published to extract polarity from social media data. But a
Apart for the above discussed technical specification of EV, the price majority of these research focus on either lexicon-based methods or
premiums, vehicle range, operating costs, refuelling behaviours and traditional data mining methods like Logistic regression, Support
stated environmental benefits etc. are the unique features that distin- Vector Machine, Decision Tree etc. These methods are based on shallow
guish EVs from other conventional vehicles. These features have led learning techniques, which are not able to extract sophisticated features
marketers/designers/manufacturers/commentators to define EVs as a due to the problem of vanishing gradient. Therefore, in this case- study,
disruptive innovation (Christensen, 1997). Due to the advantages of Deep Learning techniques were adopted to overcome these vanishing
eco-friendly electric vehicles, the market share is increasing very gradient problem. Deep learning techniques have the capability to
quickly. All the countries in the world are now promoting the use of generalize the problem in global ways and generating learning patterns
eco-friendly transportation. Therefore, the government of India is also beyond the immediate neighbors. The following section discussed the
focusing on improving the footfall of electric vehicle in India. The next detailed concepts and framework.
section discusses the market of EV in India.
3. Overview and related work
2. Electric vehicles in India
3.1. Consumer perspectives' towards electric vehicle
In 2013, the Electric Vehicle Industry (EVI) unveiled the ‘National
Electric Mobility Mission Plan (NEMMP) 2020’ to address the issues of Due to global warming and climate change, the major auto manu-
vehicular pollution and national energy security in India. It was aimed facturers are looking for alternative environmental-friendly vehicles for
to enhance the growth of domestic manufacturing capabilities of en- transportation (Takanori Okada, Tamaki, & Managi, 2019, Kasim,
vironmentally friendly vehicles. Following the Paris agreement, the Ibrahim, & Al-Ghaili, 2020). Many automobile companies such as
government of India is planning to make a significant shift to electric Chevrolet, Mahindra, Tesla and Nissan have launched different models
vehicles by 2030. Indian car manufacturers (Reva Electric Car of EV into the market. Now many more competing manufacturers
Company), and Indian app-based transportation network companies specialised in building EVs are entering into the market. However, the
‘Ola’ are working together to produce more energy-efficient and reli- sales of these vehicles are not encouraging in-compared to conventional
able electric cars. On the other hand, in pursuit of a clean and pollution- car sales due to the easy availability of oil and people mindset. Since the
free environment, the government has started encouraging the faster beginning of the twenty-first century, many countries and automakers
adoption and manufacturing of hybrid electric vehicles by providing have started giving more attention to EVs and hybrid EVs due to the
incentives for purchasing electric vehicles. Electric Vehicles covered increasing threats from climate change, CO2-emissions and increase in
under the Government of India's FAME scheme offers incentives of fuel price (Fernández, 2018; Lee & Madanat, 2017). The widespread
Rs.1800 to Rs.29,000 for scooters & motorcycles and Rs.1.38 Lac for distribution and adaption of EV's maybe the most promising solution for
cars (PTI, 2015). The last couple of years, the Government is releasing future transportation systems, but the mass distribution of EV's are
tenders to increase charging facilities in different parts of the country more difficult than that of other vehicles(traditional ones), because of
(PTI, 2015). In 2017, Karnataka(a states of India) had approved the several infrastructure limitations and technological differences (Proost
Electric Vehicle and Energy Storage Policy to promote cleaner energy & Van Dender, 2010).
(Sharan Poovanna, 2017). Public perception towards EVs and general willingness to use EV
According to Bloomberg Business (2018), India is considered as one can play an important role to promote these vehicles. Not only should
of Asia's biggest powerhouses with 7.3% economic growth over the last the technical limitations of EVs, including battery capacity and weight,
few years. But the people of India want to do more by spending less on be improved, but also peoples' personal and social issues should be
fuel and oil consumption. The government is planning to reduce investigated to increase the commercial distributions. Over the past few
spending on petroleum product by a shift to 100% electric vehicles years, many studies have been investigated the public's sentiment to-
before 2030. The Indian government has already started initiatives to wards the EV or plug-in hybrid electric vehicle (PHEV) and estimated
provide electric vehicles on a zero down payment option to promote the the market for those carbon-free clean energy-efficient vehicles. Many
sales of electric vehicles. In addition to the financial incentives, the researchers have studied and identified factors influencing the accep-
government of India has been agreed to sponsor up to 60% of the re- tance and rejection of EVs' (Min, Qiang, & Yisi, 2017; Yang, Deng, Tang,
search and development (R&D) costs for developing indigenous low- & Qian, 2013). Degirmenci and Breitner (2017) investigated the size
cost electric vehicle technology. This initiative will help India towards and characteristics of the potential market for EV in commercial fleets
switching to 100% electric vehicles by 2030. The logic behind this in- such as fleet size, range requirements and use patterns. Yang et al.
itiative is to reduce fossil fuel dependency and oil imports. In recent (2013) found that price, driving range and charging rate would influ-
years, the Indian government has invited tenders for setting up manu- ence the ownership and usage of EVs'. Ravi and Ravi (2015), in their
facturing facilities for 10,000 EVs in India. Additionally, Energy Effi- study, concluded that the adaptability of EV was heavily depended on
ciency Services Limited (EESL), Power Finance Corporation, Rural the public's lifestyle. Franke and Krems (2013) found that the actual
Electrification Corporation, and POWERGRID are working to accelerate range preferences of EV users were higher than their average range
the growth of the country's energy-efficient car market. Six of the need. Larson, Viáfara, Parsons, and Elias (2014) revealed that con-
country's leading automakers (e.g. Tata Motors, Renault, Hyundai, sumers were unwilling to pay the price of EV in the context of future
Nissan, Maruti Suzuki, and Mahindra & Mahindra) have already ex- fuel-saving advantages. Liu, Zou, Liu, et al., 2015 studied the influence
pressed their interest in partnering with the government in eco-friendly of environmental consciousness and attitudes to transportation on the
transportation domain. purchase intention of Min et al. (2017) revealed that unlike other
From the above discussion, it is clear that the government of India is countries, the high cost of Certificate of Entitlement (COE) and the
trying hard to work in line with the Paris agreement to promote cleaner purchase price of EV was the primary concern among the consumers in
energy by encouraging the use of electric vehicles. The government will Singapore. In their study, many respondents also listed the driving
succeed in its endeavour if the people of India show a positive mindset range and resale value as their primary concern.

On the other hand, the consumer will purchase EVs only after they hands-on experience with EVs can potentially change consumers' atti-
become aware of the technologies and accept them as a suitable al- tudes towards EVs. A study by Mark Singer (2017) found that rechar-
ternative to conventional internal combustion engine-based vehicles. ging opportunity largely influences the EV acceptance. The purchase
The technologies used in EV is indeed an innovation in the field of price of EVs is also a deterrent to the purchase of EV. Even though EVs
transportation. Therefore findings/guidelines of the theory of diffusion are available in a range of prices, but they are often more expensive in
of innovations can help to make this innovation acceptable to con- comparison to similar traditional vehicles due to its new technologies.
sumers (Rogers, 2010; Vij, 2020). Accordingly, a measure of exposure For instance, the cost of a battery is a significant barrier to EV accep-
can serve as a prerequisite and proxy measure for future vehicle pur- tance (Axsen, Kurani, & Burke, 2010). Battery capacity and vehicle
chases. Mark Singer (2017) studied how respondents reacted to the new weight are further obstacles to popularising EVs (Sovacool & Hirsh,
technology compared to the current standard technology (i.e., tradi- 2009). This discussion has shifted the focus of engineers and re-
tional internal combustion engine-based vehicles). Their results showed searchers towards the technical issues of EVs. The research has yielded
that consumer's attitudes towards EVs' technical features and percep- improvements in terms of technologies related to engine and batteries
tions about EVs utility are the significant factors that have influenced (Eberle & Von Helmolt, 2010). These improvements to the design and
the rate of adoption. Several researchers have studied consumer per- technical aspects of EVs have changed the users' sentiments on these
ceptions about the technical or functional attributes of EVs (Krupa vehicles. EVs have been rapidly diffused in several developed countries,
et al., 2014). The limited range of EVs is a well-known adoption barrier including Japan (Brown, 2013; Carley, Krause, Lane, & Graham, 2013),
in the technical domain (Skippon & Garwood, 2011). Skippon and the Netherlands, and the US, whereas in some nations, such as South
Garwood (2011) in their study among UK consumers' revealed that the Korea and China, India, it has propagated slowly (Thein & Chang, 2014;
limited range of 100 miles was perceived to be sufficient to own an EV Eunil Park, Lim, & Cho, 2018).
as a second car. ‘34’% of the participants of their study stated that Consumer sentiment and feelings have also been found to affect the
150 miles would make EV suitable as the first car. It has also been adoption of EVs (Christidis & Focas, 2019). The positive feelings in
debated that the limited driving range of EV is more of a perceived driving an EV was positively correlated with consumer attitudes and
barrier than an actual one. However, a study of ‘369’ Danish drivers intentions to adopt EVs for potential buyers (Graham-Rowe et al.,
who drove EVs for a trial period found that the range of EV is a real 2012). However, this study didn't provide information on the type of
concern since it is less than what users wish to have in an EV(Jensen, positive feelings that consumers anticipated to experience with EVs.
Cherchi, & Mabit's, 2013). The other important factor for EV adaptation Graham-Rowe et al. (2012), in their research, mentioned the various
is government policies (Sun & Xu, 2018). However, researchers in emotions expressed by consumers, who drove EVs (BEVs or PHEVs) for
consumer behaviour have concerns about the consumer's opinion and a trial period. On the one hand, some consumer expresses their senti-
expectation from these policies. Lane and Poter (2007) argued that ment like “feeling good” or “less guilt” after driving an EV. On the other
government regulations, fuel prices and financial incentives for the hand, consumers stated the feeling of “embarrassment” after driving a
consumer could influence the adoption. Moreover, researchers also small EV car (Graham-Rowe et al., 2012). Similarly, Schuitema, Anable,
argue that policies should be well explained to the consumers otherwise Skippon, and Kinnear (2013) examined the role of emotional attributes
policies will fail to increase adoption (Lane & Poter, 2007; David of EVs in consumer intentions to adopt EV cars. They used different
Greene, Hossain, Hofmann, Helfand, & Beach, 2018). On the other emotions (excitement, pleasantness and joy, embarrassment and pride)
hand, the frequent changes in policies can also create uncertainties in for measuring consumers' sentiment towards EV. The findings showed
consumers mind and make them resistant and consequently affect the that more positive attitudes of EVs lead to more positive emotions to-
adoption (Sun & Xu, 2018)). However, apart from consumers being wards EVs. This positive perception in turn positively influences the
fascinated by the financial incentives like tax rebates or government's intention to adopt EVs (G. Schuitema et al., 2013). Consumer emotions
cash refunds on the purchase of EVs in the US, UK and India, some EV were shown to be important in the domain of electric car purchase
adopters in these countries touch upon the issue of national in- (Elena Higueras-Castillo, Molinillo, Coca-Stefaniak, & Liébana-
dependence from foreign oil as a motivation factor to adopt EV Cabanillas, 2019), pro-environmental behavior (M.C. Onwezen,
(Skippon & Garwood, 2011; J.S. Krupa et al., 2014; Breetz & Salon, Antonides, & Bartels, 2013; Degirmenci & Breitner, 2017), consumer
2018). behavior towards adoption of innovations (Watson & Spence, 2007;
In spite of the trend of EV and its advantages, many researchers Shih & Schau, 2011; Carlucci, Cirà, & Lanza, 2018; White & Sintov,
have identified essential barriers to the wide diffusion of EVs (Egbue & 2017; Yang, Zhang, Fu, Fan, & Ji, 2018), and consumer adoption of EVs
Long, 2012, Eoin O'Neill, Moore, Kelleher, & Brereton, 2019). Market (Graham-Rowe et al., 2012; Schuitema et al., 2013). However, the
familiarisation plays a vital role in the EVs' adoption before it con- studies concern to consumer emotions has been overlooked in EV
sidered by potential consumers (Daramy-Williams, Anable, & Grant- adoption research. The antecedents and consequences of these critical
Muller, 2019). Understanding the barriers to acceptance of EV's can factors have not been thoroughly investigated. Previous investigations
help to determine the market for those vehicles and their future po- on consumer adoption of innovations and pro-environmental behaviour
tential. The electric distance or range (EVs can travel on a single were found different precursors to the emotions. These antecedents
charge) is limited by battery size or capacity is one of the significant were consumer's environmental beliefs and norms, internal attribution,
barriers in EV adoption. The time to charge is another barrier to social patterns, and perceptions of uncertainty and change of technol-
adoption. The time-to-charge depends on the size of the EV battery and ogies (Bamberg & Möser, 2007; Watson & Spence, 2007; Shih & Schau,
the charging equipment available. On the other hand, a gasoline/ pet- 2011;). Moreover, the influence of emotions on intentions to adopt EV
roleum-based vehicle is limited by the oil tank capacity, but filling has been theorised and shown to be varied concerning different pro-
stations are relatively prevalent, and vehicles are relatively quick to environmental behaviour contexts (Bamberg & Möser, 2007).
refuel. Other than the range and time to recharge, EVs' performance, Finally, it was observed that consumer sentiment was an overlooked
safety, size and style have been reported as barriers to adoption (Egbue aspect of consumer EV adoption research. A study by G. Schuitema
& Long, 2012). For some UK consumers, factors like acceleration, et al. (2013) was stepping stone research in the area of sentiment &
smoothness and less noise were found positive (Skippon & Garwood, emotions and their antecedents in the consumer EV adoption beha-
2011), while for some other UK consumers, performance and safety of viour. Ma, Fan, Guo, Xu, and Zhu (2019) studied consumer preference
EVs were evaluated as negative (Graham-Rowe et al., 2012). In another for EV in China. They found that in addition to other parameters like
study in Denmark, Jensen et al. (2013) found that hands-on experience price and technical specifications, EV aesthetics play a significant role
with EVs would alter the consumer's sentiment and perception in a in consumer choice. Understanding the cognitive and emotional re-
positive way towards EV. Therefore providing opportunities to get sponses can help marketing specialists and policymakers in designing

communication, education and policies to overcome barriers of EV case study, deep learning-based techniques were used for sentiment
adoption. Thus from the above discussion, it was believed that in- analysis. The next section will discuss the importance of Big Data
vestigating the antecedents and consequences of sentiments towards EV platform for sentiment analysis.
adoption behaviour can help to design the communication, education
and policy related to diffusion of EVs. Thus, the current study aimed to 3.3. Big data framework for sentiment analysis
explore the sentiment of Indian consumer towards acceptance of EV
using deep learning algorithms. The term Big Data has been in use since the last few decades. In
2012 Gartner defined big data as follows: “Big Data is high-volume,
3.2. Sentiment analysis high-velocity and high-variety information assets that demand cost-ef-
fective, innovative forms of information processing that enable en-
Sentiment Analysis (SA) or Opinion Mining (OM) is the analytical hanced insight, decision-making, and process automation”. The target
study of people's opinions, attitudes and emotions towards an entity. size and complexity of the Big Data growing day by day. There are four
The entity can represent individuals, events or topics. The two words, significant attributes including volume, variety, velocity, and veracity
SA or OM, are interchangeable use in the research community. Usually, that mainly define Big Data (Chen, Chiang, & Storey, 2012). In addition
these two terms express a mutual meaning. However, some researchers to the above characteristics of Big Data, there are other lots of chal-
stated that OM and SA have slightly different notions (Mikalai & lenges Big Data technologies tackle, i.e. data cleansing, feature en-
Palpanas, 2012). Opinion Mining is a process that extracts and analyses gineering, high-dimensionality, and data redundancy for data proces-
people's opinion about an entity. But, Sentiment Analysis identifies the sing. In Big Data environments, it is vital to process and act quickly on
sentiment expressed in a text and then explains it (Alsaeedi & Khan, the available data. Big Data has the potential to make a massive change
2019; Medhat, Hassan, & Korashy, 2014). Therefore, the primary ob- in all the dimension of science, but mining data from Big Data is not a
jective of SA is to find opinions, identify the sentiments, and then trivial task. According to Fan, Han, and Liu (2014), independent data
classify their polarity as positive, negative and neutral (Agnihotri, sources and decentralised control are two other essential characteristics
Dingus, Hu, & Krush, 2015; Harrigan, Soutar, Choudhury, & Lowe, of Big Data. Today, NLP and Deep Learning models are playing a vital
2014; He, Wu, Yan, Akula, & Shen, 2015). Sentiment Analysis also in- role in Big Data analysis. Big Data technologies bring transformative
volves recognising the evaluative nature of a part of passage or text. For potential and significant opportunities for the various dimension of
example, a product (electric vehicle) review can express a positive, human life.
negative, or neutral sentiment. Discovering sentiment in the document With the first blush of Big Data Analytics and Machine learning, it
has several applications, i.e. tracking sentiment towards products, has been possible to model complex systems and create intelligent
movies, diseases etc. Knowing sentiment about a product can help to systems, which can analyses texts and derive insight from business and
improve Customer Relationship Management (CRM). Over the past few society. On the other hand, e-commerce platforms are becoming a
years, there has been an extensive escalation in the use of social media popular medium for business transactions. Every day, more than 2
services such as Twitter, Facebook, Instagram worldwide for expressing billion visitors are hitting up the e-commerce platform and generating a
opinion/sentiment for different activities. Thus, there is an increasing lot of raw data. These data should be analyzed to understand the end-
incredible curiosity in sentiment analysis of small informal texts, such user behaviour for the benefit of organizations. If the users' clicks, re-
as tweets, comments and SMS messages, across a variety of domain views, tags, blog posts, and ratings are gathered and processed cor-
(Kühl, Goutier, Ensslen, & Jochem, 2019). Nowadays internet is ruling rectly, then any form of tortuous indicators can be converted to key
over every dimension of a human being. Today, if a consumer wants to performance indicators for the business. The customers are the lone
buy a product, he or she goes to e-commerce sites or applications on intention of any organization. Measuring their satisfaction level
mobile and check the product's rating and reviews, and overall features through different techniques can help both the consumers and the or-
before purchasing the product. So, the evaluation of the product plays a ganization.
vital role in selling the products. On the other hand, SA techniques are mainly focusing on identifying
Sentiment Analysis task is generally considered as a sentiment the sentiment of the user. SA approaches in big data can be divided into
classification problem. Extraction of text features is the main task of any two categories, namely content-specific and content-free. According to
sentiment analysis problem. The text features include the terms pre- Sharef and Haghanikhameneh (2014), SA can be defined as a penta-
sented and their frequencies, Parts of speech (POS), opinion words and merous opinion consisting of a target object, feature of the object etc.
phrases, and negation in the statement. There are many feature selec- Even though opinion mining was introduced decade back, SA has
tion techniques available in the literature. These feature selection gained increasing attention in big data because of the commercial value
methods can be divided into lexicon-based methods and machine emphasised by the organizations (Agnihotri et al., 2015; He et al., 2015;
learning-based methods. Lexicon-based approaches usually begin with Fang, Chen, Song, Wang, & Cao, 2017). The reason for that is the in-
a small set of ‘seed’ words. Then a vast lexicon is generated using creasing use of social media for reviewing the products and services.
bootstrapping. Casey, Navendu, and Shlomo (2005) presented many Unfortunately, studies on SA have progressed over a decade without
difficulties and problems in lexicon-based methods. The Lexicon-based recognising the strength of Big Data. Recently, due to the complex
approach relies on a sentiment lexicon, a collection of public and pre- nature of the data (in volume, velocity, veracity, and variety) and the
compiled sentiment terms. Further, Lexicon-based approach can be availability of several SA platforms for big data, users are now started
divided into the dictionary-based approach and corpus-based approach. taking interest in social media analytics using Big Data platforms
Both the technique uses statistical methods to find the sentiment po- (Batrinca & Treleaven, 2014; Sharef & Haghanikhameneh, 2014).
larity. The dictionary-based approach depends on finding opinion seed In spite of increasing awareness and acceptance on utilising Big
words from a dictionary and then searches of their synonyms and an- Data analytics platforms for SA to improve organization's productivity
tonyms. On the other hand, the corpus-based approach begins with a and profit, it is essential to find whether there is a gap between the
seed list of opinion words and then finds other opinion words from a existing Big Data framework and the SA techniques. The proper iden-
large corpus. tification of gaps will add new dimensions to the current research and
On the other hand, Machine Learning (ML) based statistical ap- also encourage detailed studies in future for knowledge creation. This is
proaches are fully automatic. The feature selection techniques of ML mainly because researches in SA have been rooted long before Big Data
treat the documents either as a group of words (Bag of Words (BOWs)) frameworks were developed and have focused primarily on text ana-
or as a string. BOW is used more often in ML due to its simplicity in the lysis. Many review-based studies (Medhat et al., 2014; Ravi & Ravi,
classification process (Aggarwal Charu, Zhai, & Xiang, 2012). In this 2015; Serrano-Guerrero, Olivas, Romero, & Herrera-Viedma, 2015) on

SA have focused on the techniques, applications and web services but of labelled data (Bengio, 2013; Bengio, Courville, & Vincent, 2013).
none of the available studies has focused on the adaptability of SA for Deep Learning algorithms can be effectively used to address the pro-
Big Data platforms. To make SA suitable for Big Data, the first point that blem of volume, veracity and variety of Big Data. Deep Learning al-
should be considered is whether the SA approaches can handle the very gorithms are very efficient to handle the massive amount of data (vo-
nature of Big Data. Several researchers have started to explore SA issues lume in Big Data). Deep Learning deals with data abstraction at an
in Big Data, such as scalability (Bing and Chan, 2014; Conejero, Burnap, abstract level. Therefore, it capably handles raw data in different for-
Rana, & Morgan, 2013), Big Data tools for SA (Ding, Song, Guo, Xiong, mats and resources (variety in Big Data) and also minimises the re-
& Hu, 2013; Mihanović, Gabelica, & Krstić, 2014; Prom-on, Ranong, & quirement of feature selection for the new data types observed in data.
Jenviriyakul, 2014), distributed SA approach (Bravo-Marquez, However, Big Data has some other characteristics to deal with when
Mendoza, & Poblete, 2014; Fulse, Sugandhi & Mahajan, 2014), and working with SA. Streaming and fast-moving nature of data can lead to
improved SA models for Big Data (Bing and Chan, 2014; Ding et al., some challenges for adopting Deep Learning in Big Data. Many studies
2013; Liu, Blasch, Chen, Shen, & Chen, 2013; Mukkamala, Hussain, & mainly focused on domain adoption during the learning process
Vatrapu, 2014). Undoubtedly, these papers were dated after, 2010, (Glorot, Bordes, & Bengio, 2011). Glorot et al. (2011) revealed that
which marks the booming of the Big Data era. Generally, SA does not Deep Learning could find intermediate data representations during the
specifically concentrate on the amount of data, and SA applications are hierarchical learning process and this representation can be useful other
expected to work in both small and large scale data. Therefore the application domains.
volume issue of big data doesn't have any negative impact on the Sentiment mining (feature engineering) and prediction tasks are the
adaptation of SA techniques since SA techniques range from content most important and most challenging task in analytics. Feature en-
specific to content-free approaches. On the contrary, the performance gineering is the main task of sentiment mining, and it learns the fea-
(precision) of the SA model on the large scale data increases due to tures automatically by themselves. The hierarchical feature learning
more trainable data. However, the issue of scalability was studied in process in Deep Learning extracts multiple layers of non-linear features,
depth by Liu et al. (2013) and observed that volume poses a lower and then a classifier combines all the features to make outcomes
influence for SA limitation compared to velocity and variety. On the (Larochelle, Bengio, Louradour, & Lamblin, 2009). Data mining models
other hand, the velocity aspect is closely related in SA because of the (e.g. Support Vector Machines and Decision Trees) are worked based on
increase in popularities and frequencies of social media usages. shallow learning and unable to extract sophisticated features. On the
Therefore the velocity aspect of big data have become one of the sig- other hand, Deep learning algorithms can generalize the relation in
nificant issues to be studied by researchers (Bravo-Marquez et al., 2014; global ways (relationships beyond immediate neighbors) using Big
Kranjc et al., 2014; Yu & Wang, 2015). The velocity issue is more im- Data. The process of Deep Learning is very close to human brain ac-
portant in big data sentiment analysis and relates closely to the volume tivities. So, in other words, Deep Learning learns the representation of
and variety. Hence, there is an increasing possibility of new linguistic data using deep architecture, having more layers. But hierarchical
features such as new acronyms, emoticons, idioms and terminologies feature learning process suffered from vanishing gradient problem. This
being created, which require an update of the SA model. problem makes these model perform poorly in comparison to shallow
In addition to above, social media messages are, by nature, shorter machine learning algorithms. Recurrent Neural Networks (RNN) is one
and generally not constructed with proper grammatical rules. It is very such example of neural networks that can overcome the vanishing
challenging to classify big data and hence decrease the classification gradient problem and increase the accuracy by considering all past
accuracy (Bing and Chan, 2014). Therefore more advanced SA models activities. Another major issue in Big Data is storing and retrieving data
are needed to explore to discover new linguistic features in the large effectively. Deep Learning algorithms can overcome those problems by
text. Nowadays, most researchers are using data mining techniques like generating high-level abstract vector representation for faster in-
logistic regression, support vector machines, decision tree etc. to de- formation retrieval. Deep Learning algorithms can help to reduce data
termining polarity in the text. These techniques primarily see each dimension and extract semantic features from a massive amount of text
word in the text in isolation resulting in sparse classification. Therefore data effectively.
this case study used deep learning based techniques were used to de- This section was devoted to understanding the concept of Big Data,
termine the sentiment in the text. sentiment analysis and Deep Learning methods. The discussion also
stressed upon the need for Deep Learning algorithms for sentiment
3.4. Deep learning and sentiment analytics in big data platform analysis using Big Data platform. The next section explains the detailed
framework adopted for this case study.
Big Data analytics provide an opportunity to develop novel algo-
rithms to address different issues in Big Data. Deep Learning algorithms 4. Proposed framework
are one of those potential solutions. There are many Big Data tasks that
have been performed by using Deep Learning algorithms. Deep The proposed framework was implemented using big data HADOOP
Learning techniques are not new, and it is dated back to the 1940s. Now platform (Fig. 1). Several APIs were used to supplement its function-
only it appears to be unique because; firstly, for several years, Deep ality. Details of each component are discussed below:
Learning techniques were relatively unpopular due to its complexity;
secondly, Deep Learning has gone through many different names before 4.1. Data acquisition
recently being called as “Deep Learning.” Deep Learning has been re-
branded many times based on various research perspectives by multiple People from across the globe nowadays are using social media to
researchers. Some basic history of Deep Learning are useful to under- express their sentiment for each and everything that they fee to share
stand the journey of Deep Learning. During the 1940s–1960s, Deep with friends and society. There are various advantages and dis-
Learning was known as cybernetics. Deep Learning was known as advantages of collecting and analysis social media data. Some of the
connectionism during the 1980s–1990s, and the current rejuvenation benefits are: (1) people generally don't feel any pressure to express their
under the name “Deep Learning” was coined in 2006 (Goodfellow, view as there are no such limitation and compulsion; (2) It is readily
Bengio, & Courville, 2016). Deep learning algorithms work through a available and cost-effective. There are also some limitations as (1)
multi-level hierarchical learning pattern. Deep Learning is useful for protection of user's privacy limits data collection; (2) Data is frag-
mining information from Big Data because of its multi-layer archi- mented by platforms, so difficult to summarise all at a time. The second
tecture. Once Deep Learning is learned using unsupervised data (Big limitation is due to the variety and veracity of data. These features can
Data), it can help traditional models to learn quickly from less amount be very well tackled by Big Data platform. Therefore, this was another

Prediction Classification Modeling Using Training
Deep Learning

Data Preprocessing

Affect Rule based Score Engine Sentiment

extractors extractors
Affect & Sentiment Lexicons

Data Cleaning &
Sentiment Cleaner,
Sentiment Influencer Languistic Lexicons
Identification Tokenizer

Stream Processor, APIs Internet Search Engine

And Data Acquisition And

Massage Filter Crawlers
Social Media Sites

Stage: Hadoop Distributed File System and HBase

Data Node-1 Data Node-2 ------ Data Node-N

Fig. 1. Proposed framework for sentiment analysis.

reason to adopt Big Data platform for sentiment analysis. using data preprocessing to ensure better results. There were many sub-
The first step for sentiment analysis was to collect data from vari- tasks performed in the proposed pre-processing step. The first subtask
eties of social media sources (Twitter, Facebook, Internet portal, etc.). in preprocessing was the removal of URLs from the text. For example, if
Majority of data was collected from Twitter and Facebook for this work. the text before removing URL looks like this “Mileage vs Cost | Tesla
Python Scrapy package (Scrapy, 2018) was used to collect EV reviews http:/ /tesla/1CQm19V #EV”. After removing the URL, the above text
from the Web. Scrapy is a system in which the user writes spiders shown as: “Mileage vs Cost | Tesla #EV”. The next subtask of the pre-
containing two sets of regular expressions (regex). Matching of ex- processing step was hashtag removal so that the text becomes more
pression in the first set by Crawled URL is called content pages, and cleaned like this: “Mileage vs Cost | Tesla EV”. Then all characters were
then the information is sent to a parsing pipeline. Similarly, matching converted to lowercase. The next step of data preprocessing was the
any expression in the second set by Crawled URLs is called linking removal of slangs from the text. For this purpose, WordNet dictionary
pages which hold links to other (content& linking) pages. In light of was used to check the meanings of each words, whether they are proper
Facebook promises to protect their users' data as the fallout of Cam- words or slangs. The next step was to check for emoticons in the text. If
bridge Analytica debacle, it was now not very easy to scrap the data any emoticon were present in the text, it would be replaced with the
from Twitter and Facebook. However, data for this research were col- word. For example ‘:)’ was replaced with ‘Happy’ and ‘:(‘ was replaced
lected from Facebook and Twitter with some limitation. In this study, with ‘sad’. For this purpose, a simple emoticon dictionary was created,
the messages posted during 2016–2018 were collected. Each message which contains the most commonly used social media emoticons. If in a
includes a messageID, a userID, user_country, user_text . The query single text, more than one emoticon was found, all of them was con-
string was applied to obtain only English language messages while verted to text and then scored accordingly. For example, “Low-cost EV
discarding the rest. Raw texts extracted from messages were obtained at makes me happy: )” was became “Low-cost EV makes me happy
the end of this stage and stored in HDFS, which was served as input to ‘happy’. Therefore, the score of happy was doubled because of its oc-
the next stage (preprocessing). Mathematically, the collected set of texts currence. Emoticons play an essential role in determining the polarity
were represented as T = { t1, t2 …tn}. of a text if present.
The next step of preprocessing was to remove the stop words from
4.2. Data pre-processing the text. For this purpose, a testifier was used to check whether it was a
stop word or not and then was removed if identified as a stop word.
4.2.1. Data cleaning Finally, the obtained texts were filtered and stored in HDFS for further
First of all, the sentiments collect in the form of a tweet or Facebook processing.
comments were filtered to obtain the messages only from India.
Sentiment analysis of words generally encounters several issues such as 4.2.2. Sentiment influencers identification
poor spelling, use of abbreviations, poor punctuation, poor grammar, The input to this task was the filtered text from the previous pre-
use of slangs etc. Therefore these hurdles were required to be removed processing phase. The filtered text was passed to a tokenizer. The

tokenizer separates each word in the text. Let W is the set of the total Sentiment classifier classifies the text as positive if the final score is
number of words in a text and expressed as W = { w1, w2 …wn}. Then positive, negative if the final score is negative, the tweet is declared as
Part-of-Speech (POS) algorithm was applied to each text. A Part-of- neutral if the final score is ‘0’. The output of this step was the three data
Speech (POS) tagger is a kind of algorithm that reads some text in a sub-corpus (Price, Maintenance and Safety) and the polarity score of
particular language and then assigns parts of speech to every word in each text in the respective sub-corpus.
the text such as noun, verb, adverb, adjective etc. Stanford POS tagger
(Python) was used for POS tagging in this framework. 4.3. Sentiment analysis using deep learning
This set of words (W) were passed to a POS tagger, which tags the
words of the text as nouns (N), verbs (V), adverbs (Adv) and adjectives In the previous sections, some advantages of using Deep Learning in
(Adj). It is also called sentiment influencers. A single text may contain Big Data platform were discussed. It was also considered the specific
more than once in a text. The output of this step was set of tagged text characteristics of Big Data, which can lead to some challenges in
(POS.TT) of sentiment influencers and was expressed as POS. TT = { adopting Deep Learning algorithms. In this section, detailed steps of
N1. . n, V1. . n, Adj1. . n, Adv1. . n}. sentiment analysis using Deep Learning algorithms was discussed. In
data mining, feature engineering is the most important and most
4.2.3. Rule-based scoring engine challenging task. The efforts involved in feature engineering determine
The output of the previous step (tagged text) was the input to this the performance of the algorithms that can learn features by them-
step. The score of the POS.TT was calculated using SentiWordNet dic- selves. Hierarchical feature selection algorithm in Deep Learning, ex-
tionary, which assigns a rating to each word in the text. If the word was tracts multiple layers of non-linear characteristics, and then a classifier
not found in the SentiWordNet dictionary, it then was searched in combines all the elements to make predictions. On the other hand, data
Wordnet dictionary to check for its base words and synonyms. If found, mining models (Support Vector Machines and Decision Trees) are based
the score was assigned to the word. The score obtained from on shallow learning techniques. Therefore, they are not able to extract
SentiWordNet dictionary was named as Standard Sentiment Score complex features due to the problem of vanishing gradient. But, Deep
(S·S·S). Let us suppose ‘α’ is the score of noun obtained from Learning methods can overcome vanishing gradient problem and have
SentiWordNet dictionary in the tagged text, ‘β’ is the score of a verb, ‘γ’ the capability to generating learning patterns, and establishing re-
is the score of an adverb, ‘δ’ is the score of an adjective in POS.TT. Then lationships beyond immediate neighbors in big data (Bengio, 2013;
the S.S.S was represented as { α1. . n, β1. . n, γ1. . n, δ1. . n}. Sohangir, Wang, Pomeranets, & Khoshgoftaar, 2018). Three Deep
The next step was to create data corpora for each of the important Learning techniques were implemented in this research. Details of these
features of EV. A short survey was conducted using sample size 200 to techniques were discussed below:
know the most exciting and determining features of EV liked by Indian
customers. The result of the study found that Price, Maintenance and 4.4. Doc2Vec algorithm
safety were the three important features to buy an EV. Therefore, the
total data corpus was divided into three subcorpora by comparing the Doc2Vec was first applied to the sentiment analysis problem.
noun terms to the three important features (e.g. Price, Maintenance and Doc2Vec uses the paragraph as a memory to keep the order of the words
Safety). If a text contains more than one noun, the first noun was in a sentence. Doc2Vec maps paragraphs and words to a vector. Le and
considered for the segregation. After dividing the total corpus into three Mikolov (2014) discussed the benefit of Doc2Vec architectures for
sets (Price, Maintenance and Safety), the next step was to determine the sentiment classification. Each paragraph vector was a combination of
polarity of each text. The sentiment polarity of each text in each sub- two vectors, i.e. one learned by distributed memory architecture (DM)
corpus was determined as follows; If the word is a verb then its score and the other learned with distributed bag-of-words (DBOW) archi-
was multiplied with ‘2’ because verbs describe the actions of the nouns tecture. The accuracy of the Doc2Vec model is generally affected by the
and play a significant role in determining the polarity of a text for window size. Larger window size improves efficiency. Doc2Vec was
sentiment analysis (Shamsudin, Basiron & Sa’aya, 2016; Ishtiaq, 2015). beneficial because after paragraph vectors learned from the labelled
If the word was found to be an adverb, the score was multiplied with ‘4’ data they can be used effectively for a task having un-labelled data. The
and given the highest weights because they modify verbs and adjectives distributed bag-of-words model ignores the input word context but in-
and used as intensifiers. Finally, if the word was an adjective, its re- stead predicts words by randomly selecting samples from the para-
spective score was multiplied with ‘3’ i.e. in between the score of a verb graph. In each iteration of the stochastic gradient descent, a text
and adverb because adjectives generally modify nouns. Mathematically, window and some random words from the chosen window was selected
Final Tagged Score (F.T.S) was represented as (Ishtiaq, 2015): to achieve higher accuracy. At the end of the process, classes were
formed based on the given paragraph vector. The distributed bag-of-
F. T. S = ∑ ( γ1..n ∗ 2)( γ1..n ∗ 3)( δ1..n ∗ 4) (1) words model was used in this study, as it is conceptually simple and
needs less memory. Doc2Vec Deep Learning algorithms are powerful to
After the calculation of the final score, the text was checked for the extract useful representation from various kinds of Big Data (Mikolov
negation. If found then the final score was inversed by multiplying with et.al., 2013).
Calculation of sentiment polarity of each text was very crucial for 4.5. Recurrent neural network
further processing. The Final Sentiment Score (FSS) was calculated first
by summing the tagged score obtained by tagging the text and then Recurrent Neural Network (RNN) is also an another effective
multiplying it with the respective number and finally, dividing it with method in the deep learning framework. The most exciting feature of
the total number of words (W) in a text. The mathematical re- RNN is that the input data are not independent of each other. RNN
presentation of FSS is given below: improves the accuracy in classification by knowing the previous itera-
tions' knowledge. Recurrent Neural Networks used the knowledge ob-
∑ F. T. S
FSS = tained in the last computation to perform the same task for every ele-
W (2)
ment of a sequence by allocating memory to capture the previously
The final sentiment score (positive, negative and neutral) of each processed information. But in practice, the vanishing gradient is a
text was appended at the end of each test. This score guides the clas- common problem in Deep Learning algorithm. Thankfully, there are a
sifier in training phase. Each text was passed to the trained sentiment variety of methods that help to address the vanishing gradient problem
classifier, which classified the tweet into positive, negative or neutral. in RNN. In this paper, the vanishing gradient problem was tackled by

(1) training the model with dozens of layers of non-linear hierarchical used to transform it into a scalar ci . This formula formulated as follow.
features; (2) generating learning patterns, and establishing relation-
xci = f (w. x i : i + h − 1 + b) (4)
ships beyond immediate neighbors in data (Bengio, 2013; Sohangir
et al., 2018) using Long short term memory (LSTM), (3) the rectified c = [c1 , c2…cn − h + 1] (5)
linear unit (ReLU)(Cho et al., 2014; Nair & Hinton, 2010) was used
instead of sigmoid activation functions) in the RNN architectures. where w is a filter map, ‘h’ denotes the window size of a filter, ‘f’ is a
RNNs have gained tremendous attention in the field of NLP. As non-linear activation function and ‘b’ represents a bias term. By doing
discussed earlier, one of the characteristics of RNNs is that it can use so, regional word vectors xi:i+h−1 in the text-matrix was convoluted to
their internal memory to process sequences of inputs in arbitrary c1. Then, subsampling operation was conducted as follows
length. An RNN consists of a hidden state ‘h’ and an optional output ‘y’ cmax = max{c } (6)
which operates on a variable-length sequencex = (x1,…xn, …xT). At
each time step ‘t’, the hidden state h(t) of the RNN is updated by h(t) = f From the above equations, a filter generates one cmax from a text
(h(t−1), xi) , Where f is a non-linear activation function. Unlike other matrix. Practically, the convolution and subsampling operations were
neural networks, quite complex activation functions for RNNs, such as performed in pair.
LSTM and GRU (Gated Recurrent Unit) cells was used by many re-
searchers. After the last word vector being inputted to the model, the 5. Experimental results
polarity distribution of the global polarity of a text can be given by the
softmax layer using h(t), as follows. Two years (June 2016- May 2018) social media data were collected
from major auto manufacturers and service provider/operators of four-
exp(wj h (T ) ) wheeler and two-wheeler companies in India e.g. Chevrolet, Nissan,
p (yj = 1) = K
∑ j ′ exp(wj h (T ) ) (3) Tesla motor, Tata Motor, Mahindra, Hero Motocorp, ELECTROTHERM
(INDIA) LTD, Xxplore Automotive Pvt. Ltd., AVERA New and
where K is the number of classes, j = 0, …(K − 1), and (K-1) and wj is Renewable Energy Moto Corp Tech Pvt. Ltd., Tunwal E Vehicles private
the rows of the weight matrix ‘W' of the softmax layer. By reusing the limited, CEEON INDIA and alibaba.com etc.
hidden units in the previous layer, RNNs allow past information en- In this case study, three essential features rated by Indian consumers
coding inside the networks. Such structures make it possible to com- (i.e. Price, Maintenance and Safety) were chosen for the analysis. A
press a variable-length input into a fix-length vector h(T). large corpus was built from online forums (≈ 98,000 texts), but only
≈37% of the corpus (over 36,000 texts) were contained the required
4.6. Convolutional neural network EV features. The main objective of the study was to extract opinions
valuable to prospective buyers, marketers (for determining what fea-
The fully-connected neural network is one of the most used deep tures should be advertised) and manufacturers (for deciding what fea-
learning technology for the classification task. The drawback of this tures should be improved) using deep learning techniques in the big
fully-connected approach is the vast number of connections in these data platform. Table 1 shows the polarity distribution of the corpus data
networks, which may lead to different problems. These problems can be set.
further amplified because of the high number of neurons require in text The average number of words in the text was 27, with a minimum of
processing for big data. Also, it is believed that words which are close 8 and a maximum of 37. The instances of the dataset were given in the
together in a sentence are more similar to each other when compared to [ < topic > , < text > , < polarity > ] format, for example [“price”,
words which never appear close together in any sentence. But fully- “the price is a little bit more in comparison to ….”, “Negative”]. Before
connected neural networks treat similarly to all input words whether presenting the experimental results, let first discuss the hyper-para-
they are far or closer in the sentence. The hierarchical learning process meters used in our training. All the proposed models were trained for
of Deep Learning makes it more expensive for high-dimensional data. 100 epochs. ADAM optimizer was used to optimize the loss function for
On other words, these kinds of Deep Learning algorithm can be stalled RNN and CNN with a learning rate of 0.0001 and a weight decay of
when dealing with Big Data that shows large volume and variety. On 0.00001. For all the models, accuracy was used as a performance me-
the other hand, Convolutional Neural Networks (CNN) offers certain tric, and 10-fold cross-validation was used for model evaluation (Kwok
advantages that make them desirable to address the above-stated pro- Tai Chui, Fung, Miltiadis, & MiuLam, 2018). The Big Data platform used
blems of the fully-connected neural network. First, each neuron in the for this research was consisting of 4 nodes. Each node were having
first hidden layer connected to a small region instead of connecting to following specification: Ubuntu 14.04.5 LTS (GNU/Linux 3.19.0-
all available input neurons. This reduction in the number of connection 25generic x86–64), Intel Core i7 CPU @ 3.40GHz, and 16GB RAM.
leads to reduce the overall complexity. Second, it helps to detect the To compare the performance of deep learning methods, Support
same feature in different locations of the input text by using the equal Vector Machine (SVM)) was used as a baseline model. SVM has been
weights for each of the hidden neurons. Third, the information from the established as a robust machine learning sentiment miner in many past
convolutional layers is transmitted to the output layer by introducing studies (Zainuddin and Selamat, 2014; R. Feldman, 2013; Tang, Tan &
pooling layers in between (Kim, 2014). CNN is one of the Deep Learning Cheng, 2009; Cortes & Vapnik, 1995). The machine learning methods
methods that can be used efficiently for Big Data analysis. CNN is also based on SVM with n-gram features proposed by Pang, Lee, and
one of the dominant models in Deep Learning, and it uses convolutional Vaithyanathan (2002) and Go, Bhayani, and Huang (2009) were used in
layers to filter inputs. this study. LibSVM Python scikit-learn library was used for SVM. Ca-
From the above discussion, it was found that CNN require fewer che_size = 200, decision_function_shape = ‘ovo’, degree = 3,
parameters than fully connected networks with the same number of gamma = ‘scale’, kernel = ‘rbf were used for SVM.
hidden layers, which makes them much easier to trained and use.
Let xi be the k-dimensional word vector corresponding to the ith Table 1
Polarity distribution of corpus.
word in a text; a text having ‘n’ words were represented as
x1:n = x1 ⊕ x2 ⊕ … ⊕ xn Where ⊕ is the concatenation operator. To Positive Negative Neutral
unify the matrix representation of texts in different length, the max-
Price 6187 7756 2513
imum length of all texts in the corpus was used as the fixed length for
Maintenance 4590 5634 2056
text matrices. For shorter texts, zero vector was padded at the back of a Safety 2489 3124 2252
text matrix. Next, convolution operation across each text matrix was

Table 2 Table 4
SVM performance matric. RNN performance matric.
Accuracy Accuracy

Positive Negative Neutral Positive Negative Neutral

Price Unigram 0.66 0.69 0.64 Price 0.72 0.73 0.69

Bi-gram 0.65 0.67 0.65 Maintenance 0.74 0.74 0.68
Unigram + Bigram 0.67 0.70 0.66 Safety 0.67 0.64 0.65
Maintenance Unigram 0.70 0.71 0.66
Bi-gram 0.69 0.71 0.65
Unigram+ Bigram 0.70 0.72 0.67
Table 5
Safety Unigram 0.64 0.63 0.62
CNN performance matric.
Bi-gram 0.63 0.61 0.61
Unigram + Bigram 0.64 0.64 0.65 Accuracy

Steps Positive Negative Neutral

The results of the SVM was presented in Table 2. The accuracy of
Price 200 0.51 0.49 0.52
SVM in all features were lies between 0.63 and 0.72.
2000 0.79 0.73 0.69
At first, the Doc2Vec model was applied to the data. This model was 20,000 0.87 0.88 0.87
used the paragraph/sentences as a memory to keep the order of the 60,000 0.89 0.91 0.88
words in a sentence, and maps paragraphs, as well as words, to a vector. Maintenance 200 0.53 0.54 0.51
2000 0.81 0.79 0.73
Le and Mikolov (2014) model was used for this research for Doc2Vec
20,000 0.88 0.89 0.86
implementation. In this experiment, each paragraph vector was a 60,000 0.92 0.92 0.88
combination of two vectors, one was learned using Distributed Memory Safety 200 0.51 0.48 0.52
(DM) architecture, and the other was learned with distributed bag-of- 2000 0.78 0.77 0.71
words (DBOW) architecture. Generally, the accuracy of the Doc2Vec 20,000 0.88 0.86 0.82
60,000 0.90 0.88 0.87
model affected by different window size; with larger windows having
higher accuracy. The window sizes of ‘5’ and ‘10’ were used to evaluate
the model. The Gensim Python library was used to implement Doc2Vec. form the hidden representation. These representations were then fol-
All words with a frequency of less than two were ignored. The results lowed by one (or multiple) fully connected layer(s) to make the final
were shown in Table 3. As expected, the accuracy of applying Doc2Vec prediction. In this case study, word embedding was performed based on
for a window size of 10 is higher than with a window size of 5, but their the pre-trained Glove model and then used a convolutional neural
difference was negligible. There was not much difference in the accu- network with three filter sizes (3,4,5), and 100 feature maps for each
racy of Doc2Vec and SVM methods. filter. The hidden representation was then connected to two fully-con-
In the second part of the experiment, RNN was used to get better nected layers and then fed into a softmax classifier.
results in comparison to SVM and Word2Doc model. In this experiment. Table 5 shows the results of the CNN model. By comparing the ac-
LSTM (Gers, Schmidhuber, & Cummins, 2000; Graves, 2012) model was curacy of this model to the other three models (SVM, Doc2Vec and
used instead of the basic RNN model due to its deeper memory struc- RNN), it was concluded that CNN outperformed other models after
ture. Theano (Bergstra, Breuleux, Bastien, & Lamblin, 2015) python 20,000 steps. After 20,000 steps, the accuracy of CNN was around 87%
library was used for LSTM. Average pooling method was used for which is considerably higher than the other models. From Table 5, it
pooling. The size of the hidden layer was 100 for this experiment. The was observed that the accuracy of prediction increases gradually with
result of these experiments was presented in Table 4. size. Comparative results of all the four models (SVM, Doc2Vec, LSTM,
The results showed that LSTM increased the accuracy in comparison and CNN) were presented in Table 6. Based on the results, it was ob-
to Doc2Vec and SVM, but not that significantly. Therefore, CNN's were served that CNN was an effective model for EV sentiment analysis for
adopted to see if it can help to improve the accuracy of EV sentiment this study.
analysis. CNN is very popular due to its ability to find the internal
structures of Big Data. Tensorflow (Pang, Lee, & Vaithyanathan, 2015)
package in Python was used for CNN. To perform text classification 6. Discussion
with CNN, usually the embedding from different words of a sentence (or
paragraph) are stacked together to form a two-dimensional array, and This empirical case study applied a conceptual framework which
then convolution filters (of varying length) are applied to a window of
‘h’ words to produce a new feature representation. Then pooling Table 6
(usually max-pooling) was applied on new features, and the pooled Performance comparison matric.
features from different filters were concatenated with each other to Models Accuracy (Best)

Positive Negative Neutral

Table 3
Price SVM 0.67 0.70 0.66
Doc2vec performance matric.
Doc2vec 0.69 0.73 0.67
Window size Accuracy RNN 0.72 0.73 0.69
CNN 0.89 0.91 0.88
Positive Negative Neutral Maintenance SVM 0.70 0.72 0.67
Doc2vec 0.70 0.73 0.67
Price 5 0.68 0.71 0.67 RNN 0.74 0.74 0.68
10 0.69 0.73 0.67 CNN 0.92 0.92 0.88
Maintenance 5 0.70 0.73 0.67 Safety SVM 0.64 0.64 0.65
10 0.70 0.73 0.67 Doc2vec 0.65 0.64 0.64
Safety 5 0.64 0.64 0.63 RNN 0.67 0.64 0.65
10 0.65 0.64 0.64 CNN 0.85 0.83 0.81

examines the sentiment of Indian consumers towards the EVs. This the consumers' sentiments. Therefore (Fig. 2), it was found that that
conceptual framework adopted deep learning algorithms to classify the majority of sentiments towards maintenance were negative. Some
sentiment of consumers using Big Data platform. Deep learning-based comments from the consumers were showing that they were not en-
algorithms were used to build the system due to their excellent per- joying the EV charging process compared to the refuelling process of
formance and promise in many areas, e.g. text processing, image pro- conventional vehicles. It may be due to the time and technologies as-
cessing, and in the domain of natural language processing. Deep sociated with battery charging (Glerum, Stankovikj, & Bierlaire, 2014)
Learning is more suitable to address the data analysis and learning and unavailability of on-road charging points in many parts of the
problems in Big Data. In contrast to data mining approaches with its country. During result analysis, few comments were observed for
shallow learning process, deep learning is a robust technique that consumers' experience with battery degradation due to the effect of
transforms inputs through more layers for a better result. Hidden layers repeated charging and the impact of climate on battery capacity over
in deep learning are generally used to extract features or data re- time. It is known that weather and charging cycle are generally affect
presentations more sensibly in compare to traditional data mining the battery life, but it is unknown to what extent climate affect the
techniques. The hierarchical learning process in deep learning provides battery life(A. G. Boulanger et al., 2011). It was also observed from the
the opportunity to find word semantics and relations more accurately. comments that some consumers were experiencing non-trivial battery
These attributes make deep learning one of the most desirable models degradation problem and about 60% of sentiments towards degradation
for sentiment analysis. Based on the results obtained in this study, it were negative. Designers and Manufacturers can use these sentiments in
was concluded that convolutional neural networks are the best model to unison with Indian climate pattern to derive conclusions about the ef-
find sentiment polarity in EV data. In a standard data mining approach fect of climate on battery capacity.
(like SVM) to text classification, documents are represented as bag-of- Further, it was observed that consumers in high-temperature re-
words vectors. Unfortunately, these vectors represent which words gions (central India) post more often about battery degradation issues.
appear in a document but do not consider the order of the words in a The warranty was another frequently discussed term in sentiments
sentence. Therefore classification accuracies were not up to the sa- massages, which are mostly negative because some consumers have
tisfaction level. But CNN provides the opportunity to use n-grams fea- filed for battery replacements through battery warranties against ca-
tures to extract the sentiment of a document efficiently. CNN benefits pacity loss. It was also observed that the majorly of neutral sentiments
due to its internal structure of data, where each computation unit re- were regarding safety issues. It may be due to the non-seriousness of
sponds to a small neighbourhood of input data. Based on the experi- Indian consumers' towards safety issues or their unawareness about the
mental results, the accuracy of CNN algorithm was considerably better safety issues of EV. Analysis of some other classified posts revealed
than other deep learning algorithms. The results obtained from this other vehicle problems such as replacement issues and maintenance
study and the proposed model can be useful to prospective buyers, rate of various parts, which can be useful to designer and manu-
marketers (for determining what features should be advertised) and facturers.
manufacturers (for deciding what features should be improved). In this study, some features were hard to mine or classify. For ex-
From this case study, price, maintenance and safety were found ample, the proposed system performed poorly on classifying opinions
commonly cited features for EV adoption (A. G. Boulanger, Chu, Maxx, related to safety. It was found that the word ‘issue’ was heavily over-
& Waltz, 2011; Liao, Molin, & van Wee, 2017)(Fig. 2). For price and loaded but used often. In some cases it was used synonymously with
maintenance, the majority of sentiments are negative. The price men- hazard, for example ‘that was a safety issue!’, in some other instances, it
tioned here includes the on-road price of the vehicle. It was mainly due was used synonymously with other feature, for example, grounded
to the prevailing quoted price for electric vehicles in India in compar- charging is a safety issue. Therefore, programmatically, it was difficult
ison to conventional (Petrol/Diesel/Gasoline) vehicle (Bansal & to detect all such phrases under consideration is a negative or neutral
Agarwal, 2018; Liao et al., 2017). Nowadays, different state govern- sentiment. Moreover, it was noticed that price (even though it was cited
ments and central government are declaring incentives for electric ve- as a significant adoption barrier) has less genuine comments. It was
hicles, but it is not enough to attract sizable consumers for adopting found too many posts comment on the price of something other than
EVs. Maintenance of EV is commonly cited as a major selling point of the price of the EV, such as the price of charging and electricity etc.
EVs(Bansal & Agarwal, 2018; Liao et al., 2017; Tamor, Moraal, Finally, some words were context-dependent even within the context of
Reprogle, & Milačić, 2015). It is because of the absence of a proper one feature. Therefore, it induces an error in classification due to the
engine in EV implies fewer moving parts to fail and fewer fluids to ambiguity and complex structure of English. Similarly, some ambiguous
change. sentences are complicated to classify even by human readers. For ex-
But on the other hand, maintenance of battery and uncertainty ample, a parameter which sounds favourable to some people, such as
of its life due to immature battery technology adversely affected “100 miles per gallon”, but it is tough to impose a strict cutoff for other
people to agree, e.g., “all mileages over X are positive”. So algor-
ithmically, it was a considerable challenge to classify them correctly.
Price Maintainance Safety

7. Conclusions
Understanding consumer's sentiment and perceptions towards EVs
can useful to build the latest generation environmental friendly cars.
Unfortunately, it is costly and time consuming to conduct field trials or

target surveys specifically to potential EV users. Therefore, a sentiment

mining framework was adopted in this study that classifies opinions/
0.15 sentiments towards EV. The proposed system will help the user to ob-
0.1 tain a high-level overview of sentiments on various product features,
0.05 and dramatically reduces the length of the text the users required to
0 read and extract opinions about the products/services. These opinions
Positive Negative Neutral will useful to prospective buyers, marketers (what features should be
Sentiments advertised?) and manufacturers (what features should be improved?).
There are several avenues for extending and improving this work,
Fig. 2. Sentiment polarity. such as (1) sentiments generally change over time, by analyzing

sentiments periodically, e.g., weekly/monthly can help to find how to fail. Boston: Harvard Business School Press.
sentiment change with vehicle performance and prices. Thus, tracking Christidis, P., & Focas, C. (2019). Factors affecting the uptake of hybrid and electric ve-
hicles in the European Union. Energies, 12(18), 3414.
consumers' opinions over time can reveal insights about how sentiment Chui, K. T., Fung, D. C. L., Miltiadis, D. L., & MiuLam, T. (2018). Predicting at-risk uni-
changes with vehicle experience on a more extended period, (2) this versity students in a virtual learning environment via a machine learning algorithm.
case study didn't perform pronoun resolution. The present model only Computers in Human Behavior. https://doi.org/10.1016/j.chb.2018.06.032.
Conejero, J., Burnap, P., Rana, O., & Morgan, J. (2013). Scaling archived social media
categorised opinionative sentences where an explicit or implicit feature data analysis using a hadoop cloud. Proceedings of the IEEE 6th International Conference
using noun, verb, adjectives and adverbs were found. Thus pronoun on Cloud Computing, Jun. 28 Jul. 3 (pp. 685–692). Santa Clara, CA: IEEE Xplore Press.
resolution may be useful to infer sophisticated features more accu- https://doi.org/10.1109/CLOUD.2013.120.
Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine Learning, 20(3),
rately, (3) this study didn't perform spam or malicious text detection. 273–297 1995.
All sentences were equally treated even though some text may contain Daramy-Williams, E., Anable, J., & Grant-Muller, S. (2019). A systematic review of the
sentences injected by malicious sources, (4) in the feature selection evidence on plug-in electric vehicle user experience. Transportation Research Part D:
Transport and Environment, 71, 22–36.
steps; a simple method was adopted to partition the data by comparing
Degirmenci, K., & Breitner, M. H. (2017). Consumer purchase intentions for electric ve-
a noun to the required features such as price, maintenance and safety. If hicles: Is green more important than price and range? Transportation Research Part D,
more than one feature found in the sentence, then the sentence was 51, 250–260.
classified based on the first noun matching. It is not an efficient way to Dijk, M., Orsato, R. J., & Kemp, R. (2013). The emergence of an electric mobility tra-
jectory. Energy Policy, 52, 135–145.
organise the data into different corpus. More sensitive methods can be Ding, W., Song, X., Guo, L., Xiong, Z., & Hu, X. (2013). A novel hybrid HDP-LDA model for
useful to increase the model performance metrics in future. sentiment analysis. Proceedings of the IEEE/WIC/ACM International Joint Conferences
on Web Intelligence (WI) and Intelligent Agent Technologies, Nov. 17-20 (pp. 329–336).
Atlanta, GA: IEEE Xplore Press. https://doi.org/10.1109/WI-IAT.2013.47.
References Eberle, U., & Von Helmolt, R. (2010). Sustainable transportation based on electric vehicle
concepts: A brief overview. Energy & Environmental Science, 3, 689–699.
ACEA (2017). Electric vehicles. ACEA - European Automobile Manufacturers’ Association. Egbue, O., & Long, S. (2012). Barriers to widespread adoption of electric vehicles: An
Retrieved November 30, 2017, from http://www.acea.be/industrytopics/tag/ analysis of consumer attitudes and perceptions. Energy Policy, 48, 717–729.
category/electric-vehicles. Ewing, Gordon, & Sarigollu, Emine (2000). Assessing Consumer Preferences for Clean-
Aggarwal Charu, C., Zhai, & Xiang, C. (2012). Mining text data. Springer New York Fuel Vehicles: A Discrete Choice Experiment. Journal of Public Policy & Marketing, 19,
Dordrecht Heidelberg London: @Springer Science+Business Media. 106–118.
Agnihotri, R., Dingus, R., Hu, M. Y., & Krush, M. T. (2015). Social media: Influencing Fan, J., Han, F., & Liu, H. (2014). Challenges of big data analysis. National Science Review,
customer satisfaction in B2B sales. Industrial Marketing Management, 53, 172–180. 1(2), 293–314.
Alsaeedi, A., & Khan, M. Z. (2019). A study on sentiment analysis techniques of twitter Fang, Y., Chen, X., Song, Z., Wang, T., & Cao, Y. (2017). Modelling propagation of public
data. International Journal of Advanced Computer Science and Applications, 10(2), opinions on microblogging big data using sentiment analysis and compartmental
361–374. models. International Journal on Semantic Web and Information Systems (IJSWIS),
Anderson, C. D., & Anderson, J. (2010). Electric and hybrid cars: A history (2nd ed.). 13(1), 11–27.
Jefferson: McFarland & Company, Inc. Feldman, R. (2013). Techniques and applications for sentiment analysis. Communications
Axsen, J., Kurani, K. S., & Burke, A. (2010). Are batteries ready for plug-in hybrid buyers? of the ACM, 56(4), 82–89.
Transport Policy, 17, 173–182. Fernández, R. (2018). A more realistic approach to electric vehicle contribution to
Bamberg, S., & Möser, G. (2007). Twenty years after Hines, Hungerford, and Tomera: A greenhouse gas emissions in the city. Journal of Cleaner Production, 172, 949–959.
new meta-analysis of psycho-social determinants of pro-environmental behavior. Franke, F., & Krems, J. F. (2013). What drives range preferences in electric vehicle users?
Journal of Environmental Psychology, 27(1), 14–25. Transport Policy, 30, 56–62.
Bansal, A., & Agarwal, A. (2018). Comparison of electric and conventional vehicles in Fulse, S., Sugandhi, R., & Mahajan, A. (2014). A survey on multimodal sentiment analysis.
Indian market: Total cost of ownership, consumer preference and best segment for Int. J. Eng. Res. Technol. 3, 1233–1238.
electric vehicle. International Journal of Science and Research (IJSR), 7(8), 683–695. Geneva Motor Show (2011). Geneva motor show 2011. Retrieved Feb, 2019, from
Batrinca, B., & Treleaven, P. C. (2014). Social media analytics: A survey of techniques, https://en.wikipedia.org/wiki/Geneva_Motor_Show.
tools and platforms. AI & Society, 30, 89–116. Gers, F., Schmidhuber, J., & Cummins, F. (2000). Learning to forget: Continual prediction
Bengio, Y. (2013). Deep learning of representations: Looking forward. International con- with LSTM. Neural Computation, 12, 2451–2471.
ference on statistical language and speech processing (pp. 1–37). Berlin: Springer. Glerum, A., Stankovikj, L., & Bierlaire, M. (2014). Forecasting the demand for electric
Bengio, Y., Courville, A., & Vincent, P. (2013). Representation learning: A review and new vehicles: Accounting for attitudes and perceptions. Transportation Science, 48(4),
perspectives. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(8), 483–499.
1798–1828. Glorot, X., Bordes, A., & Bengio, Y. (2011). Domain adaptation for large-scale sentiment
Bergstra, J., Breuleux, O., Bastien, F., & Lamblin, P. (2015). Thumbs up? Sentiment classification: A deep learning approach. In: Proceedings of the 28th international
classification using machine learning techniques. Python in Science, 9, 23–27. conference on machine learning (ICML-11). 513–20.
Bing, Li, & Chan, K. C. C. (2014). A Fuzzy Logic Approach for Opinion Mining on Large Go, A., Bhayani, R., & Huang, L. (2009).
Scale Twitter Data. Proceedings of the 2014 IEEE/ACM 7th International Conference on Twittersentimentclassificationusingdistantsupervision.CS224N project report.
Utility and Cloud Computing (UCC ’14) (pp. 652–657). USA: IEEE Computer Society. Stanford, 1, 12.
https://doi.org/10.1109/UCC.2014.105. Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep learning. MIT Presshttp://www.
Bloomberg Business. India Races Forward in Electric Cars. (2011). Availabel at: https:// deeplearningbook.org.
www.bloomberg.com/opinion/articles/2018-12-22/india-gets-policies-on-electric- Graham-Rowe, E., Gardner, B., Abraham, C., Skippon, S., Dittmar, H., Hutchins, R., &
car-infrastructure-right ( accessed on Aug, 2019) . Stannard, J. (2012). Mainstream consumers driving plug-in battery-electric and plug-
Boulanger, A. G., Chu, A. C., Maxx, S., & Waltz, D. L. (2011). Vehicle electrification: in hybrid electric cars: A qualitative analysis of responses and evaluations.
Status and issues. Proceedings of the IEEE, 99(6), 1116–1138. Transportation Research Part A, 46, 140–153.
Bravo-Marquez, F., Mendoza, M., & Poblete, B. (2014). Meta-level sentiment models for Graves, A. (2012). Supervised sequence labelling with recurrent neural networks. Heidelberg:
big social data analysis. Knowledge-Based Systems, 69, 86–99. Springer.
Breetz, H. L., & Salon, D. (2018). Do electric vehicles need subsidies? Ownership costs for Green Car Institute (2003). Study of NEV user behavior in California. CA, USA: San Louis
conventional, hybrid, and electric vehicles in 14 U.S. cities. Energy Policy, 120, Obispo.
238–249. Greene, D., Hossain, A., Hofmann, J., Helfand, G., & Beach, R. (2018). Consumer will-
Brown, M. (2013). Catching the PHEVer: Simulating electric vehicle diffusion with an ingness to pay for vehicle attributes: What do we know? Transportation Research Part
agent-based mixed logit model of vehicle choice. Journal of Artificial Societies and A: Policy and Practic,e, 118, 258–279.
Social Simulation, 16, 5. Harrigan, P., Soutar, G., Choudhury, M. M., & Lowe, M. (2014). Modelling CRM in a social
Carley, S., Krause, R. M., Lane, B. W., & Graham, J. D. (2013). Intent to purchase a plug-in media age. Australasian Marketing Journal, 23, 27–37.
electric vehicle: A survey of early impressions in large US cites. Transportation He, W., Wu, H., Yan, G., Akula, V., & Shen, J. (2015). A novel social media competitive
Research Part D: Transport and Environment, 18, 39–45. analytics framework with sentiment benchmarks. Information Management, 52,
Carlucci, F., Cirà, A., & Lanza, G. (2018). Hybrid electric vehicles: Some theoretical 801–812.
considerations on consumption behaviour. Sustainability, 10, 1302. Higueras-Castillo, E., Molinillo, S., Coca-Stefaniak, J. A., & Liébana-Cabanillas, F. (2019).
Casey, W., Navendu, G., & Shlomo, A. (2005). Using appraisal groups for sentiment Perceived value and customer adoption of electric and hybrid vehicles. Sustainability,
analysis. Proceedings of the ACM SIGIR Conference on Information and Knowledge 11, 4956. https://doi.org/10.3390/su11184956.
Management (CIKM) (pp. 625–631). . Ishtiaq, M. (2015). Sentiment analysis of twitter data using sentiment influencers. Journal
Chen, H., Chiang, R. H. L., & Storey, V. C. (2012). Business intelligence and analytics: of Intelligent Computing, 6(1), 17–25.
From big data to big impact. MIS Quarterly, 36–40. Jensen, A. F., Cherchi, E., & Mabit, S. L. (2013). On the stability of preferences and at-
Cho, K., Bart, M., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., & Bengio, Y. titudes before and after experiencing an electric vehicle. Transportation Research Part
(2014). Learning phrase representations using RNN encoder-decoder for statistical D. Transport and Environment, 25, 24–32.
machine translation. arXiv, 1406, 1078. Kasim, H., Ibrahim, Z.-A., & Al-Ghaili, A. M. (2020). Reducing climate change for future
Christensen, C. M. (1997). The Innovator’s dilemma: When new technologies cause great firms transportation: Roles of computing. Computational Science and Technology, 43–54.

Kim, Y. (2014). Convolutional neural networks for sentence classification. Proceedings of dependency on fossil fuels. Retrieved Nov 2017b from Available on https://www.
the 2014 conference on empirical methods in natural language processing (EMNLP) , livemint.com/.
1746–1751,. October 25–29, 2014 Doha, Qatar. Prom-on, S., Ranong, S. N., & Jenviriyakul, P. (2014). DOM: A big data analytics fra-
Kranjc, J., Smailović, J., Podpečan, V., Grčar, M., Žnidaršič, M., et al. (2014). Active mework for mining Thai public opinions. Proceedings of the International Conference on
learning for sentiment analysis on data streams: Methodology and workflow im- Computer, Control,Informatics and Its Applications, Oct. 21–23 (pp. 1–6). Bandung: IEEE
plementation in the ClowdFlows platform. Inform. Process. Manage, 51, 187–203. Xplore Press. https://doi.org/10.1109/IC3INA.2014.7042591.
Krupa, J. S., Rizzo, D. M., Eppstein, M. J., Brad Lanute, D., Gaalema, D. E., Lakkaraju, K., Proost, S., & Van Dender, K. (2010). What sustainable road transport future? Trends and
& Warrender, C. E. (2014). Analysis of a consumer survey on plug-in hybrid electric policy options; OECD/ITF joint transport research centre discussion paper (No.
vehicles. Transportation Research Part A: Policy and Practice, 64(2014), 14–31. 2010–14). Retrieve on Jan 2018 from https://www.itfoecd.org/sites/default/files/
Kühl, N., Goutier, M., Ensslen, A., & Jochem, P. (2019). Literature vs. twitter: Empirical docs/dp201014.pdf.
insights on customer needs in e-mobility. Journal of Cleaner Production, 213, 508–520. PTI (2015). FAME-India scheme launched to offer sops on hybrid, e-vehicles. Retrieved
Lane, B., & Poter, S. (2007). The adoption of cleaner vehicles in the UK: Exploring the Nov 2017 from https://economictimes.indiatimes.com/industry/fame-india-scheme-
consumer attitude-action gap. Journal of Cleaner Production, 15, 1085–1092. launched-to-offer-sops-on-hybrid-e-vehicles/articleshow/46853934.cms.
Larminie, J., & Lowry, J. (2003). Electric vehicle technology explained. John Wiley & Sons. Ravi, K., & Ravi, V. (2015). A survey on opinion mining and sentiment analysis: Tasks,
Larochelle, H., Bengio, Y., Louradour, J., & Lamblin, P. (2009). Exploring strategies for approaches and applications. Knowledge-Based Systems, 89, 14–46.
training deep neural networks. Journal of Machine Learning Research, 10, 1–40. Rogers, E. M. (2010). Diffusion of innovations (4th ed.). New York, NY: Simon and
Larson, P. D., Viáfara, J., Parsons, R. V., & Elias, A. (2014). Consumer attitudes about Schuster.
electric cars: Pricing analysis and policy implications. Transportation Research Part A: Schuitema, G., Anable, J., Skippon, S., & Kinnear, N. (2013). The role of instrumental,
Policy and Practice, 69, 299–314. hedonic and symbolic attributes in the intention to adopt electric vehicles.
Le, Q., & Mikolov, T. (2014). Distributed representations of sentences and documents. Transportation Research Part A: Policy and Practice, 48, 39–49.
International conference on machine learning, 31, 25–28. Serrano-Guerrero, J., Olivas, J. A., Romero, F. P., & Herrera-Viedma, E. (2015). Sentiment
Lee, J., & Madanat, S. (2017). Optimal design of electric vehicle public charging system in analysis: A review and comparative analysis of web services. Information Sciences,
an urban network for greenhouse gas emission and cost minimization. Transp. Res. 311, 18–38.
Part C Emerg. Technol, 85, 494–508. Scrapy. Available at https://scrapy.org/ (Accessed on Jan 2019).
Liao, F., Molin, E., & van Wee, B. (2017). Consumer preferences for electric vehicles: A Shamsudin, Nurul Fathiyah, Basiron, Halizah, & Sa’aya, Zurina (2016). Lexical based
literature review. Transport Reviews, 37(3), 252–275. sentiment analysis-verb, adverb & negation. Journal of Telecommunication, Electronic
Liu, B., Blasch, E., Chen, Y., Shen, D., & Chen, G. (2013). Scalable sentiment classification and Computer Engineering (JTEC), 8(2), 161–166.
for big data analysis using naive bayes classifier. Proceedings of the International Sharef, N. M., & Haghanikhameneh, F. (2014). Content-based analysis method for sen-
Conference on Big Data, Oct. 6-9 (pp. 99–104). Silicon Valley, CA: IEEE Xplore Press. timent scoring in microblogging mining. In H. Fujita, & S. Corporation (Eds.). New
https://doi.org/10.1109/BigData.2013.6691740. trends in software methodologies, tools and techniques (pp. 398–414). Amsterdam: IOS
Liu, T., Zou, Y., Liu, D., et al. (2015). Reinforcement learning of adaptive energy man- Press ISBN-10: 1607506297\.
agement with transition probability for a hybrid electric tracked vehicle. IEEE Shih, E., & Schau, H. J. (2011). To justify or not to justify: The role of anticipated regret
Transactions on Industrial Electronics, 62(12), 7837–7846. on consumers’ decisions to upgrade technological innovations. Journal of Retailing,
Ma, S.-C., Fan, Y., Guo, J.-F., Xu, J.-H., & Zhu, J. (2019). Analysing online behaviour to 87(2), 242–251.
determine Chinese consumers’ preferences for electric vehicles. Journal of Cleaner Singer, M. (2017). The barriers to acceptance of plug-in electric vehicles: 2017 update, na-
Production, 229(20), 244–255. tional renewable energy laboratory, technical report, NREL/TP-5400-70371.
Medhat, W., Hassan, A., & Korashy, H. (2014). Sentiment analysis algorithms and ap- Skippon, S., & Garwood, M. (2011). Responses to battery electric vehicles: UK consumer
plications: A survey. Ain Shams Engineering Journal, 5, 1093–1113. attitudes and attributions of symbolic meaning following direct experience to reduce
Mihanović, A., Gabelica, H., & Krstić, Ž. (2014). Big data and sentiment analysis using psychological distance. Transportation Research Part D, 16, 525–531.
KNIME: Online reviews vs. social media. Proceedings of the 37th international con- Sohangir, S., Wang, D., Pomeranets, A., & Khoshgoftaar, T. M. (2018). Big data: Deep
vention on information and communication technology, electronics and microelectronics, learning for financial sentiment analysis. J Big Data, 5, 3. https://doi.org/10.1186/
May 26-30 (pp. 1464–1468). Opatija: IEEE Xplore Press. https://doi.org/10.1109/ s40537-017-0111-6.
MIPRO.2014.6859797. Sovacool, B. K., & Hirsh, R. F. (2009). Beyond batteries: An examination of the benefits
Mikalai, T., & Palpanas (2012). Themis. Survey on mining subjective data on the web. and barriers to plug-in hybrid electric vehicles(PHEVs)and a vehicle-to-grid (V2G)
Data Mining and Knowledge Discovery, 24, 478–514. transition. Energy Policy, 37, 1095–1103.
Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient estimation of word re- Sun, X., & Xu, S. (2018). The impact of government subsidies on consumer preferences for
presentations in vector space. arXiv preprint arXiv, 1301, 3781 2013. alternative fuel vehicles. J. Dalian Univ. Technol. (Soc. Sci.), 3, 8–16.
Min, X., Qiang, M., & Yisi, L. (2017). Public’s perception of adopting electric vehicles: A Tamor, M. A., Moraal, P. E., Reprogle, B., & Milačić, M. (2015). Rapid estimation of
case study of Singapore. Journal of the Eastern Asia Society for Transportation Studies, electric vehicle acceptance using a general description of driving patterns.
12, 285–298. Transportation Research Part C: Emerging Technologies, 51, 136–148.
Mukkamala, R. R., Hussain, A., & Vatrapu, R. (2014). Fuzzy-set based sentiment analysis Tang, H., Tan, S., & Cheng, X. (2009). A survey on sentiment detection of reviews. Expert
of big social data. Proceedings of the 18th International Enterprise Distributed Object Systems with Applications, 36(2009), 10760–10773.
Computing Conference, Sept. 1–5 (pp. 71–80). Ulm: IEEE Xplore Press. https://doi.org/ Thein, S., & Chang, Y. S. (2014). Decision making model for lifecycle assessment of li-
10.1109/EDOC.2014.19. thium-ion battery for electric vehicle—A case study for smart electric bus project in
Nair, V., & Hinton, G. E. (2010). Rectified linear units improve restricted Boltzmann Korea. Journal of Power Sources, 249, 142–147.
machines. Proceedings of the 27th international conference on machine learning. Israel: Vij, A. (2020). Understanding consumer demand for new transport technologies and
Haifa. services, and implications for the future of mobility. In N. Biloria (Ed.). Data-driven
Okada, T., Tamaki, T., & Managi, S. (2019). Effect of environmental awareness on pur- multivalence in the built environment (pp. 91–107). Cham: Springer.
chase intention and satisfaction pertaining to electric vehicles in Japan. Watson, L., & Spence, M. T. (2007). Causes and consequences of emotions on consumer
Transportation Research Part D: Transport and Environment, 67, 503–513. behaviour: a review and integrative cognitive appraisal theory. European Journal of
O’Neill, E., Moore, D., Kelleher, L., & Brereton, F. (2019). Barriers to electric vehicle Marketing, 41(5/6), 487–511.
uptake in Ireland: Perspectives of car-dealers and policy-makers. Case Studies on White, L. V., & Sintov, N. D. (2017). You are what you drive: Environmentalist and social
Transport Policy, 7(1), 117–128. innovator symbolism drives electric vehicle adoption intentions. Transp. Res. Part A
Onwezen, M. C., Antonides, G., & Bartels, J. (2013). The norm activation model: An Policy Pract., 99, 94–113.
exploration of the functions of anticipated pride and guilt in pro-environmental be- Yang, S., Deng, C., Tang, T., & Qian, Y. (2013). Electrical vehicle’s energy consumption of
havior. Journal of Economic Psychology, 39, 141–153. car-following models. Nonlinear Dynamics, 71, 323–329.
Pang, B., Lee, L., & Vaithyanathan, S. (2015). TensorFlow: Large-scale machine learning Yang, S., Zhang, D., Fu, J., Fan, S., & Ji, Y. (2018). Market cultivation of electric vehicles
on heterogeneous distributed systems. Preliminary white paper, 9, 1–7. in China: A survey based on consumer behavior. Sustainability, 10, 4056. https://doi.
Pang, B., Lee, L., & Vaithyanathan, S. (2002). Thumbs up? Sentiment classification using org/10.3390/su10114056.
machine learning techniques. EMNLP (pp. 79–86). . Yu, Y., & Wang, X. (2015). World cup 2014 in the twitter world: A big data analysis of
Park, E., Lim, J., & Cho, Y. (2018). Understanding the emergence and social acceptance of sentiments in U.S. sports fans’ tweets. Computers in Human Behavior, 48, 392–400.
electric vehicles as next-generation models for the automobile industry. Sustainability, Zainuddin, Nurulhuda, & Selamat, Ali (2014). Sentiment analysis using support vector ma-
10, 662–675. chine. Proceedings of In Computer, Communications, and Control Technology (I4CT).
Poovanna, S. (2017). Karnataka govt approves electric vehicle policy to reduce IEEE333–337.


