0% found this document useful (0 votes)

12 views18 pages

A13 Nandan and Ghosh 167-184

Uploaded by

Dheeraj h

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views18 pages

A13 Nandan and Ghosh 167-184

Uploaded by

Dheeraj h

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 18

Original Scientific Article Journal of Decision Analytics and Intelligent Computing

Vol. 3 issue 1, (2023) 167-184 https://doi.org/10.31181/jdaic10008102023n

Pre-owned car price prediction by employing

machine learning techniques
Mauparna Nandan1,*, and Debolina Ghosh2

1 Department of Computer Applications, Techno Main Saltlake, Kolkata, India

2 Department of Information Technology, Haldia Institute of Technology, Haldia, India

* Correspondence: mauparna2011@gmail.com

Received 7 May 2023

Accepted for publication 27 September 2023
Published 8 October 2023

Abstract

Pre-owned automobiles including cars are becoming incredibly popular. There has been a steady increase in
automobile production namely, passenger cars over the preceding decade with more than 70 million passenger
cars being manufactured in 2016 itself. This has given rise to the resale automobile market, which has become a
thriving business in its own right. Customers who are interested in purchasing a pre-owned car frequently face
the difficulty in locating a vehicle that fits within their financial constraints as well as estimating the price of a
specific pre-owned car. Customers can make more educated decisions regarding the purchase of a pre-owned
car if they have access to accurate price projections for pre-owned cars. With the proliferation of digital
marketplaces, both the buyer and the seller remain more updated regarding the recent market trends and
patterns that impact the value of a used car. In this paper, we investigate this issue and propose a forecasting
system using machine learning techniques that enables a prospective buyer to anticipate the price of a pre-
owned vehicle of interest. The process is conducted with the collection and pre-processing of a dataset followed
by an exploratory data analysis. Various machine learning regression techniques, such as Linear Regression,
LASSO (Least Absolute Shrinkage and Selection Operator) Regression, Decision Tree, Random Forest, and
Extreme Gradient Boosting, have subsequently been implemented. The techniques are then compared so as to
determine an optimal solution. Three types of errors namely, MAE, MSE and RMSE have also been calculated in
order to determine the best-fitted model.

Keywords: Price Prediction, Machine Learning, Mean Absolute Error (MAE), Mean Squared Error (MSE), Root
Mean Squared Error (RMSE)

1. Introduction

The pre-owned automobile market is an ever-rising industry and have almost doubled its market value in the
past decade. In today’s world second hand cars have become very popular worldwide. The manufacturer sets
the prices of new cars in the market, and the government imposes additional taxes. As a result, customers who
purchase new cars can be confident that their investment is worthwhile. However, the high cost of new cars and
Journal of Decision Analytics and Intelligent Computing 3(1) (2023) 167-184 Nandan and Ghosh

customers' inability to afford them due to financial constraints have led to a rise in global sales of used cars
(Arora et al., 2022). People belonging to middle class status who cannot afford to buy brand new, expensive cars
can buy used cars nowadays. As a result of this, pre-owned car selling has increased to a large extent. Because of
the proliferation of internet marketplaces like CarDheko, Quikr, Carwale, Cars24, and many more, it is now
easier than ever for both the buyers and sellers to learn about the factors that affect a used car's price (Hankar
et al., 2023). An efficient method is required to accurately evaluate the value of used cars by considering various
features. While there are many websites that provide this service, their prediction techniques may not be ideal.
Moreover, the effectiveness of predicting the actual market value of a used car may vary based on the model
and system used. Therefore, it is very crucial to know the actual market value for used cars while purchasing or
selling pre-owned cars. Generally the price of an used car is less than that of the original price of a car. Thereby,
estimating the values of used cars is a very tedious job, as it depends on multiple factors like car mileage
(number of kilometers travelled), manufacturing year, engine size, transmission type, power of the car and
several other factors.
But, nowadays with the advent of modern technology like artificial intelligence, the retail value of an
automobile can be estimated by applying various Machine Learning algorithms based on a predefined set of
characteristics. There is no standard formula for estimating the selling price of used cars since different websites
employ different algorithms to do so. One can easily get an approximate estimate of the price without actually
entering the vehicle specifications into the desired website by training statistical models for forecasting the
costs. The primary purpose of this research is to employ different ML prediction models to estimate the resale
value of a used car and compare their performance accuracy parameters. Consequently, this results in
substantial time and effort savings for both sellers and buyers interested in second-hand vehicles. Furthermore,
the proposed model can also predict the variation in used car prices corresponding to different body types with
respect to their manufacturing year. In addition, the car manufacturers such as Mercedes-Benz, Toyota, and
Honda can determine which model should be produced in greater quantities if they wish to maintain
competition in the used cars market.
2. Related Works

Pudaruth (2014) have proposed the prediction of the price of used cars by employing four different types of
Machine Learning algorithms namely, Multiple Linear Regression Analysis, Naïve Bayes, Decision Trees and K-
Nearest Neighbours. Pal et al. (2019) have proposed the methodology for car price prediction using Random
Forest. In this paper, it has been concluded that good accuracy has been achieved from Random Forest in
comparison to other previous works. Shanti et al. (2021) have proposed the idea of Machine Learning-Powered
app for the prediction of prices of used cars. Four models were evaluated namely Random Forest, Neural
Network, Gradient Boosting and Support Vector Regressor.
Venkatasubbu and Ganesh (2019) have estimated the used cars price prediction using Supervised learning
techniques. In this paper using Lasso Regression, Regression trees and Multiple Regression, a statistical model
was developed which based upon a given set of features and previous consumer data, the price of used cars
were predicted. Amik et al. (2021) have estimated the application of machine learning techniques for prediction
of cars which are pre-owned in Bangladesh. From this paper it has been concluded that XGBoost predicts the
resale prices of used cars with higher accuracy.
AlShared (2021) have estimated the used cars price prediction and valuation using Data Mining techniques.
This paper mainly predicts the price of used cars in Dubai. From this paper Random Forest has an accuracy of
95% which is the highest among all. Arefin (2021) have estimated Second Hand Price Prediction for Tesla
Vehicles. This paper mainly stated that for the price prediction of a Tesla vehicle, how machine learning
techniques such as SVM, Random Forest and deep learning techniques have been implemented.

168
Journal of Decision Analytics and Intelligent Computing 3(1) (2023) 167-184 Nandan and Ghosh

Salim and Abu (2020) have developed a model namely, S-curve based on the used cars which have the
maximum prices that are predictive in nature. To formulate maximum equation model of a new S-curve model,
S-shaped Membership Function have been used as a base function. Farrell (1954) have discussed about the
motor cars which have demand in the United States.
Monburinon et al. (2018) have predicted the used car prices by using Regression Models. Using
supervised machine learning models, a relative study on regression performance had been conducted where
Multiple Linear Regression, Random Forest Regression, Gradient Boosted Regression trees have been used to
build used car’s price model. By using Mean Absolute Error (MAE) as a parameter, the results were compared.
Sun et al. (2017) have estimated the price evaluation model in Second-hand Car System based on the theory of
BP Neural Network. A model of second-hand car price evaluation in online have been developed locally which
helps in enhancing the speed and accuracy.
3. Proposed Methodology

In the current research problem, a prediction model is constructed by implementing various machine learning
algorithms for predicting the prices of pre-owned cars by considering different parameters using regression
analysis. The architecture of the proposed system is depicted in Figure 1 below.

Figure 1. Control Flow Graph of the Proposed System

The proposed methodology is defined as follows:
- Data acquisition: At first data of different cars have been collected including features and target
containing price as the main parameter.

169
Journal of Decision Analytics and Intelligent Computing 3(1) (2023) 167-184 Nandan and Ghosh

- Data cleaning: Data cleaning comprises of identifying null values and removing them, filling missing values
and removing outliers.
- Preprocessing: The preprocessing is being performed through Normalization or Standardization.
- Exploratory Data Analysis (EDA): Exploratory Data Analysis involves conducting initial investigations on
data to identify patterns, detect anomalies, test hypotheses, and verify assumptions through the use of
summary statistics and graphical representations.
- Dividing into training and testing set: The dataset which is obtained after preprocessing is being split into
testing and training dataset.
- Model training: After the dataset is split into training and testing features the model is trained with the
help of different machine learning algorithms by employing regression techniques.
- Making predictions on the testing dataset: The testing dataset which is obtained is being predicted, after
that the testing values which are obtained is compared with the predicted values as a result of which price
can be predicted.
In the next section, each of these points will be illustrated with respect to the results obtained.
4. Modeling and Result Analysis

4.1. Data Acquisition

The dataset employed in the current study has been downloaded from Kaggle and comprises of 426880 used
cars scraped data records which is the world's largest collection of used vehicles for sale in the United States.

4.2 Data Cleaning

The data cleaning mainly comprises of removing the irrelevant features from the dataset namely, 'URL',
'region_url', 'vin', 'image_url' etc. to name a few of them. After that, the null values in the data are identified for
each feature followed by the filling up of the missing values by applying appropriate methods and finally,
removing the outliers from the data. Figure 2 displays the missing values before data cleaning process and Figure
3 depicts the corresponding null values in grey colour.

Figure 2. Missing values before data cleaning process Figure 3. Distribution of null values in grey colour

170
Journal of Decision Analytics and Intelligent Computing 3(1) (2023) 167-184 Nandan and Ghosh

To replenish the missing values in the data, the IterativeImputer technique is employed, with a variety of
estimators being developed and their respective MSEs being generated using cross_val_score. Mean Squared
Error (MSE) is computed as the average squared deviation between the true value and the predicted value
retrieved from the data set. To deal with missing values, generally the MSE values are calculated by employing
some central tendency measures like mean, median etc. along with some iterative imputation estimators. The
imputation estimators employed in the current study are BayesianRidge Estimator, DecisionTreeRegressor
Estimator, ExtraTreesRegressor Estimator and KNeighborsRegressor Estimator respectively. Figure 4 displays the
MSE with 4 different Imputation methods.

Figure 4. MSE with 4 different Imputation Methods

The preceding diagram suggests that the ExtraTreesRegressor estimator is preferable for the imputation
strategy in the case of missing value. The second step is to fill up the missing values of categorical variables. One
hot encoding is employed for this which converts each distinct value of a variable effectively into its
corresponding binary variable. Figure 5 and 6 depicts the missing values after being filled up at the end of data
cleaning process.

Figure 5. Missing values after data cleaning process Figure 6. No null values in the dataset

171
Journal of Decision Analytics and Intelligent Computing 3(1) (2023) 167-184 Nandan and Ghosh

The third and the final step is to remove the outliers from the data by employing the InterQuartileRange (IQR)
method. Figure 7 and 8 depicts the Box Plots of Price and Odometer to reveal the outliers that lie within them.
The price outliers in Figure 7 are those that have a logarithmic value less than 6.55 or greater than 11.55. Since
no clear conclusion can be drawn from Figure 8, the interquartile range (IQR) is computed to identify the
outliers, specifically for odometer values that fall below 6.55 or above 11.55.

Figure 7. Box Plot of Price with outliers Figure 8. Box Plot of Odometer with outliers
Figure 9 displays the Box Plots and Histogram corresponding to Year. From Figure 9, it can be observed that
the outliers are the year earlier than 1995 or later than 2020.

Figure 9. Box Plot and Histogram Plot of Year

Finally, after processing the dataset, its shape changed from (435849, 25) to (374136, 18), indicating that a
total of 61713 rows and 7 columns were eliminated.

4.3. Data pre-processing

Data pre-processing is achieved through the implementation of Label Encoder and Normalization techniques.

172
Journal of Decision Analytics and Intelligent Computing 3(1) (2023) 167-184 Nandan and Ghosh

- Label Encoder: The dataset comprises of 12 features which are categorical variables and 4 features which
are numerical variables (excluding the price column). To utilize machine learning models, it is mandatory
to convert these categorical variables into numerical variables. The sklearn library's LabelEncoder is being
utilized to accomplish this task.
- Normalization: The dataset is not distributed normally and each feature has a distinct range. If the data is
not normalized, the machine learning model may ignore features with low values as their impact will be
negligible compared to the larger values. To overcome this issue, the sklearn library's MinMaxScaler is
utilized to normalize the data.
4.4 Exploratory Data Analysis

Let us now explore the various Exploratory Data Analysis (EDA) visualizations in the current dataset. Figure 10
depicts the correlation plot among the various feature variables in the dataset.

Figure 10. Correlation Matrix Plots among the different variables

It can be noted that there is low correlation among the features present in the data. Next, the pair-plots
between the various variables is illustrated in Figure 11.

173
Journal of Decision Analytics and Intelligent Computing 3(1) (2023) 167-184 Nandan and Ghosh

Figure 11. Pair-plots to find correlation

The pair plot doesn't provide any conclusive evidence as there is no apparent correlation between the
variables. Figure 12 represents the distribution of price.

Figure 12. Graph showing distribution of Price

174
Journal of Decision Analytics and Intelligent Computing 3(1) (2023) 167-184 Nandan and Ghosh

Based on the information as displayed in the Distplot, it can be inferred that the price undergoes a rapid hike
in the beginning, but after a certain time, it begins to depreciate. Next, Figure 13 describes the bar plot of price
plots corresponding to each fuel type.

Figure 13. Bar Plots displaying the price of each fuel type

Upon analysis of the graph, it can be concluded that the cost of diesel cars is higher than that of electric cars,
while hybrid vehicles are the least expensive. Figure 14 depicts the variation of car price and fuel type with
change in hue condition.

Figure 14. Bar Plots of fuel and price with hue condition

From this bar-plot analysis, it can be concluded that the hue condition of a car also plays a significant role in
determining its price based on the type of fuel it uses. Figure 15 depicts the car prices variation with year.

175
Journal of Decision Analytics and Intelligent Computing 3(1) (2023) 167-184 Nandan and Ghosh

Figure 15. Graph displaying car price variation per year

The first plot in Figure 16 indicates that the prices of cars have been consistently rising annually since 1995,
while the second plot illustrates an increasing trend in the number of cars per year. However, it can be observed
that there is a point in time, specifically in 2012, where the number of cars seems to plateau and remain
relatively constant.

Figure 16. Bar Plot displaying the price with respect to the car condition

From Figure 16, it can be deduced that the price of cars is influenced by their condition, as the car price
fluctuates according to the car's size and condition. Figure 17 depicts the car price with respect to transmission
type.

176
Journal of Decision Analytics and Intelligent Computing 3(1) (2023) 167-184 Nandan and Ghosh

Figure 17. Bar Plot displaying the price with respect to the car transmission type

Upon analysis, it is evident that the price of cars differs depending on the type of transmission. Buyers are
willing to purchase cars with automatic transmission, while cars with manual transmission are priced lower.

4.5 Splitting the dataset into training and testing set

During this procedure, 90% of the data was allocated for the training dataset, while the remaining 10% was
designated as the testing dataset.

4.6 Training with ML Models

This section involves utilizing various machine learning algorithms to predict the the price of pre-owned cars
which is the target variable in the current study. We now apply a set of supervised machine learning algorithms
to achieve the targeted pre-owned car prices.
(a) Linear Regression: Linear regression is a statistical technique that involves modeling the correlation
between a single outcome variable, also known as the dependent variable, and one or more explanatory
variables, also known as independent variables. Linear predictor functions are utilized to model the
relationships in linear regression, and their unknown parameters are estimated from the data. These
models are referred to as linear models. The coefficients in a statistical model show whether the
relationship between a predictor variable and the response variable is positive or negative.
- A positive coefficient implies that an increase in the predictor variable leads to an increase in the
response variable.
- A negative coefficient implies that an increase in the predictor variable leads to a decrease in the
response variable.
Figure 18 illustrates the performance of Linear Regression algorithm and Figure 19 displays the predominant
features using the Linear Regression algorithm.

177
Journal of Decision Analytics and Intelligent Computing 3(1) (2023) 167-184 Nandan and Ghosh

Figure 18. Graph displaying performance of LR Figure 19. Feature importance using LR

Based on the graph, it can be inferred from the Linear Regression analysis that the year, cylinder,
transmission, fuel, and odometer variables are the most significant ones.
(b) Ridge Regression: Ridge Regression is a method used to examine multiple regression data that is affected
by multicollinearity. In cases of multicollinearity, the least squares estimates may be unbiased, but their
variances are substantial, which can result in values that are significantly different from the actual ones.
In order to determine the optimal alpha value for Ridge Regression, the AlphaSelection tool from the
yellowbrick library was utilized.
Figure 20 displays the Ridge Regression alpha error while Figure 21 represents the feature importance
corresponding to Ridge model.

Figure 20. Graph displaying best value of Alpha Figure 21. Feature importance using RR

According to the figure plotted, the optimal alpha value for adjusting the dataset is 20.336. It should be noted
that alpha value is not fixed and can change each time. The Ridge Regressor method is applied based upon this

178
Journal of Decision Analytics and Intelligent Computing 3(1) (2023) 167-184 Nandan and Ghosh

alpha value. The figure also suggests that year, cylinder, transmission, fuel and odometer are the most
prominent feature variables.
(c) Lasso Regression: Lasso Regression is a form of linear regression that implements shrinkage, which
involves pulling data values towards a central point such as the mean. By using the Lasso approach, the
development of straightforward, concise models is promoted. The objective of Lasso Regression is to
identify the subset of predictors which results in the lowest prediction error for a quantitative response
variable. To achieve this, the Lasso applies a restriction on the model parameters that induces regression
coefficients for certain variables to contract to zero value.
Figure 22 depicts the most prominent features corresponding to Lasso model.

Figure 22. Lasso Regression feature importance

(d) K-Nearest Neighbor: Local interpolation is used to predict the target based on the nearby targets in the
training set. This approach is known as K-NN, which is a form of lazy learning or instance-based learning.
In k-NN, the function is only approximated locally, and all calculations are postponed until function
evaluation.
Figure 23 represents the error plot of k-NN model. The figure depicts the least error for k=5 with
n_neighbors=5 and metric='euclidean'. Thus, it can be concluded that the performance of KNN is better since as
the accuracy increases, the error decreases.

Figure 23. Error Plot for KNN for k range 1-9

(e) Random Forest: The Random Forest is a classification technique that involves multiple decision trees. To
generate a diverse set of trees with uncorrelated predictions, the algorithm employs bagging and feature

179
Journal of Decision Analytics and Intelligent Computing 3(1) (2023) 167-184 Nandan and Ghosh

randomness during tree construction. By aggregating the predictions of all the trees, the Random Forest
algorithm aims to produce more accurate results than any single decision tree. Our model generates 180
decisions by implementing a maximum of 50% of the available features.
Figure 24 and Figure 25 describes the performance and feature importance of Random Forest classifier
respectively.

Figure 24. Performance Graph of RF

Figure 25. Feature importance using RF

The basic bar chart demonstrates that the year of the car is the most significant characteristic, followed by the
odometer variable and then other variables. The Random Forest algorithm has displayed improved performance
with an increase in accuracy of approximately 10%, which is positive. As the algorithm utilizes bagging in building
each tree, so the next step will be to perform the Bagging Regressor.
(f) Bagging Regressor: A Bagging Regressor is a type of ensemble meta-estimator that builds individual
regression models on random subsets of the original dataset and then combines their predictions to
produce a final prediction. This can be executed by taking a vote or by averaging the individual
predictions. The purpose of this meta-estimator is to decrease the variability of a black-box estimator,
such as a decision tree, by adding randomness to its creation process and then creating an ensemble from
it.
(g) AdaBoost Regressor: AdaBoost is a machine learning technique that can enhance the effectiveness of any
other machine learning algorithm. By combining several "weak classifiers" into a single "strong classifier,"
AdaBoost assists in this process. Figure 26 describes the feature importance of AdaBoost classifier. A quick
look at the bar chart reveals that year is the most influential factor, followed by the total mileage driven,
and then model etc.

180
Journal of Decision Analytics and Intelligent Computing 3(1) (2023) 167-184 Nandan and Ghosh

Figure 26. Feature importance of AdaBoost classifier

(h) XGBoost Regressor: XGBoost is a method of ensemble learning that utilizes gradient boosted decision
trees. Its key advantage is its ability to quickly and efficiently learn through parallel and distributed
computing, as well as its effective use of memory. This powerful algorithm's scalability is what makes it
such an attractive option for many applications. Figure 27 illustrates the feature importance of XGBoost
classifier.

Figure 27. Feature importance of XGBoost classifier

The bar plot is a straightforward representation that ranks the car features in order of their importance,
showing which ones carry more weight. XGBoost analysis indicates that the Odometer is a significant feature,
while in earlier models, the year was identified as an essential factor.

181
Journal of Decision Analytics and Intelligent Computing 3(1) (2023) 167-184 Nandan and Ghosh

4.7 Comparison of the Performance of ML Models

Assessing the accuracy of a machine learning model is a crucial step in developing it, as it helps determine
how effective the model is in making predictions. The most commonly used metrics for evaluating the model's
performance and prediction error rates in regression analysis are MSLE, MSE, MAE, RMSE, and R2 Score which
are defined as follows:
- MSLE: The loss function referred to as MSLE (Mean Squared Logarithmic Error) is designed to alleviate the
harsh impact of large discrepancies in high predicted values. This makes it a more suitable choice for
evaluating models that make predictions directly in unscaled quantities.
- RMSLE: Also, known as Root Mean Squared Logarithmic Error, the RMSLE is calculated by taking the
square root of the mean of the squared differences between the natural logarithms of the predicted and
actual values that have been transformed to a logarithmic scale. To prevent the calculation of the natural
logarithm of values that could be zero, 1 is added to both the actual and predicted values before the
transformation is performed.
- MAE: Mean absolute error (MAE) is a metric that calculates the average absolute difference between the
predicted values and the actual values in a dataset. It is obtained by taking the absolute difference
between each predicted value and its corresponding actual value, averaging these differences, and then
expressing the result as a single value.
- MSE: Mean squared error (MSE) is a measure of the average squared difference between the predicted
values and the actual values in a dataset. It is computed by taking the squared difference between each
predicted value and its corresponding actual value, averaging these squared differences, and then
expressing the result as a single value.
- RMSE: Root mean squared error (RMSE) is a metric that represents the square root of the MSE (mean
squared error). It is calculated by taking the square root of the average squared difference between the
predicted values and the actual values in a dataset.
- R2 Score: The R2 score, also known as the coefficient of determination, is utilized to assess the
effectiveness of a linear regression model. Its purpose is to determine how accurately the model
replicates observed outcomes by calculating the proportion of the overall variation in results that can be
explained by the model.
Now, let us have a close insight regarding the performance of the various machine learning models
implemented on the used car dataset. Figure 28 represents the accuracy parameters of the various ML models
implemented in the current study and Figure 29 depicts the diagrammatic representation of the highest
accuracy achieved by different ML models.

Figure 28. Accuracy Parameters of ML models

182
Journal of Decision Analytics and Intelligent Computing 3(1) (2023) 167-184 Nandan and Ghosh

Figure 29. Performance of various ML models

Based on the figure presented above, we can infer that the XGBoost regressor has a higher level of
performance than the other models, with an accuracy of 86.87%.
5. Conclusion

The objective is to predict the price of used cars by employing 25 predictors. To achieve the highest possible
accuracy and minimize errors, various machine learning models were evaluated.
At first, the dataset underwent data cleaning to eliminate any null values or outliers. Subsequently, machine
learning models are employed to make predictions about car prices. Then, using data visualization tools, a
thorough examination of the features is conducted to investigate the relationships between them. Based on the
table provided, it can be inferred that XGBoost is the most suitable model for forecasting used car prices.
XGBoost, employed as a regression model, demonstrated the most optimal MSLE and RMSE outcomes.
This work also proposes a future scope where deep learning algorithms on the same dataset to get more
accurate results with higher efficiency. Also, other datasets can be utilized for a comparative study.

References
AlShared, A. (2021). Used Cars Price Prediction and Valuation using Data Mining Techniques. Thesis. Rochester:
Rochester Institute of Technology.
Amik, F. R., Lanard, A., Ismat, A., & Momen, S. (2021). Application of Machine Learning Techniques to Predict the
Price of Pre-Owned Cars in Bangladesh. Information, 12(12), 514.
Arefin, S. E. (2021). Second Hand Price Prediction for Tesla Vehicles. arXiv:2101.03788
Arora, P., Gupta, H., & Singh, A. (2022). Forecasting resale value of the car: Evaluating the proficiency under the
impact of machine learning model. Materials Today: Proceedings, 69 (2), 441-445.
Farrell, M. J. (1954). The demand for motor-cars in the United States. Journal of the Royal Statistical Society.
Series A (General), 117(2), 171-201.
Hankar, M., Birjali, M., & Beni-Hssane, A. (2023). Machine Learning Modeling to Estimate Used Car Prices. In
Innovations in Smart Cities Applications Volume 6: The Proceedings of the 7th International Conference on
Smart City Applications (pp. 533-542). Springer.
Monburinon, N., Chertchom, P., Kaewkiriya, T., Rungpheung, S., Buya, S., & Boonpou, P. (2018). Prediction of
prices for used car by using regression models. The Proceedings of the 5th International Conference on
Business and Industrial Research (ICBIR) (pp. 115-119). Bangkok: IEEE.
Pal, N., Arora, P., Kohli, P., Sundararaman, D., Palakurthy, S. S. (2019). How Much Is My Car Worth? A
Methodology for Predicting Used Cars’ Prices Using Random Forest. In: Arai, K., Kapoor, S., Bhatia, R. (eds)

183
Journal of Decision Analytics and Intelligent Computing 3(1) (2023) 167-184 Nandan and Ghosh

Advances in Information and Communication Networks. FICC 2018. Advances in Intelligent Systems and
Computing, vol 886 (pp. 413-422). Cham: Springer, Cham.
Pudaruth, S. (2014). Predicting the price of used cars using machine learning techniques. International Journal of
Information and Computer Technology, 4(7), 753-764.
Salim, F., & Abu, N. A. (2020). An S-curve model on the maximum predictive pricing of used cars. European
Journal of Molecular and Clinical Medicine, 7(3), 907-921.
Shanti, N., Assi, A., Shakhshir, H., & Salman, A. (2021). Machine Learning-Powered Mobile App for Predicting
Used Car Prices. The Proceedings of the 3rd International Conference on Big-data Service and Intelligent
Computation (BDSIC 21) (pp. 52-60). New York: Association for Computing Machinery.
Sun, N., Bai, H., Geng, Y., & Shi, H. (2017). Price evaluation model in second-hand car system based on BP neural
network theory. The Proceedings of the 18th IEEE/ACIS International Conference on Software Engineering,
Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD) (pp. 431-436). Kanazawa: IEEE.
Venkatasubbu, P., & Ganesh, M. (2019). Used Cars Price Prediction using Supervised Learning Techniques.
International Journal of Engineering and Advanced Technology, 9(1S3), 216-223.

184

Phạm Nguyễn Quỳnh Anh - ITDSIU22130 - Lab-06
No ratings yet
Phạm Nguyễn Quỳnh Anh - ITDSIU22130 - Lab-06
5 pages
CI-9 Networks Based On Competition - Fixed Weight Networks
No ratings yet
CI-9 Networks Based On Competition - Fixed Weight Networks
24 pages
Car Price Prediction
67% (3)
Car Price Prediction
54 pages
NguyenTriDan - AI Engineering Intern - CV
No ratings yet
NguyenTriDan - AI Engineering Intern - CV
1 page
Chapter2d Block Diagram
No ratings yet
Chapter2d Block Diagram
35 pages
Ai Pera
No ratings yet
Ai Pera
10 pages
Used Car Price Prediction Using Different Machine Learning Algorithms
No ratings yet
Used Car Price Prediction Using Different Machine Learning Algorithms
8 pages
Module 1
No ratings yet
Module 1
50 pages
Car Price Prediction Using Various Algorithms
100% (1)
Car Price Prediction Using Various Algorithms
19 pages
78 - Used Car Price Prediction Using Machine Learning
100% (1)
78 - Used Car Price Prediction Using Machine Learning
5 pages
A Study On Used Cars Price Prediction Using Regression Model With Reference To
No ratings yet
A Study On Used Cars Price Prediction Using Regression Model With Reference To
8 pages
Coursera Cryptography Homework
100% (1)
Coursera Cryptography Homework
7 pages
Used Car Price Prediction Using Multiple Linear Regression
No ratings yet
Used Car Price Prediction Using Multiple Linear Regression
6 pages
Analyzing Selling Price of Used Cars Using Machine Learning
No ratings yet
Analyzing Selling Price of Used Cars Using Machine Learning
41 pages
Car Price Prediction
No ratings yet
Car Price Prediction
12 pages
Price Prediction of Used Cars Using Machine Learning
No ratings yet
Price Prediction of Used Cars Using Machine Learning
6 pages
The Greeks Finance
No ratings yet
The Greeks Finance
49 pages
Best Journal
No ratings yet
Best Journal
9 pages
05 Aviral Mer
No ratings yet
05 Aviral Mer
60 pages
Research Paper
No ratings yet
Research Paper
3 pages
Electronics 11 02932
No ratings yet
Electronics 11 02932
12 pages
Hill Cipher: (Network Security)
No ratings yet
Hill Cipher: (Network Security)
7 pages
Car Price Prediction Using Machine Learning Algorithms
No ratings yet
Car Price Prediction Using Machine Learning Algorithms
9 pages
Car Price Prediction Project Chapters
No ratings yet
Car Price Prediction Project Chapters
30 pages
Sustainability 14 08993 v2
No ratings yet
Sustainability 14 08993 v2
19 pages
Pre-Owned Car Price and Life Prediction Using Machine Learning
No ratings yet
Pre-Owned Car Price and Life Prediction Using Machine Learning
26 pages
Mini Project New
No ratings yet
Mini Project New
25 pages
Price Prediction
No ratings yet
Price Prediction
14 pages
ML Case Study
No ratings yet
ML Case Study
11 pages
Project
No ratings yet
Project
24 pages
ML Project (1) Final
No ratings yet
ML Project (1) Final
15 pages
Duplichecker Plagiarism Report
No ratings yet
Duplichecker Plagiarism Report
1 page
Predicting Pre-Owned Car Prices Using Machine Learning
No ratings yet
Predicting Pre-Owned Car Prices Using Machine Learning
17 pages
Used Cars Price Prediction and Valuation Using Data Mining Techni
No ratings yet
Used Cars Price Prediction and Valuation Using Data Mining Techni
37 pages
Sample Paper 6
No ratings yet
Sample Paper 6
10 pages
ML Project Paper
No ratings yet
ML Project Paper
11 pages
Presentation 1
No ratings yet
Presentation 1
13 pages
Time Series - Eviews Guidelines
No ratings yet
Time Series - Eviews Guidelines
36 pages
Machine Learning-Based Models For Accurate Car Pri
No ratings yet
Machine Learning-Based Models For Accurate Car Pri
6 pages
305-Article Text-1639-1-10-20230421
No ratings yet
305-Article Text-1639-1-10-20230421
9 pages
Duplichecker Plagiarism Report
No ratings yet
Duplichecker Plagiarism Report
3 pages
PPSD 1743674861
No ratings yet
PPSD 1743674861
3 pages
Quiz Week 5 - Attempt Review - CeLOE LMS
No ratings yet
Quiz Week 5 - Attempt Review - CeLOE LMS
4 pages
Final Project - Merged
No ratings yet
Final Project - Merged
17 pages
Probability
No ratings yet
Probability
48 pages
Ai and Machine Learning For Predicting
No ratings yet
Ai and Machine Learning For Predicting
9 pages
Sustainability 14 17034
No ratings yet
Sustainability 14 17034
17 pages
Prediction of The Price of Used Cars Based On Mach
No ratings yet
Prediction of The Price of Used Cars Based On Mach
7 pages
10-Floating Point Representation With IEEE Standards and Algorithms For Common Arithmetic operations-30-Jul-2019Material - I
No ratings yet
10-Floating Point Representation With IEEE Standards and Algorithms For Common Arithmetic operations-30-Jul-2019Material - I
34 pages
Sample
No ratings yet
Sample
15 pages
Car Price Prediction Using Ai
No ratings yet
Car Price Prediction Using Ai
6 pages
MML Book
No ratings yet
MML Book
381 pages
Control Systems: Lect.7 Steady State Error
No ratings yet
Control Systems: Lect.7 Steady State Error
44 pages
IRJMETS60300008997
No ratings yet
IRJMETS60300008997
6 pages
HPC Mid2 Sub
No ratings yet
HPC Mid2 Sub
2 pages
Bulldozer Price Prediction Using Regression Model (Research Ethics)
No ratings yet
Bulldozer Price Prediction Using Regression Model (Research Ethics)
19 pages
74 Ijcse2018 19
No ratings yet
74 Ijcse2018 19
7 pages
How Much Is My Car Worth? A Methodology For Predicting Used Cars Prices Using Random Forest
No ratings yet
How Much Is My Car Worth? A Methodology For Predicting Used Cars Prices Using Random Forest
6 pages
Paper 10479
No ratings yet
Paper 10479
4 pages
Task 2 Complete
No ratings yet
Task 2 Complete
20 pages
PHD Thesis
No ratings yet
PHD Thesis
200 pages
Brain Tumour PDF
No ratings yet
Brain Tumour PDF
11 pages
Price Prediction For Pre-Owned Cars Using Ensemble
No ratings yet
Price Prediction For Pre-Owned Cars Using Ensemble
10 pages
Loan Prediction Using Machine Learning
No ratings yet
Loan Prediction Using Machine Learning
29 pages
33 Submission
No ratings yet
33 Submission
8 pages
Sanke 2024 Ijca 923900
No ratings yet
Sanke 2024 Ijca 923900
6 pages
Pre-Owned Car Price Prediction Using Machine Learning Techniques
No ratings yet
Pre-Owned Car Price Prediction Using Machine Learning Techniques
5 pages
Car Price Prediction Using Machine Learning Techniques
100% (1)
Car Price Prediction Using Machine Learning Techniques
6 pages
Demo Abstract
No ratings yet
Demo Abstract
1 page
Used Price Prediction
No ratings yet
Used Price Prediction
4 pages
1st Review
No ratings yet
1st Review
9 pages
Used Car Price Prediction Using Linear Regression Model
No ratings yet
Used Car Price Prediction Using Linear Regression Model
8 pages
Smart Used Car Price Prediction: Somesh Alkanthi Vishwakarma Institute of Technology, Pune
No ratings yet
Smart Used Car Price Prediction: Somesh Alkanthi Vishwakarma Institute of Technology, Pune
6 pages
Fuxictr: An Open Benchmark For Click-Through Rate Prediction
No ratings yet
Fuxictr: An Open Benchmark For Click-Through Rate Prediction
9 pages
Lesson 1493761882
No ratings yet
Lesson 1493761882
1 page
Second Hand Car Price Prediction
No ratings yet
Second Hand Car Price Prediction
18 pages
ITS307 Group 4 Report
No ratings yet
ITS307 Group 4 Report
14 pages
Self Learning Material - Introduction To Data Science
No ratings yet
Self Learning Material - Introduction To Data Science
10 pages
Report Car Price Prediction
No ratings yet
Report Car Price Prediction
8 pages
Prediction of Car Price Using Linear Regression
No ratings yet
Prediction of Car Price Using Linear Regression
4 pages
GRADE 11 Term 3 Test 2 - 2024 MG
No ratings yet
GRADE 11 Term 3 Test 2 - 2024 MG
7 pages
Used Car Price Prediction
No ratings yet
Used Car Price Prediction
20 pages
Modelling and Control of Coupled Tank Liquid Level System Using Backstepping Method IJERTV4IS060710
No ratings yet
Modelling and Control of Coupled Tank Liquid Level System Using Backstepping Method IJERTV4IS060710
5 pages
Dsa Lab 07
No ratings yet
Dsa Lab 07
9 pages
Loan Risk Prediction Using User Transaction Information
No ratings yet
Loan Risk Prediction Using User Transaction Information
3 pages
2014 - Predicting The Price of Used Cars Using Machine Learning Techniques PDF
No ratings yet
2014 - Predicting The Price of Used Cars Using Machine Learning Techniques PDF
12 pages
02-CH02-CompSec2e-ver02 Cryptographic Tools PDF
No ratings yet
02-CH02-CompSec2e-ver02 Cryptographic Tools PDF
36 pages
Wavelet Transformation Tool User Manual Petrel
No ratings yet
Wavelet Transformation Tool User Manual Petrel
13 pages
Artificial Intelligence and Machine Learning in Market Research: Smart Project Ideas
From Everand
Artificial Intelligence and Machine Learning in Market Research: Smart Project Ideas
Zemelak Goraga
No ratings yet
The future of artificial intelligence
From Everand
The future of artificial intelligence
Bernd Michael Grosch
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

A13 Nandan and Ghosh 167-184

Uploaded by

A13 Nandan and Ghosh 167-184

Uploaded by

Original Scientific Article Journal of Decision Analytics and Intelligent Computing

Vol. 3 issue 1, (2023) 167-184 https://doi.org/10.31181/jdaic10008102023n

Pre-owned car price prediction by employing

1 Department of Computer Applications, Techno Main Saltlake, Kolkata, India

Received 7 May 2023

Figure 1. Control Flow Graph of the Proposed System

4.1. Data Acquisition

4.2 Data Cleaning

Figure 4. MSE with 4 different Imputation Methods

Figure 9. Box Plot and Histogram Plot of Year

4.3. Data pre-processing

Figure 10. Correlation Matrix Plots among the different variables

Figure 11. Pair-plots to find correlation

Figure 12. Graph showing distribution of Price

Figure 15. Graph displaying car price variation per year

4.5 Splitting the dataset into training and testing set

4.6 Training with ML Models

Figure 22. Lasso Regression feature importance

Figure 23. Error Plot for KNN for k range 1-9

Figure 24. Performance Graph of RF

Figure 26. Feature importance of AdaBoost classifier

Figure 27. Feature importance of XGBoost classifier

4.7 Comparison of the Performance of ML Models

Figure 28. Accuracy Parameters of ML models

Figure 29. Performance of various ML models

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.