Learning/"
Learning/"
Learning/"
Abstract
This project serves as a practical experience in working with big databases using
Python and understanding the challenges involved in data retrieval and analysis. This
is on purpose so that we learn categorizing cars based on their specifications, such as
price and other relevant elements, will enhance our understanding of how data can be
effectively organized and utilized for decision-making in the automotive domain or AI
algorithms.The project utilizes Python code provided by the professor
(“https://thecleverprogrammer.com/2021/08/04/car-price-prediction-with-machine-
learning/”), which is designed to search and retrieve data from a database. The
database link provided in the code is replaced with a new database to fetch updated
results. We run this code on the Colab website, and we input different databases,
compare the results, and analyze the process. We begin by understanding the structure
of the car database and identifying the elements of the relevant specification for
categorization, such as price, make, model, year, mileage, horsepower, and so on. The
Python code provided by the professor is reviewed and modified to include the new
database link, ensuring that the code is configured to fetch data from the updated
database. Once the code is configured, it is executed on different databases containing
car data. The retrieved data is then processed and analyzed to generate results. Then
we compare the results obtained from different databases, assessing the accuracy and
efficiency of the code, and identifying any discrepancies or patterns in the retrieved
data.
The project is based on this link:
“https://thecleverprogrammer.com/2021/08/04/car-price-prediction-with-machine-
learning/”
The project was made on:
https://colab.research.google.com
Table 1. The primary database of Audi is the Excel spreadsheet containing the initial five rows
of data.
Car_ID Symbolling Car Name Fuel Type Aspiration Doors Number CarBody
1 3 alfa-romero gas std two convertible
giulia
2 3 alfa-romero gas std two convertible
stelvio
3 1 alfa-romero gas std two hatchback
Quadrifoglio
4 2 audi 100 ls gas std four sedan
5 2 audi 100ls gas std four sedan
continuing… ↓
Algorithm
1. First we find a database of any market with enough types of data like year, km driven,
model, engine size etc..
2. Then we edit the database to make it work with the program
3. Then we edit the python file a bit because of the difference of databases. Meaning the
python program will expect always a specific type of database. We have to change the
program every time the database we input is different.
4. And then after the database and the python program is linked together, we run the
program and then we see a few details and information.
5. We can see it made some calculation like the average price of the car, it made a graph,
it made a colored matrix and etc…
6. After the calculations the data learning machine will be able to make a prediction.
7. This prediction will be put in websites like second hand car websites to buy.
8. When the user finds their ideal car to buy. They will click on the car a person is
selling, and then our program will take input from the database of that website.
Meaning the program takes input every single car of that specific car the user chose in
order for the data learning machine to predict the price. It is recommended the to input
atleast 10,000 entries to make a more accurate prediciton.
9. After the user chooses the car they should be able to see the price after the learning
data machine took the database from the website and made the proper calculations
needed to make a prediction.
Flowchart
Figure 1 – Flowchart : When a program starts, first we collect the data, then we import
important libraries provided by the code runner, then we load the datasets, after that
comes the esploratory data analysis of which makes the data preprocessing start. The
models are built and trained that leads into the algorithm selecting the best models and
making predictions on the particular chosen dataset and this cycle ends.
Experiment results (Entire code "change all the variable names", All outputs, All figures
outputs with explanations)
The first database's whole code, with the variable names modified:
import seaborn as sns
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeRegressor
DatabaseAudiMarketplace = pd.read_csv("/content/DatabaseAudiMarketplace.csv")
DatabaseAudiMarketplace.head()
DatabaseAudiMarketplace.isnull().sum()
DatabaseAudiMarketplace.info()
print(DatabaseAudiMarketplace.describe())
sns.set_style("whitegrid")
plt.figure(figsize=(15, 10))
sns.distplot(DatabaseAudiMarketplace.price)
plt.show()
print(DatabaseAudiMarketplace.corr())
plt.figure(figsize=(20, 15))
correlations = DatabaseAudiMarketplace.corr()
sns.heatmap(correlations, cmap="coolwarm", annot=True)
plt.show()
predict = "price"
DatabaseAudiMarketplace = DatabaseAudiMarketplace[["enginesize",
"highwaympg","price"]]
x = np.array(data.drop([predict], 1))
y = np.array(data[predict])
from sklearn.model_selection import train_test_split
xtrain, xtest, ytrain, ytest = train_test_split(x, y, test_size=0.2)
from sklearn.tree import DecisionTreeRegressor
model = DecisionTreeRegressor()
model.fit(xtrain, ytrain)
predictions = model.predict(xtest)
from sklearn.metrics import mean_absolute_error
model.score(xtest, predictions)
print(DatabaseAudiMarketplace)
The entire code of database 1 explained in parts (First a part of the code is showed and
then the output of the taken code is shown and then explained):
import seaborn as sns
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeRegressor
DatabaseAudiMarketplace = pd.read_csv("/content/DatabaseAudiMarketplace.csv")
DatabaseAudiMarketplace.head()
- As can be seen from these lines of code, in order for the project to function, we
import some modules like dictionaries. Following the import, the Excel database sheet
for Audi's first six lines are output.
id model year price transmission mileage fueltype tax highwaympg enginesize
0 A1 2017 12500 Manual 15735 Petrol 150 55.4 1.4
1 A6 2016 16500 Automatic 36203 Diesel 20 64.2 2.0
2 A1 2016 11000 Manual 29946 Petrol 30 55.4 1.4
3 A4 2017 16800 Automatic 25952 Diesel 145 67.3 2.0
4 A3 2019 17300 Manual 1998 Petrol 145 49.6 1.0
Table 3. From what we observe in the table below, the python execution printed the
first six lines found in the DatabaseAudiMarketplace.csv database that we entered into
Excel.
DatabaseAudiMarketplace .isnull().sum()
model 0
year 0
price 0
transmission 0
mileage 0
fueltype 0
tax 0
highwaympg 0
enginesize 0
dtype: int64
Table 4. This table displays the command isnull, a feature of the Panda function that
checks to see whether or not there is an unfilled or null cell in an Excel sheet. If so, the
phrase will be true rather than false. Consequently, the result will be 1 rather than 0.
This indicates that the database we entered is operating properly. 1 for True and 0 for
False.
DatabaseAudiMarketplace .info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10668 entries, 0 to 10667
Data columns (total 9 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 model 10668 non-null object
1 year 10668 non-null int64
2 price 10668 non-null int64
3 transmission 10668 non-null object
4 mileage 10668 non-null int64
5 fueltype 10668 non-null object
6 tax 10668 non-null int64
7 highwaympg 10668 non-null float64
8 enginesize 10668 non-null float64
dtypes: float64(2), int64(4), object(3)
memory usage: 750.2+ KB
Table 5. The following piece of code displays all the technical data that the database
can provide, such as the file size, memory use, the number of entries, etc. As the table
shows, the size of the file is around 750.2 KB and it contains three different types of
data which is float, integer and object.
print(DatabaseAudiMarketplace.describe())
enginesize
count 10668.000000
mean 1.930709
std 0.602957
min 0.000000
25% 1.500000
50% 2.000000
75% 2.000000
max 6.300000
<ipython-input-8-b24cc0cfc4f5>:3: UserWarning:
`distplot` is a deprecated function and will be removed in seaborn v0.14.0.
Please adapt your code to use either `displot` (a figure-level function with
similar flexibility) or `histplot` (an axes-level function for histograms).
For a guide to updating your code to use the new functions, please see
https://gist.github.com/mwaskom/de44147ed2974457ad6372750bbe5751
Figure.2. As you observe, this code will generate a graph showing the average automobile
price. Due to pricing differences in the excel sheet, the graph will alter depending on the
database that we import. Specifically, the market if one takes the database as the market's
input. From what we notice, the majority of the automobiles in this photo cost approximately
$20,000.
sns.distplot(DatabaseAudiMarketplace .price)
print(DatabaseAudiMarketplace.corr())
plt.figure(figsize=(20, 15))
correlations = DatabaseAudiMarketplace.corr()
sns.heatmap(correlations, cmap="coolwarm", annot=True)
plt.show()
Table 7.
- The result, as can be observed, contained a warning. It is a panda alert that could interfere
later. The table containing dataframe content is located below the warning. The graphic that
follows the table displays a chart with various colors.
- This output includes a panda warning and the single digit 1 (i). This demonstrates that the
program that uses the pandas functions as expected. indicating that the code we were
developing performed as planned.
print(DatabaseAudiMarketplace)
Table 8. The changed version of the database that we entered will be displayed in this
output. Since this was the only technique available to get the database functioning with
the program, there are now only 3 rows. It must be altered for it to function. Meaning
the changed version of the database that we entered will be displayed in this output.
Since this was the only way to get the database to work with the program, there are
now only 3 rows. It must be altered for it to function. We only utilized three sorts of
data for our application since other statistics, such mpg, would not affect the car's
worth. Due to the fact that inputting data types like a car's ID won't in any way alter
the car's worth, sometimes doing so might cause problems for the data learning
machine. It could make a mistake in some circumstances.
The second database's whole source code, with modified variable names:
import seaborn as sns
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeRegressor
CarPriceDatabase =
pd.read_csv("https://raw.githubusercontent.com/amankharwal/Website-
CarPriceDatabase/master/CarPrice.csv")
CarPriceDatabase.head()
CarPriceDatabase.isnull().sum()
CarPriceDatabase.info()
print(CarPriceDatabase.describe())
CarPriceDatabase.CarName.unique()
sns.set_style("whitegrid")
plt.figure(figsize=(15, 10))
sns.distplot(CarPriceDatabase.price)
plt.show()
print(CarPriceDatabase.corr())
plt.figure(figsize=(20, 15))
correlations = CarPriceDatabase.corr()
sns.heatmap(correlations, cmap="coolwarm", annot=True)
plt.show()
predict = "price"
CarPriceDatabase = CarPriceDatabase[["symboling", "wheelbase", "carlength",
"carwidth", "carheight", "curbweight",
"enginesize", "boreratio", "stroke",
"compressionratio", "horsepower", "peakrpm",
"citympg", "highwaympg", "price"]]
x = np.array(CarPriceDatabase.drop([predict], 1))
y = np.array(CarPriceDatabase[predict])
from sklearn.model_selection import train_test_split
xtrain, xtest, ytrain, ytest = train_test_split(x, y, test_size=0.2)
from sklearn.tree import DecisionTreeRegressor
model = DecisionTreeRegressor()
model.fit(xtrain, ytrain)
predictions = model.predict(xtest)
from sklearn.metrics import mean_absolute_error
model.score(xtest, predictions)
print(CarPriceDatabase)
The whole second database's output with interpretations:
Car_ID Symbolling Car Name Fuel Type Aspiration Doors Number CarBody
1 3 alfa-romero gas std two convertible
giulia
2 3 alfa-romero gas std two convertible
stelvio
3 1 alfa-romero gas std two hatchback
Quadrifoglio
4 2 audi 100 ls gas std four sedan
5 2 audi 100ls gas std four sedan
continuing… ↓
Drive Engine Wheelbase Engine Fuel Bore Stroke Compression Horsepower
wheel location size System ratio ratio
Table 9. Essentially, this table will display the input, corresponding to the database that
we entered. From what we notice and can observe, it just displays the excel upload's rows
and columns. Simply put, this table will display the input, which is the database that we
entered. As we can see, it just displays the rows and columns of the excel file that we
entered. This repository Only the first 6 rows of the Excel sheet we provided to the
program have been printed.
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 205 entries, 0 to 204
Data columns (total 26 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 car_ID 205 non-null int64
1 symboling 205 non-null int64
2 CarName 205 non-null object
3 fueltype 205 non-null object
4 aspiration 205 non-null object
5 doornumber 205 non-null object
6 carbody 205 non-null object
7 drivewheel 205 non-null object
8 enginelocation 205 non-null object
9 wheelbase 205 non-null float64
10 carlength 205 non-null float64
11 carwidth 205 non-null float64
12 carheight 205 non-null float64
13 curbweight 205 non-null int64
14 enginetype 205 non-null object
15 cylindernumber 205 non-null object
16 enginesize 205 non-null int64
17 fuelsystem 205 non-null object
18 boreratio 205 non-null float64
19 stroke 205 non-null float64
20 compressionratio 205 non-null float64
21 horsepower 205 non-null int64
22 peakrpm 205 non-null int64
23 citympg 205 non-null int64
24 highwaympg 205 non-null int64
25 price 205 non-null float64
dtypes: float64(8), int64(8), object(10)
memory usage: 41.8+ KB
Here we can understand that this output just displays the number of non-null values
and the type of input that we sent to the database. It also displays the number of
distinct data kinds and the amount of RAM used. We can also see that every type of
column, also known as data type, in our dataset is represented in this table. This also
indicated the type of data—which may have been an integer, a float, or an object—that
we had entered in that column or data type.
car_ID symboling wheelbase carlength carwidth carheight \
count 205.000000 205.000000 205.000000 205.000000 205.000000 205.000000
mean 103.000000 0.834146 98.756585 174.049268 65.907805 53.724878
std 59.322565 1.245307 6.021776 12.337289 2.145204 2.443522
min 1.000000 -2.000000 86.600000 141.100000 60.300000 47.800000
25% 52.000000 0.000000 94.500000 166.300000 64.100000 52.000000
50% 103.000000 1.000000 97.000000 173.200000 65.500000 54.100000
75% 154.000000 2.000000 102.400000 183.100000 66.900000 55.500000
max 205.000000 3.000000 120.900000 208.100000 72.300000 59.800000
In this particular instance, this operation uses a dataframe with numerical data to
technically describe the database. It displays the mean value, sometimes called the
standard deviation. The dataframe containing numerical data is used to technically
describe the database using this command. The average number, sometimes called the
standard deviation, is displayed. As is evident, the numbers vary according on the type
of row. The count indicates the number of times the particular data was entered, the
mean displays the mean deviation, and the standard deviation result is displayed. Min
displays the lowest value. 25% displays the mean value of 25% of the lowest values.
Likewise for 50% and 75%. Max displays the highest value that the data may have
been.
<ipython-input-2-3b6c97159ec3>:7: UserWarning:
Please adapt your code to use either `displot` (a figure-level function with
similar flexibility) or `histplot` (an axes-level function for histograms).
For a guide to updating your code to use the new functions, please see
https://gist.github.com/mwaskom/de44147ed2974457ad6372750bbe5751
We can observe a few warnings alerts in this report. The cautions advise modifying the
software code (program) afterwards since it might not be functional. The cautions
advise modifying the software afterwards since it might not be reliable. Due to the
importance of the seaborn's purpose, this may alter in a later release. Right now, it
functions properly . If necessary, it will be changed to its operational condition going
forward.
Figure 5. In this figure are shown the scales in numerical forms that describe the output data for car price predictions
generated from the code execution. This describes the price ratings with visual clustered columns in chart combined
with the line containing the maximum and minimum values. Meaning this graph displays the typical cost of a car in a
market. As we can see, the typical cost of a car in this location ranges from $5,000 to $50,000. with an 8–10k dollar
used automobile average. Every distinct kind of database we offer the software will have a different effect on this.
car_ID symboling wheelbase carlength carwidth \
car_ID 1.000000 -0.151621 0.129729 0.170636 0.052387
symboling -0.151621 1.000000 -0.531954 -0.357612 -0.232919
wheelbase 0.129729 -0.531954 1.000000 0.874587 0.795144
carlength 0.170636 -0.357612 0.874587 1.000000 0.841118
carwidth 0.052387 -0.232919 0.795144 0.841118 1.000000
carheight 0.255960 -0.541038 0.589435 0.491029 0.279210
curbweight 0.071962 -0.227691 0.776386 0.877728 0.867032
enginesize -0.033930 -0.105790 0.569329 0.683360 0.735433
boreratio 0.260064 -0.130051 0.488750 0.606454 0.559150
stroke -0.160824 -0.008735 0.160959 0.129533 0.182942
compressionratio 0.150276 -0.178515 0.249786 0.158414 0.181129
horsepower -0.015006 0.070873 0.353294 0.552623 0.640732
peakrpm -0.203789 0.273606 -0.360469 -0.287242 -0.220012
citympg 0.015940 -0.035823 -0.470414 -0.670909 -0.642704
highwaympg 0.011255 0.034606 -0.544082 -0.704662 -0.677218
price -0.109093 -0.079978 0.577816 0.682920 0.759325
highwaympg price
car_ID 0.011255 -0.109093
symboling 0.034606 -0.079978
wheelbase -0.544082 0.577816
carlength -0.704662 0.682920
carwidth -0.677218 0.759325
carheight -0.107358 0.119336
curbweight -0.797465 0.835305
enginesize -0.677470 0.874145
boreratio -0.587012 0.553173
stroke -0.043931 0.079443
compressionratio 0.265201 0.067984
horsepower -0.770544 0.808139
peakrpm -0.054275 -0.085267
citympg 0.971337 -0.685751
highwaympg 1.000000 -0.697599
price -0.697599 1.000000
In this output we can see that the information of the table is linked with the colored
matrix . Meaning this graph displays the data that the data learning system will use to
forecast the price of the automobile. We can see the information on the colored chart
in the next page.
Figure 6. This graph shows how the several database categories that we entered work and function. Everything is
going according to plan and expectation because 1 is positioned diagonally. The data learning system will utilize this
variation across cells to calculate the cost of the automobile.
<ipython-input-3-09e4e61e658b>:7: FutureWarning: In a future version of pandas all arguments
of DataFrame.drop except for the argument 'labels' will be keyword-only.
x = np.array(data.drop([predict], 1))
1.0
I- As we can see, just the value 1.0 is displayed in this output. This is how the software
was meant to operate since if 1 does not appear, then 0 will, which would indicate that
the program has made a catastrophic error. Meaning that the application is running
smoothly based on the entire output. If it is, 1.0 will appear; otherwise, 0.0. This test
checks to see if the dataframe is functioning as planned and is one of the pandas
functions.
- As the pictures show above, there is a significant disparity between both datasets.
They do share a component in common, which is the diagonally arranged number 1. This is
because each column and row contains a different type of data. The number 1 is created when
the same data kinds are combined. As we can see, the major variation is the amount of
columns and rows. Due to the fact that there are fewer data types in the first dataset, audi.csv,
there are much less data types inputted. Consequently, the second data base will have more
squares , because it contains a greater variety of data types. The color types appeared to be
the same, with the exception that the second dataset's size gave rise to a greater range of hues.
Dataset 1
Dataset 2
- The density , quantity and the pricing amount are where we can observe the most
difference in this case. The density ranges from 0 to 5 in the first dataset, but ranges from 0 to
0.00010 in the second. This is because there are more automobiles (or rows) entered into the
first database. Additionally, it appears that the average price in the first database is higher
than in the other one. The price disparity is significant. In the first dataset, the price range is 0
to 150,000 dollars, whereas in the second, it is 0 to 55,000 dollars. In the first dataset, the
typical automobile looks to cost roughly 20,000 dollars, but in the other database the prices
seem to average at the price around 9,000 dollars.
Conclusion
In conclusion, by reading, studying, writing and doing analyses effort in this project,
our research skills developed a lot. We gained big knowledge from this comparative study that
expresses the way program generates different types of data, for example graphs and tables
and outputs in other visual forms, to illustrate how various characteristics impact and change
the prediction of car prices scales based on the inputted data. The biggest t difference that we
noticed was in the tables, where the two databases differed in terms of the number of rows and
columns, with the second database containing more diverse inputs, including car
dimensions.The codes we took and modified, displayed the average density of prices on the
graph, and told insights into the distribution of prices in each database. Specifically, the first
database most likely tended to have generally higher prices for cars. Further research and
analysis can be conducted so that we explore more the implications of this findings on pricing
strategies and decision impacting in the vehicles industry and also many companies can use
this kind of algorithms or codes to implement their data in projects like this.
References
Chen, J., Han, Q., Li, F., Wang, Q., Xu, J., & Yan, M. (2022). Comparisons of different methods
used for second-hand car price prediction. Paper presented at the, 12259 122594N-122594N-11.
https://doi.org/10.1117/12.2638739
Demiriz, A. (2018). Used car pricing and beyond: A survival analysis framework. In 2018 First
International Conference on Artificial Intelligence for Industries (AI4I) (pp. 65-68). IEEE.
https://doi.org/10.1109/AI4I.2018.8665680
Gegic, E., Isakovic, B., Keco, D., Kevric, J., & Masetic, Z. (2019). Car price prediction using
machine learning techniques. TEM Journal, 8(1), 113-118. https://doi.org/10.18421/TEM81-16
Google. (n.d.). Car price using machine learning flowchart [Image]. Retrieved from
https://www.google.com/search?
q=car+price+using+machine+learning+flowchart&tbm=isch&ved=2ahUKEwiOx57Nlcz-
AhVsiP0HHfSnAz0Q2-
cCegQIABAA&oq=car+price+using+machine+learning+flowchart&gs_lcp=CgNpbWcQA1DjA
Vj7DWD1DmgAcAB4AIABhgGIAdAIkgEDNi41mAEAoAEBqgELZ3dzLXdpei1pbWfAAQE
&sclient=img&ei=OoVLZI7CHeyQ9u8P9M-
O6AM&bih=577&biw=1280&rlz=1C1GCEU_enXK1022XK1022#imgrc=bbw8OUONW79ItM
Hankar, M., Beni-Hssane, A., & Birjali, M. (2022). Used car price prediction using machine
learning: A case study. In 2022 11th International Symposium on Signal, Image, Video and
Communications (ISIVC) (pp. 1-4). IEEE. https://doi.org/10.1109/ISIVC54825.2022.9800719
Hassanien, B. E., Azim, M. A., & Elgohary, M. A. (2020). Used cars price prediction based on
deep learning techniques. In 2020 IEEE 2nd International Conference on Advances in
Computational Intelligence (ICACI) (pp. 332-336). Brno, Czech Republic. doi:
10.1109/ICACI49156.2020.9120244.
Jin, C. (2021). Price prediction of used cars using machine learning. In 2021 IEEE International
Conference on Emergency Science and Information Technology (ICESIT) (pp. 223-230). IEEE.
https://doi.org/10.1109/ICESIT53460.2021.9696839
Jindal, M., & Jain, N. (2020). Machine learning based car price prediction. In 2020 International
Conference on Emerging Trends in Information Technology and Engineering (ic-ETITE) (pp. 1-
6). Jaipur, India. doi: 10.1109/ic-ETITE50223
Afroz, S. A., Masum, M. M. H., Islam, M. M., & Hossain, M. S. (2022). Prediction of car prices
using machine learning: A comprehensive review. In S. S. Das, S. Misra, S. Mukhopadhyay, &
V. Patidar (Eds.), Artificial Intelligence and Machine Learning for Future Networks and Systems
(pp. 73-89). Springer. https://doi.org/10.1007/978-981-18-9228-6_5
Chen, S., & Liu, Z. (2022). Application of data mining technology in second-hand car price
forecasting. In 2022 3rd International Conference on Electronic Communication and Artificial
Intelligence (IWECAI) (pp. 260-273). Zhuhai, China. doi: 10.1109/IWECAI55315.2022.00058
Dawood, H. A., Ibrahim, F. N., & Ali, O. M. G. (2020). Used car price prediction model: A
machine learning approach. In 2020 4th International Conference on Intelligent Computing in
Data Sciences (ICDS) (pp. 1-6). Cairo, Egypt. doi: 10.1109/ICDS48824.2020.9280089
Hossain, M. I., Uddin, M. S., & Islam, M. R. (2021). Car price prediction model development
and analysis for Bangladesh market. In 2021 IEEE 5th International Conference on Computing
Communication and Automation (ICCCA) (pp. 441-446). Noida, India. doi:
10.1109/CCAA51626.2021.9374211
Huang, Y., Zhang, X., & Cheng, H. (2021). Prediction of car price based on ensemble learning
and data cleaning. In 2021 8th International Conference on Information Science and Technology
(ICIST) (pp. 21-26). Nanjing, China. doi: 10.1109/ICIST51598.2021.00009
Li, H. (2021). Research on big data analysis data acquisition and data analysis. In 2021
International Conference on Artificial Intelligence, Big Data and Algorithms (CAIBDA) (pp.
162-165). Xi'an, China. doi: 10.1109/CAIBDA53561.2021.00041
Lukić, T., & Vulić, T. (2019). Predicting the price of used cars using multiple linear regression
and artificial neural network. TEM Journal, 8(1), 113-118. doi: 10.18421/TEM81-15
Narayana, C. V., Likhitha, C. L., Bademiya, S., & Kusumanjali, K. (2021). Machine learning
techniques to predict the price of used cars: Predictive analytics in retail business. In 2021
Second International Conference on Electronics and Sustainable Communication Systems
(ICESC) (pp. 1680-1687). Coimbatore, India. doi: 10.1109/ICESC51422.2021.9532845
Shaikh, M. K., Zaki, H., Tahir, M., Khan, M. A., Siddiqui, O. A., & Rahim, I. U. (2022). The
framework of car price prediction and damage detection technique. Pakistan Journal of
Engineering & Technology, 5(4). https://doi.org/10.51846/vol5iss4pp52-59
Kharwal, A. (2021). Car Price Prediction with Machine Learning. Retrieved from
https://thecleverprogrammer.com/2021/08/04/car-price-prediction-with-machine-learning/
Li, Z., Li, Q., Liu, Y., & Li, X. (2020). Research on used car price prediction based on SVM
optimized by genetic algorithm. In 2020 3rd International Conference on Robotics, Control and
Automation (ICRCA) (pp. 75-80). Wuhan, China. doi: 10.1109/ICRCA49248.2020.9289705.
Shao, W., & Liu, J. (2021). Car price prediction with deep learning. In 2021 International
Conference on Big Data and Blockchain (ICBDB) (pp. 172-178). Chengdu, China. doi:
10.1109/ICBDB52020.2021.00036.
Sowmya, P. S., Anu, G. K., & Joy, P. M. (2021). Analysis and prediction of used car prices using
machine learning techniques. In 2021 3rd International Conference on Inventive Computation
Technologies (ICICT) (pp. 1-6). Coimbatore, India. doi: 10.1109/ICICT51501.2021.9441436.
Thai, D. V., Son, L. N., Tien, P. V., Anh, N. N., & Anh, N. T. N. (2019). Prediction car prices
using quantify qualitative data and knowledge-based system. In 2019 11th International
Conference on Knowledge and Systems Engineering (KSE) (pp. 1-5). Da Nang, Vietnam. doi:
10.1109/KSE.2019.8919408.
Huang, Y., Zhang, X., & Cheng, H. (2021). Prediction of car price based on ensemble learning
and data cleaning. In 2021 8th International Conference on Information Science and Technology
(ICIST) (pp. 21-26). Nanjing, China. doi: 10.1109/ICIST51598.2021.00009.