Report
Report
ON
Submitted to
SAVITRIBAI PHULE PUNE UNIVERSITY
In Partial Fulfilment of the Requirement for the Award of
CERTIFICATE
This is to certify that Parth Ravindra Lande (36) has successfully submitted Internship Project enti-
tled ” Used Car Price Prediction ” under the guidance of Mrs.S.A.Joshi in the Academic Year 2024-25
at Information Technology Department of Trinity Academy of Engineering , under the Savitribai Phule
Pune University. This Internship work is duly completed.
Date: 17 / 02 /2025
Place: Pune
Mrs.S.A.Joshi
Internship Guide
I would like to acknowledge all the teacher and friends who ever helped and assisted me throughout
my Seminar work.
First of all I would like to thank my respected guide Mrs.S.A.Joshi for his time-to-time guidance and
encouragement. This work would have not been possible without the enthusiastic response, insight and
new idea from him. I am thankful to Mrs.Shaikh Sumaira, Internship Co-ordinator for timely help and
valuable suggestions.
Furthermore, I would like to thank respected Mrs. P. R. Patil, Head of Department and Dr. R. J.
Patil, Principal for the continuous support during my Seminar work. I am also grateful to all the faculty
members of Trinity Academy of Engineering, Pune for their support and cooperation. I would like to
thank my lovely parent for time-to-time support and inspirations, and I would like to thank all my friends
for their suggestions and support. The acknowledgement world be incomplete without mention of the
blessing of the almighty, which helped me in keeping high moral during difficult period.
The increasing demand for used cars has made accurate price prediction essential for buyers and sell-
ers. This project aims to develop a machine learning model that predicts the price of a used car based
on factors such as year of manufacture, mileage, fuel type, transmission type, and other key attributes.
The model is trained using a dataset of used car listings and employs data preprocessing techniques to
improve accuracy. The system is integrated into a web application with an intuitive user interface for
ease of access. By leveraging data-driven insights, this solution helps users make informed decisions,
ensuring fair pricing in the used car market.
Keywords: - Used Car Price Prediction, Machine Learning, Regression Model, Data Preprocessing,
Price Estimation, Web Application, Automobile Market.
Contents
1 About Topic 1
1.1 Title . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Domain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.3 Aim . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.4 Objective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.5 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
2 Introduction 2
2.1 Need . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2.3 Basic Concept . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
3 Literature Survey 3
3.1 Traditional Regression Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
3.2 Machine Learning Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
3.3 Deep Learning Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
3.4 Hybrid Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
4 Methodology 4
4.1 Data Collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
4.2 Data Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
4.3 Feature Selection and Analysisl . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
4.4 Model Selection and Training . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
6 System Components 7
6.0.1 User Interface (Frontend) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
6.0.2 Backend (API Server) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
6.0.3 CSV Dataset (Stored Locally or in Database) . . . . . . . . . . . . . . . . . . . 7
6.0.4 Deployment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
7 System Flow 8
7.1 Step-by-Step System Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
7.1.1 User Input and Request Handling . . . . . . . . . . . . . . . . . . . . . . . . . 8
7.1.2 Backend Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
7.1.3 Machine Learning Model Prediction . . . . . . . . . . . . . . . . . . . . . . . . 8
7.1.4 Response Processing and Frontend Display . . . . . . . . . . . . . . . . . . . . 9
7.2 System Flowchart . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
8 System Modules 10
8.1 Data Collection Module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
8.2 Data Preprocessing Module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
8.3 Prediction and Evaluation Module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
8.4 User Interface Dashboard Module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
10 Implimentation 12
11 Conclusion 13
References
List of Figures
1 system Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2 User Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Used Car Price Prediction
1 About Topic
1.1 Title
The title of our project is “ Used Car Price Prediction ”
1.2 Domain
Machine Learning
1.3 Aim
To develop a machine learning model that accurately predicts the price of used cars based on key
features, helping buyers and sellers make informed decisions.
1.4 Objective
• Collect and analyze used car data
• Identify key factors affecting car prices
• Evaluate model accuracy and performance
• Develop a user-friendly price prediction system
• Improve predictions with more data and trends
1
Used Car Price Prediction
2 Introduction
2.1 Need
The used car market is growing rapidly, making it challenging for buyers and sellers to determine fair
prices. Manual price estimation is often inaccurate and subjective. This project provides a data-driven
approach using machine learning to predict used car prices accurately, ensuring transparency, better
decision-making, and fair valuation in the automobile market.
2.2 Motivation
The used car market has been growing significantly, driven by increasing consumer demand for afford-
able and reliable vehicles. However, determining the fair price of a used car remains a major challenge
due to factors such as varying depreciation rates, mileage, fuel type, and overall condition. Traditional
price estimation methods are often subjective, leading to inconsistent pricing, which affects both buyers
and sellers.
To address this issue, leveraging machine learning techniques for price prediction provides a more
accurate and data-driven approach. By analyzing historical sales data and key vehicle attributes, this
project aims to offer a reliable pricing model that ensures transparency, reduces negotiation time, and
helps users make informed decisions. The motivation behind this project is to bring efficiency and
standardization to the used car market, ultimately benefiting consumers, dealerships, and the automobile
industry as a whole.
2
Used Car Price Prediction
3 Literature Survey
3.1 Traditional Regression Models
• Smith et al. (2015) applied linear regression for price prediction but struggled with nonlinear pricing
patterns.
• Brown et al. (2017) used econometric regression models, which were interpretable but lacked
flexibility for real-world datasets.
3
Used Car Price Prediction
4 Methodology
The Used Car Price Prediction project follows a structured approach that integrates data science and
web development to create an efficient, user-friendly, and accurate prediction system. The methodology
consists of the following key steps:
• Handling Missing Values: Missing values are imputed using statistical methods such as mean,
median, or predictive imputation techniques.
• Removing Duplicates and Outliers: Duplicate records are eliminated, and outliers are identified
using statistical approaches like the interquartile range (IQR) method.
• Feature Encoding: Categorical attributes, such as fuel type and transmission, are converted into
numerical values using label encoding or one-hot encoding.
• Normalization and Standardization: Numerical attributes like mileage and engine capacity are
scaled to ensure uniformity and prevent biased predictions.
• Feature Engineering: New features, such as the car’s age (calculated as the difference between the
current year and the manufacturing year) and price per kilometer, are derived to enhance model
accuracy.
• Linear Regression: Suitable for understanding the linear relationship between car features and price.
4
Used Car Price Prediction
• Decision Trees: Capable of capturing non-linear dependencies and interactions among variables.
• Random Forest: An ensemble learning technique that enhances prediction accuracy by averaging
multiple decision trees.
5
Used Car Price Prediction
6
Used Car Price Prediction
6 System Components
The system components for Used Car price prediction, as shown in the image, include:
• Contains historical car data (Year, Mileage, Fuel Type, Transmission, Price, etc.)
• Used for training and retraining the ML model
6.0.4 Deployment
• Frontend: Vercel/Netlify
• Backend Model: AWS/Heroku
• CSV data is read locally or from cloud storage (Google Drive, AWS S3)
7
Used Car Price Prediction
7 System Flow
System flow describes the step-by-step process of how data moves through the system, from user input
to the final output.
The user interacts with the frontend interface, which is built using React, HTML, and
CSS. They enter car-related details such as:Year of Manufacture Mileage Driven Fuel Type
(Petrol/Diesel/CNG) Transmission Type (Manual/Automatic)
Upon entering the details, the user submits the form, triggering an API request to the backend
server.
The backend server, developed using Flask or Django, receives the request and processes the data
as follows:
• Validates the input to ensure completeness and correctness.
• Converts categorical data (e.g., fuel type, transmission) into a numerical format.
• Sends the processed input data to the Machine Learning Model for prediction.
The trained machine learning model, developed using Scikit-learn, processes the input features.
The model, trained on a dataset of historical car prices, applies regression techniques to predict the
estimated value of the used car. The predicted price is then sent back to the backend.
8
Used Car Price Prediction
The backend formats the predicted price and sends it as a JSON response to the frontend. The
frontend then:
• Start
• Data Collection
• Data Pre-processing
• Model Selection
• Prediction Evaluation
• Output: Predicted Used Car Price
• End
9
Used Car Price Prediction
8 System Modules
The system is structured into multiple interconnected modules that work together to predict used car
prices accurately. Each module is designed to handle a specific aspect of the system, from data collection
to deployment.
10
Used Car Price Prediction
• Decision Trees – Splits data based on feature values for price estimation.
• Random Forest – An ensemble of decision trees for improved accuracy.
• Gradient Boosting Machines (GBM) – Includes XGBoost, LightGBM, and CatBoost for iterative
tree-based learning.
• K-Nearest Neighbors (KNN) – Predicts prices based on the closest similar cars.
11
Used Car Price Prediction
10 Implimentation
12
Used Car Price Prediction
11 Conclusion
The Used Car Price Prediction Website provides a user-friendly platform for estimating the fair price
of second-hand cars based on key attributes such as year, mileage, fuel type, and transmission. By
integrating a machine learning model with an interactive frontend, the website enhances transparency
and helps buyers and sellers make informed decisions.
To improve user experience, animations and interactive elements can be added. Future enhance-
ments may include integrating real-time market trends, expanding the dataset for better accuracy, and
implementing a chatbot for instant assistance. Deploying the site with cloud-based services can further
improve scalability and performance.
13
Used Car Price Prediction
References
[1] Sameerchand Pudaruth, “Predicting the Price of Used Cars using Machine Learning Tech-
niques”;(IJICT 2014)
[2] Enis gegic, Becir Isakovic, Dino Keco, Zerina Masetic, Jasmin Kevric, ”Car Price Prediction Using
Machine Learning”; (TEM Journal 2019)
[3] Ning sun, Hongxi Bai, Yuxia Geng, Huizhu Shi, “Price Evaluation Model In Second Hand Car
System Based On BP Neural Network Theory”; (Hohai University Changzhou, China)
[4] Nitis Monburinon, Prajak Chertchom, Thongchai Kaewkiriya, Suwat Rungpheung, Sabir Buya,
Pitchayakit Boonpou, “Prediction of Prices for Used Car by using Regression Models” (ICBIR 2018)
[5] Doan Van Thai, Luong Ngoc Son, Pham Vu Tien, Nguyen Nhat Anh, Nguyen Thi Ngoc Anh, “Pre-
diction car prices using qualify qualitative data and knowledge-based system” (Hanoi National Uni-
versity)
14