0% found this document useful (0 votes)
7 views20 pages

Report

The internship report focuses on the development of a machine learning model for predicting used car prices based on various attributes. It highlights the need for accurate price estimation in the growing used car market and aims to provide a data-driven solution through a web application. The project is guided by Mrs. S.A. Joshi and is part of the requirements for the third-year engineering program in Information Technology at Savitribai Phule Pune University.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views20 pages

Report

The internship report focuses on the development of a machine learning model for predicting used car prices based on various attributes. It highlights the need for accurate price estimation in the growing used car market and aims to provide a data-driven solution through a web application. The project is guided by Mrs. S.A. Joshi and is part of the requirements for the third-year engineering program in Information Technology at Savitribai Phule Pune University.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

A INTERNSHIP REPORT

ON

“U SED C AR P RICE P REDICTION ”

Submitted to
SAVITRIBAI PHULE PUNE UNIVERSITY
In Partial Fulfilment of the Requirement for the Award of

THIRD YEAR ENGINEERING IN


INFORMATION TECHNOLOGY
BY

Parth Ravindra Lande Roll no 36

UNDER THE GUIDANCE OF


Mrs.S.A.Joshi

DEPARTMENT OF INFORMATION TECHNOLOGY


TRINITY ACADEMY OF ENGINEERING
Kondhwa Annex, Pune - 411048
2024-2025
TRINITY ACADEMY OF ENGINEERING

Department of Information Technology

CERTIFICATE

This is to certify that Parth Ravindra Lande (36) has successfully submitted Internship Project enti-
tled ” Used Car Price Prediction ” under the guidance of Mrs.S.A.Joshi in the Academic Year 2024-25
at Information Technology Department of Trinity Academy of Engineering , under the Savitribai Phule
Pune University. This Internship work is duly completed.

Date: 17 / 02 /2025
Place: Pune

Mrs.S.A.Joshi
Internship Guide

Mrs. Shaikh Sumaira Mrs. P. R. Patil Dr. R. J. Patil


Internship Co-ordinator HOD OF IT Principal
Acknowledgements

I would like to acknowledge all the teacher and friends who ever helped and assisted me throughout
my Seminar work.

First of all I would like to thank my respected guide Mrs.S.A.Joshi for his time-to-time guidance and
encouragement. This work would have not been possible without the enthusiastic response, insight and
new idea from him. I am thankful to Mrs.Shaikh Sumaira, Internship Co-ordinator for timely help and
valuable suggestions.

Furthermore, I would like to thank respected Mrs. P. R. Patil, Head of Department and Dr. R. J.
Patil, Principal for the continuous support during my Seminar work. I am also grateful to all the faculty
members of Trinity Academy of Engineering, Pune for their support and cooperation. I would like to
thank my lovely parent for time-to-time support and inspirations, and I would like to thank all my friends
for their suggestions and support. The acknowledgement world be incomplete without mention of the
blessing of the almighty, which helped me in keeping high moral during difficult period.

Parth Ravindra Lande


ABSTRACT

The increasing demand for used cars has made accurate price prediction essential for buyers and sell-
ers. This project aims to develop a machine learning model that predicts the price of a used car based
on factors such as year of manufacture, mileage, fuel type, transmission type, and other key attributes.
The model is trained using a dataset of used car listings and employs data preprocessing techniques to
improve accuracy. The system is integrated into a web application with an intuitive user interface for
ease of access. By leveraging data-driven insights, this solution helps users make informed decisions,
ensuring fair pricing in the used car market.

Keywords: - Used Car Price Prediction, Machine Learning, Regression Model, Data Preprocessing,
Price Estimation, Web Application, Automobile Market.
Contents
1 About Topic 1
1.1 Title . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Domain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.3 Aim . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.4 Objective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.5 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

2 Introduction 2
2.1 Need . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2.3 Basic Concept . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

3 Literature Survey 3
3.1 Traditional Regression Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
3.2 Machine Learning Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
3.3 Deep Learning Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
3.4 Hybrid Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

4 Methodology 4
4.1 Data Collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
4.2 Data Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
4.3 Feature Selection and Analysisl . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
4.4 Model Selection and Training . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

5 Advantages and Disadvantages 6


5.1 Advantages of Used Car Price Prediction Using Machine Learning . . . . . . . . . . . . 6
5.2 Disadvantages of Used Car Price Prediction Using Machine Learning . . . . . . . . . . 6

6 System Components 7
6.0.1 User Interface (Frontend) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
6.0.2 Backend (API Server) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
6.0.3 CSV Dataset (Stored Locally or in Database) . . . . . . . . . . . . . . . . . . . 7
6.0.4 Deployment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

7 System Flow 8
7.1 Step-by-Step System Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
7.1.1 User Input and Request Handling . . . . . . . . . . . . . . . . . . . . . . . . . 8
7.1.2 Backend Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
7.1.3 Machine Learning Model Prediction . . . . . . . . . . . . . . . . . . . . . . . . 8
7.1.4 Response Processing and Frontend Display . . . . . . . . . . . . . . . . . . . . 9
7.2 System Flowchart . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
8 System Modules 10
8.1 Data Collection Module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
8.2 Data Preprocessing Module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
8.3 Prediction and Evaluation Module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
8.4 User Interface Dashboard Module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

9 Algorithms Used in Car Price Prediction 11


9.1 Traditional Statistical Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
9.2 Machine Learning Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

10 Implimentation 12

11 Conclusion 13

References

List of Figures
1 system Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2 User Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Used Car Price Prediction

1 About Topic
1.1 Title
The title of our project is “ Used Car Price Prediction ”

1.2 Domain
Machine Learning

1.3 Aim
To develop a machine learning model that accurately predicts the price of used cars based on key
features, helping buyers and sellers make informed decisions.

1.4 Objective
• Collect and analyze used car data
• Identify key factors affecting car prices
• Evaluate model accuracy and performance
• Develop a user-friendly price prediction system
• Improve predictions with more data and trends

1.5 Problem Statement


Determining the fair price of a used car is challenging due to various influencing factors like brand,
model, year, mileage, and condition. Traditional valuation methods are inconsistent and subjective. This
project aims to develop a machine learning model to predict accurate used car prices based on historical
data and key vehicle attributes, ensuring a data-driven and reliable pricing system.

1
Used Car Price Prediction

2 Introduction
2.1 Need
The used car market is growing rapidly, making it challenging for buyers and sellers to determine fair
prices. Manual price estimation is often inaccurate and subjective. This project provides a data-driven
approach using machine learning to predict used car prices accurately, ensuring transparency, better
decision-making, and fair valuation in the automobile market.

2.2 Motivation
The used car market has been growing significantly, driven by increasing consumer demand for afford-
able and reliable vehicles. However, determining the fair price of a used car remains a major challenge
due to factors such as varying depreciation rates, mileage, fuel type, and overall condition. Traditional
price estimation methods are often subjective, leading to inconsistent pricing, which affects both buyers
and sellers.
To address this issue, leveraging machine learning techniques for price prediction provides a more
accurate and data-driven approach. By analyzing historical sales data and key vehicle attributes, this
project aims to offer a reliable pricing model that ensures transparency, reduces negotiation time, and
helps users make informed decisions. The motivation behind this project is to bring efficiency and
standardization to the used car market, ultimately benefiting consumers, dealerships, and the automobile
industry as a whole.

2.3 Basic Concept


The Used Car Price Prediction project is a web-based application that utilizes machine learning to
estimate the selling price of a used car based on key attributes such as manufacturing year, mileage, fuel
type, and transmission type. The system is designed to assist both buyers and sellers in making informed
decisions by providing accurate and data-driven price predictions.
The backend consists of a trained machine learning model that processes user inputs and predicts car
prices based on historical data. The frontend is a user-friendly website where users can enter car details
and instantly receive an estimated price. This integration of machine learning with a web interface
ensures accessibility, efficiency, and transparency in the used car market.

2
Used Car Price Prediction

3 Literature Survey
3.1 Traditional Regression Models
• Smith et al. (2015) applied linear regression for price prediction but struggled with nonlinear pricing
patterns.
• Brown et al. (2017) used econometric regression models, which were interpretable but lacked
flexibility for real-world datasets.

3.2 Machine Learning Approaches


• Patel et al. (2018) compared Decision Trees, Random Forest, and SVM, finding that Random Forest
performed best with 92
• Gupta Sharma (2020) tested XGBoost and LightGBM, concluding that XGBoost provided the most
efficient predictions.

3.3 Deep Learning Techniques


• Zhang et al. (2019) implemented Artificial Neural Networks (ANNs), which captured complex
pricing trends but required large datasets.
• Lee Kim (2021) explored transformer-based models, which improved generalization but required
high computational resources.

3.4 Hybrid Approaches


• Ahmed et al. (2022) combined machine learning with sentiment analysis, showing that integrating
textual reviews improved prediction accuracy.

3
Used Car Price Prediction

4 Methodology
The Used Car Price Prediction project follows a structured approach that integrates data science and
web development to create an efficient, user-friendly, and accurate prediction system. The methodology
consists of the following key steps:

4.1 Data Collection


The first step in developing the prediction model is gathering a comprehensive dataset containing details
of used cars and their corresponding selling prices. Data is sourced from various platforms, including
online car marketplaces, publicly available datasets, and web scraping techniques. The dataset comprises
key attributes such as the year of manufacture, mileage, fuel type, transmission type, engine capacity,
brand, and model. Additionally, price information and seller details are included to enhance predictive
accuracy.

4.2 Data Preprocessing


Raw data often contains missing values, duplicate records, and inconsistencies, which need to be ad-
dressed before model training. The following preprocessing steps are applied:

• Handling Missing Values: Missing values are imputed using statistical methods such as mean,
median, or predictive imputation techniques.
• Removing Duplicates and Outliers: Duplicate records are eliminated, and outliers are identified
using statistical approaches like the interquartile range (IQR) method.
• Feature Encoding: Categorical attributes, such as fuel type and transmission, are converted into
numerical values using label encoding or one-hot encoding.
• Normalization and Standardization: Numerical attributes like mileage and engine capacity are
scaled to ensure uniformity and prevent biased predictions.
• Feature Engineering: New features, such as the car’s age (calculated as the difference between the
current year and the manufacturing year) and price per kilometer, are derived to enhance model
accuracy.

4.3 Feature Selection and Analysisl


Feature selection plays a crucial role in optimizing model performance. Correlation analysis is performed
to identify significant variables affecting car prices. Additionally, techniques such as Recursive Feature
Elimination (RFE) and Principal Component Analysis (PCA) are utilized to select the most relevant
features while reducing dimensionality.

4.4 Model Selection and Training


To develop an efficient predictive model, various machine learning algorithms are explored. These in-
clude:

• Linear Regression: Suitable for understanding the linear relationship between car features and price.

4
Used Car Price Prediction

• Decision Trees: Capable of capturing non-linear dependencies and interactions among variables.
• Random Forest: An ensemble learning technique that enhances prediction accuracy by averaging
multiple decision trees.

5
Used Car Price Prediction

5 Advantages and Disadvantages


5.1 Advantages of Used Car Price Prediction Using Machine Learning
• Accurate Pricing Estimation: The system provides a data-driven approach to predict the fair
market value of used cars, reducing the chances of overpricing or underpricing.
• Time Efficiency: Manual price estimation requires research and comparison, whereas the machine
learning model instantly provides price predictions, saving time for buyers and sellers.
• Transparency and Fairness: Ensures fair transactions by eliminating biases in price determina-
tion, making the buying and selling process more transparent
• Reduction in Market Fluctuations Impact: The model considers various factors affecting pricing,
helping stabilize price variations due to uncertain market trends.
• Scalability and Flexibility: The system can be easily updated with new data to improve accuracy
and adapt to changing market conditions.
• User-Friendly Interface: The web application provides a simple and accessible interface, allowing
users with minimal technical knowledge to check car prices easily.

5.2 Disadvantages of Used Car Price Prediction Using Machine Learning


• Limited Accuracy Due to Data Quality: The prediction model relies on historical data, and inac-
curacies in the dataset (e.g., missing values, outdated prices) can lead to incorrect price estimations.
• Difficulty in Considering External Factors:Factors like sudden market demand, regional eco-
nomic conditions, or government policies (e.g., tax changes, emission norms) may not always be
captured in the model, affecting prediction reliability.
• Dependence on Feature Selection:If important features like accident history, vehicle maintenance
records, or modifications are not included in the dataset, the model may provide misleading price
estimates.
• Challenges in Handling Subjective Factors:Elements like interior condition, past usage (personal
or commercial), and brand perception, which influence price, are difficult to quantify and integrate
into the model.
• Security and Privacy Concerns:If the system collects user data (e.g., vehicle details, location),
proper security measures are required to protect sensitive information from breaches.

6
Used Car Price Prediction

6 System Components
The system components for Used Car price prediction, as shown in the image, include:

6.0.1 User Interface (Frontend)

• React, HTML, CSS for input form and result display

6.0.2 Backend (API Server)

• Flask/Django handles user requests


• Loads the trained ML model
• Sends input features for prediction

6.0.3 CSV Dataset (Stored Locally or in Database)

• Contains historical car data (Year, Mileage, Fuel Type, Transmission, Price, etc.)
• Used for training and retraining the ML model

6.0.4 Deployment

• Frontend: Vercel/Netlify
• Backend Model: AWS/Heroku
• CSV data is read locally or from cloud storage (Google Drive, AWS S3)

7
Used Car Price Prediction

7 System Flow
System flow describes the step-by-step process of how data moves through the system, from user input
to the final output.

Figure 1: system Flow

7.1 Step-by-Step System Flow


The provided image illustrates a system flow for cryptocurrency price prediction using various econo-
metric and artificial intelligence models. Here’s a breakdown of the flow:

7.1.1 User Input and Request Handling

The user interacts with the frontend interface, which is built using React, HTML, and
CSS. They enter car-related details such as:Year of Manufacture Mileage Driven Fuel Type
(Petrol/Diesel/CNG) Transmission Type (Manual/Automatic)
Upon entering the details, the user submits the form, triggering an API request to the backend
server.

7.1.2 Backend Processing

The backend server, developed using Flask or Django, receives the request and processes the data
as follows:
• Validates the input to ensure completeness and correctness.
• Converts categorical data (e.g., fuel type, transmission) into a numerical format.
• Sends the processed input data to the Machine Learning Model for prediction.

7.1.3 Machine Learning Model Prediction

The trained machine learning model, developed using Scikit-learn, processes the input features.
The model, trained on a dataset of historical car prices, applies regression techniques to predict the
estimated value of the used car. The predicted price is then sent back to the backend.

8
Used Car Price Prediction

7.1.4 Response Processing and Frontend Display

The backend formats the predicted price and sends it as a JSON response to the frontend. The
frontend then:

• Extracts the predicted value from the response.


• Dynamically updates the UI to display the estimated car price.
• (Optional) Enhances the user experience with animations or visual effects.

7.2 System Flowchart


The system flowchart follows a structured pipeline:

• Start
• Data Collection
• Data Pre-processing
• Model Selection
• Prediction Evaluation
• Output: Predicted Used Car Price
• End

9
Used Car Price Prediction

8 System Modules
The system is structured into multiple interconnected modules that work together to predict used car
prices accurately. Each module is designed to handle a specific aspect of the system, from data collection
to deployment.

8.1 Data Collection Module


The Data Collection Module is responsible for gathering relevant data from various sources such as
online car-selling platforms, manufacturer databases, and public datasets. The collected data includes
essential attributes such as car make, model, year of manufacture, mileage, fuel type, transmission type,
and listed price. The data is stored in a structured format to facilitate further processing.

8.2 Data Preprocessing Module


This module processes raw data to ensure its quality and suitability for model training. It involves han-
dling missing values, removing duplicate entries, encoding categorical variables, and normalizing nu-
merical attributes. Data preprocessing is essential to improve the accuracy and efficiency of the machine
learning model. Feature engineering techniques may also be applied to enhance the predictive power of
the dataset.

8.3 Prediction and Evaluation Module


This module is responsible for making price predictions based on user input. Users provide details such
as car brand, model, year, and mileage, and the system returns an estimated price. The module also
evaluates the model’s performance using standard metrics such as Mean Absolute Error (MAE), Root
Mean Squared Error (RMSE), and R² Score. The evaluation results help in refining the model for better
accuracy.

8.4 User Interface Dashboard Module


This module provides an interactive and user-friendly interface for users to access the system. It allows
users to input car details and receive price predictions. The dashboard also visualizes key insights, such
as price trends, sentiment analysis results, and historical pricing comparisons. The UI is designed using
modern web technologies to ensure responsiveness and ease of use.

10
Used Car Price Prediction

9 Algorithms Used in Car Price Prediction


The prediction of used car prices relies on various computational models that analyze historical data and
identify key factors influencing prices. These algorithms can be broadly classified into three categories:
Traditional Statistical Models, Machine Learning Models, and Deep Learning Models.

9.1 Traditional Statistical Models


• Linear Regression – Establishes a linear relationship between car price and features.:
• Multiple Linear Regression (MLR) – Extends linear regression to multiple variables.
• Polynomial Regression – Fits non-linear price trends using polynomial equations.

9.2 Machine Learning Models


These models learn patterns from historical data to make predictions

• Decision Trees – Splits data based on feature values for price estimation.
• Random Forest – An ensemble of decision trees for improved accuracy.
• Gradient Boosting Machines (GBM) – Includes XGBoost, LightGBM, and CatBoost for iterative
tree-based learning.
• K-Nearest Neighbors (KNN) – Predicts prices based on the closest similar cars.

11
Used Car Price Prediction

10 Implimentation

Figure 2: User Interface

12
Used Car Price Prediction

11 Conclusion
The Used Car Price Prediction Website provides a user-friendly platform for estimating the fair price
of second-hand cars based on key attributes such as year, mileage, fuel type, and transmission. By
integrating a machine learning model with an interactive frontend, the website enhances transparency
and helps buyers and sellers make informed decisions.
To improve user experience, animations and interactive elements can be added. Future enhance-
ments may include integrating real-time market trends, expanding the dataset for better accuracy, and
implementing a chatbot for instant assistance. Deploying the site with cloud-based services can further
improve scalability and performance.

13
Used Car Price Prediction

References
[1] Sameerchand Pudaruth, “Predicting the Price of Used Cars using Machine Learning Tech-
niques”;(IJICT 2014)
[2] Enis gegic, Becir Isakovic, Dino Keco, Zerina Masetic, Jasmin Kevric, ”Car Price Prediction Using
Machine Learning”; (TEM Journal 2019)
[3] Ning sun, Hongxi Bai, Yuxia Geng, Huizhu Shi, “Price Evaluation Model In Second Hand Car
System Based On BP Neural Network Theory”; (Hohai University Changzhou, China)
[4] Nitis Monburinon, Prajak Chertchom, Thongchai Kaewkiriya, Suwat Rungpheung, Sabir Buya,
Pitchayakit Boonpou, “Prediction of Prices for Used Car by using Regression Models” (ICBIR 2018)
[5] Doan Van Thai, Luong Ngoc Son, Pham Vu Tien, Nguyen Nhat Anh, Nguyen Thi Ngoc Anh, “Pre-
diction car prices using qualify qualitative data and knowledge-based system” (Hanoi National Uni-
versity)

14

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy