0% found this document useful (0 votes)
63 views4 pages

Chapter 4 - Data Science

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
63 views4 pages

Chapter 4 - Data Science

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

Chapter 4: Data Science

Class X

Q1. What is data science?

Ans: Data science is a field that uses scientific methods, processes, algorithms, and systems to
extract knowledge and insights from many structural and unstructured data to apply in AI
applications.

Q2. What is targeted advertising?

Ans: Targeted advertising is a form of advertising, including online advertising, that is directed
towards an audience with certain traits, based on the product or person that advertiser is
promoting. It makes use of past data about the needs and choices of the user and fixes products
and time for advertising the product accordingly.

Q3. What is the recommended system?

Ans: A recommended system refers to a system that is capable of predicting the future
preference of a set of items for a user, and recommending the top item. Recommended system
helps the retailers/sellers and the users by suggesting items similar to the ones a person likes or
by suggesting items like by people who are similar to the user.

Q4. How has data science impacted the healthcare field?

Ans: Data science provides practical insights in the crucial decision making concerning
healthcare. Data driven decision making opens up new possibilities to boost healthcare quality.
Data science has improved the healthcare in various ways, such as

i. Improving diagnostic accuracy and efficiency

ii. Turning patient, Care into process medicine

iii. Advancing pharmaceutical research to find cure

iv. Reducing hospital re-admissions by suggesting preventive care and many more.

Q5. In what ways data science is helpful to the airline industry?

Ans: Data science really proved to be a boon to this industry as it helps to:

* Predict flight delay

* Decide which class of airplanes to buy

* Whether to directly land at the destination or take a hold in between


* Effectively drive, customer loyalty programs

Q6. Explain the term Outliers data. Give an example.

Ans: Outliers means the data that differs drastically from the rest of the data. The kind of unusual
data needs to be removed or replaced from the data set for accurate results. For example, value
zero, given in marks of a student who is absent instead of exemption. This will not give an
accurate class average.

Q7. What type of data can be used by pandas?

Ans: Pandas can be used for the following:

* Tabular data with heterogeneously typed columns, as in an SQL table or Excel spreadsheet

* Ordered and unordered time series data

* Arbitrary matrix data with row and column labels

* Any other form of observation/statistical data sets.

Q8. Why is KNN called a lazy learner algorithm?

Ans: KNN is also called a lazy learner algorithm because it does not learn from the training set
immediately. Instead, it stores the data set and at the time of classification, might perform an
action on the data set.

Q9. What are the important points to remember when data is collected?

Ans: While handling data online or off-line, the following points to be always remembered:

* The source of data should be authentic and reliable, as the random data source could provide
wrong or unusable data.

* For proper training of AI model, the authenticity of data is must.

* Privacy of data sources should always be kept in mind, as it is a fundamental right of


everyone.

* Consent of the owner of the data should be seeked, before using someone’s personal data set.

* Data present in the public domain should preferably be used, if available.

Q10. Explain the box plot graph.

Ans: The box plot graph represents the summary of the set of data values where a box is created
for each having properties like minimum, first quartile, median, third quartile and maximum. A
vertical line goes to the box at the median. Here, X axis denotes the data to be plotted while the
Y axis shows the frequency distribution.
Q11. Differentiate between arrays and lists in Python.

Ans: The following are the differences between a arrays and lists:

Array List

Array is a collection of homogenous values. List is a collection of heterogeneous values.

In arrays data of one type does not support List works perfectly by using data of one
data of another type. type by converting it into another data type.

Arrays can be accessed only through the List occupies more memory space and can
package – NumPy and occupies less be accessed directly in python without any
memory space. package support.

In arrays, the mathematical operators can In list the mathematical operators cannot be
be directly used. used directly on it instead need to be used
separately on individual elements.

Q12. What are pandas used for?

Ans: Panda is an open- source Python Library used for data manipulation and data analysis.

Q13. What is erroneous data? Explain its two types.

Ans: Erroneous data is test data that falls outside of what is acceptable and should be rejected by
the system.

The two types of erroneous data are:

Incorrect Values: The values in the dataset at random placers are not correct. Either the data is
mismatched or it is not relevant to that position.

Invalid or null values: It means value is either corrupted or has no meaning. These values when
occurring in a dataset need to be removed as they hold no value for data processing.

Q14. What are packages in Python?

Ans: Python Packages are a way to organize and structure our Python code into reusable
components. It is like a folder that contains related Python files (modules) that work together to
provide certain functionality. Packages help keep our code organized, make it easier to manage
and maintain, and allow us to share our code with others.
Q15. Explain the different formats in which the tabular dataset can be stored

Ans: The tabular data set can be stored in different formats. Some of the commonly used formats
are:

CSV: it stands for, separated values. It is a simple file format used to store tabular data. Each line
of this file is a data record and each record consists of one or more fields which are separated by
commas.

Spreadsheet: A spreadsheet is a piece of paper or a computer program which is used for


accounting and recording data using rows and columns into which information can be entered.

SQL: structured query language is a domain specific programming language used in


programming and is designed for managing data held in different kinds of DBMS. It is
particularly useful in handling structured data.

Extra Questions

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy