Python, Machine Learning and Statistics
Python, Machine Learning and Statistics
Python, Machine Learning and Statistics
Example: If you're
NO,testing
THANKSwhether a new
GET THEdrug
APP is more effective than an old
one, a p-value of 0.03 means there's a 3% chance that the observed difference
in effectiveness is due to random chance, assuming no real effect.
:
Interpretation: A p-value less than 0.05 typically indicates strong evidence
against the null hypothesis, leading to its rejection. A higher p-value suggests
insuFcient evidence to reject the null hypothesis.
A 95% con;dence interval means that if you were to take 100 different
samples and compute a con;dence interval for each sample, approximately
95 of those intervals would contain the true population parameter.
The CLT states that the distribution of the sample mean approaches a normal
distribution as the sample size becomes larger, regardless of the original
population distribution.
Type II Error (False Negative): Failing to reject the null hypothesis when it is
actually false. E.g. Concluding a drug is not effective when it actually is.
Statistical signi;cance tells us whether the results we observe are likely due
to chance or if there is a real effect. If a result is statistically signi;cant, it
means that it is unlikely to have occurred by random chance.
Example: In a clinical trial, a test with high power will more likely detect a real
difference in drug effectiveness if it exists.
Interpretation: Higher statistical power reduces the risk of Type II errors and
ensures that true effects are detected.
Outliers: Data points that signi;cantly differ from other observations in the
dataset. They can result from variability in the data or errors in measurement.
:
Example: In a dataset of human weights, a weight of 500 kg would be an
outlier.
Example: Use a t-test for normally distributed data and a Mann-Whitney U test
for data without normal distribution.
14. You are working for an e-commerce company and want to determine if a new
marketing campaign signiBcantly increases the average purchase amount
compared to the previous campaign. What statistical test would you use?
Example: Compare the average purchase amounts from customers who saw
the new campaign with those who saw the previous campaign.
15. A medical researcher wants to compare the blood pressure levels of patients
before and after taking a new medication. The same patients are measured
before and after the treatment. What type of statistical test should be used?
Answer: You should use a paired samples t-test (or dependent samples t-
test). This test is used to compare the means of two related groups.
Example: Measure the blood pressure of the same patients before starting
the medication and after completing the treatment, then compare the two
sets of measurements.
Interpretation: This test assesses whether the mean difference between the
two related groups is statistically signi;cant.
16. You want to examine if there is an association between smoking and lung
cancer incidence in a study. You collect categorical data on smoking status
:
(smoker/non-smoker) and lung cancer status (present/absent) from a sample of
individuals. What statistical test should you apply?
Answer: You would use the Chi-Square Test of Independence. This test
evaluates if there is a signi;cant association between two categorical
variables.
17. A company wants to determine if the average number of daily hours spent on
training differs between three departments (Marketing, Sales, and Customer
Service). What statistical test would be appropriate for this situation?
Answer: You should use a one-way ANOVA (Analysis of Variance) test. This
test compares the means of three or more independent groups to see if at
least one group mean is different from the others.
18. You are analyzing customer satisfaction ratings from two different branches
of a restaurant chain. Each branch has collected ratings on a 5-point scale from
100 customers. What statistical test would you use to compare the ratings
between the two branches?
19. You have a dataset with several features and want to predict whether a
customer will buy a product (binary outcome: Yes or No). What model would you
choose for this classiBcation task, and how would you evaluate its performance?
Answer: For a binary classi;cation task, you could use models such as
Logistic Regression, Decision Trees, Random Forests, or Support Vector
Machines (SVMs).
Evaluation Metrics:
Interpretation: The chosen metrics provide insights into how well the model
performs and where it might need improvement, especially if the classes are
imbalanced.
20. You are working on a regression problem where you need to predict house
prices based on features like size, location, and number of bedrooms. How
:
would you select the appropriate regression model and evaluate its
performance?
Answer: For regression tasks, you might choose models such as Linear
Regression, Ridge Regression, Lasso Regression, or Random Forest
Regression.
Evaluation Metrics:
● Mean Absolute Error (MAE): Average absolute difference between predicted and
actual values.
● Mean Squared Error (MSE): Average squared difference between predicted and
actual values. MSE penalizes larger errors more than MAE.
● Root Mean Squared Error (RMSE): Square root of MSE, providing error in the same
units as the target variable.
● R² (Coefficient of Determination): Proportion of variance in the dependent variable
that is predictable from the independent variables.
Interpretation: These metrics help assess the accuracy and reliability of the
regression model and guide improvements in feature selection or model
complexity.
21. What is the difference between list, tuple, and set in Python?
Answer: In Python, lists, tuples, and sets are data structures used to store collections of
items. Each has its own characteristics, which makes them suitable for different
situations.
List: A list is an ordered collection of items that can be modi;ed (mutable). Lists allow
duplicate elements.
Characteristics:
○ Ordered: Items have a de;ned order, and you can access them by index.
○ Mutable: You can change, add, or remove items after the list is created.
Tuple
Characteristics:
○ Ordered: Items have a de;ned order, and you can access them by index.
Interpretation: Use tuples when you need a collection of items in a speci;c order but
don’t want to allow modi;cations to the collection (e.g., ;xed data like coordinates or
database records).
Set
DeBnition: A set is an unordered collection of unique items. Sets are mutable, but they
do not allow duplicates.
Characteristics:
○ Unordered: Sets don’t maintain order, so you can’t access items by index.
Interpretation: Use sets when you need to store unique items and don’t care about the
order (e.g., storing a collection of unique user IDs or eliminating duplicates from a list).
Interpretation of Virtual Environments: Let’s say you are working on two projects:
Without virtual environments, installing Django globally would result in conWicts, as the
two versions would overwrite each other. Using virtual environments, however, you can:
Each project works independently with the correct version of Django, and they don’t
interfere with one another.
Conclusion:
Answer: The append() method adds a single element to the end of a list, while the
extend() method adds the elements of an iterable (e.g., list, tuple) to the end of the
list.
Example:
24. How does the map(), reduce() and Blter() function work in Python, and
provide an example.
Answer: In Python, map(), ;lter(), and reduce() are higher-order functions, meaning
they take other functions as arguments. They are commonly used for applying
operations to collections like lists, tuples, etc
map()
Purpose: map() applies a given function to all items in an iterable (like a list) and returns
a map object (which can be converted to a list, tuple, etc.).
1. A function.
Interpretation: In this example, the square() function is applied to each element in the
list numbers. The result is a map object, which is then converted to a list to view the
squared values.
Blter()
Purpose: filter() applies a given function to an iterable and returns only the elements
that evaluate to True.
2. An iterable.
The function is applied to each element, and only the elements that make the function
return True are included in the result.
Example:
reduce()
Purpose: reduce() applies a function cumulatively to the items of an iterable, reducing
the iterable to a single value.
How it works: It takes two arguments:
2. An iterable.
The function is applied cumulatively to the items, so the ;rst two elements are combined,
then the result is combined with the next element, and so on.
Note: reduce() is part of the functools module in Python 3, so you need to import it
;rst.
Example:
reduce() applies the multiply() function cumulatively to the elements of the list. It
;rst multiplies 1 * 2, then multiplies the result with 3, and ;nally with 4, yielding 24.
:
25. Python Programming Related questions?
FizzBuzz
Question: Write a Python function that prints the numbers from 1 to 100. But for
multiples of 3, print "Fizz" instead of the number, and for multiples of 5, print "Buzz". For
numbers which are multiples of both 3 and 5, print "FizzBuzz".
def ;zz_buzz():
for i in range(1, 101):
if i % 3 == 0 and i % 5 == 0:
print("FizzBuzz")
elif i % 3 == 0:
print("Fizz")
elif i % 5 == 0:
print("Buzz")
else:
print(i)
Palindrome Check
Question: Write a Python function to check if a given string is a palindrome (a string that
reads the same forwards and backwards).
def is_palindrome(s):
return s == s[::-1]
Question: Write a Python function that prints all prime numbers up to a given number n.
def is_prime(n):
if n < 2:
return False
for i in range(2, int(n ** 0.5) + 1):
:
if n % i == 0:
return False
return True
def print_prime_numbers(n):
for i in range(2, n+1):
if is_prime(i):
print(i)
Reverse a String
Question: Write a Python function to reverse a given string without using any built-in
functions.
def reverse_string(s):
reversed_s = ""
for char in s:
reversed_s = char + reversed_s
return reversed_s
Sum of Digits
Question: Write a Python function that takes an integer and returns the sum of its digits.
def sum_of_digits(n):
total = 0
while n > 0:
total += n % 10
n = n // 10
return total
Question: Write a Python function to ;nd the largest element in a list without using any
built-in functions like max().
def ;nd_largest_element(lst):
largest = lst[0]
for num in lst:
if num > largest:
largest = num
return largest
Factorial of a Number
Question: Write a Python function to compute the factorial of a given number using
recursion.
def factorial(n):
if n == 0 or n == 1:
return 1
:
else:
return n * factorial(n-1)
Question: Write a Python function to count the number of vowels (a, e, i, o, u) in a given
string.
def count_vowels(s):
vowels = "aeiouAEIOU"
count = 0
for char in s:
if char in vowels:
count += 1
return count
Question: Given a list of numbers from 1 to n with one number missing, write a Python
function to ;nd the missing number.
Question: Write a Python function that generates the ;rst n numbers in the Fibonacci
sequence.
def ;bonacci(n):
;b_sequence = [0, 1]
for i in range(2, n):
;b_sequence.append(;b_sequence[-1] + ;b_sequence[-2])
return ;b_sequence[:n]
def count_occurrences(lst):
counts = {}
for item in lst:
counts[item] = counts.get(item, 0) + 1
return counts
Question: Write a Python function to remove duplicates from a list while maintaining the
original order.
def remove_duplicates(lst):
unique_list = []
for item in lst:
if item not in unique_list:
unique_list.append(item)
return unique_list
Question: Write a Python function to ;nd the intersection (common elements) of two
lists.
Question: Write a Python function to convert two lists into a dictionary where one list
contains the keys and the other contains the values.
Question: Write a Python function that counts the frequency of each element in a list and
returns the result as a dictionary.
def count_frequency(lst):
frequency_dict = {}
for item in lst:
:
frequency_dict[item] = frequency_dict.get(item, 0) + 1
return frequency_dict
Question: Write a Python function to sort a list of tuples based on the second element in
each tuple.
def sort_by_second_element(tuples_list):
return sorted(tuples_list, key=lambda x: x[1])
Question: Write a Python function to ;nd the tuple with the maximum and minimum
values based on the ;rst element of each tuple.
def min_max_tuple(tuples_list):
min_tuple = min(tuples_list, key=lambda x: x[0])
max_tuple = max(tuples_list, key=lambda x: x[0])
return min_tuple, max_tuple
Question: Write a Python function to merge two dictionaries. If a key appears in both
dictionaries, the value from the second dictionary should overwrite the value from the
;rst.
Question: Write a Python function to Watten a list of lists into a single list.
def Watten_list(lst):
return [item for sublist in lst for item in sublist]
Question: Write a Python function to ;nd the keys with the maximum and minimum
values in a dictionary.
def ;nd_max_min_keys(d):
max_key = max(d, key=d.get)
min_key = min(d, key=d.get)
return max_key, min_key
:
Unpack a List of Tuples into Two Separate Lists
Question: Write a Python function to unpack a list of tuples into two separate lists: one
containing all the ;rst elements, and the other containing all the second elements.
def unpack_tuples(tuples_list):
;rst_elements = [x[0] for x in tuples_list]
second_elements = [x[1] for x in tuples_list]
return ;rst_elements, second_elements
Question: Write a Python function that takes a list of strings and creates a dictionary
where each key is the string and the value is the length of the string.
def list_to_dict(lst):
return {item: len(item) for item in lst}
Question: Write a Python function that takes a list and returns a dictionary where the
keys are the indices and the values are the elements of the list.
def list_to_dict_with_index(lst):
return {i: lst[i] for i in range(len(lst))}
Question: Write a Python function that converts a list of strings into a dictionary where
the string is the key and the length of the string is the value, using a dictionary
comprehension.
Question: Write a Python function that takes two lists (one containing keys and the other
containing values) and merges them into a dictionary using a dictionary comprehension.
Question: Write a Python function that takes two dictionaries and creates a new
dictionary by using the keys from the ;rst dictionary and the values from the second
dictionary. If a key doesn't exist in the second dictionary, set its value to None.
Question: Write a Python function that swaps the keys and values in a dictionary.
Assume that all values are unique.
def swap_keys_values(d):
return {value: key for key, value in d.items()}
Question: Write a Python function that converts a list of tuples into a dictionary using
dictionary comprehension. Each tuple should contain two elements: a key and a value.
def tuples_to_dict(tuples_list):
return {key: value for key, value in tuples_list}
Question: Write a Python function that converts a nested list into a nested dictionary
where the ;rst element of each sublist is the key and the remaining elements form a
sublist as the value.
def nested_list_to_dict(nested_list):
return {sublist[0]: sublist[1:] for sublist in nested_list}
Question: Write a Python function that ;lters a dictionary by retaining only those key-
value pairs where the value is an even number.
def ;lter_even_values(d):
return {key: value for key, value in d.items() if value % 2 == 0}
Question: Write a Python function that takes a list of dictionaries and creates a new
dictionary where the keys are unique values of a speci;c key in the dictionaries and the
values are lists of dictionaries that have the same key value.