Module 1 Computing Exercises - Intro To Ai
Module 1 Computing Exercises - Intro To Ai
Module 1 Computing Exercises - Intro To Ai
Directions
• Use MS Word file for the following:
Cover page (course, course title, section, computing exercise #, group #, and names of members)
Discussion of your Python implementation (include screenshots with figure numbers and description).
Links (short/tiny url) of your .ipynb folder. Links could be: Google Drive if using Google Colab or OneDrive/Google Drive if
using Jupyter Notebook
• Convert the MS Word file to pdf and sign above your names
• Upload the signed pdf file to your OneDrive or Google drive. Provide the tiny/short url of the uploaded file in your Blackboard
assignment submission.
A. Discussion
Bayes Theorem
Scenario: Consider a human population that may or may not have cancer (Cancer is True or False) and a medical test that returns positive or
negative for detecting cancer (Test is Positive or Negative), e.g. like a mammogram for detecting breast cancer.
Manual Calculation:
Medical diagnostic tests are not perfect; they have error. Sometimes a patient will have cancer, but the test will not detect it. This capability of the
test to detect cancer is referred to as the sensitivity, or the true positive rate.
In this case, we will contrive a sensitivity value for the test. The test is good, but not great, with a true positive rate or sensitivsity of 85%. That is, of all
the people who have cancer and are tested, 85% of them will get a positive result from the test.
Given this information, our intuition would suggest that there is an 85% probability that the patient has cancer.
This type of error in interpreting probabilities is so common that it has its own name; it is referred to as the base rate fallacy.
It has this name because the error in estimating the probability of an event is caused by ignoring the base rate. That is, it ignores the probability of a
randomly selected person having cancer, regardless of the results of a diagnostic test.
In this case, we can assume the probability of breast cancer is low and use a contrived base rate value of one person in 5,000, or (0.0002) 0.02%.
P(Cancer=True) = 0.02%.
We can correctly calculate the probability of a patient having cancer given a positive test result using Bayes Theorem.
P(B|A) ∗ P(A)
P(A|B) =
P(B)
wherein:
• P(Cancer = True | Test = Positive) indicates Posterior
• P(B|A) P(Test = Positive|Cancer = True) indicates Likelihood
• P(Cancer = True) indicates Prior
• P(Test = Positive) indicates Normalizer
CPE126 (Intro to Artificial Intelligence) Page 2 of 5
Module 1 Computing Exercises
We know the probability of the test being positive given that the patient has cancer is 85%, and we know the base rate or the prior probability of a
given patient having cancer is 0.02%; we can plug these values in:
We don’t know P(Test=Positive), it’s not given directly. Instead, we can estimate it using:
P(Test = Positive) = P(Test = Positive|Cancer = True) ∗ P(Cancer = True) + P(Test = Positive|Cancer = False) ∗ P(Cancer = False)
Firstly, we can calculate P(Cancer=False) as the complement of P(Cancer=True), which we already know
P(Cancer=False) = 1 – P(Cancer=True)
= 1 – 0.0002
= 0.9998
We still do not know the probability of a positive test result given no cancer.
Specifically, we need to know how good the test is at correctly identifying people that do not have cancer. That is, testing negative result
(Test=Negative) when the patient does not have cancer (Cancer=False), called the true negative rate or the specificity.
With this final piece of information, we can calculate the false positive or false alarm rate as the complement of the true negative rate.
We can plug this false alarm rate into our calculation of P(Test=Positive) as follows:
Excellent, so the probability of the test returning a positive result, regardless of whether the person has cancer or not is about 5%.
We now have enough information to calculate Bayes Theorem and estimate the probability of a randomly selected person having cancer if they get
a positive test result.
𝑃𝑃(𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇=𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃|𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶=𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇) ∗ 𝑃𝑃(𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶=𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇)
• 𝑃𝑃(𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶 = 𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇 | 𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇 = 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃) =
P(Test=Positive)
The calculation suggests that if the patient is informed, they have cancer with this test, then there is only 0.33% chance that they have cancer.
The example also shows that the calculation of the conditional probability requires enough information.
CPE126 (Intro to Artificial Intelligence) Page 3 of 5
Module 1 Computing Exercises
For example, if we have the values used in Bayes Theorem already, we can use them directly.
This is rarely the case, and we typically have to calculate the bits we need and plug them in, as we did in this case. In our scenario we were given 3
pieces of information, the base rate, the sensitivity (or true positive rate), and the specificity (or true negative rate).
Sensitivity: 85% of people with cancer will get a positive test result.
Base Rate: 0.02% of people have cancer.
Specificity: 95% of people without cancer will get a negative test result.
We did not have the P(Test=Positive), but we calculated it given what we already
We might imagine that Bayes Theorem allows us to be even more precise about a given scenario. For example, if we had more information about the
patient (e.g. their age) and about the domain (e.g. cancer rates for age ranges), and in turn we could offer an even more accurate probability
estimate.
Running the example calculates the probability that a patient has cancer given the test returns a positive result, matching our manual calculation.
Gradient Descent
Introduction
Gradient descent Machine Learning method is an optimization algorithm that is used to find the local minima of a differentiable function. It can be
used in Linear Regression as well as Neural Network.
In the realm of Machine Learning, it is used to find the values of parameters of a differentiable function such that the loss is minimized.
Let us understand the gradient descent algorithm with a simple practical example. Imagine a blind hiker is trying to get down a hill to its lowest point
as shown in the image above. There is no way for the hiker to see which direction to go as he is blind. However, there is one thing hiker understands
clearly if he is going down, it’s the right progress and if he is going up, it is wrong progress.
CPE126 (Intro to Artificial Intelligence) Page 4 of 5
Module 1 Computing Exercises
Therefore, if he keeps taking small steps, that takes him downwards, he will be able to get down the lowest point on the hill.
Here, taking small steps can be considered as a learning rate, and the height above the lowest point can be considered as the loss.
Also reaching the lowest point of the hill can be considered as a convergence which indicates no further possibility of going down, and the loss is
minimum.
2. [Optimization]: Find the minimum values of the function 𝑓𝑓 (𝑥𝑥 ) = 7 + | 𝑥𝑥 − 2| for x between 1 and 4
inclusive.
Required: Make a Python program using either Google Colab or Jupyter Notebook that will output the
global minima and local minimum.
C. Sources/ References
• https://machinelearningmastery.com/bayes-theorem-for-machine-learning/
• http://scipy-lectures.org/intro/scipy/auto_examples/plot_optimize_example2.html
• http://gtribello.github.io/mathNET/bayes-theorem-problems.html
• https://www.whitman.edu/mathematics/calculus_online/section06.01.html#exercises