0% found this document useful (0 votes)
20 views

Tung Wah College GEN3005 / GED3005 Big Data and Data Sciences

This document provides an overview and instructions for setting up Python and completing basic programming tasks. It introduces how to set up the PyCharm IDE for Python development. It then presents two programming tasks - a simple calculator program that performs basic math operations on user-input values, and a statistics program that calculates mean, population standard deviation, and sample standard deviation of values in a list using the NumPy package.

Uploaded by

Valiant Cheung
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views

Tung Wah College GEN3005 / GED3005 Big Data and Data Sciences

This document provides an overview and instructions for setting up Python and completing basic programming tasks. It introduces how to set up the PyCharm IDE for Python development. It then presents two programming tasks - a simple calculator program that performs basic math operations on user-input values, and a statistics program that calculates mean, population standard deviation, and sample standard deviation of values in a list using the NumPy package.

Uploaded by

Valiant Cheung
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Tung Wah College

GEN3005 / GED3005 Big Data and Data Sciences


Tutorial 1: Basic Python Programming
Objective
After completing this tutorial, you should be able:
(i) set up the programming environment for writing python programs;
(ii) write python programs with the use of function for type conversion;
(iii) write python programs to perform basic mathematical operations; and
(iv) import python packages and use simple statistical functions in the package numpy;

Overview
In this tutorial, you will learn how to set up the Integrated Development Environment (IDE)
for python programming, and write simple python programs. The IDE being used is called
PyCharm.
Python is a programming language that can be used to build computer programs for data
processing, data analysis, and data visualization, which are the main steps in data science.

Setup of the IDE


1. The PyCharm (Community Edition) can be downloaded from the PyCharm website:
https://www.jetbrains.com/pycharm/download/other.html. In addition, the python
interpreter is necessary to write python programs. It can be downloaded from
https://www.python.org/downloads/. Please download the version 3.11.7.

2. To create a project, you have to specify:


(i) The location of the project files
(ii) The basic interpreter to be used

3. You can “untick” the option “Create a main.py welcome script”.

4. In my screenshot below, the location of the project files is

“…\GED3005 Big Data and Data Sciences\Programs”

The basic interpreter is set as

“C:\Users\...\AppData\Local\Programs\Python\Python311”
5. After creating a new project, you can right-click the project name to create a directory
“Week 1”, and a new python file “Hello World.py” inside such directory.

6. In the “Hello World.py” file, you can type the following in line 1. This will print the
message “Hello World!” when you run the program.
print("Hello World!")

7. To run the program, right-click the file “Hello World.py” at the top and choose Run
“Hello World”. You will see the message printed out in the output screen at the bottom.

Programming Task (1) – A simple calculator


8. In this task, you are going to write a python program takes in two integers x and y and
compute the following:
(i) 𝑥+𝑦
(ii) 𝑥−𝑦
(iii) 𝑥𝑦
(iv) 𝑥/𝑦
(v) 𝑥𝑦

9. Create a new python file named “MyCalculator.py” in the “Week 1” directory.

10. To capture user’s input, you can use the input() function.
x = input("Please enter the x value:\n")

\n is a new line character


In the above statement, the message “Please enter the x value” (with a new line
character) is printed. User can type a value to be captured by the program. The value is
stored in a variable x.
string is letter
11. The value of x is in a “string” type. However, to do calculation, the value of x should be
in an “integer” type. The type conversion can be done with the following line.

x = int(x) it is string type

The x-value at the right of the “=” operator is in the “string” type. The function int()
will convert the value from the “string” type to the “integer” type. The value of x in
integer type will be stored in the variable x again.

12. Similarly, you can write the following two lines of codes for capturing the y value from
the user.

y = input("Please enter the y value:\n")


y = int(y)
print(x) no "" cos we want the content of x not the word x
13. To perform the desired operations, we can do the addition, subtraction, multiplication,
division, and exponentiation, as below.

addition_result = x + y
subtraction_result = x - y
multiplication_result = x * y
division_result = x/y
powering_result = x**y

print("x + y = ", addition_result)


print("x - y = ", subtraction_result)
print("x * y = ", multiplication_result)
print("x/y = ", division_result)
print("x^y = ", powering_result)

Sample Outputs:
Programming Task (2) – Simple Statistics
14. Write a python program that can computes the mean, the population standard deviation,
and the sample standard deviation of the values that are stored in a list. To start with,
create a python file named “StatCalculator.py” in the Week 1 directory.

15. Import the numpy package, which contains the required functions for computing the
statistics measures.
install the package (numpy) in python package
import numpy as np

The package initially has not been installed in the project. To install the package, you
have to download it using the python package installer.

16. You can use the following line of code to create a list of integers.

lst not 1st


lst = [1,2,3,4,5]

17. The mean, the population standard deviation, and the sample standard deviation has the
following formulas.

The mean is
𝑋1 + 𝑋2 + ⋯ + 𝑋𝑁 ∑𝑋
𝜇= =
𝑁 𝑁

The population standard deviation is


∑(𝑋 − 𝜇)2
𝜎=√
𝑁

The sample standard deviation is


∑(𝑋 − 𝑋̅)2
𝑠=√
𝑛−1

You can use the following line of codes to compute their values.
parameter (input of the function)
mean = np.mean(lst)
p_std = np.std(lst) degree of freedom n-1
s_std = np.std(lst, ddof = 1)

This demonstrates the use of “functions” to get the desired results. The function has
inputs (known as “parameters”) and will return an output.
18. The results are printed using the following lines of code. The function round() is used
to perform the rounding. The first parameter is the value to be rounded, and the second
parameter is an integer that specifies the number of decimal places desired.
round up to 2 decimal place, 0 is integer, -1 is 10 place
print("The mean is", round(mean, 2))
print("The population standard deviation is", round(p_std, 2))
print("The sample standard deviation is", round(s_std,2))

Sample Outputs:

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy