0% found this document useful (0 votes)
11 views46 pages

DS231 Module 6

The document provides an introduction to Data Science Programming, focusing on open data resources and Python programming. It outlines ten resources for accessing open data, including the Saudi Open Data Portal, and discusses the popularity and tools for Python programming. The document also emphasizes the importance of open data for government transparency and citizen engagement.

Uploaded by

azooom64
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views46 pages

DS231 Module 6

The document provides an introduction to Data Science Programming, focusing on open data resources and Python programming. It outlines ten resources for accessing open data, including the Saudi Open Data Portal, and discusses the popularity and tools for Python programming. The document also emphasizes the importance of open data for government transparency and citizen engagement.

Uploaded by

azooom64
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 46

‫الجامعة السعودية االلكترونية‬

‫الجامعة السعودية االلكترونية‬

‫‪26/12/2021‬‬
College of Computing and
Informatics
Introduction to Data Science
Programming

2
Introduction to Data Science Programming
Module 6:
• Part 1: Ten Phenomenal Resources
for Open Data

• Part 2: Starting with Python


Contents
Part 1: Ten Phenomenal Resources for Open Data
1. Exploring Data Worldwide
2. Discovering Saudi Open Data Portal
Part 2: Starting with Python
3. Why Python Is Hot
4. Choosing the Right Python
5. Tools for Success
6. Writing Python in VS Code
7. Using Jupyter Notebook for Coding
4
Weekly Learning
Outcomes
1. Learning about several resources for open Data

2. Discovering Saudi Open Data Portal


https://data.gov.sa/en

3. Discovering Python

4. Writing Python Code

5
Required Reading
1. Chapter 19. Ten Phenomenal Resources for
Open Data (Lillian Pierson, Data Science, 3rd
Edition, 2021)
2. Book 1, Chapter 1. Starting with Python (John
C. Shovic and Alan Simpson, Python All-in-One,
Videos
2ndData
• Best edition 2021)
Sources for your Next Analytic Projects | FREE Open-
Source Datasets
https://youtu.be/z3DsLSmkqOU
• ‫كيف ممكن تفيدك اإلحصاءات في مجالك؟‬
https://youtu.be/8h7hWls5Nh4

6
Part 1: Ten Phenomenal Resources for
Open Data
Introduction
Introduction

• Think of open data as data that has been made publicly


available and is permitted to be used, reused, built on, and
shared with others.
• Maybe you’ve heard of open-source software, open
hardware, open-content creative work, open access to
scientific journals, and open science. Along with open data,
they’re all part of the aptly named open movement.
• The distinguishing feature of open licenses is that they have
copyleft instead of copyright.
Introduction

• Be aware that sometimes work that’s labeled as open may not


fit the accepted definition. You’re responsible for checking the
licensing rights and restrictions of the open data you use.
• As part of more recent open government initiatives,
governments around the world began releasing open
government data.
• This data can be used by volunteer analysts and programmers
who work collaboratively to build open-source solutions that
use open data to solve social problems.
• The open government movement promotes government
transparency and accountability.
1. Exploring Data
Worldwide
Exploring Data Worldwide
Digging Through data.gov
• The Data.gov program (at www.data.gov ) provide open access
to nonclassified US government data.
• By mid-2014, you could search for over 100,000 datasets by
using the Data.gov search. The website is an unparalleled
resource for the following indicators: Economic, Environmental,
STEM industry, Quality of life and Legal.
• You can also find over 60 open-source application programming
interfaces (APIs) available on the platform. You can use these
APIs to create tools and apps that pull data from government
departments listed in the Data.gov data catalog.
Exploring Data Worldwide
Checking Out Canada Open Data
• If you look at the Canada Open Data website (http://
open.canada.ca ), you can find over 200,000 datasets. Among
the 25 most popular offerings on the site are datasets that cover
the following indicators:
 Environmental: like natural disasters and fuel consumption ratings.
Citizenship: Permanent resident applications, permanent resident
counts, foreign student entry counts, and other items can be found.
Quality of life: like cost of living trends, automobile collision statistics,
and disease surveillance, for example.
Exploring Data Worldwide
Diving into data.gov.uk
• The United Kingdom got off to a late start in the open
government movement. (http://data.gov.uk) was started in
2010, and by mid-2014, only about 20,000 datasets were yet
available.
• The website represents a useful source for data on the following
indicators: Environmental, Government spending, Societal,
Health, Education and Business and economic.
Exploring Data Worldwide
Checking Out US Census Bureau Data
• The demo- graphics data provided by the US Census Bureau (at
www.census.gov ) can be extremely helpful if you’re doing
marketing or advertising research and need to target your
audience according to the following classifications: Age,
Average annual income, Household size, Gender or race, and
Level of education.
Exploring Data Worldwide
Accessing NASA Data
• Since its inception in 1958, NASA has made public all its
nonclassified project data. NASA datasets have been growing
even faster with recent improvements in satellite and
communication technology.
• In fact, NASA now generates 4 terabytes of new earth-science
data per day, which is equivalent to over a million MP3 files.
• NASA’s open data portal is called (http://data.nasa.gov ). This
portal is a source of all kinds of wonderful data, including data
about: Astronomy and space, Climate, Life sciences, Geology
and Engineering.
Exploring Data Worldwide
Wrangling World Bank Data
• The World Bank is an international financial institution that
provides loans to developing countries.
• Data available to the public at the World Bank Open Data page (
http://data.worldbank.org ).
• You can use the website to download entire datasets or simply
view the data visualizations online. You can also use the World
Bank’s Open Data API to access what you need.
Exploring Data Worldwide
Wrangling World Bank Data
• World Bank Open Data supplies data on the following indicators (and
many, many more)

Agriculture and rural development


Economy and growth
Environment
Science and technology
Financial sector
Poverty income
Exploring Data Worldwide
Getting to Know Knoema Data
• The Knoema platform houses a staggering 500+ databases, in
addition to its 150 million time series — 150 million collections
of data on attribute values over time, in other words. Knoema
includes, but isn’t limited to, all these data sources:
Government data from industrial nations:
National public data from developing nations:
United Nations data:
International organization data
Corporate data from global corporations
Exploring Data Worldwide
Queuing Up with Quandl Data
• Quandl (www.quandl.com ) is a Toronto-based website that
aims to be a search engine for numeric data.
• The Quandl database includes links to over 10 million datasets
• Quandl links to 2.1 million UN datasets and many other sources,
including datasets in the Open Financial Data Project, the
central banks, real estate organizations, and well-known think
tanks.
Exploring Data Worldwide
Exploring Exversion Data
• Exversion aims to provide the same collaborative functionality
around data that GitHub provides around code. The Exversion
platform offers version control functionality and hosting services
for uploading and sharing your data.
• All the data you upload to Exversion is public.
• Exversion is extremely useful in the data-cleanup stage.
Exploring Data Worldwide
Mapping OpenStreetMap Spatial Data
• OpenStreetMap (OSM) is an open, crowd-sourced alternative to
commercial mapping products such as Google Maps and ESRI
ArcGIS Online.
• To illustrate how a person can create data in OSM, imagine that
someone links the GPS system on their mobile phone to the
OSM application. Because of this authorization, OSM can
automatically trace the routes of roads while the person travels.
Later, this person (or another OSM user) can go to the OSM
online platform to verify and label the routes.
2. Discovering Saudi Open Data Portal
Discovering Saudi Open Data Portal

• The Open Data portal of Saudi Arabia is an important


initiative for the country, as it aims to implement a public
data hub and strategy to enable transparency, promote e-
participation and inspire innovation.
• The primary role of the portal is to publish datasets from
ministries and government agencies in an open format, and
make this data available to the public.

https://data.gov.sa/en/aboutodp
Discovering Saudi Open Data Portal

• The platform enables the public to have central point of


access to find, download and use datasets generated by the
ministries and governmental entities in the country.
Government Open Data helps bridge the gap between
governments and citizens.
• The public benefits from the data provided in different ways,
such as acquiring a better understanding of how to
government agencies work, opening up the opportunities for
people to evaluate the performance of various administrative
institutions, giving citizens the chance to make informed
decisions about government policies.
https://data.gov.sa/en/aboutodp
Discovering Saudi Open Data Portal

• It also allows using the data for research, reports, providing


feedback, and developing web and smartphone applications
and solutions based on government open data.
• With this Open Data portal the Saudi government pursues to
expand its e-Government services portfolio by extending the
growing efforts to individuals and private sectors, enhancing
transparency and allowing people to show their creativity.

https://data.gov.sa/en/aboutodp
Discovering Saudi Open Data Portal

• 6,544 Datasets (August 2022)


• Total publishers: 147
• Use cases available
• You may request Dataset
• Open data license : The open data license provides freedom
to users in distributing and producing works regarding the
dataset, transforming and building upon them. The source
should be mentioned in the prescribed manner in the license.
https://data.gov.sa/sites/default/files/odp/Open%20Data%20
License%201.02.pdf
https://data.gov.sa/en/aboutodp
Lets discover the Saudi Open Data together

https://data.gov.sa
Part 2 : Starting with Python
1. Why Python Is Hot
Why Python Is Hot

• The Python language is becoming more and more popular,


and in 2017 it became the most popular language in the
world according to IEEE Spectrum.
• Machine learning, robotics, artificial intelligence, and data
science are the leading technologies today and for the future.
Python is popular mainly because it already has lots of
capabilities in these areas.
Why Python Is Hot

• The main reasons cited for Python’s current popularity are:


Python is relatively easy to learn.
Everything you need to learn (and do) in Python is free.
Python offers more ready-made tools for current hot
technologies such as data science, machine learning,
artificial intelligence, and robotics than most other
languages.
• Following figure shows Google search trends over the last five years.
As you can see, Python has been gaining in popularity.
2. Choosing the Right Python
Choosing the Right Python

• This table shows the release


dates of various Python
versions.
• Go to python.org website to
download Python, it will tell you
the most current stable build
(version). That’s the one they’ll
recommend, and that’s the one
you should use.
3. Tools for Success
Tools for Success

• You’ll need a good Python interpreter and editor. The editor lets
you type the code, and the interpreter lets you run that code.
• A code editor is an app that lets you type code. But you also
need the Python interpreter.
• Anaconda is a complete Python development environment with
intuitive and easily managed graphic user interface, as on a Mac
or Windows or any phone or tablet.
• Anaconda is often referred to as a data science platform
because many of the packages that come with it are data
science oriented. And it also comes with VS Code.
Tools for Success
Installing Anaconda and VS Code
• Go to www.anaconda.com/download
• Click Download under the largest version number.
• Follow any on screen instructions to download the free version
• Installation: Choose whichever option makes sense to you.
• When you come to a page where it asks if you want to install
Microsoft VS Code (it may take quite a while), click Install
Microsoft VS Code (or whatever option on your screen indicates
that you want to install VS Code).
Tools for Success
Installing Anaconda and VS Code
• Using Anaconda Navigator : Anaconda Navigator lets you
navigate around through different features of the app and
choose what you want to run.
4. Writing Python in VS Code
Writing Python in VS Code

• We suggest that you open VS Code from Anaconda (Scroll down


a little until you see the Launch button under VS Code, if
necessary, and then click the Launch button)
• To use VS Code with Python and Anaconda, you need some VS
Code extensions. You should already have them because they
come with your Anaconda download. To verify that, click the
Extensions icon in the left pane (it looks like a puzzle piece). You
should see at least three extensions listed: Anaconda Extension
Pack, Python, and YAML.
5. Using Jupyter Notebook for Coding
Using Jupyter Notebook for Coding

• Jupyter Notebook is another popular tool for writing Python


code. It supports writing code in three popular languages: Julia
and Python and R.
• People often use Jupyter to share code on the Internet. It is free
and comes with Anaconda.
Thank
You

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy