0% found this document useful (0 votes)
23 views

Chapter 13 - Data Analysis

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views

Chapter 13 - Data Analysis

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 31

CHAPTER 13

DATA ANALYSIS
Topic list
1. Use of data in business
2. Sources of data and information
3. Qualities of good information
4. Data analysis
5. Potential problems with data
6. Presentation of information
7. Big data
8. Data science

201053_Data Analysis
1. Use of data in business
Data vs information
đầu vào (Ngân hàng: các giao dịch liên quan đến tiền chuyển đi, chuyển vào --> để kế toán làm)
Data: distinct pieces of information, which can
exist in a variety of forms:
▪ Piece of paper: number or text
▪ Electronic memory: bits or bytes
▪ Person’s mind: facts
Information: the output of whatever system is
used to process data or to organize it in a useful
way.
201053_Data Analysis
Use of data in business
Types of data
- A measurements or counts
- Numeric values
Quantitative
- Quantitative data lends itself to statistical & other analysis
data
e.g. the number of units sold per day; the height of
individuals.
Qualitative - Cannot be expressed in numerical terms
data e.g. gender; nationality, etc.
dữ liệu
phải là số
Discrete - Can only take exact values (usually a whole numbers)
tròn data e.g. the number of people in a group
ước tính, Continuous - Can take any value within a range
khoảng
bao nhiêu variables e.g. the height of individual

201053_Data Analysis
sd dữ liệu trong kinh doanh

Use of data in business


Lên kế hoạch:
 Planning: businesses need to plan their activities
Đưa ra quyết định: đầu tư hay không
 Decision-making from short to long term.
Kiểm soát: kiểm soát chi phí
 Controlling involves the organization in achieving
its objectives or taking amendments to adjust the
plan
Nguồn Dữ liệ và Thông tin
2. Sources of data and information

 Useful data/ information comes from both inside


and outside the organization, from a variety of
sources.
bao gồm:
 The internet of things is an important source of
data. Smart devices, software, sensors and
security devices are all part of the internet of
things.
201053_Data Analysis
Nguồn Dữ liệu bên trong

Internal data sources


 A system for collecting or measuring transaction data. For example sales,
purchases, inventory, etc.
Unofficial # Official: văn bản của cty, email của sếp
 Informal communication nói chuyện riêng giữa các nhân viên, ngoài lề

 Communication between managers

 From the accounting records

 From Human resources and payroll records

 From machine logs and computer systems in production/ operation

 From Procurement data systems

 From Timesheets in service businesses


201053_Data Analysis

 From Staff
Nguồn Dữ liệu bên ngoài công ty

External data sources


Capturing data information from outside the business may be formal or informal.
khác hàng và nhà cung cấp
 Received from customers and suppliers
chính phủ
 Received from the government

 Received advice or information bureau (Reuters or Bloomberg)

 Received from data sharing portals

 Received from consultancies of all sorts

 Received from newspaper and magazine publishers

 Received from specific reference works which are used in a particular line of work

 Received from libraries and information services

 Received from the systems of other business.


201053_Data Analysis
The internet of things

 Smart devices (gas and electricity meters, smartphones,


etc)

 Software (such as applications to control smart devices)

 Sensor (such as motorway traffic sensors,…) cảm biến: nhiệt độ, khói,...

 Security devices (such as CCTV)


3. Qualities of good information
Chất lượng của một thông tin tốt

There are 8 key characteristics (ACCURATE)

1. Accurate

2. Complete

3. Cost-beneficial

4. User-targeted

5. Relevant
nguồn thông tin đáng tin cậy, do bên có thẩm quyền ban hành
6. Authoritative* - the source of the information should be a reliable one

7. Timely Thời gian: Có thời gian từ ngày bao nhiêu đến bao nhiêu

8. Easy to use dễ sử dụng: rõ ràng, dễ xem


201053_Data Analysis
4. Data analysis Phân tích dữ liệu

What is data analysis?

➔ Data analysis involves obtaining useful information from data to give insight for management.

What is a well-planned data analysis program? Involve the following stages:

(1) Identifying the information needs of the business: dependent on the business’s objectives and
strategies xác định mục tiêu và chiến lược

Thu thập dữ liệu The process above is likely to be an iterative one


(2) Collecting the data
phân tích dữ liệu – means that the stages occurring in sequential
(3) Analysing the data order, earlier stages may be repeated based on
(4) Presenting the information feedback obtained at later stages.
(eg, during the data analysis stage (Stage 3),
(5) Using the information
insights obtained may lead analysts to modify the
Data analysis may be based
2nd phase, identifying additional sources of data)
on the whole population or
on a sample. The population is defined as the entire set of data from which a sample is selected for
analysis.
Analyzing the data
Statistical analysis is commonly used, as follows:
1. Descriptive statistics: to summarise the data in a data
set. Thống kê mô tả:tóm tắt dữ liệu, tính mean, median và mode
E.g. Mearsures of central tendency (mean, median, and mode)
Measures of dispersion (range, variance, and standard deviation)
Dựa trên mẫu đại diện cho số đông => đưa ra kết luận
2. Inferential statistics: to deduce the characteristics of a
bigger population from a small but representative sample.
xác định các mối quan hệ trong một chuỗi dữ liệu: dữ liệu này có ảnh hưởng đến dữ liệu kia hay không
3. Exploratory data analysis: Identifying relationships in a
set of data. For example pattern of a business

4. Confirmation data analysis: using statistical methods to


confirm a pre-determined hypothesis.
Đưa ra giả thuyết: sau đó đi thu thập thông tin về giả thuyết đí => để biết xem giả thuyết đó có chính xác không
Sampling Lấy mẫu

lấy mẫu khảo sát càng nhiều càng tốt

Sampling: analyzing a sample of data from a


population, and based on this, making inferences about
the population.
When making inferences, the statistics obtained from
the sample will not be exactly the same as the
population ➔ it has to be recognized that they are an
estimate.
How to make the estimates more reliable?
 The sample should be selected randomly to avoid
introducing bias.
 Larger samples are likely to be more representative
of the population than small samples.
5. Potential problem with data
Vấn đề với dữ liệu

Tính hoài nghi: hoài nghi về số liệu


 Professional skepticism: this involves assessing
information, estimates, and explanation critically with a
questioning mind, and being alert to possible
misstatements due to error and fraud.
Tính so sánh: sự khác biệt giữa các nền kinh tế, các đối tượng => khác nhau về dữ liệu
 Comparability of data: the extent to which differences
between statistics from different geographical areas,
non-geographic domains, or overtime, can be attributed
to differences between true values of statistics.
Sự so sánh => bọ ảnh hưởng bởi hai lý do:
 Comparability might be distorted by two main sources:
- Sử dụng các định nghĩa sai
Use of different definitions; or sử dụng cùng chỉ số kinh tế để so sánh 2 cty
- Sử dụng công cụ khác nhau
Use of different measuring tools, compilation, and
presentation practices.
5. Potential problem with data

 If sample data is used to make inferences about the


population then it is important that the sample is
representative of the population. The larger the
sample is, the more likely it is to be representative
of the population.
Thiên vị:
 Data bias: where the data in the sample is not
representative of the population for reasons other
than the size of the sample.
 Some different types of data bias: Selection bias;
Self-selection bias; Observer bias; Omitted variable
bias; Cognitive bias; Confirmation bias; Survivorship
bias.
5. Potential problem with data

 Type I or Type II errors:


In hypothesis testing there is a risk that wrong conclusions
are reached as follows:

A type I (‘false positive’) error occurs where the null


hypothesis is true, but because the sample results are
significantly different the null hypothesis is rejected.
tiêu cực

A type II (‘false negative’) error occurs where the null


hypothesis is false, but it is accepted because the sample
results are not statistically significantly different from the
null hypothesis.
6. Presentation of information
Trình bày thông tin

Data visualisation: is the use of charts and diagrams


to present information.
sử dụng dữ liệu để vẽ graph, chart

The objectives of a good presentation are:


 Easy for the users to understand.
 The presentation should accurately reflect the
underlying data.
 The information may need to be accompanied by
some commentary to help users interpret the
message.
Bar charts
Pie charts
Line charts
7. Big data
 What is big data?
Those datasets whose size is beyond the ability of typical …
software to capture, store, manage and analysis (Manyika et al)

 Characteristics of big data:


Volume: Big data accessible to a business is vast.
Velocity: Big data can be streamed into the business at great
speed.
tính đa dạng cảu dữ liệu
Variety: Big data is available a huge variety of issues.
Veracity: This is to do with the trustworthiness or accuracy of
big data. tính xác thực của dữ liệu
Types of Big Data

Structured data: is obtained with particular


purpose in mind, so it has inherent structure
• Created data: has been created on purpose, usually
for product or market research
• Provoked data dữ liệu thu thập từ khảo sát
• Transacted data dữ liệu giao dịch
• Compiled data dữ liệu thu thập bởi bên thứ 3. VD: các cty nghiên cứu thị trường

Unstructured data

• Captured data dữ liệu khi bị theo dõi, tra cứu thì lúc sau sẽ xuất hiện ở bất cứ đâu
• User-generated data dữ liệu người dùng: chụp hình, đăng FB
Types of big data
Structured data: is obtained with a particular purpose in mind, so
it has an inherent structure.
✓ Created data: has been created on purpose by an
organisation, usually for product or market research
✓ Provoked data: obtained from people who have been given
the opportunity to express the view
✓ Transacted data: collected about actual transactions such
as sales
✓ Compiled data: covered by third party such as market
research, credit rating, etc.
Unstructured data: is obtained without a particular objective so
has no inherent structure within itself:
✓ Created data: has been created on purpose by an
organisation, usually for product or market research
✓ Provoked data: obtained from people who have been given
201053_Data Analysis the opportunity to express the view
Sources of Big data

Another way of classifying types of big data is to analyze its sources:


• Processed data from information systems.
• Open data refers to the release of large amounts of primarily public sector
data (e.g: GSO, transport data, government financial data, public service data)
• Human-sourced data from social networks, blogs, email, text messages, and
internet searches.
• Machine-generated data from the internet of things: from fixed and mobile
sensors, and from computer and website logs.

Importance of big data


• New sources of data (open data, social media, and the internet of things)
• Exponential growth in computing power and storage
• New infrastructure for knowledge creation
8. Data science

Data science covers the whole life cycle of data,


from acquisition and exploration to analysis and
communication of the results.

Data science deals with collecting, preparing,


managing, analysing, interpreting and visualising
large and complex datasets (imperial College
London, 2018)
Data analytics
Value is extracted from big data by data
scientists through the process of data analytics.

Data analytics: the process of using fields within


the source data itself, rather than predetermined
formats, to collect, organize and analyze large
sets of data to discover patterns and other
useful information which an organization can use
for its future business decision
201053_Data Analysis
Benefits of big data, data science and data
analytics
Benefits that can bring to an organisation to:
1. Gain insights
2. Predict the future, and
3. Automate non-routine making.
Benefits of big data, data science and data
analytics
Big data and data analytics can be used by a business to
create value:
 Enhancing transparency
 Performance improvement
 Market segmentation and customisation
 Decision making
 Innovation
 Risk management
Risks of big data, data science and data
analytics

 Storage
 Workforce skills
 Data dependency
 Information overload
 Data privacy
 Data security
Data ethics

The increasing collection and analysis of data,


particularly personal data held about individuals,
raises complex ethical issues.
 Transparency: are businesses transparent about how
they use data.
 Fairness: The processes for collecting, storing, and
analysing data should aim to avoid unintended
discriminatory effects.
 Privacy
 Ownership of data
 Consent
END OF CHAPTER 13

201053_Data Analysis

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy