0% found this document useful (0 votes)
646 views11 pages

STA 122 Notes Part I

Uploaded by

faithsyokau04
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
646 views11 pages

STA 122 Notes Part I

Uploaded by

faithsyokau04
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

STA122: Computational Methods and Data Analysis I

Course Outline
Computer graphics. Statistical packages and libraries. Role of
computers in data bases. Survey applications. Number systems;
errors and accuracy; interpolation; finite differences; difference
equations; successive approximation or iterative techniques.
Numerical solution of non-linear equations. Writing programs to
implement numerical algorithms. Application of numerical analysis,
software packages such as NAG.

Introduction
One of the features of the development of modern world is the
development of the capacity to convert observations in numbers. The
science, which deals with numbers, is statistics.
It crunches the numbers and organises them in a meaningful way so
that information is generated. This information builds up knowledge
and thus the development goes on. Advances in computing have come
handy in this as they help in doing this part of job accurately, timely
effectively and convincingly.

The term “Statistics” is used as a “collection of numerical facts or


data”. It is also used in terms of a “body of methods and techniques for
analysing numerical data”.

Statistical techniques have many purposes, which include methods


and procedures for summarising, simplifying, reducing and
presenting raw data.

It then makes predictions, tests hypotheses and infers characteristics


of a population from the characteristics of a sample.

In other words, Statistics is generally thought of as serving two


functions. One is to describe sets of data; the other is to help in
drawing inferences.

When you are studying only a sample, there is possibility that your
assumption may not be accurate and you can never be certain that you
have drawn the correct inference.
For this reason, the inferential use of statistics may be thought of as
helping you to make decisions under conditions of uncertainty.

It is different from guessing, because Statistics also provides you with


a method of estimating how reliable your conclusions are. With each
statistical statement that you make, you indicate the probability that
findings like yours could have been the result of chance factors.

Statistical Packages and Libraries


Computer can help immensely in the statistical analysis. There exist
numerous statistical tools and the need is to identify their actual
usage. Even with the use of a statistical package many statistical
procedures require a lot of prior knowledge and insight.

A statistical package is the software for the collection, organisation,


interpretation, and presentation of numerical information.
Statistical software is a specialized programs designed to perform
complex statistical analysis.
They are the tools that assist in the organization, interpretation, and
presentation of selected data sets to provide science-based insights
into patterns and trends.

Statistical software uses statistical analysis theorems and


methodologies such as regression analysis, time series analysis to
perform data sciences

The need for a statistical package has arisen because of the complexity
of calculations involved in making inferences from the data. The
advances in computing technologies have made statistics a yet more
powerful field.

Benefits of Statistical Software


 Increases efficiency of the work
 More accuracy in data analysis and management
 Less time consuming
 Easy customization
 Grants access to large database
 Reduces sampling error
 Empowers to make data driven decisions

Types of Statistical Software

The most widely used piece of statistical packages/software for


statistics includes Excel, SPSS, SAS, Minitab, GenStat, Generalised
Linear Interactive Modelling Package (GLIM), Stata, S-PLUS, R,
MATLAB (MATrix LABoratory), Epi-data, Epi-info, NVivo,

1. Microsoft Excel
 The MS Excel worksheet is a collection of cells. There are
65,000 (rows) X 256 (columns) cells in an MS Excel
worksheet. Each row or column can be used to enter data
belonging to one category. Data entry in MS Excel is as
simple as writing on a piece of paper. MS Excel assigns
each column a field depending upon the type of data. It
supports various data formats; one can choose a data
format by formatting the cells.
 This worksheet can be used for data entry and for
performing calculations by click of buttons. It has a “paste
function where you can paste any formula from a big list
of inbuilt functions.
 MS Excel can be used to create tables, and graphs and
perform statistical calculations. The work done in MS
Excel can be easily copied and pasted to many window-
based programs for further analysis.

 Spreadsheets are a useful and popular tool for processing


and presenting data. Microsoft Excel spreadsheets have
become somewhat of a standard for data storage, at least
for smaller data sets. The fact that the program is often
being packaged with new computers, which increases its
easy availability, naturally encourages its use for
statistical analysis. Due to its ease of availability to many
people, including professional statisticians, use Excel, even
on a daily basis, for quick and easy statistical calculations.

 Excel is clearly not an adequate statistics package because


many statistical methods are simply not available. This
lack of functionality makes it difficult to use it for more
than computing summary statistics and simple linear
regression and hypothesis testing

2. SPSS (Statistical Package for Social Sciences)

 It is a very popular package due to its features and


compatibility with other window-based programs. In the
late 1960s, three Stanford University graduate students
developed the SPSS statistical software system.
 SPSS is the most widely used powerful software for
complex statistical data analysis.
 It easily compiles descriptive statistics, parametric and
non- parametric analysis as well as delivers graphs and
presentation ready reports to easily communicate the
results.
 More accurate reports are achieved here through
estimation and uncovering of missing values in the data
sets.
 SPSS is used for quantitative data analysis.
 SPSS can take data input from many packages like dBase
(*.dbf), Excel (*.xls), Lotus 123 (*.w*) and others like *.dat
and *.txt. It can filter the data and perform analysis
 only in selected cases.
 SPSS also supports several statistical graphs. It displays
many statistics on the graph itself. It has a feature that
helps you to find a chart that is most suitable for your data,
which is called “Chart Galleries by Data Structure”.
3. Stata
 Stata is also a widely used software that enables to
analyze, manage, store and produce graphical
visualization of data.
 Coding knowledge is not necessary to use it.
 Presence of both command line and graphical user
interface makes its use more intuitive.
 It is generally used by researchers in the field of
economics, social sciences and bio-medicine to examine
the data patterns.
 Stata is used for quantitative data analysis.
4. R
 ‘R’ software is widely used free statistical software that
provides statistical and graphical techniques including
linear and non-linear modelling.
 Toolboxes essentially plugins are available for great range
of applications. Knowledge of coding is required here.
 It provides interactive reports and applications, leverage
large amount of data and is complaint with security
practices and standards.
 R is used for quantitative data analysis.

5. SAS (Statistical Analysis Software)


 It is a cloud based platform that provides ready to use
programs for data manipulation, information storage and
retrieval.
 Its procedures are multithreadede. performing multiple
operations at once.
 It is primarily used for statistical modeling, observing
trends and patterns in data and aiding in decision-making
by business analysts, statisticians, data scientists,
researchers and engineers.
 Coding can be difficult to those new to this approach.
 It is used for quantitative data analysis.
6. MATLAB (MATrix LABoratory)
 MATLAB stands for MATrix LABoratory.
 MATLAB is software that provides an analytical platform
and programming language
 It expresses matrix and array mathematics, plotting of
functions and data, implementation of algorithms,
creation of user interfaces.
 Live Editor is also included which creates a script that
combines code, output, and formatted text in a executable
notebook.
 It is widely used by engineer and scientist.
 MATLAB is used for quantitative data analysis.

7. Epi-data
 Epi-data is free widely used data software designed to
assist epidemiologist, public health investigators and
others to enter, manage and analyze data in the field.
 It performs basic statistical analysis, graphs and
comprehensive data management.
 Here user gets to create own forms and database.
 Epi-data is used for quantitative data analysis.
8. Epi-info
 It is a public domain suite software tool designed for
researchers and public health practitioners of the globe
developed by Centre for disease control and prevention
(CDC)
 It provides easy data entry form and database
construction, and data analyses with epidemiologic
statistics, maps, and graphs for those who may lack an
information technology background.
 It is used for outbreak investigations; for developing small
to mid-sized disease surveillance systems; as analysis,
visualization, and reporting (AVR) components of larger
systems.
 It is used for quantitative data analysis.

2. ROLE OF COMPUTERS IN DATA BASES.

Data- are raw facts. The word raw indicates that the facts have not yet
been processed to reveal their meaning.
Information - Is the result of processing raw data to reveal its
meaning
Database- Is a shared, integrated computer structure that stores a
collection of
 End-user, that is, raw facts of interest to the end user.
 Metadata, or data about data, through which the end-user data
are integrated and managed.
Metadata- Provide a description of the data characteristics and the
set of Relationships that link the data found within the DB
Database management system (DBMS):
Is collection of programs that manages the DB structure and
controls access to the data stored in the database.
Role and advantages of the DBMS
-improved data sharing
-Improved data security
-better data integration
-minimized data inconsistency
-improved data access
-improved decision making
-Increased end
-user productivity.

Types of DB:
number of users according
1- Single user desktop database, single user on desktop computer
2- Multiuser database supports multiple users
- Workgroup DB: small number of users
- Enterprise DB: more than so

DBMS Functions
A DBMS performs several important functions that guarantee the
integrity and consistency of the data in the DB.
1- Data dictionary management
The DBMS stores definitions of the data elements and their
relationship in data dictionary.
2- Data storage management
The DBMS creates and managements the complex structures
required for data storage, thus relieving you from the difficult
task of defining and programming the physical data
characteristics.
3- Data transformation and presentation
The DBMS transforms entered data to conform to required data
structures
4- Security management
The DBMS creates security system that enforces user security
and data privacy
5- Multiuser access control
To provide data integrity and data consistency
6- Backup and recovery management
To ensure data safety and integrity
7- Data integrity management
Enforces integrity rules, thus minimizing data redundancy and
maximizing data consistency
8- DB access languages and application programming interfaces
9- DB communication interfaces
Current-generation DBMS accept end-user requests via
multiple, different network environments

Disadvantages of DB systems
 Increased costs
 Management complexity
 Maintaining currency
 Vendor dependence
 Frequent upgrade/replacement cycles

3. SURVEY APPLICATIONS.

Surveys are research methods used for collecting data from a


predefined group of respondents to gain information and insights into
various topics of interest. They can have multiple purposes, and
researchers can conduct them in many ways depending on the
methodology chosen and the study’s goal.

The data is usually obtained through standardized procedures to


ensure that each respondent can answer the questions at a level
playing field to avoid biased opinions that could influence the
research outcome or study.

The process involves asking people for information through an online


or offline questionnaire. However, with the arrival of new
technologies, it is common to distribute them using digital media such
as social networks, email, QR codes, or URLs.

Creating a Good Survey Tool

a survey usually has its beginnings when a person, company, or


organization faces a need for information, and there is no existing data
that is sufficient. The following should be considered when creating a
survey tool.
1. Define objective:
The survey would have no meaning if the aim and the result were
unplanned before deploying it. The survey method and plan should
include actionable milestones and the sample planned for research.
Appropriate distribution methods for these samples also have to
be put in place right at the outset.
2. The number of questions:
The number of questions used in a market research study depends
on the research’s end objective. It is essential to avoid redundant
queries in every way possible. The length of the questionnaire has
to be dictated only by the core data metrics that have to be
collected.
3. Simple language:
One factor that can cause a high survey dropout rate is if the
respondent finds the language difficult to understand. Therefore, it
is imperative to use easily understandable text in the survey.
4. Question types:
There are several types of questions that can go into a survey. It is
essential to use the question types that offer the most value to the
research while being the easiest to understand and answer to a
respondent. Using close-ended questions like the Net Promoter
Score (NPS) questions or multiple-choice questions help increase
the survey response rate.
5. Consistent scales:
If you use rating scale questions, ensure the scales are consistent
throughout the research study. Using scales from -5 to +5 in one
question and -3 to +3 in another question may confuse a
respondent.
6. Survey Logic:
Logic is one of the most critical aspects of the survey design. If the
logic is flawed, respondents will not be able to continue further or
the desired way. Logic must be applied and tested to ensure that
only the next logical question appears when selecting an option.

Characteristics of Effective Surveys


1. Specific Objectives
The first step in any survey is deciding what you want to learn.
The goals of the project determine whom you will survey and
what you will ask. If your goals are unclear, the results will
probably be unclear. Each survey item must align with one or
more of the survey objectives.
2. Straightforward Questions
The best survey items are ones that respondents can
understand and respond to immediately. Keep the questions
clear and concise and avoid overly complex language and
structure. (See section on “Writing Good Questions.”)
3. Proper Sample
In order to understand the perspectives of an entire population,
it is necessary to gather responses from a representative sample
of that population. It is not necessary to survey the entire
population; bigger is not always better. Actively pursuing the
selected random sample through follow-up phone calls and
other forms of communication will have a greater impact on the
accuracy and generalizability of your findings than simply
expanding your sampling pool.
4. Reliable and Valid
Reliable surveys generate data that can be reproduced. Valid
surveys measure the construct that they are intended to
measure.
5. Accurate Reporting of Results
Survey results must be carefully analyzed and reported in order
to accurately represent the perspectives of the target
population. In order for a report to be accepted by its target
audience(s) (stakeholder groups such as district department
staff, school staff, parents, students, and/or the community at
large), it must be transparent that researcher/surveyor bias had
no role in the interpretation and reporting of results. Credible
reports include both positive and negative findings; reports that
only share the most positive survey results risk being
disregarded as public relations material and having minimal
impact. Often it is the negative or unexpected report findings
that offer the most guidance for program improvement.

Checklist of Typical Survey Steps


 Identify survey objectives
 Decide who will be included and how and when the
survey will be administered
 Write survey items
 Prepare survey instrument
 Pilot test the survey and make adjustments as indicated
 Administer survey
 Organize data
 Analyze data
 Report results
When to Use a Survey (Indications)
The best use of survey methodology is to investigate human
phenomena, such as emotions and opinions. These are data that are
neither directly observable, nor available in documents. Moreover, a
new survey instrument is only indicated when a prior instrument
does not exist or is determined empirically to have insufficient
validity and reliability evidence for the sampling frame of interest.

When properly constructed, a survey regardless of topic and whether


exploring an emotion or opinion has the equivalent rigor of a
psychometric instrument. A psychometric instrument can even be
used as a survey to explore emotion.

Similarly, an opinion is a human quality and must be addressed by a


survey, such as a preference for a product or teaching method. It is
worth stressing that opinion surveys also require the same rigor as
psychometric instruments.

Application of Surveys
A survey instrument is any series of pre-defined questions intended
to collect information from people, whether in person, by Internet, or
any other media.
Surveys are applicable in health professions, education research,
Engineering, economics, finance, agriculture.
This is because, of their;
 low cost,
 relative speed
 simple to use.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy