STA 122 Notes Part I
STA 122 Notes Part I
Course Outline
Computer graphics. Statistical packages and libraries. Role of
computers in data bases. Survey applications. Number systems;
errors and accuracy; interpolation; finite differences; difference
equations; successive approximation or iterative techniques.
Numerical solution of non-linear equations. Writing programs to
implement numerical algorithms. Application of numerical analysis,
software packages such as NAG.
Introduction
One of the features of the development of modern world is the
development of the capacity to convert observations in numbers. The
science, which deals with numbers, is statistics.
It crunches the numbers and organises them in a meaningful way so
that information is generated. This information builds up knowledge
and thus the development goes on. Advances in computing have come
handy in this as they help in doing this part of job accurately, timely
effectively and convincingly.
When you are studying only a sample, there is possibility that your
assumption may not be accurate and you can never be certain that you
have drawn the correct inference.
For this reason, the inferential use of statistics may be thought of as
helping you to make decisions under conditions of uncertainty.
The need for a statistical package has arisen because of the complexity
of calculations involved in making inferences from the data. The
advances in computing technologies have made statistics a yet more
powerful field.
1. Microsoft Excel
The MS Excel worksheet is a collection of cells. There are
65,000 (rows) X 256 (columns) cells in an MS Excel
worksheet. Each row or column can be used to enter data
belonging to one category. Data entry in MS Excel is as
simple as writing on a piece of paper. MS Excel assigns
each column a field depending upon the type of data. It
supports various data formats; one can choose a data
format by formatting the cells.
This worksheet can be used for data entry and for
performing calculations by click of buttons. It has a “paste
function where you can paste any formula from a big list
of inbuilt functions.
MS Excel can be used to create tables, and graphs and
perform statistical calculations. The work done in MS
Excel can be easily copied and pasted to many window-
based programs for further analysis.
7. Epi-data
Epi-data is free widely used data software designed to
assist epidemiologist, public health investigators and
others to enter, manage and analyze data in the field.
It performs basic statistical analysis, graphs and
comprehensive data management.
Here user gets to create own forms and database.
Epi-data is used for quantitative data analysis.
8. Epi-info
It is a public domain suite software tool designed for
researchers and public health practitioners of the globe
developed by Centre for disease control and prevention
(CDC)
It provides easy data entry form and database
construction, and data analyses with epidemiologic
statistics, maps, and graphs for those who may lack an
information technology background.
It is used for outbreak investigations; for developing small
to mid-sized disease surveillance systems; as analysis,
visualization, and reporting (AVR) components of larger
systems.
It is used for quantitative data analysis.
Data- are raw facts. The word raw indicates that the facts have not yet
been processed to reveal their meaning.
Information - Is the result of processing raw data to reveal its
meaning
Database- Is a shared, integrated computer structure that stores a
collection of
End-user, that is, raw facts of interest to the end user.
Metadata, or data about data, through which the end-user data
are integrated and managed.
Metadata- Provide a description of the data characteristics and the
set of Relationships that link the data found within the DB
Database management system (DBMS):
Is collection of programs that manages the DB structure and
controls access to the data stored in the database.
Role and advantages of the DBMS
-improved data sharing
-Improved data security
-better data integration
-minimized data inconsistency
-improved data access
-improved decision making
-Increased end
-user productivity.
Types of DB:
number of users according
1- Single user desktop database, single user on desktop computer
2- Multiuser database supports multiple users
- Workgroup DB: small number of users
- Enterprise DB: more than so
DBMS Functions
A DBMS performs several important functions that guarantee the
integrity and consistency of the data in the DB.
1- Data dictionary management
The DBMS stores definitions of the data elements and their
relationship in data dictionary.
2- Data storage management
The DBMS creates and managements the complex structures
required for data storage, thus relieving you from the difficult
task of defining and programming the physical data
characteristics.
3- Data transformation and presentation
The DBMS transforms entered data to conform to required data
structures
4- Security management
The DBMS creates security system that enforces user security
and data privacy
5- Multiuser access control
To provide data integrity and data consistency
6- Backup and recovery management
To ensure data safety and integrity
7- Data integrity management
Enforces integrity rules, thus minimizing data redundancy and
maximizing data consistency
8- DB access languages and application programming interfaces
9- DB communication interfaces
Current-generation DBMS accept end-user requests via
multiple, different network environments
Disadvantages of DB systems
Increased costs
Management complexity
Maintaining currency
Vendor dependence
Frequent upgrade/replacement cycles
3. SURVEY APPLICATIONS.
Application of Surveys
A survey instrument is any series of pre-defined questions intended
to collect information from people, whether in person, by Internet, or
any other media.
Surveys are applicable in health professions, education research,
Engineering, economics, finance, agriculture.
This is because, of their;
low cost,
relative speed
simple to use.