Cit304 Summary From Noungeeks
Cit304 Summary From Noungeeks
Most people associate data with statistics that have been organized in
tables. They are right, but only partially. You will also be right in
associating data with numbers, such as, 12.34, 0.3456, -111.01, 12 feet,
45.4 kilograms, 6000 miles, etc. But, would you agree that the words
that I have written since the beginning of this unit are also data?
This definition suggests that data are usually in the form of numbers
and statistics obtained from scientific experiments or computations, and
take the form of statistics. This seems to agree with the popular
conception of data noted above. The second important message of the
definition is that data do not just fall from the sky like manna, but are
generated through activities that are associated with scientific research
such as experiments, measurement of variables, and the computation of
data to derive other data such as sums, averages, quotients, etc.
Data versus Precepts
Data are different from precepts. But, some people believe that data is
synonymous with information. Do you agree?
One thing you might have noticed about the definition of information is
that people and knowledge are involved when we talk about information.
Information is usually extracted from the knowledge possessed by
someone or some community, and then communicated through data to
other people or communities. Such other people must then interpret the
data towards gaining information for improving their own knowledge.
In other words, information is like a stream that flows from one pool of
knowledge to another.
You might need a break at this point to reflect further on the possible
similarity and differences between data and information.
Welcome back.
By way of summarizing for you what you have learnt so far, let me now
give you the definitions of data and information that you will find very
useful in this course.
You should also note the following subtle difference between data and
information. Data is plural; hence, we say 'data are. By contrast,
information is always singular; hence, we say 'information is ..'. The
reason is that data are invariably a set of symbols, which could be
chunked or re-organized into smaller subsets as needed or convenient.
Let me now interpret for you what all these definitions suggest.
Knowledge is the extent of conversant or familiarity by particular
individuals, communities, or mankind as a whole, with certain facts,
truths, principles or subjects. Knowledge consists of facts, truths and
ideas that fit together to form a coherent and meaningful whole. A
person's overall knowledge influences how she perceives the world
around her. In other words, knowledge is used by people to interpret
and evaluate new information. Knowledge is built up over time through
observation, experience, or reports of new facts, truths and ideas.
Knowledge serve as a pool from which people can extract specific truth,
facts and ideas (i.e., information) for informing or instructing other
people thereby improving those other people's knowledge.
Conclusion
Now that you have understood what data are, and how they differ from
precepts, information and knowledge, you will be better able to
appreciate the fact that data are visible everywhere - in books,
newspapers, office documents, computers, billboards, posters, etc.
Nevertheless, the ideas or information implied by the creators of the
data are never known for certain. So also are the ideas or information
that different people can obtain from the data. Much of course depends
on the abilities of people to convey information with data, and to infer
information from data. These abilities depend on their knowledge.
Accordingly, this course, Data organization and management aim to
improve your knowledge of the principles of effective creation,
organization and management of data, and to some extent, information
and knowledge.
Data organization
You will recall from unit 1 that data are invariably created to express or
convey information. However, data can be created out of different
combinations of symbols including words, numbers, graphs, pictures,
sound, etc. data organizations entail the analysis and applications of
strategies for selecting, combining and using words, numbers and other
types of symbols to create data for expressing information.
3. Menu items, icons, tool bars, words, etc. are, or should be, displayed
in the windows of a computer screen;
Data management
They are:
Information systems
1. Input;
2. Processing;
3. Storage;
4. Output;
5. Communication.
Data input activities involves the inflow (i.e., input) of data to the system
from other systems. This entails the acquisition or capture of data from
the environment.
Data and information output entails sending out processed data and
information from the system to other systems (people, organizations,
communities).
So, what are the symbols, rules and usages of natural languages?
1. Alphabets and other special symbols (e.g. comma), which are the
symbols that can be used for creating and recording data to express
information.
Hence, the biologists, chemists and other scientists are trained in how to
use such special languages to create data. Of course, such special
languages, just like natural languages, have their peculiar symbols, rules
and usages.
Numeric symbols (0, 1, 2 ...), as well as rules on how the symbols can be
combined to express quantitative information (e.g., 102.237 or 13.4
metres, or 34 F). Hence, 100 is considered to be exactly ten times larger
than 10, and 10 is considered to be ten times larger than 1.
Pictorial and graphic symbols, such as lines, dots, and the like, that can
be used to create image data, such as maps, cartoons, line graphs,
drawings, paintings, etc. The Chinese and Japanese, for instance use
graphic symbols to represent information.
Most literate people know about the decimal number and counting
system, which involves the use of the 0, 1 ...9, in various combinations to
express integers (e.g. 1, 23, 567, -12 ...) as well as fractional numbers
(1.237, -12.67, 0.000212). Most people also know how to count in units,
tens, hundredths, and so on.
Move the decimal point a number of spaces from its original position in
the given number so that the original number is in the form of a derived
number 'N.NNNN', where the first N before the point is any of the
positive o r negative digits (1,2,3, .....9,-1,-2...-9), and the other N's could
be any positive digit (0, 1, 2...9).
Similarly,
Note however that another way of saying '10 raised to a certain positive
or negative power' is to say '10 exponent the positive or negative power'.
Hence, the above equations can be rewritten as
The input devices perform the task of detecting signals produced by the
action a computer operator, such as the press of a key on the keyboard,
or the movement of a mouse, or the clicking of a digital camera, or the
act of speaking into a microphone. The input devices immediately code
the incoming signals into electrical pulses that are transmitted through
some connecting wires to the CPU.
In other words, the CPU and the input (as well as output) devices use a
common language of electronic pulses to exchange data. Accordingly,
data are represented in computer systems not in the form of natural
language symbols, but in the form of electronic symbols, ie. Electronic
pulses. As noted above, the electronic pulses are of just two kinds, low
and high, and are represented by low and high electrical voltages
respectively.
To recap, CPUs and input and output devices are only able to
understand just two types of symbols, either low or high voltage pulses
or nothing else. By contrast, human beings use a wide variety of symbols
(alphabetical, numerical, special, pictorial, etc.) to express information
with data. It was precisely in order to bridge the gap in the number of
symbols used by humans (many symbols) and computer devices (only
two symbols) that early computer engineers and scientists adopted a
data coding system to be used by computer devices to translate data in
human languages into equivalent data that the devices can understand,
and vice versa.
The binary number system
You probably learned about the binary number system in high school.
But if you did not, or you have forgotten, here is an introduction.
However, in order to promote your understanding, let us review the
decimal system with which you are very familiar.
The decimal system has ten symbols or digits (0, 1, 2... 9). Hence, to
represent increasingly bigger units of numbers we begin at 0, and count
through 1 to 9, at which point we run out of digits. The next number is
nine plus one, and to represent it we write a 1 and then a 0 to get 10.
What the 10 means is that we now have one of tens and zero of units. We
then continue cycling through the remaining digits again, hence, 11, 12,
and 19. The last number, 19, means one of tens and nine of units. The
next number is then written as 20, meaning 2 tens and zero units. And
so on. Similarly, the next number after 99 is 100, meaning one of
hundreds and zero of units, and 1000 means one of thousands and zero
of units. And so on.
200
30
--------------
1234
--------------
the ten digits of the decimal number system. Another way to say this is
that in the decimal system we count in base 10. This is the familiar
number system that is used to represent, and also add, subtract,
multiply and divide quantities. Of course you learned about this system
as early as in primary school.
Now let us see how rules for representing quantities in the binary
number system is very similar (but not identical) to that of the decimal
number system.
The binary number system has only two symbols or digits, 0 and 1.
Hence, to represent increasingly bigger units of numbers we begin at 0,
and then 1, at which point we run out of digits. The next number is one
plus one, and to represent it we write a 1 and then a 0 to get 10. Note
however, what the 10 here means is that we now have one of and zero of
units. The next number will the be 11, that is one of and one of units.
Thereafter, the next number will be 100, meaning one of fours and zero
of twos and zeros of units. We then continue counting 101, and then 111.
The next number will be 1000, meaning one of eights, zero of fours, zero
of twos and zero of units. Next will be 1001, 1011, 1111, 10000, 10001.
Now, what is the meaning of the last number? It means one of sixteens,
zero of eights, zero of fours, zero of twos and one of units.
This is the way the binary number system is built up, from zero numbers
to very large numbers. Hence, a binary number such as 1001010 can be
decomposed as
11000
111
-----------
11000
-----------
The procedure is that we begin the addition from the right most column.
Addition one and one gives one of twos and zero of units. Hence, we
write 0 under the column and carry over one of twos to the next column
to the left. The process is repeated until the addition is complete.
You know already that in the decimal system, numbers are counted
from zero upwards, in units, tens, hundreds, thousands' and so on.
Hence the first twenty numbers, as well as a few other selected numbers
are shown in Table 5.1
0 0
1 1
2 10
3 11
4 100
5 101
6 110
7 111
8 1000
9 1001
10 1011
16 10000
20 10100
30 11110
32 100000
64 1000000
99 1100011
100 1100100
128 10000000
1 of 64 (or 1 of 26) +
1 of 3 2 (or 1 of 2') +
0 of 16 (or 0 of 2') +
1 of 8 (or 1 of 2') +
0 of 4 (or 0 of 22) +
1 of 2 (or 1 of 2') +
1 of 1 (or 1 of 2°).
Thereafter add up the binary number equivalent of each line on the right
of the equation, hence:
10000000 +
1000000 +
100000 +
00000 +
1000 +
+
+
--------------
--------------
Find the highest power of two that can be taken from the given decimal
number with a remainder (zero is a remainder).
Find the next lower multiples or powers of two, which can be taken from
the remainder in step (i);
Repeat step (ü) until you cycle through all the multiples or powers of
two lower than that found in step (i);
Re-arranging data
In the last unit we noted that data are symbols that have been used to
describe one or more entities. The data could be just one word, say
'elephant', or a set of words or numbers, such as 'An elephant is a very
big animal, indeed the biggest land animal. It has tusks, and eats grasses.
You also learned that data can also be defined, created and collected in
separately meaningful portions. In particular, you learned how data are
sometimes organized into data tables, data records and data fields.
Let us now look more closely at one such data (Table 7.4), which shows
the sales of medicines recorded in a register by a sales clerk (Table 7.4).
The data are provided as records and fields. How many data records and
data fields are there? There are eight records and six fields respectively.
By the reverse order of the date on which the purchase was made, i.e.,
the reverse of the natural time order.
Although the latter arrangement might be useful for some purpose, the
purpose is not clear. Is it clear to you? Now when you encounter such an
arrangement you probably will initially try to understand why the data
are arranged in that way. However, if you cannot understand, you
become less sure about what meaning the person who arranged the
names of the cities wanted you to obtain from such an unusual
arrangement.
Data modeling and data models
Suppose now that we have created the following data to describe the
different living things:
However, another person might disagree, arguing instead that the five
words should be classified into the following three distinct groups: father,
child (because they are human and members of a family); cat, dog
(because they are both domestic animals who dislike each other); house
(because it is the only non-living object). Furthermore, whereas
someone might say that forest, jackal and owl, should be grouped
together because both jackal and owl live in forests, another person may
argue that dog and jackal should be grouped together because they have
similar features. Everyone will probably be partly right, just like the
proverbial six blind men who went to 'see' (pardon me, touch) an
elephant.
The reason for the differences in the ways that they categorize, classify
or group the words (the data) is that they are viewing, structuring or
modelling the data in different ways. Each of the different groupings of
the data will be appropriate for some occasions and inappropriate for
others. In other words, data can often be categorized or modelled
differently for different purposes.
Types of data structures or models
G1, F1, F2, MI, M2, C1, C2, C3, C4, C5, C6
G I FI, F2 M1, M2
CI, C2, C3, C4, C5, C6
Suppose however that we know that some of the fathers, mothers and
children are related biologically. We can then use this knowledge to
structure the data differently as follow
C1 C2 C3 C4 C5 C6
This last grouping and linking of the data records is what is known as the
hierarchical method of modelling data. In other words, a parent-child
hierarchy is established among the data. Hence, data record G1 is the
parent of data records Fl and M1; In turn data record F1 is the parent of
data record C1, whereas data record MI is the parent of data records C2,
C3, and C4. And so on.
The idea behind linking data or data records in a hierarchy is that, once
they are so linked, we can get to the data for a child from the data of the
father or mother.
Firstly, we may want to regard each piece of data as separate data in its
own right. We can then go on to arrange the data alphabetically, as
explained above. However, we can go further to first group them into
meaningful or useful categories, and then arrange the data in each
category hierarchically from smallest place to largest place, as follows:
Plot 12
Of course, these are how addresses are usually written - from the
smallest place to the largest place. In other words, the addresses we
often see on envelopes are actually inverse hierarchical arrangements of
data, comprising the name of a person, who is living in a particular
house, on a particular street, in a particular town, in a particular country,
of the world. In fact, data on people and places in the world can be
arranged in a huge hierarchy beginning with data on the world, then
data on individual countries, then data on cities, towns and villages in
the countries, then data on streets in the cities, towns and villages, then
data on houses on the streets, and finally data on persons in the houses.
Data collection
Different data collection methods and instruments are used for data
collection. The major methods are direct observation, interviews, and
the administration of forms and questionnaires. You have probably used
or heard about some of the common instruments that are used by
natural scientists and technologists for collecting data in laboratories or
workshops, including calipers, scales, thermometers, voltmeters, etc. But
there are numerous other more sophisticated instruments. For example,
ultrasound equipment is used by medical scientists and technologists to
collect data about parts of a human body. X-ray equipment is also used
to obtain images of objects. The reason for these instruments is that
science and scientists aim for precise ways of measuring and recording
data about variables and constants. Hence, more accurate and
sophisticated instruments continue to be invented.
Data quality control refers to the processes and methods by which the
accuracy, validity and reliability of data is ensured at the different stages
of the data management cycle. (The data management cycle was first
explained in Unit 2). The aim of data quality control is to ensure t hat the
data that are created or collected, stored, processed, communicated and
used by an information system meets the system's minimum standards
of quality.
Data quality control aims to ensure that data are accurate, valid and
reliable. Each information system will often specify and work toward
achieving acceptable standards of data accuracy, validity and reliability.
To achieve this, information systems usually use different strategies and
methods for ensuring the quality of their data.
Three words have so far been used repeatedly - accuracy, validity and
reliability. You may now be wondering what these words mean. So let
me explain them.
Accuracy of data
Before I explain the meaning of data accuracy, you should recall from
Units I and 8 that data are symbols that have been used to describe or
express information about one or more entities. Entities could be
persons, objects, events, ideas, or even the attributes of the persons,
objects, events, ideas, etc. A word 'variable' is often also used to describe
an entity that varies from one situation to another, say the marks
obtained by different students in a course, or the names of some children.
Suppose for instance, that a thrown stone actually hits a man on the
nose, but a witness says that "the stone hit the man in the face". Of
course, a face includes the nose, so the witness is not telling a lie.
However, by mentioning 'a face' instead of 'a nose' the witness is not
being precise or exact. Hence, the data "the stone hit the victim in the
face" is not a very accurate description of the incident, although it is true.
Such imprecise data may be deemed adequately accurate or inaccurate
for different purposes.
Validity of data
The data will be considered valid if Abu is actually or truly older than
John. Conversely, the data will not be valid if Abu is not actually or truly
older than John. In other words, data that expresses untrue information
about an entity is invalid as far as that particular entity is concerned.
For the second example, suppose that a person actually eats bean stew
once every week. If the person, in response to a question, says 'four
times per month', the data may not be valid. This is because 'four times
per month' does not convey the same information as 'once per week'. A
person who eats beans stew once per week actually eats it regularly once
every seven days. However, a person who eats the food four times per
month might eat the food on four consecutive days in a month, or in
other sequence of four days different from 'once per week'. This second
example illustrates the fact that the validity of data decreases as the data
becomes less and less accurate.
You will recall from the previous unit that instruments for measuring
and recording data include not only physical or mechanical instruments
such as tape rules, weight scales and voltmeters, but also social survey
questionnaires and educational achievement tests. In other words,
questionnaires and tests can also be described as accurate/inaccurate
and/or reliable/unreliable. Hence, a question in a questionnaire that
asks respondents about when they usually wake up in the morning will
be considered to be a very reliable instrument if each respondent
provides the same time waking time each and every time the question is
asked. Also, a whole questionnaire will be considered to be highly
reliable if more or less the same data can be collected with the
questionnaire each time it is used. Finally, an examination test in a
subject will be considered to be a reliable instrument for determining
the achievement of students in that subject only if the student’s
performance will be more or less be the same if they do the test on two
or more occasions.
The qualities of data are affected by the quality of the instruments that
are used to measure, record or create data. Secondly, a good instrument
used in a wrong manner often also leads to data of low quality. Thirdly,
low data quality can result from errors made in recording the data that
has been measured with a good instrument. Finally, errors can also arise
when data are copied or transferred from one medium (say, paper) to
another medium (say, paper or computer). In other words, quality
instruments and procedures are required to ensure that quality data are
created or collected.
Data quality control can be performed before, during and after data are
collected. As explained above, one of the main tasks in planning for data
collection is to design effective data collection instruments and
procedures. Data quality can be controlled before data are actually
collected by implementing strategies for producing valid, accurate and
reliable data collection instruments and procedures. Planned and pre-
tested instruments and procedures are more likely to generate good
quality data than unplanned and untested instruments. Strategies for
controlling data quality before data are collected are explained in section
11.5.
Finally, data quality can be controlled after data had been collected by
cross checking the data for errors. This can be done by humans and/or
machines. Strategies for controlling data quality after data are created or
collected are explained in section 11.7.
The human brain was used to store data and information long before paper
was invented. However, the human brain provides a store not only for data,
but also knowledge.
You will recall from Unit l that knowledge is the extent of familiarity
possessed by a person with certain facts, truths, principles or subjects.
Knowledge consists of facts, truths and ideas that fit together to form a
coherent and meaningful whole. A person's overall knowledge influences
how he perceives the world around him. In other words, knowledge is used
by people to interpret and evaluate new information. Knowledge serves as a
pool from which people can extract specific truths, facts, ideas for
informing or instructing other people. Data are stored in human brains as
part of knowledge. Data are also created from a person's knowledge when
the person uses symbols to express information extracted from the
knowledge.
Nobody knows for sure how data and knowledge are organized and stored
in the human brain. Psychologists claim however that human beings store
data in either short-term or long-term memory. Data in short term
memories are stored for a short time, and are lost unless committed to long
term memory. Among the strategies used by people to store data in long
term memory is to say, read or write the data repeatedly. Is this not how
you commit data to memory? If not, how do you do it? There are of course
many other methods which we cannot explain here.
Paper
Paper became a media for storing data when papyrus was invented. Data
were initially stored on paper by writing long hand until printing was
invented. Data are still mostly recorded on paper in the form of published
or unpublished, as well as printed, typed and hand-written documents.
Among the well known paper-based data and information sources are
books, newspapers, journals, technical reports, correspondence, etc.
Microforms
Microforms is the general word used to describe all miniature but non-
computerized storage media such as film rolls, film slides, microfiche, etc.
Data are stored in these media as miniature or microscopic images, hence,
the name microforms. The data are stored on such media by photographing
or scanning pages of paper or computer documents, and then transferring
the images unto film rolls, film slides or microfiche. The major advantage of
microforms is that they require much less space to store than paper.
However, special equipment such as reading glasses or lens, film projectors,
microfiche readers, etc, are required to access the data in these media.
Computer media
Data are increasingly being stored on computer media. Computer media
include tapes, disks, diskettes, compact disks (CDs), smart cards, mobile
phone recharge cards, etc. Data are stored on computer media as data files.
A data file on a computer media is any collection of data stored under a
single name. The data might comprise alphabetical, numerical or special
characters, or digitized images. (You may review Unit 5 for explanation on
how computers store digitized images). The data in the file may also be
subdivided into data records and fields, as you learned in Units 7 and 8.
The gramophone record was a popular media for storing sound data before
the arrival of audio and video tapes and CDs. Tapes and CDs are now used
for storing sound, image and voice data. You most probably had listened to
a Michael Jackson tape or CD before, but tapes and CDs are now used as
media for publishing books, dictionaries and encyclopedias. Indeed, some
of the study materials in some of your NOU courses might be provided on
audio or video tapes or CDs.
Let us now focus on just one organization, say the National Open
University of Nigeria. To function properly, the university will need to
collect, store and process data on many types of entities, including its
employees, equipment, buildings, vehicles, activities, sales and purchases,
projects, letters received, letters dispatched, office files, students, courses,
examinations, graduation ceremonies, student associations, books in the
libraries, study centres, tutors, tutorials, etc.
Now, notice that each type of entity is in the plural, meaning that for each
type of entity there will be many instances or members of that entity. For
example, there will be many employees, students, examinations, study
centres, etc.
Why a database?
Data on each of the entities that were listed for the university in the
previous section will invariably accumulate over time. As the years pass by,
the volume of data collected and accumulated on each entity by the
university will increase. Now, unless the university finds a way of
organizing and storing the data, the data will grow into a mass of
unorganized data from which it will be difficult or impossible to find
specific data. If that happens the university will not be able to locate, say,
data on specific students who graduated some years previously. What do
you think will happen then? One possibility is that some people might
claim that they graduated from the university, and the university will find it
very difficult or impossible to confirm their claims.
This is the reason why the data created and collected by an organization,
and which often accumulate over time, must be properly organized for
storage and retrieval. Before computers became popular most organizations
created, collected and stored data in paper documents - registers, forms,
sheets, letters, printed reports, hand written memos, etc. Organizations
also created office files, file cabinets and record centres for managing the
data in paper documents. Eventually, paper files became too voluminous
and demanded expensive storage space. So organizations used microfilm
and other types of microforms to store some of the data, particularly those
not needed frequently. They did this by filming their documents and
keeping the images on the microforms.
What is a database?
Your review of Unit 7 will have refreshed you with the following facts:
You have already learned about the concepts of database, DBMS, and data
tables, records, and fields. You now need to understand a few more.
Form:
This refers to a pre-defined format for entering data into one or more data
tables in a database. A DBMS can usually be used to design and display a
form on the computer screen to enable data to be entered into the records
of the table. For example, shown in Table 13.2 is a form that may be
designed and used to enter data into records of the table in Table 13.1. Such
a form can be used to enter data for each book, or to display the data for
each book.
View:
Query:
Display from the Books table the author and title of books published by
'ABC Publishers'.
You will note that a query comprises three main parts, as broken down
below:
The first line is the command to the DBMS telling it to display data from
the books table. The second line indicates the fields to be displayed. Finally,
the third line specifies the criteria or condition to be used by the DBMS to
determine whether a record should be displayed. Queries are sometimes
pre-defined and stored in the database.
Conclusion
1. Creating a database.
2. Creating data tables.
3. Updating records in data tables.
4. Sorting the records in the tables.
5. Creating and using indexes to tables.
6. Displaying records and fields from tables.
Creating a database
Creating a database is the first task that you must perform when you begin
to use a DBMS to create and manage a database. The database is created
initially as an empty or blank container into which tables, queries, forms,
reports, etc. will be stored or saved.
Hence, the process of creating the database is very simple: you only need to
tell the DBMS three things: the type of database to be created, the name
you want to call the database (i.e., file name of the database), and where (ie.,
in which folder) the database should be stored on a computer media.
Creating data tables
2. Using a DBMS to create and save the designed record structure in the
database.
Secondly, recall that for each entity you need to determine the different
attributes for which data will be stored, so that appropriate fields can be
created in the corresponding table.
Now, and thirdly, for each field that you decide to include in the table for an
entity, you need to determine its:
Field name;
Field type; and
Field width.
Field name (or Field label) refers to the name, label or heading for a field.
A field name should be one that reveals the type of data stored in the field
(e.g., 'Birth date' if the field is to contain the birth dates of people).
Field type (or Data type) refers to the type of data that can validly be
entered into the field. Field types include:
Character (or Text): for a field that will contain different types of characters
- alphabetical, numeric, special.
Number (or Numeric): for a field that will contain only numbers.
Date: for a field that will contain dates, such as birth dates, dates of
appointments, or dates of sales.
Yes/No (or Logical): for a field that will contain either 'Yes' or 'No' data, or
'True' or 'False' data. For example, if a field of a table is named 'Passed
English', valid data for the field will be 'Yes' or 'No'.
Field width (or Field size): refers to the amount of character spaces that will
be provided for entering data in the field. An alphabetical character or a
numeric digit occupies one character space each. Hence, if you specify a
field width of 10 for a field named 'Surname', the DBMS will allow you to
enter only ten characters for each surname in the field. Of course, for such a
small field width, you will run into space problems if you must enter long
surnames, such as 'Abiola-Thompson', in the field. The 10 spaces will be
enough for entering only 'Abiola-Tho'. On the other hand, you will be
wasting storage space on computer media if you specify an unnecessarily
large field width for field.
Data entry into a table can be performed only after the table's record
structure had been defined and saved in the database. Data entry may be
performed as the data becomes available, for example, as a new applicant
submits an application letter or a completed application form. Often the
data are entered in batches of say, ten or fifty records at a time. The data
might also be scanned into the records of a table using various computer
input devices, such as scanner, cameras, microphone, etc.
DBMS software usually provides a form on the computer screen that can be
used to enter data into the records of a table. Data are keyed or scanned
into the form much like the manner one would complete a paper form. The
software often also provides automatic checking of the data as they are
keyed in. For example, if a data entry clerk attempted to enter alphabetical
data in a field that is expected to contain only numeric data, the computer
would beep a warning and reject the data.
The DBMS usually arrange the records in a table automatically in the order
of the data in the primary key field. For example, if 'Student Number' is the
primary key field of a table, the records in the table will be automatically
arranged in order of student numbers any time records are added to,
modified in, or deleted, from the table.
Records can usually be displayed in two ways: the datasheet method and
the form method. In the datasheet method, the DBMS uses the table format
to display the records, that is, in the form of rows and columns of data. As
many of the records and fields can fit into a display window are displayed at
a time, and scroll bars and cursor keys can be used to display more records
or fields as desired. In the form method, the one record is displayed at a
time and in a form, as shown in Table 14.3
Conclusion
Data and information retrieval provides a meeting point between data and
information, and also between data and information creators and
information searchers. On the one hand, people create data to express
information. Such data are invariably recorded, organized and stored on
paper, computer and other media using different kinds of strategies. For
example, data are usually organized in textbooks into chapters, sections,
sub-sections and paragraphs. Tables of contents and indexes are also
provided. Data are often also organized into the tables of a database, with
each table organized into records and fields. On the other hand, are
information searchers who want specific data or information to use for
various purposes, and are willing to search for information from various
media and data stores. The important question is whether searchers will be
able to obtain the data and information that they need by searching the
appropriate data stores.
In order to search for, and retrieve information from a textbook you are
most likely to go through the following process:
You will, first of all, need to determine and describe the type of information
that you want to find from the textbook. This may be a particular single
word or phrase, paragraph, table, or any useful information on a topic or
subject. You will then describe what you are looking with some data, such
as water yam, or Chapter two, or Figure 2.3 or politics and corruption in
Nigeria. In information retrieval, what you want to find from a data store,
and which you describe with data (such as water yam), is referred to as
your search term.
Let us assume that you have decided to search for your search term in
the index .However, you still need to make up your mind as to how
you will decide if a word or phrase in the index matches your search
term. You might be looking for words or phrases in index that match
phrases data that are approximately close to your search term in either
your search criteria. Your search criteria is the condition that must be
met before you will accept that a word or phase in index is likely to
lead you to useful data and information in the textbook. For instance,
is your search term. Your search criteria might then be any one of
The following:
Notice that the first of the above criteria is most specific, followed by
second. The third criteria is the least specific. The more specific your search
criteria, the lower will be your chances of finding words or phrase that can
satisfy the criteria. Conversely, the less specific your search criteria, the
higher will be your chances of finding words or phases that satisfy the
criteria. For instance, if (c) was your search criteria, you will accept and
follow up on the following words when you come across them in the index:
Yam, White yam, coco-yam, potato, cassava, and carrot.
In this step, you will use both your search criteria to browse through the
index .You will do this by inspecting words and phrases in the index ,and
then deciding for each word whether it satisfies your search criteria . Of
course, being a human being, you might miss some words or phrase as you
browse. You might also change your search term or search criteria as you
browse. If you are lucky, you will find words or phrases that satisfy your
search criteria. You will then note the page numbers corresponding to the
words or phrases that you have found .Finally, you will refer to the various
pages in the textbook, and locate where the words or phrase occur in the
page
Next, you will assess or evaluate the data and information that you have
found on the various pages. You do this usually by noting and evaluating
the other words, phrases and sentences associated with your search term in
the textbook. For example, suppose your search term is 'white yam', and
you have found it in a particular paragraph of the textbook. You will usually
read the paragraph, as well as other nearby paragraphs toward gaining
information about your search term.
Importance of data to organizations
Let us now focus on just one organization, say the National Open
University of Nigeria. To function properly, the university will need to
collect, store and process data on many types of entities, including its
employees, equipment, buildings, vehicles, activities, sales and purchases,
projects, letters received, letters dispatched, office files, students, courses,
examinations, graduation ceremonies, student associations, books in the
libraries, study centres, tutors, tutorials, etc.
Now, notice that each type of entity is in the plural, meaning that for each
type of entity there will be many instances or members of that entity. For
example, there will be many employees, students, examinations, study
centres, etc.
Updating a data table
Data entry into a table can be performed only after the table's record
structure had been defined and saved in the database. Data entry may be
performed as the data becomes available, for example, as a new applicant
submits an application letter or a completed application form. Often the
data are entered in batches of say, ten or fifty records at a time. The data
might also be scanned into the records of a table using various computer
input devices, such as scanner, cameras, microphone, etc.
DBMS software usually provides a form on the computer screen that can be
used to enter data into the records of a table. Data are keyed or scanned
into the form much like the manner one would complete a paper form. The
software often also provides automatic checking of the data as they are
keyed in. For example, if a data entry clerk attempted to enter alphabetical
data in a field that is expected to contain only numeric data, the computer
would beep a warning and reject the data.
You learned about views of a data table in the previous unit (in section 13.7).
To refresh your memory, a view is a way of displaying some or all the
records and fields in a data table. Accordingly, each different sorting of the
records in a data table is a different view of the table.
The DBMS usually arrange the records in a table automatically in the order
of the data in the primary key field. For example, if 'Student Number' is the
primary key field of a table, the records in the table will be automatically
arranged in order of student numbers any time records are added to,
modified in, or deleted, from the table.
After data are entered into the records of a table, the DBMS can be used to
display the records.
Records can usually be displayed in two ways: the datasheet method and
the form method. In the datasheet method, the DBMS uses the table format
to display the records, that is, in the form of rows and columns of data. As
many of the records and fields can fit into a display window are displayed at
a time, and scroll bars and cursor keys can be used to display more records
or fields as desired. In the form method, the one record is displayed at a
time and in a form, as shown in Table 14.3
Conclusion
Creating and using databases is a key aspect of data organization and
management. The reason is that all types of data often must be organized
and stored temporarily and permanently .Such data can be stored in both
computer and non computer (e.g. paper) media .However, computer
databases provide a platform for structuring data into tables, records and
fields, and for creating different queries, view, forms, reports, e.t.c, for
updating, displaying and printing the data in the tables.
Data and information retrieval provides a meeting point between data and
information, and also between data and information creators and
information searchers. On the one hand, people create data to express
information. Such data are invariably recorded, organized and stored on
paper, computer and other media using different kinds of strategies. For
example, data are usually organized in textbooks into chapters, sections,
sub-sections and paragraphs.
Tables of contents and indexes are also provided. Data are often also
organized into the tables of a database, with each table organized into
records and fields. On the other hand, are information searchers who want
specific data or information to use for various purposes, and are willing to
search for information from various media and data stores. The important
question is whether searchers will be able to obtain the data and
information that they need by searching the appropriate data stores.