CSC 214 Week 9
CSC 214 Week 9
Introduction
Haven gone through previous units to have a basic knowledge of the computer file
processing, the aim of this unit is to introduce the file management system and file
system architecture, stating their objectives and functions
Learning Outcomes
Main Content
.
File Management System
The file management system, FMS is the subsystem of an operating system that manages the data
storage organization on secondary storage, and provides services to processes related to their
access. In this sense, it interfaces the application programs with the low-level media-I/O (e.g.
disk I/O) subsystem, freeing on the application programmers from having to deal with low-level
intricacies and allowing them to implement I/O using convenient data-organizational abstractions
such as files and records. On the other hand, the FMS services often are the only ways through
which applications can access the data stored in the files, thus achieving an encapsulation of the
data themselves which can be
usefully exploited for the purposes of data protection, maintenance and control.
Typically, the only way that a user or application may access files is through the file
management system. This relieves the user or programmer of the necessity of developing
special-purpose software for each application and provides the system with a consistent, well-
defined means of controlling its most important asset.
Objectives of File Management System
The objectives of a File Management System as follows:
1. Data Management. An FMS should provide data management services to applications
through convenient abstractions, simplifying and making device-independent of the
common operations involved in data access and modification.
2. Generality with respect to storage devices. The FMS data abstractions and access
methods should remain unchanged irrespective of the devices involved in data storage.
3. Validity. An FMS should guarantee that at any given moment the stored data reflect the
operations performed on them, regardless of the time delays involved in actually
performing those operations. Appropriate access synchronization mechanism should be
used to enforce validity when multiple accesses from independent processes are possible.
4. Protection. Illegal or potentially dangerous operations on the
data should be controlled by the FMS, by enforcing a well
defined data protection policy.
5. Concurrency. In multiprogramming systems, concurrent access to the data should be
allowed with minimal differences with respect to single-process access, save for access
synchronization enforcement.
6. Performance. The above functionalities should be offered achieving at the same a good
compromise in terms of data access speed and data transferring rate.
Summary
The file management system is the subsystem of an operating system that manages the data
storage organization on secondary storage, and provides services to processes related to their
access. In this sense, it interfaces the application programs with the low-level media-I/O (e.g.
disk I/O). Objectives of File Management System includes: data management,
validity,protection, concurrency, performance
Self-Assessment Questions
Tutor-Marked Assignment
1. Explain any four Objectives of file management system that you know
2. Explain the lowest level of the file architecture.
3. What level in the file architecture is closest to the user programs?
References
Retrieved on 5th February 2019 from www.tn.nic.intnhomeprojectfilesfilemgmnt
Further Reading
https://www.globodox.com/file-management system
https://www.canto.com/file-management system
UNIT 2 Data Management Facilities
Introduction
Data management which is often an aspect neglected in file processing is briefly described in this
unit therefore the reader must have a background knowledge on computer files and operations
performed on a file to achieve a desirable understanding of this unit.
Learning Outcomes
Main Content
Data Management
Data management includes all aspects of data planning, handling, analysis,
documentation and storage, and takes place during all stages of a study. The objective is to create
a reliable data base containing high quality data. Data management is a too often neglected part
of study design, and includes:
(a) Planning the data needs of the study
(b) Data collection
(c) Data entry
(d) Data validation and checking
(e) Data manipulation
(f) Data files backup
(g) Data documentation
Each of these processes requires thought and time; each requires painstaking attention to detail.
The main element of data management are database files.
Database files contain text, numerical, images, and other data in machine readable form. Such
files should be viewed as part of a database management systems (DBMs) which allows for a
broad range of data functions, including data entry, checking, updating, documentation, and
analysis.
Data processing errors are errors that occur after data have been collected. Examples of
data processing errors include:
1. Transpositions (e.g., 19 becomes 91 during data entry)
1. Copying errors (e.g., 0 (zero) becomes O during data entry)
2. Coding errors (e.g., a racial group gets improperly coded because of changes in
the coding scheme)
3. Routing errors (e.g., the interviewer asks the wrong question or asks questions in
the wrong order)
4. Consistency errors (contradictory responses, such as the reporting of a
hysterectomy after the respondent has identified himself as a male)
5. Range errors (responses outside of the range of plausible answers, such as a
reported age of 290)
To prevent such errors, you must identify the stage at which they occur and correct the
problem.
1. Manual checks during data collection (e.g., checks for completeness, handwriting
legibility)
2. Range and consistency checking during data entry (e.g., preventing impossible results,
suchas ages greater than 110)
3. Double entry and validation following data entry
4. Data analysis screening for outliers during data analysis
EpiData provides a range and consistency checking program and allows for double entry and
validation, as demonstrated in the accompanying lab.
Advantages of DBMS
The organization can exert via the DBA, a centralized management and control over the data.
The database administrator is the focus of the centralized control. Any application requiring a
change in the structure of a data record requires an arrangement with the DBA, who makes the
necessary modification. Such modifications do not affect other applications or users of the record
in question.
2) Reduction of Redundancies:
Centralized control of data by the DBA avoids unnecessary duplication of data and effectively
reduces the total amount of data storage required. It also eliminates the extra processing
necessary to trace the required data in a large mass of data. Another advantage of avoiding
duplication is the elimination of the inconsistencies that tend to be present in redundant data
files. Any redundancies that exist in the DBMS are controlled and the system ensures that these
multiple copies are consistent.
3) Shared Data:
A database allows the sharing of data under its control by any number of application programs or
users. E.g. The application for the public relations and payroll departments could share the data
contained for the record type EMPLOYEE
4) Integrity:
Centralized control can also ensure that adequate checks are incorporated in the DBMS to
provide data integrity. Data integrity means that the data contained in the database is both
accurate and consistent. Therefore, data values being entered for storage could be checked to
ensure that they fall within a specified range and are of correct format. E.g. The value for the age
of an employee may be in the range of 16 to 65.
Another integrity check that should be incorporated in the database is to ensure that if there is a
reference to certain object, that object must exist. In the case of an automatic teller machine, for
example, A user is not allowed to transfer funds from a nonexistent savings account to checking
account.
5) Security
Data is of vital importance to an organization and may be confidential. Such confidential data
must not be accessed by unauthorized persons.
The DBA who has the ultimate responsibility for the data in the DBMS can ensure that proper
access procedures are followed, including proper authentication schemes for access to the DBMS
and additional checks before permitting access to sensitive data.
Different levels of security could be implemented for various types of data and operations. The
enforcement of security could be data value dependent (e.g. A manager has access to the salary
details of the employees in his or her department only) as well as data-type dependent (but the
manager cannot access the medical history of any employee, including those in his or her
department).
6) Conflict Resolution
Since the database is under the control of DBA, he could resolve the conflicting requirements of
various users and applications. In essence, the DBA chooses the best file structure and access
method to get optimal performance for the response critical applications, while permitting less
critical applications to continue to use the database, with a relatively slower response.
Disadvantages of DBMS
1) Significant disadvantage of DBMS is cost.
2) In addition to cost of purchasing or developing the software, the hardware has to be upgraded
to allow for the extensive programs and work spaces required for their execution and storage.
3) The processing overhead introduced by the DBMS to implement security, integrity and
sharing of the data causes a degradation of the response and through put times.
4) An additional cost is that of migration from a traditionally separate application environment to
an integrated one.
5) While centralization reduces duplication, the lack of duplication requires that the database be
adequately backed up so that in the case of failure the data can be recovered. Backup and
recovery operations are fairly complex in the DBMS environment. Furthermore, a database
system requires a certain amount of controlled redundancies and duplication to enable access to
related data items.
6) Centralization also means that the data is accessible from a single source namely the database.
This increases the potential severity of security branches and disruption of the operation of the
organization because of downtimes and failures.
7) The replacement of a monolithic centralized database by a federation of independent and
cooperating distributed databases resolves some of the problems resulting from failures and
downtimes
Circular logging: it is said to be only full, offline backups of the database are
allowed to recognize.
The database must be offline i.e. inaccessible to users “when full backup is taken. As the
name suggests, circular logging uses a ring” of a online logs to crashes. The logs are used
and retained only to the point of ensuring the integrity of current transactions only crash
recovery and version recovery are supported using this type of logging.
Archive logging: is the support recoverable database by archiving logs after they
have been written to that is to say, log files are not reused. Archive logging is used
specifically for roll – forward recovering. This enable the log retain and/ or the use-exit
database configuration parameter results in archiving logging. To archive logs, you can
choose to have database leave the log files in the active path and then manually archive
them, or you can install a user exit program to automate the archiving. Archived logs are
logs that were active but are no longer required for crash recovery.
Log files can be characterized as one of the following Active-the log files written by DBM,
supported crash recovery. They certain there in the files, information related to the units of
works that have not
Summary
You have learnt about the data management which is known to include all aspects of data
planning, handling, analysis, documentation and storage, and takes place during all stages of a
study. Database management system (DBMS) have also been described stating its advantages
and disadvantages. Types of software used for database management activities was also
described together with Methods to prevent data entry errors
Database that are very large, contain large objects such as photos, or require high performance,
you need to use advance method to store your data. Database provides table spaces, containers,
and buffer pools for you to define how data is store on your system. Two ways of Configuring
logging for database includes: Circular logging, Archive logging
Self-Assessment Questions
Tutor-Marked Assignment
References
Olowu, T. C. DATABASE MANAGEMENT (DBM). Proceedings of the 1st International
Technology, Education and Environment Conference (c) African Society for Scientific Research
(ASSR)
Bennett, S., Myatt, M., Jolley, D., & Radalowicz, A. (2001). Data Management for Surveys and
Trials.
Further Reading
http://www.epidata.dk/downloads/dmepidata.pdf.
http://repository.essex.ac.uk/2398/1/TrainingResourcesPack.pdf