Overall System Structure
Overall System Structure
1B
Database Users
1. The database users fall into several categories: o Application programmers are computer professionals interacting with the system through DML calls embedded in a program written in a host language (e.g. C, PL/1, Pascal). These programs are called application programs. The DML precompiler converts DML calls (prefaced by a special character like $, #, etc.) to normal procedure calls in a host language. The host language compiler then generates the object code. Some special types of programming languages combine Pascallike control structures with control structures for the manipulation of a database. These are sometimes called fourth-generation languages. They often include features to help generate forms and display data. o Sophisticated users interact with the system without writing programs. They form requests by writing queries in a database query language. These are submitted to a query processor that breaks a DML statement down into instructions for the database manager module.
Specialized users are sophisticated users writing special database application programs. These may be CADD systems, knowledge-based and expert systems, complex data systems (audio/video), etc. Naive users are unsophisticated users who interact with the system by using permanent application programs (e.g. automated teller machine).
2B
Database Administrator
1. The database administrator is a person having central control over data and programs accessing that data. Duties of the database administrator include: o Scheme definition: the creation of the original database scheme. This involves writing a set of definitions in a DDL (data storage and definition language), compiled by the DDL compiler into a set of tables stored in the data dictionary. o Storage structure and access method definition: writing a set of definitions translated by the data storage and definition language compiler o Scheme and physical organization modification: writing a set of definitions used by the DDL compiler to generate modifications to appropriate internal system tables (e.g. data dictionary). This is done rarely, but sometimes the database scheme or physical organization must be modified. o Granting of authorization for data access: granting different types of authorization for data access to various users o Integrity constraint specification: generating integrity constraints. These are consulted by the database manager module whenever updates occur.
3B
Database Manager
1. The database manager is a program module which provides the interface between the low-level data stored in the database and the application programs and queries submitted to the system. 2. Databases typically require lots of storage space (gigabytes). This must be stored on disks. Data is moved between disk and main memory (MM) as needed. 3. The goal of the database system is to simplify and facilitate access to data. Performance is important. Views provide simplification. 4. So the database manager module is responsible for o Interaction with the file manager: Storing raw data on disk using the file system usually provided by a conventional operating system. The database manager must translate DML statements into low-level file system commands (for storing, retrieving and updating data in the database). o Integrity enforcement: Checking that updates in the database do not violate consistency constraints (e.g. no bank account balance below $25)
Security enforcement: Ensuring that users only have access to information they are permitted to see o Backup and recovery: Detecting failures due to power failure, disk crash, software errors, etc., and restoring the database to its state before the failure o Concurrency control: Preserving data consistency when there are concurrent users. 5. Some small database systems may miss some of these features, resulting in simpler database managers. (For example, no concurrency is required on a PC running MS-DOS.) These features are necessary on larger systems.
o
4B
5B
6B
1. Databases change over time. 2. The information in a database at a particular point in time is called an instance of the database. 3. The overall design of the database is called the database scheme. 4. Analogy with programming languages: o Data type definition - scheme o Value of a variable - instance 5. There are several schemes, corresponding to levels of abstraction: o Physical scheme o Conceptual scheme o Subscheme (can be many)
11B
14B
Similar to the network model. Organization of the records is as a collection of trees, rather than arbitrary graphs. Figure 1.5 shows a sample hierarchical database that is the equivalent of the relational database of Figure 1.3 .
HU UH HU UH
Figure 1.5: A sample hierarchical database The relational model does not use pointers or links, but relates records by the values they contain. This allows a formal mathematical foundation to be defined.
15B
Data are represented by collections of records. Relationships among data are represented by links. Organization is that of an arbitrary graph.
Figure 1.4 shows a sample network database that is the equivalent of the relational database of Figure 1.3 .
HU UH HU UH
Data and relationships are represented by a collection of tables. Each table has a number of columns with unique names, e.g. customer, account. Figure 1.3 shows a sample relational database.
HU UH
17B
18B
Another essential element of the E-R diagram is the mapping cardinalities, which express the number of entities to which another entity can be associated via a relationship set.
We'll see later how well this model works to describe real world situations. 2. The overall logical structure of a database can be expressed graphically by an E-R diagram: o rectangles: represent entity sets. o ellipses: represent attributes. o diamonds: represent relationships among entity sets. o lines: link attributes to entity sets and entity sets to relationships. See figure 1.2 for an example.
HU UH
7B
Data Models
1. Data models are a collection of conceptual tools for describing data, data relationships, data semantics and data constraints. There are three different groups: 1. Object-based Logical Models. 2. Record-based Logical Models. 3. Physical Data Models. We'll look at them in more detail now.
8B
Data Abstraction
1. The major purpose of a database system is to provide users with an abstract view of the system. The system hides certain details of how data is stored and created and maintained Complexity should be hidden from database users. 2. There are several levels of abstraction: 1. Physical Level: How the data are stored. E.g. index, B-tree, hashing. Lowest level of abstraction. Complex low-level structures described in detail. 2. Conceptual Level: Next highest level of abstraction. Describes what data are stored. Describes the relationships among data. Database administrator level. 3. View Level: Highest level. Describes part of the database for a particular group of users. Can be many different views of a database. E.g. tellers in a bank get a view of customer accounts, but not of payroll data. Fig. 1.1 (figure 1.1 in the text) illustrates the three levels.
HU UH
9B
Multiple users Want concurrency for faster response time. Need protection for concurrent updates. E.g. two customers withdrawing funds from the same account at the same time - account has $500 in it, and they withdraw $100 and $50. The result could be $350, $400 or $450 if no protection. Security problems Every user of the system should be able to access only the data they are permitted to see. E.g. payroll people only handle employee records, and cannot see customer accounts; tellers only access account data and cannot see payroll data. Difficult to enforce this with application programs. Integrity problems Data may be required to satisfy constraints. E.g. no account balance below $25.00. Again, difficult to enforce or to change constraints with the fileprocessing approach.
These problems and others led to the development of database management systems.
10B