Finals Reviewer
Finals Reviewer
Finals Reviewer
Join – relational operation that causes two tables with a common domain to be combined into
a single table or view.
-most frequently used relational operation,
Equi-join - join in which the joining condition is based on equality between values in the
common columns. Common columns appear (redundantly) in the result table.
Cartesian join - The number of rows is equal to the number of rows in each table, multiplied
together
Natural join - join that is the same as an equi-join except that one of the duplicate columns is
eliminated in the result table.
Outer join - join in which rows that do not have matching values in common columns are
nevertheless included in the result table.
Union Join - will be a table that includes all data from each table that is joined. The result table
will contain all columns from each table and will contain an instance for each row of data
included from each table
Self join - when a join requires matching rows in a table with other rows in that same table—
that is, joining a table with itself.
Subqueries - which involves placing an inner query (SELECT . . . FROM . . . WHERE) within a
WHERE or HAVING clause of another (outer) query.
The inner query provides a set of one or more values for the search condition of the
outer query.
Such queries are referred to as subqueries or nested subqueries.
Subqueries can be nested multiple times.
Subqueries are prime examples of why SQLis a set-oriented language.
Correlated subquery - In SQL, a subquery in which processing the inner query depends on data
from the outer query.
Derived Tables
Combining Queries
Conditional Expressions - IF-THEN-ELSE logical processing within an SQL statement can now be
accomplished by using the CASE keyword in a statement
TIPS FOR DEVELOPING QUERIES:
Familiarize yourself with the data model and the entities and relationships that have
been established.
Be sure that you understand what results you want from your query.
Figure out what attributes you want in your query result.
Locate within the data model the attributes you want and identify the entity where the
required data are stored.
Review the ERD and all the entities identified in the previous step.
Construct a WHERE equality for each link
When you have a basic result set to work with, you can begin to fine-tune your query by
adding GROUP BY and HAVING clauses, DISTINCT, NOT IN, and so forth.
Until you gain query writing experience, your first draft of a query will tend to work with
the data you expect to encounter.
Rather than use the SELECT * option, take the time to include the column names of the
attributes you need in a query.
Try to build your queries so that your intended result is obtained from one query.
Rather than obtain those data in several separate queries, create a single query that
retrieves all the data that will be needed;
Guidelines for Better Query Design
Sharability Routines - may be cached on the server and made available to all users so
that they do not have to be rewritten.
Applicability Routines -are stored as part of the database and may apply to the entire
database rather than be limited to one application. This advantage is a corollary to
sharability.
Embedded SQL Hard - coded SQLstatements included in a program written in another
language, such as C or Java.
Dynamic SQL - Specific SQLcode generated on the fly while an application is processing.
ANSWERS TO REVIEW QUESTIONS
Define each of the following key terms:
a. Dynamic SQL. The process of making an application capable of generating specific SQL code on the fly,
as the application is processed
b. Correlated subquery. This type of subquery is processed outside in, rather than inside out. That is, the
inner query is executed for each row in the outer query, and the inner query depends in some way on
values from the current row in the outer query. The results of the inner query will, in turn, affect the
final results of the outer query
c. Embedded SQL. The process of including hard-coded SQL statements in a program written in another
language such as C or Java
f. Equi-join. A join in which the joining condition is based on equality between values in the common
columns. It produces a table of rows composed of columns from two other tables, where common
columns appear (redundantly) in the result table.
g. Self-join. A join that requires matching rows in a table with other rows in the same table. This is a join
that joins a table with itself and often occurs with the presence of a unary relationship in the database,
such as a Supervisor or Manager of Employees within an Employee table.
h. Outer join. A join in which rows that do not have matching values in common columns are
nevertheless included in the result table. Outer joins return all the values in one of the tables included in
the join, regardless of whether a match exists in the other table(s) or not.
i. Function. A stored subroutine that returns one value and has only input parameters
j. Persistent Stored Modules (SQL/PSM). Extensions defined in SQL:1999 that include the capability to
create and drop modules of code stored in the database schema across user sessions
e 1. equi-join
I 2. natural join
d 3. outer join
j 4. trigger
k 5. procedure
g 6. Embedded SQL
b 7. UDT
f 8. COMMIT
c 9. SQL/PSM
h 10. Dynamic SQL
a 11. ROLLBACK
CHAPTER 8
Client/server system - networked computing model that distributes processes between clients
and servers, which supply the requested services. In a database system, the database generally
resides on a server that processes the DBMS. The clients may process the application systems
or request services from another server that holds the application programs.
Application partitioning - The process of assigning portions of application code to client or
server partitions after it is written to achieve better performance and interoperability (ability of
a component to function on different platforms).
Fat client - client PC that is responsible for processing presentation logic, extensive application
and business rules logic, and many DBMS functions.
Database server - computer that is responsible for database storage, access, and processing in
a client/server environment. Some people also use this term to describe a two-tier client/server
applications.
Middleware Software - that allows an application to interoperate with other software without
requiring the user to understand and code the low-level operations necessary to achieve
interoperability.
Application program interface (API) - Sets of routines that an application program uses to
direct the performance of procedures by the computer’s operating system.
Open database connectivity (ODBC) - An application programming interface that provides a
common language for application programs to access and process SQL databases independent
of the particular DBMS that is accessed.
Three-tier architecture - client/server configuration that includes three layers: a client layer
and two server layers. Although the nature of the server layers differs, a common configuration
contains an application server and a database server.
Thin client - application where the client (PC) accessing the application primarily provides the
user interfaces and some application processing, usually with no or limited local data storage.
WEB APPLICATION COMPONENTS
Database server - This server hosts the storage logic for the application and hosts the
DBMS.
Web server - provides the basic functionality needed to receive and respond to requests
from browser clients.
Application server - software provides the building blocks for creating dynamic Web
sites and Web-based applications.
Web browser - Microsoft’s Internet Explorer, Mozilla’s Firefox, Apple’s Safari, Google’s
Chrome, and Opera are examples.
World Wide Web Consortium (W3C) - An international consortium of companies working to
develop open standards that foster the development of Web conventions so that Web
documents can be consistently displayed across all platforms.
XHTML - A hybrid scripting language that extends HTMLcode to make it XMLcompliant.
Java servlet - Java program that is stored on the server and contains the business and database
logic for a Java-based application.
Extensible Markup Language (XML) - text-based scripting language used to describe data
structures hierarchically, using HTML-like tags.
XMLSchema Definition (XSD) - Language used for defining XML databases that has been
recommended by the W3C.
XPath - One of a set of XMLtechnologies that supports XQuery development. XPath expressions
are used to locate data in XMLdocuments.
XQuery - An XMLtransformation language that allows applications to query both relational
databases and XMLdata.
Extensible Stylesheet Language Transformation (XSLT) – A language used to transform
complex XMLdocuments and also used to create HTMLpages from XMLdocuments.
Web services - set of emerging standards that define protocols for automatic communication
between software programs over the Web. Web services are XMLbased and usually run in the
background to establish transparent communication among computers.
Universal Description, Discovery, and Integration (UDDI) - technical specification for creating a
distributed registry of Web services and businesses that are open to communicating through
Web services.
Web Services Description Language (WSDL) - An XML-based grammar or language used to
describe a Web service and specify a public interface for that service.
Simple Object Access Protocol (SOAP) - An XML-based communication protocol used for
sending messages between applications via the Internet.
Service-oriented architecture (SOA) - collection of services that communicate with each other
in some manner, usually by passing data or coordinating a business activity.
Java Database Connectivity (JDBC) - has been an industry standard for a call-level application
programming interface (API) with which Java programs can access relational databases.
REVIEW QUESTIONS
a. Application partitioning. Assigning portions of application code to client or server partitions in order
to achieve better performance and interoperability
b. Application program interface (API). Type of software that allows a specific frontend program
development platform to communicate with a particular back-end database server, even when the front
end and back end were not built to be compatible
c. Client/server system. A common solution for hardware and software organization that implements
the idea of distributed computing. Many client/server environments use a local area network (LAN) to
support a network of personal computers—each with its own storage—that are also able to share
common devices (such as a hard disk or printer) and software (such as a DBMS) attached to the LAN.
Several client/server architectures have evolved; they can be distinguished by the distribution of
application logic components across clients and servers.
d. Middleware. A type of software that allows an application to interoperate with other software
without requiring the user to understand and code the low-level operations
e. Stored procedure. A module of code, usually written in a proprietary language such as Oracle’s
PL/SQL or Sybase’s Transact-SQL. It implements application logic or a business rule and is stored on the
server, where it runs when called.
f. Three-tier architecture. A client/server configuration that includes three layers: a client layer and two
server layers. Although the nature of the server layers differs, common configurations include an
application server or a transaction server.
g. Open Database Connectivity (ODBC). A standard interface through which an application program can
access SQL databases.
h. XML Schema. A specification of the structure of an XML document type.
i. Web services. Set of emerging standards that define protocols for communication between software
programs over the Web
j. XSLT A language used to transform complex XML documents and also used to create HTML pages from
XML documents
k. SOAP/Simple Object Access Protocol. An XML-based communication protocol used for sending
messages between applications via the Internet
Match terms with appropriate definitions:
g 1. client/server system
h 2. application program interface (API)
a 3. fat client
f 4. database server
e 5. middleware
i 6. three-tier architecture
c 7. thin client
j 8. XSD
d 9. SOA
b.10. w3c
CHAPTER 9
Data warehouse - subject-oriented, integrated, time-variant, nonupdateable collection of data
used in support of management decision-making processes.
Subject-oriented - A data warehouse is organized around the key subjects (or high-level
entities) of the enterprise.
Integrated - The data housed in the data warehouse are defined using consistent
naming conventions, formats, encoding structures, and related characteristics gathered
from several internal systems of record and also often from sources external to the
organization. This means that the data warehouse holds the one version of “the truth.”
Time-variant - Data in the data warehouse contain a time dimension so that they may
be used to study trends and changes.
Nonupdateable - Data in the data warehouse are loaded and refreshed from
operational systems, but cannot be updated by end users.
Operational system - system that is used to run a business in real-time, based on current data.
Also called a system of record.
Informational system - system designed to support decision making based on historical point-
in-time and prediction data for complex queries or data-mining applications.
Data mart - data warehouse that is limited in scope, whose data are obtained by selecting and
summarizing data from a data warehouse or from separate extract, transform, and load
processes from source data systems.
Independent data mart - data mart filled with data extracted from the operational
environment, without the benefit of a data warehouse.
Dependent data mart - data mart filled exclusively from an enterprise data warehouse and its
reconciled data.
Enterprise data warehouse (EDW) - centralized, integrated data warehouse that is the control
point and single source of all data made available to end users for decision support
applications.
Operational data store (ODS) - integrated, subject-oriented, continuously updateable,
currentvalued (with recent history), enterprise-wide, detailed database designed to serve
operational users as they do decision support processing.
Logical data mart - data mart created by a relational view of a data warehouse.
Real-time data warehouse - enterprise data warehouse that accepts near-real-time feeds of
transactional data from the systems of record, analyzes warehouse data, and in near-real-time
relays business rules to the data warehouse and systems of record so that immediate action
can be taken in response to business events.
Reconciled Data - Detailed, current data intended to be the single, authoritative source for all
decision support applications.
Derived data - Data that have been selected, formatted, and aggregated for enduser decision
support applications.
Transient data - Data in which changes to existing records are written over previous records,
thus destroying the previous data content.
Periodic data - Data that are never physically altered or deleted once they have been added to
the store.
Star schema - simple database design in which dimensional data are separated from fact or
event data. A dimensional model is another name for a star schema.
Grain - The level of detail in a fact table, determined by the intersection of all the components
of the primary key, including all foreign keys and any other primary key elements.
Conformed dimension - One or more dimension tables associated with two or more fact tables
for which the dimension tables have the same business meaning and primary key with each fact
table.
Snowflake schema - An expanded version of a star schema in which dimension tables are
normalized into several related tables.
Online analytical processing (OLAP) - The use of a set of graphical tools that provides users
with multidimensional views of their data and allows them to analyze the data using simple
windowing techniques.
Relational OLAP (ROLAP) - OLAPtools that view the database as a traditional relational
database in either a star schema or other normalized or denormalized set of tables.
Multidimensional OLAP (MOLAP) - OLAPtools that load data into an intermediate structure,
usually a three- or higher-dimensional array.
Data visualization - The representation of data in graphical and multimedia formats for human
analysis.
Data mining - Knowledge discovery, using a sophisticated blend of techniques from traditional
statistics, artificial intelligence, and computer graphics
REVIEW QUESTIONS
a. Data warehouse. A subject-oriented, integrated, time-variant, non-volatile collection of data used in
support of management decision-making processes (Inmon and Hackathorn, 1994)
b. Data mart. A data warehouse that is limited in scope, whose data is obtained by selecting and (where
appropriate) summarizing data from the enterprise data warehouse
c. Reconciled data. Detailed, historical data that are intended to be the single, authoritative source for
all decision support applications and not generally intended to be accessed directly by end users
d. Derived data. Data that have been selected, formatted, and aggregated for end-user decision support
applications
e. Enterprise data warehouse. A centralized, integrated data warehouse for the entire enterprise that
provides data to all end users of decision support applications.
f. Real-time data warehouse. A data warehouse that receives and analyzes (near) real-time data feeds
from systems of record (instead of using batch ETL), making data available for very quick responses to
business events as they take place.
g. Star schema. A simple database design in which dimensional data are separated from fact or event
data. A dimensional model is another name for star schema
h. Snowflake schema. An expanded version of a star schema in which all of the tables are fully
normalized
i. Grain. The length of time (or other meaning) associated with each record in the table
j. Conformed dimension. One or more dimension tables associated with two or more fact tables for
which the dimension tables have the same business meaning and primary key with each fact table