Relational Database Operations Modeling With UML
Relational Database Operations Modeling With UML
Relational Database Operations Modeling With UML
1
<<SP Container>> 2.1. Relational Theory
SP_1 Site
<<SP>>ReportByPollutant(PollutantID:INT):Void PK SiteID:INT
<<SP>>ReportBySite(SiteID:INT):Void SiteName:VARCHAR(50)
Address:VARCHAR(50)
Relational database theory is based on Relational
City:VARCHAR(50)
State:VARCHAR(2) Model [4] and set theory. According to Relational Model,
Zip:CHAR(5)
Latitude:FLOAT relational tables are sets. The rows of the tables can be
Longitude:FLOAT
<<PK>>PK_Site()
considered as elements of the set. Operations on sets can
also be performed on relational tables. The eight relational
operations are union, difference, intersection, product,
Figure 1. Air toxics data archive UML data projection, selection, join and division. SQL translates
model the relational theory into practice. SQL is a language
that is a loose implementation of relational theory, and
has been further modified in its actual implementation by
the RDBMS software that uses it [7]. We will use the
schema and it is more expressive than ER modeling. Yet the
operations defined in relational theory to model SQL DML
current UML database modeling techniques mainly focus
operations.
on static schema modeling. Dynamic database operations
are modeled in an ad hoc manner.
The major contribution of this work is to identify the
weakness of UML Data Modeling Profile, i.e., lack of abil- 2.2. SQL Data Manipulation Language
ities to model operations. We will propose a framework on
how to model operations at database level. SQL is essen- Essentially all database operations are implemented us-
tially built upon relational theory. We will show how to ing atomic SQL data manipulation language constructs such
model atomic operations in SQL Data Manipulation Lan- as INSERT, UPDATE, DELETE, and SELECT. In this sec-
guage (DML) based on relational algebra and set theory. tion we will model standard SQL data manipulation lan-
SQL is a declarative language; it allows us to express guage constructs and use them as building block to model
what we want without going into the details about where it’s more complex operations in the next section.
located or how to get it [7]. We are not trying to model the We all know that there is no object in RDBMS and
internal execution details such as parsing SQL statement, operations are separated from tables. On the other hand,
validating the statement, optimizing the statement, gener- we can think of a row as an anonymous object of a class as
ating an execution plan and executing the execution plan. defined by the table. There are four predefined operations
All these details should not be exposed to application de- upon this object which are INSERT, UPDATE, DELETE
velopers. On the other hand, modeling database operations and SELECT.
at SQL statement level makes it easy for end users to under-
stand the database operations.
The rest of this paper is organized as follows. We de- 2.3. Insert Operation Modeling
scribe how to model atomic SQL DML constructs in Sec-
tion 2. We will use an example to show how to model com-
plex database operations in Section 3. Finally we present INSERT statement is the simplest operation in DML.
our conclusions in Section 4. Here is an example:
SELECTION:PollutantID=2
SET:CASNumber=’103333’
Actor
Figure 2. Insert operation use case diagram Figure 5. Update operation sequence diagram
1.1:SELECTION:PollutantID=2
Figure 3. Insert operation sequence diagram UPDATE operation can be more complex than INSERT
operation since it can include optional WHERE clause. In
such case, an update operation can be decomposed into the
following steps: 1) Selection: find the rows to be updated
can also be modeled using sequence diagram, communi- based on specific conditions first; 2) Set: update the old
cation diagram, activity diagram, interaction overview di- value with the new value. Here is an example:
agram and state machine diagram. Figure 3 is a sequence
diagram example. UPDATE Pollutant SET CASNumber = ‘103333’
WHERE PollutantID = 2
An anonymous stereotyped object is used to represent
a data row in Pollutant table. INSERT operation is con- In this example, we use a framed part UPDATE to repre-
sidered as a predefined operation of Pollutant table. We sent the update statement as a whole and use SELECTION
encapsulate the whole statement into a named framed and SET as individual steps to implement the statement.
part,which is a newly added feature in UML2.0[18], called SELECTION has the same semantic meaning as defined in
INSERT. SET is considered as an atomic operation in our relational theory. Again, SET is an atomic operation which
model which assigns value to a single row or a data set. The updates a data set. An equivalent communication diagram
SET operation is treated as a synchronous operation here. is shown in Figure 6. Multiple columns can be modified in
A communication diagram is shown in Figure 4. Again one statement; in this case, we can separate these expres-
we use an anonymous stereotyped object to represent the sions using comma, which is the same as SQL. The label
Pollutant table. The same operation can be represented become means the Pollutant data set is changed.
DELETE SELECT
GROUP BY: SiteID, PollutantID
SELECTION:PollutantID=11 HAVING SiteID >=1
PROJECTION:SiteID, PollutantID
AVG(ConcentrationValue)
DESTROY
ORDER BY: AVG(Concentration)
Figure 7. Delete operation sequence diagram Figure 9. Sequence diagram for select state-
ment with GROUP BY
1.1:SELECTION:PollutantID=11
1:DELETE 1.2:DESTROY
:Pollutant C:Concentration P:Pollutant
<<become>> :Pollutant
SELECT JOIN:C.PollutantID=P.PollutantID
SELECTION:P.PollutantName, C.SiteID,
Figure 8. Delete operaton communication di- C.ObservationDate, C.ConcentrationValue
agram
Select operation can be very complex such as nested WHERE C.PollutantID = P.PollutantID
queries, outer joins, etc. The simplest select operation in-
cludes SELECT, FROM and WHERE clause. We define We model JOIN operation using another framed part as
additional operations to model its internal execution. PRO- shown in Figure 10. Join condition and selection list are
JECTION is another implicit operation which has the same listed inside the box. C:Concentration defines an alias; it is
semantic meaning in relational theory. Both SELECTION not a named object normally found in OO modeling. Simi-
and PROJECTION are executed upon a data set. These two larly, we can model other JOIN conditions. Sub query can
operations will not change the state of the original data set. be modeled using a nested framed part.
A typical Select statement can also include GROUP BY, Union operator can combine two union-compatible
HAVING, and ORDER BY clause as well. Here is another queries. Its semantic meaning is the same as its meaning
example and its sequence diagram is shown in Figure 9. in relational theory. Here is an example using union.
SELECT SiteID, PollutantID,
AVG (ConcentrationValue) SELECT PollutantID FROM Pollutant
FROM Concentration WHERE PollutantName LIKE ‘A%’
GROUP BY SiteID, PollutantID UNION
HAVING SiteID >= 1 SELECT PollutantID FROM Pollutant
ORDER BY AVG (ConcentrationValue) WHERE PollutantName LIKE ‘B%’
Let’s look at another example which contains join oper-
ation. We model UNION as another framed part. A horizon-
tal dotted line in the compartment means the two operations
SELECT P.PollutantName, C.ObservationDate, inside the compartment can be executed in parallel. Simi-
C.SiteID, C.ConcentrationValue larly, we can model INTERSECTION, DIFFERENCE and
FROM Concentration C, Pollutant P PRODUCT by defining new framed parts.
:Pollutant :Pollutant
Table 1. Stored procedure dt setpropertybyid use
UNION case
SELECT Operation Name dt setpropertybyid
SELECTION
Owner Dbo
PollutantName LIKE ’A%’ Operation type Stored procedure
PROJECTION:PollutantID
Involved Tables dtproperties
Overview If the property already exists, reset
the value; otherwise add property
SELECT
Precondition dtproperties table has been created
SELECTION
Post Condition A new record is added into dtproper-
PollutantName LIKE ’B%’ ties table if the property already ex-
ists; otherwise, reset the property.
PROJECTION:PollutantID
Input Parameters @id int, @property varchar(64),
@value varchar(255),
@lvalue image
Output Parame- None
ters
Figure 11. Union operation sequence diagram
Return Value None
Most database products define functions, packages, or Declare and assign variables
erations can be very complex and should be modeled in the [property exists]
alt
[4] Edgar F. Codd, “A relational model of data for large shared
UPDATE data banks”, Communications of ACM, Vol.13, No.6, June
SELECTION:objectid=@id 1970, pp.377-387.
AND property=@property
[5] Hans-Erik Erikson, et al., UML 2 Toolkit, Wiley Publishing,
SET:value=@value,
uvalue=@uvalue,
Inc. Indianapolis, Indiana, USA, 2004
lvalue=@lvalue,
version=version+1 [6] Gordon C. Everest, Database management: Objective, Sys-
[exist]
tem Functions, and Administration, McGraw-Hill, New
[not exist] York, USA, 1986
INSET
SET:property=@property,
objectid=@id,
[7] Martin Gruber, Understanding SQL, SYBEX Inc., Alameda,
value=@value, California, USA, 1990
uvalue=@uvalue,
lvalue=@lvalue [8] Davor Gornik, “UML Data Modeling Profile”, White Paper,
Rational Software. May 2002