Relational Database Operations Modeling With UML

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

Relational Database Operations Modeling with UML

Shuxin Yin Indrakshi Ray


Department of Computer Science
Colorado State University
yin, iray @cs.colostate.edu


Abstract will greatly enhance program understandability and facili-


tate communication between application development team
Many existing software applications involve complex appli- and database team.
cation layer implemented in OO programming languages Relational database systems play an important role in en-
and at the same time use relational database systems as terprise software applications. Business information is pro-
back-end data store. Modeling the whole system in a con- cessed and manipulated at database level before being used
sistent manner will help developers and end users better by applications. These database operations implemented
understand the application. Application layer and database in packages, stored procedures, triggers and functions can
layer sometimes are tightly coupled together in some legacy be very complex. Modeling these operations improves un-
systems; however, most people use UML and ER modeling derstandability, reusability and maintenance of the database
to model these two layers respectively, which creates lots system. ER modeling only captures the static schema but
of inconsistencies. Database operations can’t be properly can’t model dynamic operations. Current UML modeling
modeled using ER modeling. In this work we present an ex- standard doesn’t fully address this program as well. There
tension to UML Data Modeling Profile and use concrete ex- are many kinds of database systems; we will limit our dis-
amples to illustrate how to model relational database oper- cussion to relational databases in this paper.
ations using UML. Atomic database operations are modeled With the development of programming languages and
based on our framework and are used as building blocks to software engineering techniques, software development has
model more complex database operations. evolved from simple monolithic applications into powerful
systems which could contain millions of lines of code. No
1. Introduction matter what programming languages we use, there are still
plenty of cases where application code and database code
In the present world of software development, software are coupled together. We need to reverse engineer these
applications are becoming more and more complex. Most systems for various reasons; modeling this kind of hybrid
enterprise software systems include presentation and appli- system properly will improve the understandability of the
cation layer which are implemented in object-oriented pro- whole system and facilitate future system enhancement.
gramming languages. A lot of these applications are used Most enterprise level applications involve end users,
to process huge amount of data which is stored in database business analysts, application development team and
systems. In fact, many legacy systems are still using rela- database team; modeling the whole system using one stan-
tional database management system (RDBMS). Because of dard will make communication among team members much
the dominant position of RDBMS, this hybrid model is still easier. Collaboration and cooperation has always been a key
being used in many newly developed applications. The Uni- aspect of overall system success. The UML gives us the
fied Modeling Language (UML) has become widely used in ability to model, in a single language, the business, appli-
object-oriented system modeling such as J2EE and .NET. cation, database, and architecture of the system. By having
Yet Entity-Relationship (ER) model still dominates rela- one single language, everybody involved can communicate
tional database application modeling. These two techniques their thoughts, ideas, and requirements [11].
are not fully compatible and confusions among develop- UML Data Modeling Profile [8][16] was proposed by
ers and end-users while using these two techniques at the Rational Software from IBM. The use of Data Modeling
same time affect overall system development and mainte- Profile for the UML has helped open the UML to database
nance. Modeling the whole application using one standard design [11]. UML can be used to model rational database
dures in this example indicated by the stereotype ¡¡SP Con-
Pollutant Concentration tainer¿¿. Using Rational Rose, we can model stored pro-
PK PollutantID:INT 0..* PK ConcentrationID:INT
PollutantNae:VARCHAR(50)
1
SiteID:INT cedures using activity diagram and state machine diagram.
CASNumber:VARCHAR(50) PollutantID:INT
ObservationDate:DATETIME Actions and states will be represented in English which is
<<PK>>PK_Pollutant()
ConcentrationValue:FLOAT ambiguous. Moreover, this representation will not capture
<<FK>>FK_Concentration3()
<<FK>>FK_Concentration2() the relationships between the tables, SQL DML and actions
<<PK>>PK_Concentration()
0..*
in the activity diagram.

1
<<SP Container>> 2.1. Relational Theory
SP_1 Site
<<SP>>ReportByPollutant(PollutantID:INT):Void PK SiteID:INT
<<SP>>ReportBySite(SiteID:INT):Void SiteName:VARCHAR(50)
Address:VARCHAR(50)
Relational database theory is based on Relational
City:VARCHAR(50)
State:VARCHAR(2) Model [4] and set theory. According to Relational Model,
Zip:CHAR(5)
Latitude:FLOAT relational tables are sets. The rows of the tables can be
Longitude:FLOAT
<<PK>>PK_Site()
considered as elements of the set. Operations on sets can
also be performed on relational tables. The eight relational
operations are union, difference, intersection, product,
Figure 1. Air toxics data archive UML data projection, selection, join and division. SQL translates
model the relational theory into practice. SQL is a language
that is a loose implementation of relational theory, and
has been further modified in its actual implementation by
the RDBMS software that uses it [7]. We will use the
schema and it is more expressive than ER modeling. Yet the
operations defined in relational theory to model SQL DML
current UML database modeling techniques mainly focus
operations.
on static schema modeling. Dynamic database operations
are modeled in an ad hoc manner.
The major contribution of this work is to identify the
weakness of UML Data Modeling Profile, i.e., lack of abil- 2.2. SQL Data Manipulation Language
ities to model operations. We will propose a framework on
how to model operations at database level. SQL is essen- Essentially all database operations are implemented us-
tially built upon relational theory. We will show how to ing atomic SQL data manipulation language constructs such
model atomic operations in SQL Data Manipulation Lan- as INSERT, UPDATE, DELETE, and SELECT. In this sec-
guage (DML) based on relational algebra and set theory. tion we will model standard SQL data manipulation lan-
SQL is a declarative language; it allows us to express guage constructs and use them as building block to model
what we want without going into the details about where it’s more complex operations in the next section.
located or how to get it [7]. We are not trying to model the We all know that there is no object in RDBMS and
internal execution details such as parsing SQL statement, operations are separated from tables. On the other hand,
validating the statement, optimizing the statement, gener- we can think of a row as an anonymous object of a class as
ating an execution plan and executing the execution plan. defined by the table. There are four predefined operations
All these details should not be exposed to application de- upon this object which are INSERT, UPDATE, DELETE
velopers. On the other hand, modeling database operations and SELECT.
at SQL statement level makes it easy for end users to under-
stand the database operations.
The rest of this paper is organized as follows. We de- 2.3. Insert Operation Modeling
scribe how to model atomic SQL DML constructs in Sec-
tion 2. We will use an example to show how to model com-
plex database operations in Section 3. Finally we present INSERT statement is the simplest operation in DML.
our conclusions in Section 4. Here is an example:

INSERT INTO Pollutant (PollutantID,


2. Atomic DML Operations Modeling PollutantName, CASNumber)
VALUES (1, ‘Ethylbenzene’, ‘100414’)
We use the example of an Air Toxics Data Archive to ex-
plain our ideas. Figure 1 is an example of UML data model At a higher level, we can model this operation using use
generated by Rational Rose. There are two stored proce- case diagram as shown in Figure 2. An insert operation
:Pollutant
System
UPDATE

SELECTION:PollutantID=2

Add a new Pollutant

SET:CASNumber=’103333’
Actor

Figure 2. Insert operation use case diagram Figure 5. Update operation sequence diagram

1.1:SELECTION:PollutantID=2

:Pollutant 1.2:SET CASNumber=’103333’


1:UPDATE
:Pollutant :Pollutant
INSERT <<become>>

SET: PollutantID = 1, Figure 6. Update operation communication


PollutantName=’Ethylbenzene’,
CASNumber=’100414’ diagram

2.4. Update Operation Modeling

Figure 3. Insert operation sequence diagram UPDATE operation can be more complex than INSERT
operation since it can include optional WHERE clause. In
such case, an update operation can be decomposed into the
following steps: 1) Selection: find the rows to be updated
can also be modeled using sequence diagram, communi- based on specific conditions first; 2) Set: update the old
cation diagram, activity diagram, interaction overview di- value with the new value. Here is an example:
agram and state machine diagram. Figure 3 is a sequence
diagram example. UPDATE Pollutant SET CASNumber = ‘103333’
WHERE PollutantID = 2
An anonymous stereotyped object is used to represent
a data row in Pollutant table. INSERT operation is con- In this example, we use a framed part UPDATE to repre-
sidered as a predefined operation of Pollutant table. We sent the update statement as a whole and use SELECTION
encapsulate the whole statement into a named framed and SET as individual steps to implement the statement.
part,which is a newly added feature in UML2.0[18], called SELECTION has the same semantic meaning as defined in
INSERT. SET is considered as an atomic operation in our relational theory. Again, SET is an atomic operation which
model which assigns value to a single row or a data set. The updates a data set. An equivalent communication diagram
SET operation is treated as a synchronous operation here. is shown in Figure 6. Multiple columns can be modified in
A communication diagram is shown in Figure 4. Again one statement; in this case, we can separate these expres-
we use an anonymous stereotyped object to represent the sions using comma, which is the same as SQL. The label
Pollutant table. The same operation can be represented become means the Pollutant data set is changed.


using state machine diagram and activity diagram as well.


2.5. Delete Operation Modeling

Delete operation is similar to update operation except


1:INSERT
that the rows are removed from the table and DESTROY
:Pollutant operation is executed upon the table. A typical delete state-
ment is listed below and its sequence diagram and commu-
nication diagram are shown in Figure 7 and Figure 8.
Figure 4. Insert operation communication di-
agram DELETE FROM Pollutant
WHERE PollutantID = 11
:Pollutant :Pollutant

DELETE SELECT
GROUP BY: SiteID, PollutantID
SELECTION:PollutantID=11 HAVING SiteID >=1

PROJECTION:SiteID, PollutantID
AVG(ConcentrationValue)
DESTROY
ORDER BY: AVG(Concentration)

Figure 7. Delete operation sequence diagram Figure 9. Sequence diagram for select state-
ment with GROUP BY
1.1:SELECTION:PollutantID=11

1:DELETE 1.2:DESTROY
:Pollutant C:Concentration P:Pollutant
<<become>> :Pollutant

SELECT JOIN:C.PollutantID=P.PollutantID
SELECTION:P.PollutantName, C.SiteID,
Figure 8. Delete operaton communication di- C.ObservationDate, C.ConcentrationValue
agram

Here we use DESTROY to replace SET. DESTROY is


considered as another predefined operation upon a table
which means deleting data set based upon SELECTION
result. Figure 10. Sequence diagram for select state-
ment containing JOIN operation

2.6. Select Operation Modeling

Select operation can be very complex such as nested WHERE C.PollutantID = P.PollutantID
queries, outer joins, etc. The simplest select operation in-
cludes SELECT, FROM and WHERE clause. We define We model JOIN operation using another framed part as
additional operations to model its internal execution. PRO- shown in Figure 10. Join condition and selection list are
JECTION is another implicit operation which has the same listed inside the box. C:Concentration defines an alias; it is
semantic meaning in relational theory. Both SELECTION not a named object normally found in OO modeling. Simi-
and PROJECTION are executed upon a data set. These two larly, we can model other JOIN conditions. Sub query can
operations will not change the state of the original data set. be modeled using a nested framed part.
A typical Select statement can also include GROUP BY, Union operator can combine two union-compatible
HAVING, and ORDER BY clause as well. Here is another queries. Its semantic meaning is the same as its meaning
example and its sequence diagram is shown in Figure 9. in relational theory. Here is an example using union.
SELECT SiteID, PollutantID,
AVG (ConcentrationValue) SELECT PollutantID FROM Pollutant
FROM Concentration WHERE PollutantName LIKE ‘A%’
GROUP BY SiteID, PollutantID UNION
HAVING SiteID >= 1 SELECT PollutantID FROM Pollutant
ORDER BY AVG (ConcentrationValue) WHERE PollutantName LIKE ‘B%’
Let’s look at another example which contains join oper-
ation. We model UNION as another framed part. A horizon-
tal dotted line in the compartment means the two operations
SELECT P.PollutantName, C.ObservationDate, inside the compartment can be executed in parallel. Simi-
C.SiteID, C.ConcentrationValue larly, we can model INTERSECTION, DIFFERENCE and
FROM Concentration C, Pollutant P PRODUCT by defining new framed parts.
:Pollutant :Pollutant
Table 1. Stored procedure dt setpropertybyid use
UNION case
SELECT Operation Name dt setpropertybyid
SELECTION
Owner Dbo
PollutantName LIKE ’A%’ Operation type Stored procedure
PROJECTION:PollutantID
Involved Tables dtproperties
Overview If the property already exists, reset
the value; otherwise add property
SELECT
Precondition dtproperties table has been created
SELECTION
Post Condition A new record is added into dtproper-
PollutantName LIKE ’B%’ ties table if the property already ex-
ists; otherwise, reset the property.
PROJECTION:PollutantID
Input Parameters @id int, @property varchar(64),
@value varchar(255),
@lvalue image
Output Parame- None
ters
Figure 11. Union operation sequence diagram
Return Value None

3. Database Operations Modeling

Most database products define functions, packages, or Declare and assign variables

stored procedures to isolate portions of operations, hide im-


plementation details and improve performance. These op- [property doesn’t exist]

erations can be very complex and should be modeled in the [property exists]

same way as application level operations. We can use a


Update property, version++
use case to represent an operation at requirement analysis Insert property

level and use activity diagram, sequence diagram, commu-


nication diagram and state machine diagram at design and
implementation level. Here is a system stored procedure
dt setpropertybyid used in SQL Server 7.0. It is imple-
Figure 12. Stored procedure activity diagram
mented in Transact-SQL.
CREATE PROCEDURE dbo.dt_setpropertybyid INSERT dbo.dtproperties (property,
@id int, objectid, value, uvalue, lvalue)
@property varchar(64), VALUES (@property, @id, @value,
@value varchar(255), @uvalue, @lvalue)
@lvalue image end
as
set nocount on The above stored procedure can be modeled using use
declare @uvalue nvarchar(255) case as shown in Table 1. We can use activity diagram,
set @uvalue = convert(nvarchar(255), @value) sequence diagram and other diagrams to model this stored
if exists (SELECT * FROM dbo.dtproperties procedure at requirement, analysis, design and implemen-
WHERE objectid=@id AND property=@property) tation level. By combining all these diagrams together, we
begin will have a much better understanding of complex database
UPDATE dbo.dtproperties operations.
SET value=@value, uvalue=@uvalue, Some CASE tools such as Rational Rose provide the
lvalue=@lvalue, version=version+1 functionality to attach activity diagram and state machine
WHERE objectid=@id AND property=@property diagram to stored procedures. But forward and reverse engi-
end neering capability has not been implemented yet. Figure 13
else shows the sequence diagram at implementation level. We
begin use Select, Update and Insert statement as building blocks
sd dt_setpropertybyid [2] Tomas A. Bruce, Designing Quality Databases with
@id int, IDEF1X Information Models, Dorset House Publishing
@property varchar(64), :dtproperties
Company, Incorporated, New York, USA, October 1992
@value varchar(255),
@lvalue image
[3] Peter Chen, “The Entity-Relationship model - Toward a uni-
SELECT
fied view of data”, ACM Transactions on Database System,
SELECTION:projectID=@id Vol.1, No.1, March 1976, pp.9-36.
AND property=@property

alt
[4] Edgar F. Codd, “A relational model of data for large shared
UPDATE data banks”, Communications of ACM, Vol.13, No.6, June
SELECTION:objectid=@id 1970, pp.377-387.
AND property=@property
[5] Hans-Erik Erikson, et al., UML 2 Toolkit, Wiley Publishing,
SET:value=@value,
uvalue=@uvalue,
Inc. Indianapolis, Indiana, USA, 2004
lvalue=@lvalue,
version=version+1 [6] Gordon C. Everest, Database management: Objective, Sys-
[exist]
tem Functions, and Administration, McGraw-Hill, New
[not exist] York, USA, 1986
INSET

SET:property=@property,
objectid=@id,
[7] Martin Gruber, Understanding SQL, SYBEX Inc., Alameda,
value=@value, California, USA, 1990
uvalue=@uvalue,
lvalue=@lvalue [8] Davor Gornik, “UML Data Modeling Profile”, White Paper,
Rational Software. May 2002

[9] Terry Halpin and Anthony Bloesch, “Data modeling in UML


and ORM: a comparison”, Journal of Database Manage-
Figure 13. Stored procedure sequence dia-
ment, Vol. 10, No. 4, Idea group Publishing Company, Her-
gram shey, USA, October 1999, pp.4-13.

[10] Robert J. Muller, Database Design for Smarties: Using


UML for Data Modeling, Organ Kaufmann Publishers, Inc.
and place them inside predefined framed part such as alt to San Francisco, California, USA, 1999
represent conditional execution. Input and output param-
[11] Eric J. Naiburg and Robert A. Maksimchuk, UML for
eters are listed in a separate compartment below diagram
database design, Addison-Wesley, Boston, MA, USA, 2001
name.
[12] William J. Premerlani, et al., “An Approach for Reverse En-
gineering of Relational Databases”, Communications of the
4. Conclusion ACM, Vol. 37, No.5, May 1994, pp.42-49.

[13] Shekar Ramanathan, et al., “Reverse Engineering Relational


In this paper, we proposed a framework that can be used Schemas to Object-oriented Schemas”, Technical Report
to model database-related operations using a set of UML No. MSU-960701, Mississippi State University, 1996
diagrams. We believe that this approach will provide end- [14] James Rumbaugh, Ivar Jasobson, and Grady Booch, The
users and developers with a unified view of the whole sys- Unified Modeling Language Reference Manual, Addison-
tem and bring the power of UML to database domain. As Wesley, Reading, MA, USA, 1999
the next step of our research, we plan to analyze more com-
plex database operations in details. We will formalize our [15] Toby J. Teorey, Dongqing Yang, and James P. Fry, “A
model using UML metamodel. We hope to develop algo- logical design methodology for relational database using
rithms used for reverse engineering existing applications. the extended entity-relationship model”, Computing Survey,
Vol.18, Issue 2, June 1986, pp.198-222.
Eventually tools can be built based on these algorithms.
[16] The UML and Data Modeling, White Paper, Rational Soft-
ware, 2003
References
[17] Unified Model Language specification, Object Management
Group, http://www.omg.com
[1] Anreas Behm, et al., “On Migration of Relational Schemas
and Data to Object-oriented Database Systems”, Proceed- [18] Unified Modeling Language: Superstructure 2.0 pct/03-07-
ings of the 5th International Conference on Re-techologies 06, Object Management Group, 2003
for Information Systems, December 1997, pp.13-33.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy