0% found this document useful (0 votes)
129 views

Measuring Data Management Practice Matur

Measuring Data Management Practice Maturity

Uploaded by

Diana Marcu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
129 views

Measuring Data Management Practice Matur

Measuring Data Management Practice Maturity

Uploaded by

Diana Marcu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 69

Do More With Your Data

Measuring Data Management Practice Maturity:


A Communitys Self-Assessment
Data Management & Information Quality  Philadelphia  November 2008
By Peter A. Aiken, PhD  Associate Professor
Virginia Commonwealth University  Founding Director of Data Blueprint

About Data Blueprint


Data Blueprint leverages decades of data management (DM) experience to develop and apply client-
designed solutions to address information management requirements and needs. Data Blueprint clients
gain immediate benefits due to the skillfully integrated and packaged data expertise designed to enhance
organizational data management that moves organizations towards data empowerment with the following
suite of Products and Services:
Data Assessments we evaluate the current state of an
organization and provide risk and opportunity information in Data
Management Practices, Data Strategy, Data Quality
Engineering, Data Risk, and Data Training.

Data Consulting/Engineering we provide a continuum of


specialized Data Blueprint skills in Enterprise Data Architecture,
Data Quality, Audit, Cleansing, Migration, Integration, Metadata;
XML, Business Process Analysis, and other strategic areas.

Data Instruction we transfer knowledge to your organization


through: classroom training, educational courses and workshops
(some with accredited university credits), mentoring,
conferences, and research lectures. Our goal is to empower you
to manage your own data.

Data Blueprints methodology enables clients to learn and master processes, facilitated by our Products
and Services, to achieve enhanced data management. The result of our team approach is empowerment
both to your data as well as your organization to successfully create and maintain premiere data
management capabilities. The value to the organization is positive return on your investment in data
one of your most critical assets.

Data Blueprint has developed a unique technical approachfurther, they


have implemented an extremely difficult technical solutiona technical feat
that no contractor has been able to implement in the more than three
decades that out system has been running
Tony Berta, Program Manager
Technical Quality  Headquarters Defense Logistics Agency

Do more with your data. Call +1.804.521.4056 or email info@datablueprint.com.

datablueprint.com info@datablueprint.com phone +1.804.521.4056 fax +1.804.521.4004


Maggie Walker Business & Technology Center 501 E Franklin St Ste 414 Richmond, VA 23219
R E S E A R C H F E A T U R E

Measuring Data Management


Practice Maturity:
A Communitys
Self-Assessment
Peter Aiken, Virginia Commonwealth University/Institute for Data Research
M. David Allen, Data Blueprint
Burt Parker, Independent consultant
Angela Mattia, J. Sergeant Reynolds Community College

Increasing data management practice maturity levels can positively impact the
coordination of data flow among organizations, individuals, and systems. Results
from a self-assessment provide a roadmap for improving organizational data
management practices.

A
s increasing amounts of data flow within and version (changing data into other forms, states, or
between organizations, the problems that can products), or scrubbing (inspecting and manipulat-
result from poor data management practices ing, recoding, or rekeying data to prepare it for sub-
are becoming more apparent. Studies have sequent use).
shown that such poor practices are widespread. Approximately two-thirds of organizational data
For example, managers have formal data management training;
slightly more than two-thirds of organizations use
PricewaterhouseCoopers reported that in 2004, only or plan to apply formal metadata management tech-
one in three organizations were highly confident in niques; and slightly fewer than one-half manage their
their own data, and only 18 percent were very con- metadata using computer-aided software engineer-
fident in data received from other organizations. ing tools and repository technologies.3
Further, just two in five companies have a docu-
mented board-approved data strategy (www.pwc. When combined with our personal observations, these
com/extweb/pwcpublications.nsf/docid/15383D6E7 results suggest that most organizations can benefit from
48A727DCA2571B6002F6EE9). the application of organization-wide data management
Michael Blaha1 and others in the research community practices. Failure to manage data as an enterprise-, cor-
have cited past organizational data management edu- porate-, or organization-wide asset is costly in terms of
cation and practices as the cause for poor database market share, profit, strategic opportunity, stock price,
design being the norm. and so on. To the extent that world-class organizations
According to industry pioneer John Zachman,2 orga- have shown that opportunities can be created through
nizations typically spend between 20 and 40 percent the effective use of data, investing in data as the only
of their information technology budgets evolving their organizational asset that cant be depleted should be of
data via migration (changing data locations), con- great interest.

48 Computer Published by the IEEE Computer Society 0018-9162/06/$20.00 2006 IEEE


Table 1. Data management processes.4

Process Description Focus Data type


Data program Provide appropriate data Direction Program data: Descriptive propositions or observations needed to
coordination management process and establish, document, sustain, control, and improve organizational
technological infrastructure data-oriented activities (such as vision, goals, policies, and metrics).
Organizational Achieve organizational Direction Development data: Descriptive facts, propositions, or observations used
data integration sharing of appropriate data to develop and document the structures and interrelationships of data
(for example, data models, database designs, and specifications).
Data stewardship Achieve business-entity Direction and Stewardship data: Descriptive facts about data documenting
subject area data integration implementation semantics and syntax (such as name, definition, and format).
Data development Achieve data sharing within Implementation Business data: Facts and their constructs used to accomplish enterprise
a business area business activities (such as data elements, records, and files).
Data support Provide reliable access to Implementation
operations data
Data asset use Leverage data in business Implementation
activities

DATA MANAGEMENT DEFINITION accounting practices that have been practiced for thou-
AND EVOLUTION sands of years. As Figure 2 on the next page shows, data
As Table 1 shows, data management consists of six managements scope has expanded over time, and this
interrelated and coordinated processes, primarily expansion continues today.
derived by Burt Parker from sponsored research he led Ideally, organizations derive their data management
for the US Department of Defense at the MITRE requirements from enterprise-wide information and
Corporation.4 functional user requirements. Some of these require-
Figure 1 supports the similarly standardized defini- ments come from legacy systems and off-the-shelf soft-
tion: Enterprise-wide management of data is under- ware packages. An organization derives its future data
standing the current and future data needs of an requirements from an analysis of what it will deliver, as
enterprise and making that data effective and efficient in well as future capabilities it will need to implement orga-
supporting business activities.4 nizational strategies. Data management guides the trans-
The figure illustrates how
organizational strategies guide
other data management pro- Organizational strategies Implementation
cesses. Two of these processes
Data program Guidance
data program coordination
coordination
and organizational data inte- Integrated
grationprovide direction to Goals Organizational models
the implementation processes data integration
data development, data sup-
Standard
port operations, and data asset Data data Data
use. The data stewardship pro- stewardship development
cess straddles the line between
direction and implementation. Application models
All processes exchange feed- and designs
back designed to improve and Direction
Business
fine-tune overall data manage- Data support data Data
ment practices. operations asset use
Data management has existed Feedback
in some form since the 1950s Business value
and has been recognized as a
discipline since the 1970s. Data
management is thus a young Figure 1. Interrelationships among data management processes (adapted from Burt
discipline compared to, for Parkers earlier work4). Blue lines indicate guidance, red lines indicate feedback, and green
example, the relatively mature lines indicate data.

December 2006 49
quent strategic investments in data
Expanding Data Management Scope 1950-1970 1970-1990 1990-2000 2000 to management.
present Viewing data management as a col-
Database development lection of processes, each with a role
Database operation
Data requirements analysis
that provides value to the organization
Data modeling through data, makes it easier to trace
Enterprise data management coordination value through those processes and
Enterprise data integration point not only to a methodological
Enterprise data stewardship
Enterprise data use
why of data management practice
Explicit focus on data quality throughout improvement but also to a specific,
Security concrete how.
Compliance
Other responsibilities
RESEARCH BASIS
Mark Gillenson has published three
Figure 2. Data managements growth over time.The discipline has expanded from papers that serve as an excellent back-
an initial focus on database development and operation in the 1950s to 1970s to ground to this research.5-7 Like earlier
include additional responsibilities in the periods 1970-1990, 1990-2000, and from works, Gillenson focuses on the
2000 to the present. implementation half of Figure 1,
adopting a more narrow definition of
formation of strategic organizational information needs data administration. Over time, his work paints a pic-
into specific data requirements associated with particu- ture of an industry attempting to catch up with techno-
lar technology system development projects. logical implementation. Our work here updates and
All organizations have data architectures, whether confirms his basic conclusions while changing the focus
explicitly documented or implicitly assumed. An impor- from whether a process is performed to the maturity
tant data management process is to document the archi- with which it is performed.
tectures capabilities, making it more useful to the Three other works also influenced our research: Ralph
organization. Keeneys value-focused thinking,8 Richard Nolans six-
In addition, data management stage theory of data processing,9 and the Capability
Maturity Model Integration (CMMI).10,11
must be viewed as a means to an end, not the end Keeneys value-focused thinking provides a method-
itself. Organizations must not practice data man- ological approach to analyzing and evaluating the var-
agement as an abstract discipline, but as a process ious aspects of data management and their associated
supporting specific enterprise objectives in partic- key process areas. We wove the concepts behind means
ular, to provide a shared-resource basis on which to and fundamental objectives into our assessments con-
build additional services. struction to connect how we measure data management
involves both process and policy. Data management with what customers require from it.
tasks range from strategic data planning to the cre- In Stage VI of his six-stage theory of data processing,
ation of data element standards to database design, Nolan defined maturity as data resource management.
implementation, and maintenance. Although Nolans theory predates and is similar to the
has a technical component: interfacing with and facil- CMMI, it contains several ideas that we adapted and
itating interaction between software and hardware. reused in the larger data management context. However,
has a specific focus: creating and maintaining data to CMMI refinement remains our primary influence.
provide useful information. Most technologists are familiar with the CMM (and its
includes management of metadata artifacts that upgrade to the CMMI), developed at Carnegie Mellons
address the datas form as well as its content. Software Engineering Institute with assistance from the
MITRE Corporation.10,11 The CMMI itself was derived
Although data management serves the organization, from work that Ron Radice and Watts Humphrey per-
the organization often doesnt appreciate the value it formed while at IBM. Dennis Goldenson and Diane
provides. Some data management staffs keep ahead of Gibson presented results pointing to a link between
the layoff curve by demonstrating positive business CMMI process maturity and organizational success.12 In
value. Managements short-term focus has often made addition, Cyndy Billings and Jeanie Clifton demonstrated
it difficult to secure funding for medium- and long-term the long-term effects for organizations that successfully
data management investments. Tracing the disciplines sustain process improvement for more than a decade.13
efforts to direct and indirect organizational benefits has CMMI-based maturity models exist for human
been difficult, so it hasnt been easy to present an artic- resources, security, training, and several other areas of
ulate business case to management that justifies subse- the software-related development process. Our colleague,

50 Computer
Brett Champlin, contributed a list of dozens of maturity
measurements derived from or influenced by the CMMI. Table 2. Organizations included in data management
This list includes maturity measurement frameworks for analysis, by type.
data warehousing, metadata management, and software
Organization type Percent
systems deployment. The CMMIs successful adoption in
other areas encouraged us to use it as the basis for our Local government 4
data management practice assessment. State government 17
Whereas the core ideas behind the CMMI present a Federal government 11
reasonable base for data management practice maturity International organization 10
measurement, we can avoid some potential pitfalls by Commercial organization 58
learning from the revisions and later work done with
the CMMI. Examples of such improvements include
general changes to how the CMMI makes interrela- process-improvement strategies by determining their cur-
tionships between process areas more explicit and how rent process maturity and identifying the most critical
it presents results to a target organization. issues to improving their software quality and process.10
Work by Cynthia Hauer14 and Walter Schnider and Similarly, our goal was to aid data management practice
Klaus Schwinn15 also influenced our general approach to improvement by presenting a scale for measuring data
a data management maturity model. Hauer nicely artic- management accomplishments. Our assessment results
ulated some examples of the value determination fac- can help data managers identify and implement process
tors and results criteria that we have adopted. Schnider improvement strategies by recognizing their data man-
and Schwinn presented a rough but inspirational out- agement challenges.
line of what mature data management practices might
look like and the accompanying motivations. DATA COLLECTION PROCESS
AND RESEARCH TARGETS
RESEARCH OBJECTIVES Between 2000 and 2006, we assessed the data man-
Our research had six specific objectives, which we agement practices of 175 organizations. Table 2 pro-
grouped into two types: community descriptive goals vides a breakdown of organization types.
and self-improvement goals. Students from some of our graduate and advanced
Community descriptive research goals help clarify our undergraduate classes largely conducted the assessments.
understanding of the data management community and We provided detailed assessment instruction as part of
associated practices. Specifically, we want to understand the course work. Assessors used structured telephone
and in-person interviews to assess specific organizational
the range of practices within the data management data management practices by soliciting evidence of
community; processes, products, and common features. Key concepts
the distribution of data management practices, specif- sought included the presence of commitments, abilities,
ically the various stages of organizational data man- measurements, verification, and governance.
agement maturity; and Assessors conducted the interviews with the person
the current state of data management practices in identified as having the best, firsthand knowledge of
what areas are the community data management organizational data management practices. Tracking
practices weak, average, and strong? down these individuals required much legwork; identi-
fying these individuals was often more difficult than
Self-improvement research goals help the community securing the interview commitment.
as a whole improve its collective data management prac- The assessors attempted to locate evidence in the orga-
tices. Here, we desire to nization indicating the existence of key process areas
within specific data management practices. During the
better understand what defines current data man- evaluation, assessors observed strict confidentiali-
agement practices; ty they reported only compiled results, with no men-
determine how the assessment informs our standing tion of specific organizations, individuals, groups,
as a technical community (specifically, how does data programs, or projects. Assessors and participants kept
management compare to software development?); all information to themselves and observed proprietary
and rights, including several nondisclosure agreements.
gain information useful for developing a roadmap All organizations implement their data management
for improving current practice. practice in ways that can be classified as one of five
maturity model levels, detailed in Table 3 on the next
The CMMIs stated goals are almost identical to ours: page. Specific evidence, organized by maturity level,
[The CMMI] was designed to help developers select helped identify the level of data management practiced.

December 2006 51
Table 3. Data management practice assessment levels.

Level Name Practice Quality and results predictability

1 Initial The organization lacks the necessary processes for The organization depends entirely on individuals, with little or no
sustaining data management practices. Data corporate visibility into cost or performance, or even awareness
management is characterized as ad hoc or chaotic. of data management practices. There is variable quality, low
results predictability, and little to no repeatability.
2 Repeatable The organization might know where data management The organization exhibits variable quality with some
expertise exists internally and has some ability to predictability. The best individuals are assigned to critical
duplicate good practices and successes. projects to reduce risk and improve results.
3 Defined The organization uses a set of defined processes, Good quality results within expected tolerances most of the time.
which are published for recommended use. The poorest individual performers improve toward the best
performers, and the best performers achieve more leverage.
//correct?//.
4 Managed The organization statistically forecasts and directs Reliability and predictability of results, such as the ability to
data management, based on defined processes, determine progress or six sigma versus three sigma
selected cost, schedule, and customer satisfaction measurability, is significantly improved.
levels. The use of defined data management processes
within the organization is required and monitored.
5 Optimizing The organization analyzes existing data management The organization achieves high levels of results certainty.
processes to determine whether they can be improved,
makes changes in a controlled fashion, and reduces
operating costs by improving current process
performance or by introducing innovative services to
maintain their competitive edge.

For each data management process, the assessment For example, the data program coordination practice
used between four and six objective criteria to probe for area results include:
evidence. Assessed outside the data collection process,
the presence or absence of this evidence indicated orga- Mystery Airline achieved level 1 on responses 1, 2,
nizational performance at a corresponding maturity level. and 5, and level 2 on responses 3 and 4.
The airline industry performed above both Mystery
ASSESSMENT RESULTS Airline and all respondents on responses 1 through
The assessment results reported for the various prac- 3.
tice areas show that overall scores are repeatable (level The airline industry performed below both Mystery
2) in all data management practice areas. Airline and all respondents on response 4, and well
Figure 3 shows assessment averages of the individual below all respondents and just those in the airline
response scores. We used a composite chart to group the industry on response 5.
averages by practice area. Such groupings facilitate
numerous comparisons, which organizations can use to Figure 3f illustrates the range of results for all orga-
plan improvements to their data management practices. nizations surveyed for each data management process
We present sample results (blue) for an assessed orga- for example, the assessment results for data program
nization (disguised as Mystery Airline), whose man- coordination ranged from 2.06 to 3.31.
agement was interested in not only how the organization The maturity measurement framework dictates that
scored but also how it compared to other assessed air- a data program can achieve no greater rating than the
lines (red) and other organizations (white). lowest rating achieved hence the translation to the
We grouped 19 individual responses according to the scores for Mystery Airline of 1, 2, 2, 2, and 2 combin-
five data management maturity levels in the horizontal ing for an overall rating of 1. This is congruent with
bar charts. Most numbers are averages. That is, for an CMMI application.
individual organization, we surveyed multiple data man- Although this might seem a tough standard, the rat-
agement operations, combined the individual assessment ing reflects the adage that a chain is only as strong as its
results, and presented them as averages. We reported weakest link. Mature data management programs cant
assessments of organizations with only one data man- rely on immature or ad hoc processes in related areas.
agement function as integers. The lowest rating received becomes the highest possible

52 Computer
0 1 2 3 4 0 1 2 3 4
(a) (b)
1 2
Response 1 3.15 Response 6 2.98
2.72 2.66
1 2
Response 2 2.98 Response 7 0.98
2.57 2.34
2
2
Response 3 3.11
Response 8 2.05
2.06 2.57
2
Response 4 1.09
Response 9
2
3.08
2.88
2.18
1
Response 5 3.14 Mystery Airline
3.31
Airline industry
All respondents

0 1 2 3 4 0 1 2 3 4
(c) (d)
2 2
Response 10-a 0.965 Response 11 0.89
2.40 2.33

2 2
Response 10-b 3.04 Response 12 1.2
2.15 1.57

1 2
Response 10-c 0.97 Response 13 1.05
1.98 2.01
2 2
Response 10-d 1.1 Response 14 0.79
2.21 2.46
2 2
Response 10-e 1./05 Response 15 1.14
2.23 2.25
2
Response 10-f 0.96
2.13

Enterprise
Data program data Data Data Data support
0 1 2 3 4 coordination integration stewardship development operations
(e) (f) results results results results results
3 5.00
Response 16 2.89
2.66
4.00
3 3.31
Response 17 3.11
3.00
2.66 2.66 2.66
2.46
2.28
3 2.00
Response 18 3.04 2.06 2.18 1.98 2.04
2.04
1.57
1.00
3
Response 19 1.11
0.00
2.17

Figure 3. Assessment results useful to Mystery Airline: (a) data program coordination, (b) enterprise data integration, (c) data
stewardship, (d) data development, (e) data support organizations, and (f) assessments range.

overall rating. This also explains why many organiza- Results analysis
tions are at level 1 with regard to their software devel- Perhaps the most important general fact represented
opment practices. While the CMMI process results in a in Figure 3 is that organizations gave themselves rela-
single overall rating for the organization, data manage- tively low scores. The assessment results are based on
ment requires a more fine-grained feedback mechanism. self-reporting and, although our 15-percent validation
Knowing that some data management processes per- sample is adequate to verify accurate industry-wide
form better than others can help an organization develop assessment results, 85 percent of the assessment is based
incentives as well as a roadmap for improving individ- on facts that were described but not observed. Although
ual ratings. direct observables for all survey respondents would have
Taken as a whole, these numbers show that no data provided valuable confirming evidence, the cost of such
management process or subprocess measured on aver- a survey and the required organizational access would
age higher than the data program coordination process, have been prohibitive.
at 3.31. Its also the only data management process that We held in-person, follow-up assessment validation
performed on average at a defined level (greater than 3). sessions with about 15 percent of the assessed organi-
The results show a community that is approaching zations. These sessions helped us validate the collection
the ability to repeat its processes across all of data method and refine the technique. They also let us gauge
management. the assessments accuracy.

December 2006 53
Community descriptive research goals
Table 4. Assessment scores adjusted for self-reporting First, we wanted to determine the range of practices
inflation. within the data management community. A wide range
of such practices exists. Some organizations are strong
Response Adjusted average
in some data management practices and weak in others
1 1.72388 (the range of practice is consistently inconsistent). The
2 1.57463 wide divergence of practices both within and between
3 1.0597 organizations can dilute results from otherwise strong
4 1.8806 data management programs. The assessments applica-
5 2.31343 bility to longitudinal studies remains to be seen; this is
6 1.66418 an area for follow-up research. Although researchers
7 1.33582 might undertake formal studies of such trends in the
8 1.57463 future, evidence from ongoing assessments suggests that
9 1.1791 results are converging. Consequently, we feel that our
10 a 1.40299 sample constitutes a representation of community-wide
10 b 1.14925 data management practices.
10 c 0.97761 Next, we wanted to know whether the distribution of
10 d 1.20896 practices informs us specifically about the various stages
10 e 1.23134 of organizational data management maturity. The
10 f 1.12687 assessment results confirm the frameworks utility, as do
11 1.32836 the postassessment validation sessions. Building on the
12 0.57463 framework, we were able to specify target characteris-
13 1.00746 tics and objective measurements. We now have better
14 1.46269 information as to what comprises the various stages of
15 1.24627 organizational data management practice maturity.
16 1.65672 Organizations do clump together into the various matu-
17 1.66418 rity stages that Nolan originally described. We can now
18 1.04478 determine the investments required to predictably move
19 1.17164 organizations from one data management maturity level
to another.
Finally, we wanted to determine in what areas the
Although the assessors strove to accurately measure community data management practices are weak, aver-
each subprocesss maturity level, some interviews age, and strong. Figure 4 shows an average of unad-
inevitably were skewed toward the positive end of the justed rates summarizing the assessment results. As the
scale. This occurred most often because interviewees figure shows, the data management community reports
reported on milestones that they wanted to or would itself relatively and perhaps surprisingly strong in all five
soon achieve as opposed to what they had achieved. We major data management processes when compared to
suspected, and confirmed during the validation sessions, the industry averages for software development. The
that responses were typically exaggerated by one point range and averages indicate that the data management
on the five-point scale. community has more mature data program coordina-
When we factor in the one-point inflation, the num- tion processes, followed by organizational data inte-
bers in Table 4 become important. Knowing that the bar gration, support operations, stewardship, and then data
is so low will hopefully inspire some organizations to development. The relatively lower data development
invest in data management. Doing so might give them a scores might suggest data program coordination imple-
strategic advantage if the competition is unlikely to be mentation difficulties.
making a similar investment.
The relatively low scores reinforce the need for Self-improvement research goals
this data management assessment. Based on the Our first objective was to produce results that would
overall scores in the data management practice help the community better understand current best prac-
areas, the community receives five Ds. These areas tices. Organizations can use the assessment results to
provide immediate targets for future data manage- compare their specific performance against others in
ment investment. their industry and against the community results as a
whole. Quantities and groupings indicate the relative
WHERE ARE WE NOW? state and robustness of the best practices within each
We address our original research objectives according process. Future research can use this information to
to our two goal categories. identify specific practices that can be shared with the

54 Computer
community. Further study of
these areas will provide lever-
Initial Repeatable Defined
ageable benefits.
Next, we wanted to deter- Data program coordination 2.06 2.71 3.31
Enterprise data integration 2.18 2.44 2.66
mine how the assessment in-
Data stewardship 1.98 2.18 2.40
forms our standing as a tech- Data development 1.57 2.12 2.46
nical community. Our research Data support operations 2.04 2.38 2.66
gives some indication of the
claimed current state of data
management practices. How- Figure 4. Average of unadjusted rates for the assessment results, by process.
ever, given the validation session
results, we believe that its best to caution readers that improve their data management practices. Organizations
the numbers presented probably more accurately can use this data as a baseline from which to look for,
describe the intended state of the data management describe, and measure improvements in the state of the
community. practice. Such information can enhance their under-
As it turns out, the relative number of organizations standing of the relative development of organizational
above level 1 for both software and data management data management. Other investigations should probe
are approximately the same, but a more detailed analy- further to see if patterns exist for specific industry or busi-
sis would be helpful. Given the belief that investment ness focus types.
in software development practices will result in signif- Building an effective business case for achieving a cer-
icant improvements, its appropriate to anticipate sim- tain level of data management is now easier. The failure
ilar benefits from investments in data management to adequately address enterprise-level data needs has
practices. hobbled past efforts.4 Data management has, at best, a
Finally, we hoped to gain information useful for devel- business-area focus rather than an enterprise outlook.
oping a roadmap for improving current practice. Likewise, applications development focuses almost
Organizations can use the survey assessment information exclusively on line-of-business needs, with little atten-
to develop roadmaps to improve their individual data tion to cross-business-line data integration or enterprise-
management practices. Mystery Airline, for example, wide planning, analysis, and decision needs (other than
could develop a roadmap for achieving data management within personnel, finance, and facilities management).
improvement by focusing on enterprise data integration, In addition, data management staff is inexperienced in
data stewardship, and data development practices. modern data management needs, focusing on data man-
agement rather than metadata management and on syn-
SUGGESTIONS FOR FUTURE RESEARCH taxes instead of semantics and data usage.
Additional research must include a look at relation-
ships between data management practice areas, which

F
could indicate an efficient path to higher maturity lev- ew organizations manage data as an asset. Instead,
els. Research should also explore the success or failure most consider data management a maintenance cost.
of previous attempts to raise the maturity levels of orga- A small shift in perception (from viewing data as a
nizational data management practices. cost to regarding it as an asset) can dramatically change
One of our goals was to determine why so many orga- how an organization manages data. Properly managed
nizational data management practices are below expec- data is an organizational asset that cant be exhausted.
tations. Several current theses could spur investigation Although data can be polluted, retired, destroyed, or
of the root causes of poor data management practices. become obsolete, its the one organizational resource that
For example, can be repeatedly reused without deterioration, provided
that the appropriate safeguards are in place. Further, all
Are poor data management practices a result of the organizational activities depend on data.
organizations lack of understanding? To illustrate the potential payoff of the work presented
Does data management have a poor reputation or here, consider what 300 software professionals applying
track record in the organization? software process improvement over an 18-year period
Are the executive sponsors capable of understanding achieved:16
the subject?
How have personnel and project changes affected They predicted costs within 10 percent.
the organization efforts? They missed only one deadline in 15 years.
The relative cost to fix a defect is 1X during inspec-
Our assessment results suggest a need for a more for- tion, 13X during system testing, and 92X during
malized feedback loop that organizations can use to operation.

December 2006 55
Early error detection rose from 45 to 95 percent 10. Carnegie Mellon Univ. Software Eng. Inst., Capability Matu-
between 1982 and 1993. rity Model: Guidelines for Improving the Software Process,
Product error rate (measured as defects per 1,000 1st ed., Addison-Wesley Professional, 1995.
lines of code) dropped from 2.0 to 0.01 between 11. M.C. Paulk and B. Curtis, Capability Maturity Model, Ver-
1982 and 1993. sion 1.1, IEEE Software, vol. 10, 1993, pp. 18-28.
12. D.R. Goldenson and D.L. Gibson, Demonstrating the Impact
If improvements in data management can produce and Benefits of CMM: An Update and Preliminary Results,
similar results, organizations should increase their matu- special report CMU/SEI-2003-SR-009, Carnegie Mellon Univ.
rity efforts. Software Eng. Inst., 2003, pp. 1-55.
13. C. Billings and J. Clifton, Journey to a Mature Software
Process, IBM Systems J., vol. 33, 1994, pp. 46-62.
Acknowledgments 14. C.C. Hauer, Data Management and the CMM/CMMI:
We thank Graham Blevins, David Rafner, and Santa Translating Capability Maturity Models to Organizational
Susarapu for their assistance in preparing some of the Functions, presented at National Defense Industrial Assoc.
reported data. We are greatly indebted to many of Peter Technical Information Division Symp., 2003; www.dtic.mil/
Aikens classes in data reengineering and related topics ndia/2003technical/hauer1.ppt.
at Virginia Commonwealth University for the careful 15. W. Schnider and K. Schwinn, Der Reifegrad des Datenman-
work and excellent results obtained as a result of their agements [The Data Management Maturity Model], KPP
various contributions to this research. This article also Consulting; www.kpp-consulting.ch/downloadbereich/DM%
benefited from the suggestions of several anonymous 20Maturity%20Model.pdf, 2004 (in German).
reviewers. We also acknowledge the helpful, continuing 16. H. Krasner, J. Pyles, and H. Wohlwend, A Case History of
work of Brett Chaplin at Allstate in collecting, apply- the Space Shuttle Onboard Systems Project, Technology
ing, and assessing CMMI-related efforts. Transfer 94092551A-TR, Sematech, 31 Oct. 1994.

References Peter Aiken is an associate professor of information systems


1. M. Blaha, A Retrospective on Industrial Database Reverse at Virginia Commonwealth University and founding direc-
Engineering ProjectsParts 1 & 2, Proc. 8th Working Conf. tor of Data Blueprint. His research interests include data
Reverse Eng., IEEE Press, 2001, pp. 147-164. and systems reengineering. Aiken received a PhD in infor-
2. J. Zachman, A Framework for Information Systems Archi- mation technology from George Mason University. He is a
tecture, IBM Systems J., vol. 26, 1987, pp. 276-292. senior member of the IEEE, the ACM, and the Data Man-
3. P.H. Aiken, Keynote Address to the 2002 DAMA Interna- agement Association (DAMA) International. Contact him
tional Conference: Trends in Metadata, Proc. 2002 DAMA at peter@datablueprint.com.
Intl/Metadata Conf., CD-ROM, Wilshire Conf., 2002, pp.
1-32.
4. B. Parker, Enterprise Data Management Process Maturity, M. David Allen is chief operating officer of Data Blueprint.
Handbook of Data Management, S. Purba, ed., Auerbach His research interests include data and systems reengineer-
Publications, CRC Press, 1999, pp. 824-843. ing. Allen received an MS in information systems from
5. M. Gillenson, The State of Practice of Data Administration Virginia Commonwealth University. He is a member of
1981, Comm. ACM, vol. 25, no. 10, 1982, pp. 699-706. DAMA. Contact him at mda@datablueprint.com.
6. M. Gillenson, Trends in Data Administration, MIS Quar-
terly, Dec. 1985, pp. 317-325.
7. M. Gillenson, Database Administration at the Crossroads: Burt Parker is an independent consultant based in Wash-
The Era of End-User-Oriented, Decentralized Data Process- ington, D.C. His technical interests include enterprise data
ing, J. Database Administration, Fall 1991, pp. 1-11. management program development. Parker received an
8. R.L. Keeney, Value-Focused Thinking A Path to Creative MBA in operations research/systems analysis (general sys-
Decisionmaking, Harvard Univ. Press, 1992. tems theory) from the University of Michigan. He is a mem-
9. R. Nolan, Managing the Crisis in Data Processing, Har- ber of DAMA. Contact him at parkerbg@comcast.net.
vard Business Rev., Mar./Apr. 1979, pp. 115-126.

Angela Mattia is a professor of information systems at J.


Sergeant Reynolds Community College. Her research inter-
ests include data and systems reengineering and maturity
models. Mattia received an MS in information systems from
Virginia Commonwealth University. She is a member of
DAMA. Contact her at amattia@jsr.vccs.edu.

56 Computer
Improving Data
Management
Practices

1 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!

Please Help with A


Research Project!

Data Management
Practices Assessment
peter@datablueprint.com

2 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!

Peter Aiken
Full time in information technology since 1981
IT engineering research and project background
University teaching experience since 1979
Seven books and dozens of articles
Research Areas
reengineering, data reverse engineering, software requirements engineering, information engineering, human-
computer interaction, systems integration/systems engineering, strategic planning, and DSS/BI
Director
George Mason University/Hypermedia Laboratory (1989-1993)
Published Papers
Communications of the ACM, IBM Systems Journal, InformationWEEK, Information & Management, Information
Resources Management Journal, Hypermedia, Information Systems Management, Journal of Computer
Information Systems and IEEE Computer & Software
DoD Computer Scientist
Reverse Engineering Program Manager/Office of the Chief Information Officer (1992-1997)
Visiting Scientist
Software Engineering Institute/Carnegie Mellon University (2001-2002)
DAMA International Advisor/Board Member (http://dama.org)
2001 DAMA International Individual Achievement Award (with Dr. E. F. "Ted" Codd)
2005 DAMA Community Award
Founding Advisor/International Association for Information and Data Quality (http://iaidq.org)
Founding Advisor/Meta-data Professionals Organization (http://metadataprofessional.org)
Founding Director Data Blueprint 1999
3 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!

http://peteraiken.net

Contact Information:

Peter Aiken, Ph.D.

Department of Information Systems


School of Business
Virginia Commonwealth University
1015 Floyd Avenue - Room 4170
Richmond, Virginia 23284-4000

Data Blueprint
Maggie L. Walker Business & Technology Center
501 East Franklin Street
Richmond, VA 23219
804.521.4056
http://datablueprint.com

office :+1.804.883.759
cell:+1.804.382.5957

e-mail:peter@datablueprint.com
http://peteraiken.net

4 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Organizations Surveyed
Results from
International Organizations
10%
Local Government
more than 400
4% organizations
State Government Agencies 32%
17%
government
Appropriate
public company
representation
Federal Government Enough data to
11%
demonstrate
European
organization DM
Public Companies
58% practices are
generally more
mature
5 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!

% of DM organizations labeled "successful"


0.45

0.36

0.27

0.18

0.09

Successful 0
Partial Success
Don't know/too soon to tell
Unsuccessful
Does not exist
In 25 years:
1981
2007
6 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Investment <= Return
10%
Largely
Ineffective
DM
Investments
Return ! 0
70%
Investment > Return
20%

Approximately, 10% percent of organizations achieve parity and


(potential positive returns) on their DM investments.
Only 30% of DM investments achieve tangible returns at all.
Seventy percent of organizations have very small or no tangible
return on their DM investments.
7 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!

8 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
September 21, 2004

9 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!

Hmm

Confusion

Correct Name:
Yusuf Islam

TSA No Fly
Listing:
Youssouf Islam

10 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
15,000 people
15,000 want off the US terror watch
appealedlist
to be
removed from list
2,000 month
requesting removal
TSA promised 30 day
review process
Actual time is 44 days
American Civil Liberties
Union estimates 1
million people on US
government watch lists

11 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!

US Terror Watch List Facts

Fall 2008 comments:


Fewer than 2,500 people on US "no-fly" list
10% those are US citizens
16,000 people on "selectee" list (additional screening)
Transfer responsibility of comparing names on lists from dozens of
airlines to TSA
12 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
IT Project Failure Rates
Recent IT project failure rates statistics
can be summarized as follows:
Carr 1994
16% of IT Projects completed on time,
within budget, with full functionality
OASIG Study (1995)
7 out of 10 IT projects "fail" in some respect
The Chaos Report (1995)
75% blew their schedules by 30% or more
31% of projects will be canceled before they ever get completed
53% of projects will cost over 189% of their original estimates
16% for projects are completed on-time and on-budget
KPMG Canada Survey (1997)
61% of IT projects were deemed to have failed
Conference Board Survey (2001)
Only 1 in 3 large IT project customers were very satisfied"
Robbins-Gioia Survey (2001)
51% of respondents viewed their large IT implementation project as unsuccessful
MacDonalds Innovate (2002)
Automate fast food network from fry temperature to # of burgers sold-$180M USD write-
off
Ford Everest (2004)
Replacing internal purchasing systems-$200 million over budget
http://www.it-cortex.com/stat_failure_rate.htm (accessed
FBI (2005) 9/14/02)
New York Times 1/22/05 pA31
Blew $170M USD on suspected terrorist database-"start over from scratch"
13 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!

DM Involvement
Initiative Leader Initiative Involvement Not Involved

Data Warehousing

XML

Data Quality

Customer Relationship Management

Master Data Management

Customer Data Integration

Enterprise Resource Planning

Enterprise Application Integration

0 12.5 25.0 37.5 50.0


Particpation Percentage
14 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Misunderstanding Data Management

15 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!

Link business objectives to technical capabilities

16 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Data Management
"Understanding the
current and future
data needs of an
enterprise and
making that data
effective and
efficient in
supporting business
activities"
Aiken, P, Allen, M. D., Parker, B., Mattia, A.,
"Measuring Data Management's Maturity: A
Community's Self-Assessment" IEEE Computer
(research feature April 2007)

17 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!

A Model Specifying Relationships Among Important Terms


Wisdom & knowledge are
often used synonymously

Intelligence
Data

Information Use
Data

Data Request
Data
Data

Fact Meaning
Data Data

1. Each FACT combines with one or more MEANINGS.


2. Each specific FACT and MEANING combination is referred to as a DATUM.
3. An INFORMATION is one or more DATA that are returned in response to a specific
REQUEST.
4. INFORMATION REUSE is enabled when one FACT is combined with more than
one MEANING.
5. INTELLIGENCE is INFORMATION associated with its USES.
[Built on definition by Dan Appleton 1983]
18 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Expanding Scope

Years 1950-1970 1970-1990 1990-2000 2000-

Database design
Database operation
Data requirements analysis
Data modeling
Enterprise data management coordination
Enterprise data integration
Data stewardship
Data use
Data Quality, Data Security
Data Compliance, Mashups
(more)

19 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!

DM Practice Evolution
1.0

0.8

0.6

0.4

0.2
Jan 1, 1978
Jan 1, 1979
Jan 1, 1980
Jan 1, 1981
Jan 1, 1982
Jan 1, 1983
Jan 1, 1984
Jan 1, 1985
Jan 1, 1986
Jan 1, 1987
Jan 1, 1988
Jan 1, 1989
Jan 1, 1990
Jan 1, 1991
Jan 1, 1992
Jan 1, 1993
Jan 1, 1994
Jan 1, 1995
Jan 1, 1996
Jan 1, 1997
Jan 1, 1998
Jan 1, 1999
Jan 1, 2000
Jan 1, 2001
Jan 1, 2002
Jan 1, 2003
Jan 1, 2004
Jan 1, 2005
Jan 1, 2006
Jan 1, 2007

Inferred and representative percentages of organizational 'practicing' DM by year

20 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Organizational DM Functions and their Inter-relationships
Organizational Strategies Implementation
Data Program Guidance
Coordination
Goals

Organizational Integrated
Data Integration Models

Data Data
Stewardship Standard Development
Data

Application
Models & Designs

Direction Data Support Data


Operations Business Asset Use
Feedback Data

Business Value

21 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!

Do you know the game Twister?

Canada Estonia Ireland Switzerland


Chile Finland Italy Thailand
Columbia France Japan Turkey

Germany Qatar UAE
Egypt US
Great Scotland
Britain
22 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Typical System Evolution Finance Application
(3rd GL, batch
system, no source)

Payroll Data
(database)
Payroll Application Finance
(3rd GL) Data
(indexed)

Marketing Data
Marketing Application
(external database)
(4rd GL, query facilities,
no reporting, very large)

Personnel Data
(database)
Personnel App.
(20 years old,
un-normalized data)
R&D Mfg. Data
Data (home grown
(raw) database) Mfg. Applications
(contractor supported)
R& D Applications
(researcher supported, no documentation)
23 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!

Nicolo Machiavelli
(1469-1527)

He who doesnt lay his


foundations before
hand, may by great
abilities do so
afterward, although with great
trouble to the architect and
danger to the building.
Machiavelli, Niccolo. The Prince. 19 Mar. 2004 http://pd.sparknotes.com/philosophy/prince
24 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Information Architectures
are plans, guiding the
transformation of strategic
organizational information needs into
specific information systems
development projects
Source: Internet
"Information architecture is a
foundation discipline describing the
theory, principles, guidelines,
standards, conventions, and factors for
managing information as a resource.
It produces drawings, charts, plans,
documents, designs, blueprints, and
templates, helping everyone make
efficient, effective, productive and
innovative use of all types of
information."
Source: Information First by Roger & Elaine
Evernden, 2003 ISBN 0 7506 5858 4 p. 1.
Information architecture (IA) is the art
of expressing a model or concept of
information used in activities that
require explicit details of complex
systems. (wikipedia.org)
All organizations have information
architectures
Some are better understood and
documented (and therefore more
useful to the organization) than others.
25 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!

Building from the Top

26 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Sample Conversation (Developing Constraints)

I'd like to build a building.


What kind of building - do you want to sleep in it?
Eat in it? Work in it?
I'd like to sleep in it.
Oh, you want to build a house?
Yes, I'd like a house.
How large a house do you have in mind?
Well, my lot size is 100 feet by 300 feet.
Then you want a house about 50 feet by 100 feet.
Yes, that's about right.
How many bedrooms do you need?
Well, I have two children, so I'd like three
bedrooms ...
27 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!

GAO Has Identified the Problem

28 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Concrete Block & Engineering Continuity

29 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!

Look Familiar?

30 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Finance Example
Business Rule:
A customer may have one and only one account
Bank Manager: # Account ID Sorted IDs
The customer 1 peter peter
is always 2 peter1 peter1
right ... 3 peter2 peter10
And this one 4 peter3 peter2
needs multiple 5 peter4 peter3
accounts! 6 peter5 peter4
7 peter6 peter5
8 peter7 peter6
9 peter8 peter7
10 peter9 peter8
11 peter10 peter9
31 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!

Architecture Jargon

32 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Avoiding Unnecessary Work Using Business Rule Metadata
BR1) Zero, one, or more
Person EMPLOYEES can be Job Class
associated with one PERSON

BR4) One or
more
'Mond-Licht' BR2) Zero, one, or more POSITIONS
or EMPLOYEES can be associated
can be
with one JOB CLASS;
'Mondschein' associated
with one JOB
CLASS.

Employee Job Sharing Position

BR3) Zero, one, or more EMPLOYEES can be associated with one POSITION
33 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!

Student
System
Data
Model

34 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Proposed Data Model
35 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!

Application Build Model IBM's AD/Cycle Information Model


Defines the tools, parameters and Business
environment required to build an Business Strategy
Model
automated Business Application. Rules Model Model
Goals
Applications Structure Model
Defines the overall scope of an automated
Business Application, the components of the
Resource/
application and how they fit together. Organization/
Problem
LocationModel
Business Goals Model Model
Defines the mission of the
enterprise, its long-range goals,
and the business policies and
assumptions that affect its
operations.
Business Rules Model Enterprise Entity-
Records rules that govern the Structure Relationship
operation of the business and the Model Model
Business Events that trigger
execution of Business Processes. Process Model

Data Structures Model


Defines the data structures and their
elements used in an automated
Business Application. Info Usage
Flow Model
DB2 Model Model
Refines the definition of a Relational
Database design to a DB2-specific Value Domain
design. Model

Derivations/Constraints Model
Records the rules for deriving legal
values for instances of
Extension Derivations/
Entity-Relationship Model Global Text
Support Model Constriants
components, and for controlling the Model
Model
use or existence of E-R instance.
Enterprise Structure Model
Defines the scope of the enterprise
to be modeled. Assigns a name to the
model that serves to qualify each
component of the model. Application Application
Entity-Relationship Model Structure
Build Model
Defines the Business Entities, their Model
properties (attributes) and the Program
relationships they have with other Elements
Business Entities. Model
IMS Structure
Extension Support Model DB2 Model
Model
Provides for tactical Information
Model extensions to support special
tool needs.
Flow Model Relational Data
Specifies which of the Entity Library Panel/ Screen
Database Test Model Structure
Relationship Model component Model Model
Model Model
instances are passed between
Process Model components.
Library Model Program Elements Model Strategy Model
Global Text Model Records the existence of Identifies the various pieces and Records business strategies to
Supports recording of extended non-repository files and the role they elements of application program resolve problems, address goals,
descriptive text for many of the play in defining and building an source that serve as input to the and take advantage of business
Information Model components. automated Business Application. application build process. opportunities. It also records
IMS Structures Model Organization/Location Model Resource/Problem Model the actions and steps to be taken.
Defines the component structures Records the organization structure Identifies the problems and needs Test Model
and elements and the application and location definitions for use in of the enterprise, the projects Identifies the various file (test
program views of an IMS Database. describing the enterprise. designed to address those needs, procedures, test cases, etc.)
Info Usage Model Panel/Screen Model and the resources required. affiliated with an automated
Specifies which of the Identifies the Panels and Screens and business Application for use in
Relational Database Model
Entity-Relationship Model the fields they contain as elements testing that application.
Describes the components of a
component instances are used by used in an automated Business Relational Database design in Value Domain Model
other Information Model Application. terms common to all SAA Defines the data characteristics
components. relational DBMSs. and allowed values for
Process Model
Defines Business Processes, their information items.
36 - datablueprint.com sub processes and
Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
components.
37 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!

38 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Archeology-based Transformations Solve a Puzzle
Primary sources of guidance:
The edge-pieces are easy to identify
Distinct physical piece features exist, such as
colors, patterns, pictures, etc.
Steps for solving:
Physically segregate all identified edge
pieces (not always present in existing environment.)
Create puzzle framework - connecting edge pieces using the puzzle picture
Within frame, physically group remaining pieces by distinct physical
features
Solve a smaller section of the puzzle containing just a portion of the picture
that is focused on similar physical features such as a ball or a puppy as
images in the picture. This is an effective approach because the
Focus is on a common domainone distinct aspect of the entire picture
Because it focuses the analysis on a smaller number of puzzle pieces it
is proportionately smaller than attempting to solve the overall puzzle at
once.
As the components are assembled, combine them to solve the complete
puzzle.
39 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!

How was this bridge constructed?

40 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Flood

41 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!

42 - datablueprint.com
New River Bridge
Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Bridge Engineering

43 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!

Oct 2004 IRS Accomplishment


Unified five definitions of "child"
Reduce 5 definitions to 1 for tax return
preparations such as:
Dependent
Earned income tax credit
Child credit
Different reasons, either it
"Was developed to carry out social policy objective(s), or
Someone perceived it was going to save revenue"
"Is it easier for (customers) to understand and it is easier for
IRS to audit and there a lots of things like that we can do"
Initiative started in 1991 - it took 13 years including 2.5
years moving as legislation!

Source: Pamela F. Olson former Assistant Secretary for Tax Policy (quote from the Diane Rehm Show 11/29/04 http://
www.wamu.org/programs/dr/04/11/29.php)

44 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
45 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!

Data Integration/Exchange
Challenges

Customer typically has had different meanings to different


parts of the organization:
Accounting -> organization that buys products or services
Service -> client
Sales -> prospect
Assigning the same mission to the DoD lines of business to:
Secure the building elicits very different results from each
line of business:
Army: Posts guards at all entrances and ensures no unauthorized
access
Navy: Turns out all the lights, locks up, and leaves
Marines: Sends in a company to clear the building room-by-room; forms
perimeter defense around the building
Air Force: Signs three year lease with option to buy
[Second example courtesy of Burt Parker]

46 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
FBI & Canadian Social Security Gender Codes
1. Male
2. Female If column 1 in
3. Formerly male now female source = "m"
then set
4. Formerly female now male value of
5. Uncertain target data
to "male"
6. Won't tell else set
value of
7. Doesn't know target data
8. Male soon to be female to "female"

9. Female soon to be male


Hypothesized extensions contributed by a Chicago DAMA Member
10. Both soon to be female
11. Both soon to be male
12. Psychologically female, biologically male
13. Psychologically male, biologically female
47 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!

Predicting Engineering Problem Characteristics

Platform: Amdahl
Legacy System OS: MVS Platform: UniSys Legacy System
#1: Payroll 1998 Age: 15 OS: OS #2: Personnel
Data Structure: VSAM/virtual 1998 Age: 21
database Data Structure: DMS (Network)
tables Physical Records: 4,950,000
Physical Records: 780,000 Logical Records: 250,000
Logical Records: 60,000 Relationships: 62
Relationships: 64 Entities: 57
Entities: 4/350 Attributes: 1478
Attributes: 683

Characteristics Logical Physical


Platform: WinTel Records: 250,000 600,000
OS: Win'95 Relationships: 1,034 1,020
1998 Age: new Entities: 1,600 2,706
Data Structure: Client/Sever RDBMS Attributes: 15,000 7,073

New System

48 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
May 27, '96 Jun 3, '96
ID Task Name Duration Cost Work S S M T W T
1 1000 ORGANIZATION 18.01d $128,335.99 82.44d
2 1100 Organize Project 18d $42,585.33 27.36d Technology Consultant[0.55],Experi
3 1200 Complete Work Program 18d $71,739.42 46.08d Technology Consultant[0.92],Consu

4 Detailed Work Plan and Finalized Deliverable List 0d $0.00 0d


5 1300 Develop Quality Plan 18.01d $14,011.24 9d Technology Consultant[0.18],Consu

6 2000 ESTABLISH DEVELOPMENT ENVIRONMENT 54d $235,364.34 228.07d


7 2100 Setup Application Software 18d $51,310.67 49.86d Manager[0.44],Technology Consulta

8 2200 Site Preparation 54d $184,053.67 178.2d Experienced Analyst[0.56],Technolo

9 Comprehensive Backup Plan 0d $0.00 0d


10 3000 PLAN CHANGE MANAGEMENT 72.01d $347,901.67 249.13d
11 3100 Develop Change Management Plan 18.01d $39,821.00 21.97d Manager[0.24],Consultant[0.61],Exp

12 Change Management Plan 0d $0.00 0d


13 3200 Implement Change Management Plan 36d $123,597.00 91.08d Analyst[0.76],Consultant[1.26],Mana

14 3300 Develop Impact Analysis Plan 18.01d $17,485.42 12.96d


15 Impact Analysis Plan 0d $0.00 0d
16 3400 Implement Impact Analysis Plan 18d $166,998.25 123.12d
17 4000 PERFORM CONFIGURATION TEST 72d $93,585.25 76.14d
18 4100 Prepare for Functional Configuration Testing 54d $53,091.67 36.18d Consultant[0.37],Experienced Analy

19 4200 Perform Functional Configuration Testing 18d $40,493.58 39.96d


20 5000 PRELIMINARY SYSTEM & PROCESS DESIGN 108d $1,248,758.99 1079.82d
21 5100 Analyze Business Processes 54d $621,386.25 511.92d Manager[1.42],Consultant[3.32],Ana
22 5200 Software Fit Analysis 54d $568,447.16 505.44d

Task Summary Rolled Up Progress


Project:
Progress Rolled Up Task
Date: Thu 9/28/00
Milestone Rolled Up Milestone

Page 1
49 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!

"Extreme" Data Engineering


2 person months = 40 person days
2,000 attributes mapped onto 15,000
2,000/40 person days = 50 attributes
per person day
or 50 attributes/8 hour = 6.25 attributes/hour
and
15,000/40 person days = 375 attributes
per person day
or 375 attributes/8 hours = 46.875
attributes/hour
Locate, identify, understand, map, transform,
document, QA at a rate of -
52 attributes every 60 minutes or
.86 attributes/minute!
50 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Why Data Projects Fail by Joseph R. Hudicka

Median Project Expense


Assessed 1200
migration projects!
Surveyed only
experienced migration
specialists who have Median Project Cost
done at least four
migration projects
The median project $0 $125,000 $250,000 $375,000 $500,000

costs over 10 times the amount planned!


Biggest Challenges: Bad Data; Missing Data; Duplicate Data
The survey did not consider projects that were cancelled
largely due to data migration difficulties
" problems are encountered rather than discovered"

Joseph R. Hudicka "Why ETL and Data Migration Projects Fail" Oracle Developers Technical Users Group Journal June 2005 pp. 29-31
51 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!

Organizational DM Functions and their Inter-relationships


Organizational Strategies Implementation
Data Program Guidance
Coordination
Goals
Major focus
Organizational Integrated of study and
Data Integration Models research

Data Data
Stewardship Standard Development
Data

Application
Models & Designs

Direction Data Support Data


Operations Business Asset Use
Feedback Data

Business Value

52 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
New Technical Expertise Required

Focus has been on new systems


development
Guidance and technical expertise
required to develop new data
applications and components.
New domain focus is on
maintenance of existing
environments.
Understanding what the existing
systems were originally designed to
accomplish (the requirements) and
on how (the design) those systems
53 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!

Why?

54 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Metadata Engineering
O-1/3 reconstitute original metadata
O-4/5 improve the current metadata
O-6/9 improve system data capabilities based on the improved metadata
Reverse Engineering

As Is Information As Is Data Design Assets As Is Data Implementation


Requirements O5 Reconstitute Assets
Assets Requirements
O4
Existing

Recon-
stitute
Data
O7 Re- Design
develop O6
O3 Recreate Require- Redesign O1 Recreate Data
O2 Recreate
Requirements ments Data Implementation
Data Design
To Be To Be To Be Data
Requirements Design Metadata Implementation
Assets Assets Assets
New

O8 O9
Redesign Reimplement
Data Data

Forward engineering
55 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!

System System System System System System


Component A Component B Component C Component D Component E Component F

Architectural components from selected system components

Common metadata model


implementation CM2

are repurposed for use on other integration efforts.

Organizational Data & Software


Business
performance architecture reverse
engineering
optimization evolution engineering

56 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Component
Existing Structure Pareto Phase I: Archeology-based Transformations designed to
Environment Analysis Analysis understand the existing environment Phase II: Developing the desired architecture
T8 - CM2-
T4 -
T1 - T2 - T5 - Potential T7 - based
Component Component Pareto T3 - Planning Repeatability T6 - Gap
Implementation Technology Capabilities Solutions Component
structure is structure is Subset is Modeling Integration Reusability Analyses
Filtering Analyses Engineering Implementatio
unknown discovered Hypothesized Combing
n Engineering

Component 1
Component 2
Component 3
Component 4
Component 5
Component 6
Component 7
Component 8
Component 9
Unknown Component 10
collection Component 11
of Component 12
components Component 13
Component 14
Component 15
Component 16
Component 17
Component 18
Component 19
Component 20
Component 21
Component 22
Component
Component
23
24
Structured Data Engineering
Component 25
57 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!

Consumer Goods Federal Govmnt Hospitality Health Insurance Telephony


Customers Citizens Guests Policies Customers
Suppliers Taxpayers Services Policyholders Products
Products Terrorists Stays Groups Services
Stores Visas Facilities Claims Equipment
Inventory Locations Claimants
Loyalty Programs Providers
Major Subject Areas Services
Distribution State/Local Govmt Insurance Manufacturing/ Transportation
Customers Citizens Policies Distribution Customers
Customers
Assets Tax Payers Policyholders Suppliers
Supply Chain Service Recipients Incidents Suppliers Vehicle Inventory
Warehousing Properties Distributors Transportation
Inventory Claims Products Routes

Beneficiaries Parts
Inventory
Finance/Banking Universities Healthcare Retailers Utilities
Customers Students Patient Customers Customers
Accounts Instructors Suppliers Loyalty Programs Suppliers
Products Courses Treatments Suppliers Services
Branches Enrollments Hospitals Products Installations
Facilities Doctors Orders Utilization
Classrooms Nurses Inventory
Exams & testing Medications
Adapted from Data Strategy by Sid Adelman, Larissa Moss and Majid Abai (2005) Addison-Wesley Professional, ISBN: 0321240995
58 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
How many interfaces are required to solve this integration problem?

Application 1 Application 2 Application 3

15 Interfaces
(N*(N-1))/2

Application 4 Application 5 Application 6

RBC: 200 applications - 4900 batch interfaces


59 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!

XML-based Integration Solution

Application 1 Application 2 Application 3

Integration Processor

Application 4 Application 5 Application 6

60 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Typical System Evolution Finance Application
(3rd GL, batch
system, no source)

Payroll Data
(database)
Payroll Application Finance
(3rd GL) Data
(indexed)

Marketing Data
Marketing Application
(external database)
(4rd GL, query facilities,
no reporting, very large)

Personnel Data
(database)
Personnel App.
(20 years old,
un-normalized data)
R&D Mfg. Data
Data (home grown
(raw) database) Mfg. Applications
(contractor supported)
R& D Applications
(researcher supported, no documentation)
61 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!

Becomes this Finance Application


(3rd GL, batch
system, no source)

Payroll Data
(database)
Payroll Application Finance
XML Processor
(3rd GL) Data
(indexed)

XML Processor

XML Processor
Marketing Data
Marketing Application
(external database)
(4rd GL, query facilities,
no reporting, very large)

XML Processor
Personnel Data
(database)
Personnel App. XML Processor
(20 years old,
un-normalized data)
R&D Mfg. Data
XML Processor (home grown
Data
(raw) database) Mfg. Applications
(contractor supported)
R& D Applications
(researcher supported, no documentation)
62 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
3-Way Scalability XML-based Integration Solution

Application 1 Application 2 Application 3

Expand the:
1. Number of data items XML Processor
from each system
How many individual
data items are tagged? Application 4 Application 5 Application 6

2. Number of 43 - datablueprint.com Copyright 2004 by Data Blueprint - all rights reserved!

interconnections
between the systems and the hub
How many systems are connected to the hub?
3. Amount of interconnectability among hub-
connected systems
How many inter-system data item
transformations exist in the rule collection?
63 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!

XML-Based Meta Data Management

Existing
System 1

System 2 System 4

System 3

System 5

System 6

64 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
XML-Based Meta Data Management

Existing New
System 1 System-to-System Program
Transformation Knowledge

XSLT
System 2 Transformations Transformations
System 4
Data Store

XSLT 3
System
Transformations
Generated
Programs
System 5
XSLT
Transformations
System 6

65 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!

XML-based Portals: Portal Motivation

Portals do for the web what Windows did for DOS

[Adapted from Terry Lanham Designing Innovative Enterprise Portals and Implementing Them Into Your
66 - datablueprint.com
Content Strategies Lockheed Martins Compelling Case Study Web Content II: Leveraging Best-of-Breed
Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Content Strategies - San Francisco, CA 23 January 2001]
Portal Solution

[Adapted from Terry Lanham Designing Innovative Enterprise Portals and Implementing Them Into Your
Content Strategies Lockheed Martins Compelling Case Study Web Content II: Leveraging Best-of-Breed
Content Strategies - San Francisco, CA 23 January 2001]
67 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!

Top Tier Demo

68 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
69 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!

Cruiser Collector

70 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
1996 Council of American
Capability Maturity
Building Officials (COBE) and the
Model Levels
2000 International
We have a process for
Code Councilimproving our DM capabilities Optimizing
recommendations call for unit (5)
runs to be not less than 10
inches and unit rises not more One concept for
We manage our DM processes so that process improvement,
than 7! inches. others include:
the whole organization can follow our
standard DM guidance Managed Norton Stage Theory
(4) TQM
TQdM
TDQM
We have experience that we have ISO 9000
standardized so that all in the Defined and focus on
organization can follow it understanding current
(3) processes and
determining where
improvements can be
made.

Repeatable We have DM experience and


(2) have the ability to implement
disciplined processes

Our DM practices are ad hoc


Initial and dependent upon
(1) "heroes"

71 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!

Key Finding: Process Frameworks are not Created Equal


With the exception of CMM and ITIL, use of process-efficiency
frameworks does not predict higher on-budget project delivery
Percentage of Projects on Budget
By Process Framework Adoption

while the same pattern generally holds true for on-time performance
Percentage of Projects on Time
By Process Framework Adoption

Source: Applications Executive Council, Applications Budget, Spend, and Performance Benchmarks: 2005 Member Survey
Results, Washington D.C.: Corporate Executive Board 2006, p. 23.

72 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Organizational DM Functions and their Inter-relationships
Organizational Strategies
Implementation
Data Program Guidance
Coordination
Goals

Organizational Integrated
Data Integration Models

Data Data
Stewardship Standard Development
Data

Application
Models & Designs

Direction Data Support Data


Operations Business Asset Use
Data
Feedback

Business Value

73 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!

Organizational
Defining, DM Functions
coordinating, resourcing, implementing, and their Inter-relationships
and monitoring organizational
data program strategies, policies, plans, etc. as coherent set of activities.
Organizational Strategies
Implementation
Data Program Identifying, modeling, coordinating, organizing, distributing, and architecting
Guidance
Coordination data shared across business areas or organizational boundaries.
Goals

Organizational Integrated
Data Integration Models

Ensuring that specific


individuals are assigned the Data Data
responsibility for the Stewardship Standard Development
maintenance of specific data Data
as organizational assets, and
that those individuals are Specifying and designing appropriately
Application architected
provided the requisite data assets that are engineered
Models &toDesigns
be capable of
knowledge, skills, and abilities supporting organizational needs.
to accomplish these goals in
conjunction with other data
stewards inDirection
the organization. Data Support Data
Operations Business Asset Use
Data
Feedback
Initiation, operation, tuning, maintenance, backup/
recovery, archiving and disposal of data assets in
support of organizational activities. Business Value

74 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Organizational DM Functions and their Inter-relationships
Data management
processes and
infrastructure
Organizational Strategies Implementation
Data Program Guidance
Coordination
Goals

Combining multiple Organizational Integrated


assets to produce Data Integration Models Achieve sharing of data
extra value
within a business area

Organizational-
entity subject area Data Data
data Stewardship Development
integration Standard
Data

Application
Models & Designs

Direction Provide reliable Data Support Data


access to data
Operations Business Asset Use
Feedback Data
Leverage data in organizational activities

Business Value

75 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!

How is it done?

Follows form of a semi-


structured interview
Approximately one hour is
required to complete
each interview
Examines organizational data management
practices in five areas
Branched series of questions explores
capabilities, execution, and ongoing efforts.
Total time to results typically ranges from
1 week to 1 month
76 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Council Hill Road Sign roadsign

Photo from William J. Manon Jr. .pbase.com/g3/91/555491/ 2/66430431.telWKGJG.jpg


77 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!

Assessment Benefits

Quantitative Benefits
Objective determination of
baseline BI/Analytic capabilities
Gap analysis indicates specific actions
required to achieve the "next" level
Available comparisons with similar
organizations
Provides facts useful when prioritizing
subsequent investments
Qualitative Benefits
Highlights strengths, weaknesses,
capabilities, and limitations existing BI/
78 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Data Management Practices
Measurement (DMPA)

Collaboration with
CMU's Software
Engineering
Institute (SEI)
Results from more
than 400

Optimizing (V)
Repeatable (II)

Managed (IV)
Defined (III)
organizations

Initial (I)
Public Companies
State Government
Agencies
Data Program Coordination
Federal Government
Focus: Guidance
Organizational Data Integration and Facilitation International
Organizations
Data Stewardship
Defined industry
Data Development Focus: standard
Implementation and
Access
Data Support Operations

3279 - datablueprint.com
- datablueprint.com Copyright01/1/08 and
Copyright previous
07/23/08 by years by Data Blueprint
Data Blueprint - all rights -reserved!
all rights reserved!

Sample Perception vs. Fact Chart


5

3 3.0

2.4
2.3
2.2

2 2.0 2.0

1.2

1 1.0 1.0 1.0

0
Development Data Support Systems Asset Recovery Development
Guidance Adminstration Capability Training

Verified Average

80 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Comparative Assessment Results

Challenge
Data Program Coordination

Challenge
Organizational Data Integration

Data Stewardship

Data Development

Challenge
Data Support Operations

0 1 2 3 4 5
Client
Nokia Industry Competition All Respondents

81 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!

High Marks for IFCs Program


Data Mgmt Audit 2006

Leadership & Guidance


"These IFC scores
represent the highest
aggregate scores in the
Asset Creation
area of data stewardship
recorded in our database
Metadata Management of hundreds of
assessments that has
Quality Assurance
been recognized as as a
representative scientific
sample."
Change Management

Data Quality

0 1 2 3 4 5
Page Overall Benchmarks Industry Benchmarks TRE IFC ISG
The challenge ahead
5.00
The chart represents the average scores
presented on the previous slide -
interesting that none have apparently
reached level-3
4.00

3.00

2.00

1.00

0.00
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28

83 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!

After more than a decade

Question How many software practices (surveyed) are above level 1 on


the CMM?
Answer By far most organizations (95%) surveyed are producing
software using informal processes
Question How many organizations have demonstrated at least some
proficiency according to the DM3? (i.e., scored above level 1)
Answer One in ten organizations has scored above level 1
84 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Service Orient or Be Doomed!
Service Orient or
Be Doomed!
How Service
Orientation Will
Change Your
Business
(Hardcover) by
Jason Bloomberg &
Ronald Schmelzer
I'm not quite sure
what "doom" awaits
by not service
orienting, other than
remaining mired in
archaic, calcified and
siloed processes
which a lot of

85 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!

Services

Integration Possibilities
User Interface
Business Process
Application
Data
AV Component
Well defined components
Self-contained
No interdependencies

Analogy derived from D. Barry "Web Services" Intelligent Enterprise 10/10/03 pp. 26-47 - wiring diagram from sunflowerbroadband.com

86 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Contractor Implemented Wiring

87 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!

Concise Notes on Published in 1979


Software Engineering 93 pages including
appendices & references
Out of print
$1.99 at half.com
Principles of Information
Hiding (p. 32-33)
Conceal complex data
structures whenever
possible
Allow only selected
service modules to know
about the concealed
data structures
Bind together modules
that know about
concealed data
structures
88 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
How Does SOA Fit In Existing Architectures?

The basketball and golfball slide

Bank

89 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!

Evolving applications from stove pipe to web-service-based architectures

Organizational Portal

Sunday, April 27, 2008 - All systems operational!

Organizational News

Organizational Early News Industry News


Press Releases Newsletters

Organizational IT Email

Service Desk 320 new msgs, 14,572 total


Settings Send quick email

16 million 2.1 million Organizational Essentials Search

lines of lines of Knowledge network Go



legacy code legacy code
Employee assistance
IT procurement
Organizational media design Stocks
Organizational merchandise
Full Portfolio Market Update
Reporting
XYZ 50
YYZ 29.5
ZZZ 45.25
State
Regional
Alabama
Northeast
Arkansas As of:
Northwest
Georgia Sunday, April 27, 2008
Southeast
Mississippi
Southwest
Vermont
Midnorth
Virginia
Midsouth
Get Quote

90 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Legacy Systems Transformed Into Web-services Accessed Through a Portal

Web Organizational Portal


Service 1.1
Legacy Web Monday, November 03, 2008 - All systems operational!
Application 1 Service 1.2
Web Organizational News

Service 1.3 Organizational Early News Industry News


Press Releases Newsletters
Legacy Web
Application 2 Service 2.1 Organizational IT Email

Web Service Desk 320 new msgs, 14,572 total


Settings Send quick email
Service 2.2
Legacy Web Organizational Essentials Search
Application 3 Service 3.1
Knowledge network Go
Web

Employee assistance
IT procurement
Service 3.2

Organizational media design
Organizational merchandise
Stocks
Legacy Web Full Portfolio Market Update
Application 4 Service 4.1 Reporting
XYZ 50
YYZ 29.5
Web ZZZ 45.25
Service 4.2 State
Regional Alabama
Legacy Web Northeast Arkansas
As of:
Northwest Georgia
Application 5 Service 5.1 Southeast Mississippi Monday, November 03, 2008
Southwest Vermont
Virginia
Web Midnorth
Midsouth
Get Quote
Service 5.2
Web
Service 5.3

2591
- datablueprint.com
- datablueprint.com Copyright
Copyright
01/1/08 and
11/03/08
previous
by Data
yearsBlueprint
by Data Blueprint
- all rights- reserved!
all rights reserved!

Solution Framework

External Address
Validation Processing
Channels
Repository Ch 1
SORs
SOR 1 Ch 2
Indicator
Extraction Ch 3
SOR 2 Service Latency
(could be Check Ch 4
SOR 3 segmented by Service
day of week Ch 5
month,
SOR 4 system, etc.) Ch 6
Customer
SOR 5 Contact Ch 7

SOR 6 Ch 8

SOR 7 Update
Addresses
SOR 8
92 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Logical
Extension

Text

93 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!

Logical
Extension

94 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
CONTENTS COVER STORY
>> DEFINING THE BUSINESS VALUE OF TECHNOLOGY

> NEWS FILTER


ISSUE 1,198 AUG. 11, 2008

28 Simpler Than SOA


Stymied by the complexity of SOAs, some
IT departments are taking the Web-oriented
21 Global Problem The
indictment of 11 people in five
countries in connection with the
theft of credit card numbers from
Simpler Than SOA
architecture route U.S. retailers demonstrates how
easily cybercrime crosses borders
Stymied by the complexity of
21 Still Standing IT spending
has tightened in the United States,
SOAs, some IT departments are
taking the Web-oriented
but demand from other parts of
the world kept big tech companies
growing in the second quarter

22 No Deal Deutsche Post kills


a proposed seven-year outsourcing
architecture route
deal with Hewlett-Packard, saying
it wouldnt save enough money to
be worth the risk

23 Lost Opportunity IBMs


e-discovery software offers many
useful features, but it misses the
mark by not pulling e-mail from
third-party archives

23 Real Protection SunGard


parlays its partnership with
VMware into a service that uses
virtualization to provide faster
disaster-recovery setup

Smart Web App Development


24 Olympic-Sized Task
AT&Ts new Synaptic Hosting
cloud computing service will
get its first big test this week,

Web-oriented architectures are easier to implement


providing temporary Web server
capacity for the U.S. Olympic

and offer a similar flexibility to SOA


Committees Web site

25 New Cloud Forms


Elastra advances the idea of
private clouds, in which corporate
Cover photo by Mick Coulas

data centers use the technologies


and practices of public cloud

3595 - datablueprint.com
- datablueprint.com
21 Small world 22 Backpedaling infrastructures from the likes
of Amazon.com
Copyright
01/1/08 and
Copyright and Google
previous
07/23/08 by years by Data Blueprint
Data Blueprint - all rights -reserved!
all rights reserved!

informationweek.com Aug. 11, 2008 5

WOA

http://hinchcliffe.org/archive/2008/02/27/16617.aspx

- datablueprint.com
35 - datablueprint.com Copyright01/1/08 and
Copyright previous
07/23/08 by years by Data Blueprint
Data Blueprint - all rights -reserved!
all rights reserved!
SOA & Data & ???

97 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!

SOA Requirements

Data Program Coordination

Organizational Data Integration

Data Stewardship

Data Development

Data Support Operations

0 1.25 2.50 3.75 5.00


4498
- datablueprint.com
- datablueprint.com Copyright
Copyright
01/1/08 and
11/03/08
previous
by Data
yearsBlueprint
by Data Blueprint
- all rights- reserved!
all rights reserved!
Predictive Analysis

I'm a little surprised, with such extensive


experience in predictive analysis, you should've
known we would hire you

99 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!

What is Analytics?

Analytics:
Something that is analytic
Analytic:
Of or relating to analysis;
especially; separating or
breaking up a whole or a
compound into it component
parts or constituent elements
Skilled in or using analysis
The science of logical analysis
101 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!

Car Maxx in Doha, Qatar

102 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
BI/Analytic Capabilities

Business Intelligence (BI)


refers to technologies, applications and
practices for the collection, integration, analysis,
and presentation of business information and
sometimes to the information itself.
The purpose of business intelligence--a term that
dates at least to 1958--is to support better business
decision making.
Analytics
The simplest definition of Analytics is "the science of
analysis."
A simple and practical definition, however, would be
how an entity (i.e., business) arrives at an optimal

103 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!

BI/Analytic Capabilities

Analytics Business Intelligence

Strategy formulation Strategy implementation

104 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
BI/Analytic Capabilities

Wine quality = 12.145 + 00.00117 winter rainfall


+ 0.0614 growing season temperature - 0.00386
harvest rainfall (Orley Ashenfelter)
Out performs experts
specifically Robert Parker
(http://www.erobertparker.com/)

Most everyone else


Clinical Versus
Statistical
Prediction
(Paul Meele)
8/136 studies
experts were
more accurate
105 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!

BI Challenges

Technical Challenges
Poor quality data
Poor understanding of architectural constructs
Poor quality data management practices
New technical expertise is required
Non-Technical Challenges
Architecture is under appreciated
BI perceived as a "technology" project
Inability to link technical capabilities to
business objectives
Putting BI initiatives in context
106 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Obstacles to Real-Time BI-Lessons from Deployment

Business case, high cost or budget 60%


issues

Non-integrated data sources 47%

Education and understanding of 46%


real-time BI by business users
Lack of infrastructre for handing 46%
real-time processing

Poor quality data 43%

Education and understanding of


36%
real-time BI by IT staff
Lack of tools for doing real-time 35%
processing

Immature technology 28%

Performance and scalability 24%

TDWI The Real Time Enterprise Report, 2003


107 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!

Cost of Poor Data Quality $600 Billion Annually!

108 - datablueprint.com
Thanks to Bret Champlin Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Who is Joan Smith?

http://www.sas.com
109 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!

Defining Customer
Challenges
Purchased an A4
on June 15 2007
Had not done
business with the
dealership prior
"makes them
seem sleazy
when I get a
letter in the mail
before I've even
made the first
payment on the
car advertising
lower payments
than I got"
110 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Defining Customer
Challenges
Purchased an A4
on June 15 2007
Had not done
business with the
dealership prior
"makes them
seem sleazy
when I get a

111 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!

How to solve this data quality problem using just tools?

Retail price for the unit was $40

112 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
A congratulations
letter from another
bank

Problems
Bank did not know
it made an error
Tools alone could
not have prevented
this error
Lost confidence in
the ability of the
bank to manage
customer funds

113 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!

From my retirement plan

114 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Rolling Stone Magazine

TRA- - datablueprint.com Copyright 11/11/08 and prior years by Data Blueprint - all rights reserved!

Quantitative Benefits

116 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Evidence
Type

Evidence

System
Component
Component
Element
Location

User Type

Information
Business
Processes
Business Intelligence
Increased business
perception of DM
value resulting from
http://peteraiken.net
System
Component
Logical
Process
better business
XML-based Portals
Data
Type Attribute

Logical
Data Entity
Model
Decomposition
systems including
repositories,
warehouses, ERP
01101001
01100100 implementations
01110010
Business
Rules

Data Assets

XML-based Repositories

XML

Revised Data
Management
Challenge #4 Goals
Challenge #3
Challenge #1 Data Analysis
Challenge #2
Challenge #1 Technologies
Quality
Contact Information:
Tomorrow's Data Management
10 - datablueprint.com Copyright 2004 by Data Blueprint - all rights reserved!

Peter Aiken, Ph.D.

Department of Information Systems


School of Business
Virginia Commonwealth University
1015 Floyd Avenue - Room 4170
Richmond, Virginia 23284-4000

Data Blueprint
Maggie L. Walker Business & Technology Center
501 East Franklin Street
Richmond, VA 23219
804.521.4056
http://datablueprint.com

office :+1.804.883.759
cell:+1.804.382.5957

e-mail:peter@datablueprint.com
http://peteraiken.net

117 - datablueprint.com CopyrightCopyright 12/18/07


01/1/08 and by Data
previous yearsBlueprint - all rights -reserved!
by Data Blueprint all rights reserved!

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy