Measuring Data Management Practice Matur
Measuring Data Management Practice Matur
Data Blueprints methodology enables clients to learn and master processes, facilitated by our Products
and Services, to achieve enhanced data management. The result of our team approach is empowerment
both to your data as well as your organization to successfully create and maintain premiere data
management capabilities. The value to the organization is positive return on your investment in data
one of your most critical assets.
Increasing data management practice maturity levels can positively impact the
coordination of data flow among organizations, individuals, and systems. Results
from a self-assessment provide a roadmap for improving organizational data
management practices.
A
s increasing amounts of data flow within and version (changing data into other forms, states, or
between organizations, the problems that can products), or scrubbing (inspecting and manipulat-
result from poor data management practices ing, recoding, or rekeying data to prepare it for sub-
are becoming more apparent. Studies have sequent use).
shown that such poor practices are widespread. Approximately two-thirds of organizational data
For example, managers have formal data management training;
slightly more than two-thirds of organizations use
PricewaterhouseCoopers reported that in 2004, only or plan to apply formal metadata management tech-
one in three organizations were highly confident in niques; and slightly fewer than one-half manage their
their own data, and only 18 percent were very con- metadata using computer-aided software engineer-
fident in data received from other organizations. ing tools and repository technologies.3
Further, just two in five companies have a docu-
mented board-approved data strategy (www.pwc. When combined with our personal observations, these
com/extweb/pwcpublications.nsf/docid/15383D6E7 results suggest that most organizations can benefit from
48A727DCA2571B6002F6EE9). the application of organization-wide data management
Michael Blaha1 and others in the research community practices. Failure to manage data as an enterprise-, cor-
have cited past organizational data management edu- porate-, or organization-wide asset is costly in terms of
cation and practices as the cause for poor database market share, profit, strategic opportunity, stock price,
design being the norm. and so on. To the extent that world-class organizations
According to industry pioneer John Zachman,2 orga- have shown that opportunities can be created through
nizations typically spend between 20 and 40 percent the effective use of data, investing in data as the only
of their information technology budgets evolving their organizational asset that cant be depleted should be of
data via migration (changing data locations), con- great interest.
DATA MANAGEMENT DEFINITION accounting practices that have been practiced for thou-
AND EVOLUTION sands of years. As Figure 2 on the next page shows, data
As Table 1 shows, data management consists of six managements scope has expanded over time, and this
interrelated and coordinated processes, primarily expansion continues today.
derived by Burt Parker from sponsored research he led Ideally, organizations derive their data management
for the US Department of Defense at the MITRE requirements from enterprise-wide information and
Corporation.4 functional user requirements. Some of these require-
Figure 1 supports the similarly standardized defini- ments come from legacy systems and off-the-shelf soft-
tion: Enterprise-wide management of data is under- ware packages. An organization derives its future data
standing the current and future data needs of an requirements from an analysis of what it will deliver, as
enterprise and making that data effective and efficient in well as future capabilities it will need to implement orga-
supporting business activities.4 nizational strategies. Data management guides the trans-
The figure illustrates how
organizational strategies guide
other data management pro- Organizational strategies Implementation
cesses. Two of these processes
Data program Guidance
data program coordination
coordination
and organizational data inte- Integrated
grationprovide direction to Goals Organizational models
the implementation processes data integration
data development, data sup-
Standard
port operations, and data asset Data data Data
use. The data stewardship pro- stewardship development
cess straddles the line between
direction and implementation. Application models
All processes exchange feed- and designs
back designed to improve and Direction
Business
fine-tune overall data manage- Data support data Data
ment practices. operations asset use
Data management has existed Feedback
in some form since the 1950s Business value
and has been recognized as a
discipline since the 1970s. Data
management is thus a young Figure 1. Interrelationships among data management processes (adapted from Burt
discipline compared to, for Parkers earlier work4). Blue lines indicate guidance, red lines indicate feedback, and green
example, the relatively mature lines indicate data.
December 2006 49
quent strategic investments in data
Expanding Data Management Scope 1950-1970 1970-1990 1990-2000 2000 to management.
present Viewing data management as a col-
Database development lection of processes, each with a role
Database operation
Data requirements analysis
that provides value to the organization
Data modeling through data, makes it easier to trace
Enterprise data management coordination value through those processes and
Enterprise data integration point not only to a methodological
Enterprise data stewardship
Enterprise data use
why of data management practice
Explicit focus on data quality throughout improvement but also to a specific,
Security concrete how.
Compliance
Other responsibilities
RESEARCH BASIS
Mark Gillenson has published three
Figure 2. Data managements growth over time.The discipline has expanded from papers that serve as an excellent back-
an initial focus on database development and operation in the 1950s to 1970s to ground to this research.5-7 Like earlier
include additional responsibilities in the periods 1970-1990, 1990-2000, and from works, Gillenson focuses on the
2000 to the present. implementation half of Figure 1,
adopting a more narrow definition of
formation of strategic organizational information needs data administration. Over time, his work paints a pic-
into specific data requirements associated with particu- ture of an industry attempting to catch up with techno-
lar technology system development projects. logical implementation. Our work here updates and
All organizations have data architectures, whether confirms his basic conclusions while changing the focus
explicitly documented or implicitly assumed. An impor- from whether a process is performed to the maturity
tant data management process is to document the archi- with which it is performed.
tectures capabilities, making it more useful to the Three other works also influenced our research: Ralph
organization. Keeneys value-focused thinking,8 Richard Nolans six-
In addition, data management stage theory of data processing,9 and the Capability
Maturity Model Integration (CMMI).10,11
must be viewed as a means to an end, not the end Keeneys value-focused thinking provides a method-
itself. Organizations must not practice data man- ological approach to analyzing and evaluating the var-
agement as an abstract discipline, but as a process ious aspects of data management and their associated
supporting specific enterprise objectives in partic- key process areas. We wove the concepts behind means
ular, to provide a shared-resource basis on which to and fundamental objectives into our assessments con-
build additional services. struction to connect how we measure data management
involves both process and policy. Data management with what customers require from it.
tasks range from strategic data planning to the cre- In Stage VI of his six-stage theory of data processing,
ation of data element standards to database design, Nolan defined maturity as data resource management.
implementation, and maintenance. Although Nolans theory predates and is similar to the
has a technical component: interfacing with and facil- CMMI, it contains several ideas that we adapted and
itating interaction between software and hardware. reused in the larger data management context. However,
has a specific focus: creating and maintaining data to CMMI refinement remains our primary influence.
provide useful information. Most technologists are familiar with the CMM (and its
includes management of metadata artifacts that upgrade to the CMMI), developed at Carnegie Mellons
address the datas form as well as its content. Software Engineering Institute with assistance from the
MITRE Corporation.10,11 The CMMI itself was derived
Although data management serves the organization, from work that Ron Radice and Watts Humphrey per-
the organization often doesnt appreciate the value it formed while at IBM. Dennis Goldenson and Diane
provides. Some data management staffs keep ahead of Gibson presented results pointing to a link between
the layoff curve by demonstrating positive business CMMI process maturity and organizational success.12 In
value. Managements short-term focus has often made addition, Cyndy Billings and Jeanie Clifton demonstrated
it difficult to secure funding for medium- and long-term the long-term effects for organizations that successfully
data management investments. Tracing the disciplines sustain process improvement for more than a decade.13
efforts to direct and indirect organizational benefits has CMMI-based maturity models exist for human
been difficult, so it hasnt been easy to present an artic- resources, security, training, and several other areas of
ulate business case to management that justifies subse- the software-related development process. Our colleague,
50 Computer
Brett Champlin, contributed a list of dozens of maturity
measurements derived from or influenced by the CMMI. Table 2. Organizations included in data management
This list includes maturity measurement frameworks for analysis, by type.
data warehousing, metadata management, and software
Organization type Percent
systems deployment. The CMMIs successful adoption in
other areas encouraged us to use it as the basis for our Local government 4
data management practice assessment. State government 17
Whereas the core ideas behind the CMMI present a Federal government 11
reasonable base for data management practice maturity International organization 10
measurement, we can avoid some potential pitfalls by Commercial organization 58
learning from the revisions and later work done with
the CMMI. Examples of such improvements include
general changes to how the CMMI makes interrela- process-improvement strategies by determining their cur-
tionships between process areas more explicit and how rent process maturity and identifying the most critical
it presents results to a target organization. issues to improving their software quality and process.10
Work by Cynthia Hauer14 and Walter Schnider and Similarly, our goal was to aid data management practice
Klaus Schwinn15 also influenced our general approach to improvement by presenting a scale for measuring data
a data management maturity model. Hauer nicely artic- management accomplishments. Our assessment results
ulated some examples of the value determination fac- can help data managers identify and implement process
tors and results criteria that we have adopted. Schnider improvement strategies by recognizing their data man-
and Schwinn presented a rough but inspirational out- agement challenges.
line of what mature data management practices might
look like and the accompanying motivations. DATA COLLECTION PROCESS
AND RESEARCH TARGETS
RESEARCH OBJECTIVES Between 2000 and 2006, we assessed the data man-
Our research had six specific objectives, which we agement practices of 175 organizations. Table 2 pro-
grouped into two types: community descriptive goals vides a breakdown of organization types.
and self-improvement goals. Students from some of our graduate and advanced
Community descriptive research goals help clarify our undergraduate classes largely conducted the assessments.
understanding of the data management community and We provided detailed assessment instruction as part of
associated practices. Specifically, we want to understand the course work. Assessors used structured telephone
and in-person interviews to assess specific organizational
the range of practices within the data management data management practices by soliciting evidence of
community; processes, products, and common features. Key concepts
the distribution of data management practices, specif- sought included the presence of commitments, abilities,
ically the various stages of organizational data man- measurements, verification, and governance.
agement maturity; and Assessors conducted the interviews with the person
the current state of data management practices in identified as having the best, firsthand knowledge of
what areas are the community data management organizational data management practices. Tracking
practices weak, average, and strong? down these individuals required much legwork; identi-
fying these individuals was often more difficult than
Self-improvement research goals help the community securing the interview commitment.
as a whole improve its collective data management prac- The assessors attempted to locate evidence in the orga-
tices. Here, we desire to nization indicating the existence of key process areas
within specific data management practices. During the
better understand what defines current data man- evaluation, assessors observed strict confidentiali-
agement practices; ty they reported only compiled results, with no men-
determine how the assessment informs our standing tion of specific organizations, individuals, groups,
as a technical community (specifically, how does data programs, or projects. Assessors and participants kept
management compare to software development?); all information to themselves and observed proprietary
and rights, including several nondisclosure agreements.
gain information useful for developing a roadmap All organizations implement their data management
for improving current practice. practice in ways that can be classified as one of five
maturity model levels, detailed in Table 3 on the next
The CMMIs stated goals are almost identical to ours: page. Specific evidence, organized by maturity level,
[The CMMI] was designed to help developers select helped identify the level of data management practiced.
December 2006 51
Table 3. Data management practice assessment levels.
1 Initial The organization lacks the necessary processes for The organization depends entirely on individuals, with little or no
sustaining data management practices. Data corporate visibility into cost or performance, or even awareness
management is characterized as ad hoc or chaotic. of data management practices. There is variable quality, low
results predictability, and little to no repeatability.
2 Repeatable The organization might know where data management The organization exhibits variable quality with some
expertise exists internally and has some ability to predictability. The best individuals are assigned to critical
duplicate good practices and successes. projects to reduce risk and improve results.
3 Defined The organization uses a set of defined processes, Good quality results within expected tolerances most of the time.
which are published for recommended use. The poorest individual performers improve toward the best
performers, and the best performers achieve more leverage.
//correct?//.
4 Managed The organization statistically forecasts and directs Reliability and predictability of results, such as the ability to
data management, based on defined processes, determine progress or six sigma versus three sigma
selected cost, schedule, and customer satisfaction measurability, is significantly improved.
levels. The use of defined data management processes
within the organization is required and monitored.
5 Optimizing The organization analyzes existing data management The organization achieves high levels of results certainty.
processes to determine whether they can be improved,
makes changes in a controlled fashion, and reduces
operating costs by improving current process
performance or by introducing innovative services to
maintain their competitive edge.
For each data management process, the assessment For example, the data program coordination practice
used between four and six objective criteria to probe for area results include:
evidence. Assessed outside the data collection process,
the presence or absence of this evidence indicated orga- Mystery Airline achieved level 1 on responses 1, 2,
nizational performance at a corresponding maturity level. and 5, and level 2 on responses 3 and 4.
The airline industry performed above both Mystery
ASSESSMENT RESULTS Airline and all respondents on responses 1 through
The assessment results reported for the various prac- 3.
tice areas show that overall scores are repeatable (level The airline industry performed below both Mystery
2) in all data management practice areas. Airline and all respondents on response 4, and well
Figure 3 shows assessment averages of the individual below all respondents and just those in the airline
response scores. We used a composite chart to group the industry on response 5.
averages by practice area. Such groupings facilitate
numerous comparisons, which organizations can use to Figure 3f illustrates the range of results for all orga-
plan improvements to their data management practices. nizations surveyed for each data management process
We present sample results (blue) for an assessed orga- for example, the assessment results for data program
nization (disguised as Mystery Airline), whose man- coordination ranged from 2.06 to 3.31.
agement was interested in not only how the organization The maturity measurement framework dictates that
scored but also how it compared to other assessed air- a data program can achieve no greater rating than the
lines (red) and other organizations (white). lowest rating achieved hence the translation to the
We grouped 19 individual responses according to the scores for Mystery Airline of 1, 2, 2, 2, and 2 combin-
five data management maturity levels in the horizontal ing for an overall rating of 1. This is congruent with
bar charts. Most numbers are averages. That is, for an CMMI application.
individual organization, we surveyed multiple data man- Although this might seem a tough standard, the rat-
agement operations, combined the individual assessment ing reflects the adage that a chain is only as strong as its
results, and presented them as averages. We reported weakest link. Mature data management programs cant
assessments of organizations with only one data man- rely on immature or ad hoc processes in related areas.
agement function as integers. The lowest rating received becomes the highest possible
52 Computer
0 1 2 3 4 0 1 2 3 4
(a) (b)
1 2
Response 1 3.15 Response 6 2.98
2.72 2.66
1 2
Response 2 2.98 Response 7 0.98
2.57 2.34
2
2
Response 3 3.11
Response 8 2.05
2.06 2.57
2
Response 4 1.09
Response 9
2
3.08
2.88
2.18
1
Response 5 3.14 Mystery Airline
3.31
Airline industry
All respondents
0 1 2 3 4 0 1 2 3 4
(c) (d)
2 2
Response 10-a 0.965 Response 11 0.89
2.40 2.33
2 2
Response 10-b 3.04 Response 12 1.2
2.15 1.57
1 2
Response 10-c 0.97 Response 13 1.05
1.98 2.01
2 2
Response 10-d 1.1 Response 14 0.79
2.21 2.46
2 2
Response 10-e 1./05 Response 15 1.14
2.23 2.25
2
Response 10-f 0.96
2.13
Enterprise
Data program data Data Data Data support
0 1 2 3 4 coordination integration stewardship development operations
(e) (f) results results results results results
3 5.00
Response 16 2.89
2.66
4.00
3 3.31
Response 17 3.11
3.00
2.66 2.66 2.66
2.46
2.28
3 2.00
Response 18 3.04 2.06 2.18 1.98 2.04
2.04
1.57
1.00
3
Response 19 1.11
0.00
2.17
Figure 3. Assessment results useful to Mystery Airline: (a) data program coordination, (b) enterprise data integration, (c) data
stewardship, (d) data development, (e) data support organizations, and (f) assessments range.
overall rating. This also explains why many organiza- Results analysis
tions are at level 1 with regard to their software devel- Perhaps the most important general fact represented
opment practices. While the CMMI process results in a in Figure 3 is that organizations gave themselves rela-
single overall rating for the organization, data manage- tively low scores. The assessment results are based on
ment requires a more fine-grained feedback mechanism. self-reporting and, although our 15-percent validation
Knowing that some data management processes per- sample is adequate to verify accurate industry-wide
form better than others can help an organization develop assessment results, 85 percent of the assessment is based
incentives as well as a roadmap for improving individ- on facts that were described but not observed. Although
ual ratings. direct observables for all survey respondents would have
Taken as a whole, these numbers show that no data provided valuable confirming evidence, the cost of such
management process or subprocess measured on aver- a survey and the required organizational access would
age higher than the data program coordination process, have been prohibitive.
at 3.31. Its also the only data management process that We held in-person, follow-up assessment validation
performed on average at a defined level (greater than 3). sessions with about 15 percent of the assessed organi-
The results show a community that is approaching zations. These sessions helped us validate the collection
the ability to repeat its processes across all of data method and refine the technique. They also let us gauge
management. the assessments accuracy.
December 2006 53
Community descriptive research goals
Table 4. Assessment scores adjusted for self-reporting First, we wanted to determine the range of practices
inflation. within the data management community. A wide range
of such practices exists. Some organizations are strong
Response Adjusted average
in some data management practices and weak in others
1 1.72388 (the range of practice is consistently inconsistent). The
2 1.57463 wide divergence of practices both within and between
3 1.0597 organizations can dilute results from otherwise strong
4 1.8806 data management programs. The assessments applica-
5 2.31343 bility to longitudinal studies remains to be seen; this is
6 1.66418 an area for follow-up research. Although researchers
7 1.33582 might undertake formal studies of such trends in the
8 1.57463 future, evidence from ongoing assessments suggests that
9 1.1791 results are converging. Consequently, we feel that our
10 a 1.40299 sample constitutes a representation of community-wide
10 b 1.14925 data management practices.
10 c 0.97761 Next, we wanted to know whether the distribution of
10 d 1.20896 practices informs us specifically about the various stages
10 e 1.23134 of organizational data management maturity. The
10 f 1.12687 assessment results confirm the frameworks utility, as do
11 1.32836 the postassessment validation sessions. Building on the
12 0.57463 framework, we were able to specify target characteris-
13 1.00746 tics and objective measurements. We now have better
14 1.46269 information as to what comprises the various stages of
15 1.24627 organizational data management practice maturity.
16 1.65672 Organizations do clump together into the various matu-
17 1.66418 rity stages that Nolan originally described. We can now
18 1.04478 determine the investments required to predictably move
19 1.17164 organizations from one data management maturity level
to another.
Finally, we wanted to determine in what areas the
Although the assessors strove to accurately measure community data management practices are weak, aver-
each subprocesss maturity level, some interviews age, and strong. Figure 4 shows an average of unad-
inevitably were skewed toward the positive end of the justed rates summarizing the assessment results. As the
scale. This occurred most often because interviewees figure shows, the data management community reports
reported on milestones that they wanted to or would itself relatively and perhaps surprisingly strong in all five
soon achieve as opposed to what they had achieved. We major data management processes when compared to
suspected, and confirmed during the validation sessions, the industry averages for software development. The
that responses were typically exaggerated by one point range and averages indicate that the data management
on the five-point scale. community has more mature data program coordina-
When we factor in the one-point inflation, the num- tion processes, followed by organizational data inte-
bers in Table 4 become important. Knowing that the bar gration, support operations, stewardship, and then data
is so low will hopefully inspire some organizations to development. The relatively lower data development
invest in data management. Doing so might give them a scores might suggest data program coordination imple-
strategic advantage if the competition is unlikely to be mentation difficulties.
making a similar investment.
The relatively low scores reinforce the need for Self-improvement research goals
this data management assessment. Based on the Our first objective was to produce results that would
overall scores in the data management practice help the community better understand current best prac-
areas, the community receives five Ds. These areas tices. Organizations can use the assessment results to
provide immediate targets for future data manage- compare their specific performance against others in
ment investment. their industry and against the community results as a
whole. Quantities and groupings indicate the relative
WHERE ARE WE NOW? state and robustness of the best practices within each
We address our original research objectives according process. Future research can use this information to
to our two goal categories. identify specific practices that can be shared with the
54 Computer
community. Further study of
these areas will provide lever-
Initial Repeatable Defined
ageable benefits.
Next, we wanted to deter- Data program coordination 2.06 2.71 3.31
Enterprise data integration 2.18 2.44 2.66
mine how the assessment in-
Data stewardship 1.98 2.18 2.40
forms our standing as a tech- Data development 1.57 2.12 2.46
nical community. Our research Data support operations 2.04 2.38 2.66
gives some indication of the
claimed current state of data
management practices. How- Figure 4. Average of unadjusted rates for the assessment results, by process.
ever, given the validation session
results, we believe that its best to caution readers that improve their data management practices. Organizations
the numbers presented probably more accurately can use this data as a baseline from which to look for,
describe the intended state of the data management describe, and measure improvements in the state of the
community. practice. Such information can enhance their under-
As it turns out, the relative number of organizations standing of the relative development of organizational
above level 1 for both software and data management data management. Other investigations should probe
are approximately the same, but a more detailed analy- further to see if patterns exist for specific industry or busi-
sis would be helpful. Given the belief that investment ness focus types.
in software development practices will result in signif- Building an effective business case for achieving a cer-
icant improvements, its appropriate to anticipate sim- tain level of data management is now easier. The failure
ilar benefits from investments in data management to adequately address enterprise-level data needs has
practices. hobbled past efforts.4 Data management has, at best, a
Finally, we hoped to gain information useful for devel- business-area focus rather than an enterprise outlook.
oping a roadmap for improving current practice. Likewise, applications development focuses almost
Organizations can use the survey assessment information exclusively on line-of-business needs, with little atten-
to develop roadmaps to improve their individual data tion to cross-business-line data integration or enterprise-
management practices. Mystery Airline, for example, wide planning, analysis, and decision needs (other than
could develop a roadmap for achieving data management within personnel, finance, and facilities management).
improvement by focusing on enterprise data integration, In addition, data management staff is inexperienced in
data stewardship, and data development practices. modern data management needs, focusing on data man-
agement rather than metadata management and on syn-
SUGGESTIONS FOR FUTURE RESEARCH taxes instead of semantics and data usage.
Additional research must include a look at relation-
ships between data management practice areas, which
F
could indicate an efficient path to higher maturity lev- ew organizations manage data as an asset. Instead,
els. Research should also explore the success or failure most consider data management a maintenance cost.
of previous attempts to raise the maturity levels of orga- A small shift in perception (from viewing data as a
nizational data management practices. cost to regarding it as an asset) can dramatically change
One of our goals was to determine why so many orga- how an organization manages data. Properly managed
nizational data management practices are below expec- data is an organizational asset that cant be exhausted.
tations. Several current theses could spur investigation Although data can be polluted, retired, destroyed, or
of the root causes of poor data management practices. become obsolete, its the one organizational resource that
For example, can be repeatedly reused without deterioration, provided
that the appropriate safeguards are in place. Further, all
Are poor data management practices a result of the organizational activities depend on data.
organizations lack of understanding? To illustrate the potential payoff of the work presented
Does data management have a poor reputation or here, consider what 300 software professionals applying
track record in the organization? software process improvement over an 18-year period
Are the executive sponsors capable of understanding achieved:16
the subject?
How have personnel and project changes affected They predicted costs within 10 percent.
the organization efforts? They missed only one deadline in 15 years.
The relative cost to fix a defect is 1X during inspec-
Our assessment results suggest a need for a more for- tion, 13X during system testing, and 92X during
malized feedback loop that organizations can use to operation.
December 2006 55
Early error detection rose from 45 to 95 percent 10. Carnegie Mellon Univ. Software Eng. Inst., Capability Matu-
between 1982 and 1993. rity Model: Guidelines for Improving the Software Process,
Product error rate (measured as defects per 1,000 1st ed., Addison-Wesley Professional, 1995.
lines of code) dropped from 2.0 to 0.01 between 11. M.C. Paulk and B. Curtis, Capability Maturity Model, Ver-
1982 and 1993. sion 1.1, IEEE Software, vol. 10, 1993, pp. 18-28.
12. D.R. Goldenson and D.L. Gibson, Demonstrating the Impact
If improvements in data management can produce and Benefits of CMM: An Update and Preliminary Results,
similar results, organizations should increase their matu- special report CMU/SEI-2003-SR-009, Carnegie Mellon Univ.
rity efforts. Software Eng. Inst., 2003, pp. 1-55.
13. C. Billings and J. Clifton, Journey to a Mature Software
Process, IBM Systems J., vol. 33, 1994, pp. 46-62.
Acknowledgments 14. C.C. Hauer, Data Management and the CMM/CMMI:
We thank Graham Blevins, David Rafner, and Santa Translating Capability Maturity Models to Organizational
Susarapu for their assistance in preparing some of the Functions, presented at National Defense Industrial Assoc.
reported data. We are greatly indebted to many of Peter Technical Information Division Symp., 2003; www.dtic.mil/
Aikens classes in data reengineering and related topics ndia/2003technical/hauer1.ppt.
at Virginia Commonwealth University for the careful 15. W. Schnider and K. Schwinn, Der Reifegrad des Datenman-
work and excellent results obtained as a result of their agements [The Data Management Maturity Model], KPP
various contributions to this research. This article also Consulting; www.kpp-consulting.ch/downloadbereich/DM%
benefited from the suggestions of several anonymous 20Maturity%20Model.pdf, 2004 (in German).
reviewers. We also acknowledge the helpful, continuing 16. H. Krasner, J. Pyles, and H. Wohlwend, A Case History of
work of Brett Chaplin at Allstate in collecting, apply- the Space Shuttle Onboard Systems Project, Technology
ing, and assessing CMMI-related efforts. Transfer 94092551A-TR, Sematech, 31 Oct. 1994.
56 Computer
Improving Data
Management
Practices
1 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Data Management
Practices Assessment
peter@datablueprint.com
2 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Peter Aiken
Full time in information technology since 1981
IT engineering research and project background
University teaching experience since 1979
Seven books and dozens of articles
Research Areas
reengineering, data reverse engineering, software requirements engineering, information engineering, human-
computer interaction, systems integration/systems engineering, strategic planning, and DSS/BI
Director
George Mason University/Hypermedia Laboratory (1989-1993)
Published Papers
Communications of the ACM, IBM Systems Journal, InformationWEEK, Information & Management, Information
Resources Management Journal, Hypermedia, Information Systems Management, Journal of Computer
Information Systems and IEEE Computer & Software
DoD Computer Scientist
Reverse Engineering Program Manager/Office of the Chief Information Officer (1992-1997)
Visiting Scientist
Software Engineering Institute/Carnegie Mellon University (2001-2002)
DAMA International Advisor/Board Member (http://dama.org)
2001 DAMA International Individual Achievement Award (with Dr. E. F. "Ted" Codd)
2005 DAMA Community Award
Founding Advisor/International Association for Information and Data Quality (http://iaidq.org)
Founding Advisor/Meta-data Professionals Organization (http://metadataprofessional.org)
Founding Director Data Blueprint 1999
3 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
http://peteraiken.net
Contact Information:
Data Blueprint
Maggie L. Walker Business & Technology Center
501 East Franklin Street
Richmond, VA 23219
804.521.4056
http://datablueprint.com
office :+1.804.883.759
cell:+1.804.382.5957
e-mail:peter@datablueprint.com
http://peteraiken.net
4 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Organizations Surveyed
Results from
International Organizations
10%
Local Government
more than 400
4% organizations
State Government Agencies 32%
17%
government
Appropriate
public company
representation
Federal Government Enough data to
11%
demonstrate
European
organization DM
Public Companies
58% practices are
generally more
mature
5 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
0.36
0.27
0.18
0.09
Successful 0
Partial Success
Don't know/too soon to tell
Unsuccessful
Does not exist
In 25 years:
1981
2007
6 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Investment <= Return
10%
Largely
Ineffective
DM
Investments
Return ! 0
70%
Investment > Return
20%
8 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
September 21, 2004
9 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Hmm
Confusion
Correct Name:
Yusuf Islam
TSA No Fly
Listing:
Youssouf Islam
10 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
15,000 people
15,000 want off the US terror watch
appealedlist
to be
removed from list
2,000 month
requesting removal
TSA promised 30 day
review process
Actual time is 44 days
American Civil Liberties
Union estimates 1
million people on US
government watch lists
11 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
DM Involvement
Initiative Leader Initiative Involvement Not Involved
Data Warehousing
XML
Data Quality
15 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
16 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Data Management
"Understanding the
current and future
data needs of an
enterprise and
making that data
effective and
efficient in
supporting business
activities"
Aiken, P, Allen, M. D., Parker, B., Mattia, A.,
"Measuring Data Management's Maturity: A
Community's Self-Assessment" IEEE Computer
(research feature April 2007)
17 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Intelligence
Data
Information Use
Data
Data Request
Data
Data
Fact Meaning
Data Data
Database design
Database operation
Data requirements analysis
Data modeling
Enterprise data management coordination
Enterprise data integration
Data stewardship
Data use
Data Quality, Data Security
Data Compliance, Mashups
(more)
19 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
DM Practice Evolution
1.0
0.8
0.6
0.4
0.2
Jan 1, 1978
Jan 1, 1979
Jan 1, 1980
Jan 1, 1981
Jan 1, 1982
Jan 1, 1983
Jan 1, 1984
Jan 1, 1985
Jan 1, 1986
Jan 1, 1987
Jan 1, 1988
Jan 1, 1989
Jan 1, 1990
Jan 1, 1991
Jan 1, 1992
Jan 1, 1993
Jan 1, 1994
Jan 1, 1995
Jan 1, 1996
Jan 1, 1997
Jan 1, 1998
Jan 1, 1999
Jan 1, 2000
Jan 1, 2001
Jan 1, 2002
Jan 1, 2003
Jan 1, 2004
Jan 1, 2005
Jan 1, 2006
Jan 1, 2007
20 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Organizational DM Functions and their Inter-relationships
Organizational Strategies Implementation
Data Program Guidance
Coordination
Goals
Organizational Integrated
Data Integration Models
Data Data
Stewardship Standard Development
Data
Application
Models & Designs
Business Value
21 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Payroll Data
(database)
Payroll Application Finance
(3rd GL) Data
(indexed)
Marketing Data
Marketing Application
(external database)
(4rd GL, query facilities,
no reporting, very large)
Personnel Data
(database)
Personnel App.
(20 years old,
un-normalized data)
R&D Mfg. Data
Data (home grown
(raw) database) Mfg. Applications
(contractor supported)
R& D Applications
(researcher supported, no documentation)
23 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Nicolo Machiavelli
(1469-1527)
26 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Sample Conversation (Developing Constraints)
28 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Concrete Block & Engineering Continuity
29 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Look Familiar?
30 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Finance Example
Business Rule:
A customer may have one and only one account
Bank Manager: # Account ID Sorted IDs
The customer 1 peter peter
is always 2 peter1 peter1
right ... 3 peter2 peter10
And this one 4 peter3 peter2
needs multiple 5 peter4 peter3
accounts! 6 peter5 peter4
7 peter6 peter5
8 peter7 peter6
9 peter8 peter7
10 peter9 peter8
11 peter10 peter9
31 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Architecture Jargon
32 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Avoiding Unnecessary Work Using Business Rule Metadata
BR1) Zero, one, or more
Person EMPLOYEES can be Job Class
associated with one PERSON
BR4) One or
more
'Mond-Licht' BR2) Zero, one, or more POSITIONS
or EMPLOYEES can be associated
can be
with one JOB CLASS;
'Mondschein' associated
with one JOB
CLASS.
BR3) Zero, one, or more EMPLOYEES can be associated with one POSITION
33 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Student
System
Data
Model
34 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Proposed Data Model
35 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Derivations/Constraints Model
Records the rules for deriving legal
values for instances of
Extension Derivations/
Entity-Relationship Model Global Text
Support Model Constriants
components, and for controlling the Model
Model
use or existence of E-R instance.
Enterprise Structure Model
Defines the scope of the enterprise
to be modeled. Assigns a name to the
model that serves to qualify each
component of the model. Application Application
Entity-Relationship Model Structure
Build Model
Defines the Business Entities, their Model
properties (attributes) and the Program
relationships they have with other Elements
Business Entities. Model
IMS Structure
Extension Support Model DB2 Model
Model
Provides for tactical Information
Model extensions to support special
tool needs.
Flow Model Relational Data
Specifies which of the Entity Library Panel/ Screen
Database Test Model Structure
Relationship Model component Model Model
Model Model
instances are passed between
Process Model components.
Library Model Program Elements Model Strategy Model
Global Text Model Records the existence of Identifies the various pieces and Records business strategies to
Supports recording of extended non-repository files and the role they elements of application program resolve problems, address goals,
descriptive text for many of the play in defining and building an source that serve as input to the and take advantage of business
Information Model components. automated Business Application. application build process. opportunities. It also records
IMS Structures Model Organization/Location Model Resource/Problem Model the actions and steps to be taken.
Defines the component structures Records the organization structure Identifies the problems and needs Test Model
and elements and the application and location definitions for use in of the enterprise, the projects Identifies the various file (test
program views of an IMS Database. describing the enterprise. designed to address those needs, procedures, test cases, etc.)
Info Usage Model Panel/Screen Model and the resources required. affiliated with an automated
Specifies which of the Identifies the Panels and Screens and business Application for use in
Relational Database Model
Entity-Relationship Model the fields they contain as elements testing that application.
Describes the components of a
component instances are used by used in an automated Business Relational Database design in Value Domain Model
other Information Model Application. terms common to all SAA Defines the data characteristics
components. relational DBMSs. and allowed values for
Process Model
Defines Business Processes, their information items.
36 - datablueprint.com sub processes and
Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
components.
37 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
38 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Archeology-based Transformations Solve a Puzzle
Primary sources of guidance:
The edge-pieces are easy to identify
Distinct physical piece features exist, such as
colors, patterns, pictures, etc.
Steps for solving:
Physically segregate all identified edge
pieces (not always present in existing environment.)
Create puzzle framework - connecting edge pieces using the puzzle picture
Within frame, physically group remaining pieces by distinct physical
features
Solve a smaller section of the puzzle containing just a portion of the picture
that is focused on similar physical features such as a ball or a puppy as
images in the picture. This is an effective approach because the
Focus is on a common domainone distinct aspect of the entire picture
Because it focuses the analysis on a smaller number of puzzle pieces it
is proportionately smaller than attempting to solve the overall puzzle at
once.
As the components are assembled, combine them to solve the complete
puzzle.
39 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
40 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Flood
41 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
42 - datablueprint.com
New River Bridge
Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Bridge Engineering
43 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Source: Pamela F. Olson former Assistant Secretary for Tax Policy (quote from the Diane Rehm Show 11/29/04 http://
www.wamu.org/programs/dr/04/11/29.php)
44 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
45 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Data Integration/Exchange
Challenges
46 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
FBI & Canadian Social Security Gender Codes
1. Male
2. Female If column 1 in
3. Formerly male now female source = "m"
then set
4. Formerly female now male value of
5. Uncertain target data
to "male"
6. Won't tell else set
value of
7. Doesn't know target data
8. Male soon to be female to "female"
Platform: Amdahl
Legacy System OS: MVS Platform: UniSys Legacy System
#1: Payroll 1998 Age: 15 OS: OS #2: Personnel
Data Structure: VSAM/virtual 1998 Age: 21
database Data Structure: DMS (Network)
tables Physical Records: 4,950,000
Physical Records: 780,000 Logical Records: 250,000
Logical Records: 60,000 Relationships: 62
Relationships: 64 Entities: 57
Entities: 4/350 Attributes: 1478
Attributes: 683
New System
48 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
May 27, '96 Jun 3, '96
ID Task Name Duration Cost Work S S M T W T
1 1000 ORGANIZATION 18.01d $128,335.99 82.44d
2 1100 Organize Project 18d $42,585.33 27.36d Technology Consultant[0.55],Experi
3 1200 Complete Work Program 18d $71,739.42 46.08d Technology Consultant[0.92],Consu
Page 1
49 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Joseph R. Hudicka "Why ETL and Data Migration Projects Fail" Oracle Developers Technical Users Group Journal June 2005 pp. 29-31
51 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Data Data
Stewardship Standard Development
Data
Application
Models & Designs
Business Value
52 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
New Technical Expertise Required
Why?
54 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Metadata Engineering
O-1/3 reconstitute original metadata
O-4/5 improve the current metadata
O-6/9 improve system data capabilities based on the improved metadata
Reverse Engineering
Recon-
stitute
Data
O7 Re- Design
develop O6
O3 Recreate Require- Redesign O1 Recreate Data
O2 Recreate
Requirements ments Data Implementation
Data Design
To Be To Be To Be Data
Requirements Design Metadata Implementation
Assets Assets Assets
New
O8 O9
Redesign Reimplement
Data Data
Forward engineering
55 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
56 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Component
Existing Structure Pareto Phase I: Archeology-based Transformations designed to
Environment Analysis Analysis understand the existing environment Phase II: Developing the desired architecture
T8 - CM2-
T4 -
T1 - T2 - T5 - Potential T7 - based
Component Component Pareto T3 - Planning Repeatability T6 - Gap
Implementation Technology Capabilities Solutions Component
structure is structure is Subset is Modeling Integration Reusability Analyses
Filtering Analyses Engineering Implementatio
unknown discovered Hypothesized Combing
n Engineering
Component 1
Component 2
Component 3
Component 4
Component 5
Component 6
Component 7
Component 8
Component 9
Unknown Component 10
collection Component 11
of Component 12
components Component 13
Component 14
Component 15
Component 16
Component 17
Component 18
Component 19
Component 20
Component 21
Component 22
Component
Component
23
24
Structured Data Engineering
Component 25
57 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Beneficiaries Parts
Inventory
Finance/Banking Universities Healthcare Retailers Utilities
Customers Students Patient Customers Customers
Accounts Instructors Suppliers Loyalty Programs Suppliers
Products Courses Treatments Suppliers Services
Branches Enrollments Hospitals Products Installations
Facilities Doctors Orders Utilization
Classrooms Nurses Inventory
Exams & testing Medications
Adapted from Data Strategy by Sid Adelman, Larissa Moss and Majid Abai (2005) Addison-Wesley Professional, ISBN: 0321240995
58 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
How many interfaces are required to solve this integration problem?
15 Interfaces
(N*(N-1))/2
Integration Processor
60 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Typical System Evolution Finance Application
(3rd GL, batch
system, no source)
Payroll Data
(database)
Payroll Application Finance
(3rd GL) Data
(indexed)
Marketing Data
Marketing Application
(external database)
(4rd GL, query facilities,
no reporting, very large)
Personnel Data
(database)
Personnel App.
(20 years old,
un-normalized data)
R&D Mfg. Data
Data (home grown
(raw) database) Mfg. Applications
(contractor supported)
R& D Applications
(researcher supported, no documentation)
61 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Payroll Data
(database)
Payroll Application Finance
XML Processor
(3rd GL) Data
(indexed)
XML Processor
XML Processor
Marketing Data
Marketing Application
(external database)
(4rd GL, query facilities,
no reporting, very large)
XML Processor
Personnel Data
(database)
Personnel App. XML Processor
(20 years old,
un-normalized data)
R&D Mfg. Data
XML Processor (home grown
Data
(raw) database) Mfg. Applications
(contractor supported)
R& D Applications
(researcher supported, no documentation)
62 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
3-Way Scalability XML-based Integration Solution
Expand the:
1. Number of data items XML Processor
from each system
How many individual
data items are tagged? Application 4 Application 5 Application 6
interconnections
between the systems and the hub
How many systems are connected to the hub?
3. Amount of interconnectability among hub-
connected systems
How many inter-system data item
transformations exist in the rule collection?
63 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Existing
System 1
System 2 System 4
System 3
System 5
System 6
64 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
XML-Based Meta Data Management
Existing New
System 1 System-to-System Program
Transformation Knowledge
XSLT
System 2 Transformations Transformations
System 4
Data Store
XSLT 3
System
Transformations
Generated
Programs
System 5
XSLT
Transformations
System 6
65 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
[Adapted from Terry Lanham Designing Innovative Enterprise Portals and Implementing Them Into Your
66 - datablueprint.com
Content Strategies Lockheed Martins Compelling Case Study Web Content II: Leveraging Best-of-Breed
Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Content Strategies - San Francisco, CA 23 January 2001]
Portal Solution
[Adapted from Terry Lanham Designing Innovative Enterprise Portals and Implementing Them Into Your
Content Strategies Lockheed Martins Compelling Case Study Web Content II: Leveraging Best-of-Breed
Content Strategies - San Francisco, CA 23 January 2001]
67 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
68 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
69 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Cruiser Collector
70 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
1996 Council of American
Capability Maturity
Building Officials (COBE) and the
Model Levels
2000 International
We have a process for
Code Councilimproving our DM capabilities Optimizing
recommendations call for unit (5)
runs to be not less than 10
inches and unit rises not more One concept for
We manage our DM processes so that process improvement,
than 7! inches. others include:
the whole organization can follow our
standard DM guidance Managed Norton Stage Theory
(4) TQM
TQdM
TDQM
We have experience that we have ISO 9000
standardized so that all in the Defined and focus on
organization can follow it understanding current
(3) processes and
determining where
improvements can be
made.
71 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
while the same pattern generally holds true for on-time performance
Percentage of Projects on Time
By Process Framework Adoption
Source: Applications Executive Council, Applications Budget, Spend, and Performance Benchmarks: 2005 Member Survey
Results, Washington D.C.: Corporate Executive Board 2006, p. 23.
72 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Organizational DM Functions and their Inter-relationships
Organizational Strategies
Implementation
Data Program Guidance
Coordination
Goals
Organizational Integrated
Data Integration Models
Data Data
Stewardship Standard Development
Data
Application
Models & Designs
Business Value
73 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Organizational
Defining, DM Functions
coordinating, resourcing, implementing, and their Inter-relationships
and monitoring organizational
data program strategies, policies, plans, etc. as coherent set of activities.
Organizational Strategies
Implementation
Data Program Identifying, modeling, coordinating, organizing, distributing, and architecting
Guidance
Coordination data shared across business areas or organizational boundaries.
Goals
Organizational Integrated
Data Integration Models
74 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Organizational DM Functions and their Inter-relationships
Data management
processes and
infrastructure
Organizational Strategies Implementation
Data Program Guidance
Coordination
Goals
Organizational-
entity subject area Data Data
data Stewardship Development
integration Standard
Data
Application
Models & Designs
Business Value
75 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
How is it done?
Assessment Benefits
Quantitative Benefits
Objective determination of
baseline BI/Analytic capabilities
Gap analysis indicates specific actions
required to achieve the "next" level
Available comparisons with similar
organizations
Provides facts useful when prioritizing
subsequent investments
Qualitative Benefits
Highlights strengths, weaknesses,
capabilities, and limitations existing BI/
78 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Data Management Practices
Measurement (DMPA)
Collaboration with
CMU's Software
Engineering
Institute (SEI)
Results from more
than 400
Optimizing (V)
Repeatable (II)
Managed (IV)
Defined (III)
organizations
Initial (I)
Public Companies
State Government
Agencies
Data Program Coordination
Federal Government
Focus: Guidance
Organizational Data Integration and Facilitation International
Organizations
Data Stewardship
Defined industry
Data Development Focus: standard
Implementation and
Access
Data Support Operations
3279 - datablueprint.com
- datablueprint.com Copyright01/1/08 and
Copyright previous
07/23/08 by years by Data Blueprint
Data Blueprint - all rights -reserved!
all rights reserved!
3 3.0
2.4
2.3
2.2
2 2.0 2.0
1.2
0
Development Data Support Systems Asset Recovery Development
Guidance Adminstration Capability Training
Verified Average
80 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Comparative Assessment Results
Challenge
Data Program Coordination
Challenge
Organizational Data Integration
Data Stewardship
Data Development
Challenge
Data Support Operations
0 1 2 3 4 5
Client
Nokia Industry Competition All Respondents
81 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Data Quality
0 1 2 3 4 5
Page Overall Benchmarks Industry Benchmarks TRE IFC ISG
The challenge ahead
5.00
The chart represents the average scores
presented on the previous slide -
interesting that none have apparently
reached level-3
4.00
3.00
2.00
1.00
0.00
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28
83 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
85 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Services
Integration Possibilities
User Interface
Business Process
Application
Data
AV Component
Well defined components
Self-contained
No interdependencies
Analogy derived from D. Barry "Web Services" Intelligent Enterprise 10/10/03 pp. 26-47 - wiring diagram from sunflowerbroadband.com
86 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Contractor Implemented Wiring
87 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Bank
89 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Organizational Portal
Organizational News
Organizational IT Email
90 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Legacy Systems Transformed Into Web-services Accessed Through a Portal
2591
- datablueprint.com
- datablueprint.com Copyright
Copyright
01/1/08 and
11/03/08
previous
by Data
yearsBlueprint
by Data Blueprint
- all rights- reserved!
all rights reserved!
Solution Framework
External Address
Validation Processing
Channels
Repository Ch 1
SORs
SOR 1 Ch 2
Indicator
Extraction Ch 3
SOR 2 Service Latency
(could be Check Ch 4
SOR 3 segmented by Service
day of week Ch 5
month,
SOR 4 system, etc.) Ch 6
Customer
SOR 5 Contact Ch 7
SOR 6 Ch 8
SOR 7 Update
Addresses
SOR 8
92 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Logical
Extension
Text
93 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Logical
Extension
94 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
CONTENTS COVER STORY
>> DEFINING THE BUSINESS VALUE OF TECHNOLOGY
3595 - datablueprint.com
- datablueprint.com
21 Small world 22 Backpedaling infrastructures from the likes
of Amazon.com
Copyright
01/1/08 and
Copyright and Google
previous
07/23/08 by years by Data Blueprint
Data Blueprint - all rights -reserved!
all rights reserved!
WOA
http://hinchcliffe.org/archive/2008/02/27/16617.aspx
- datablueprint.com
35 - datablueprint.com Copyright01/1/08 and
Copyright previous
07/23/08 by years by Data Blueprint
Data Blueprint - all rights -reserved!
all rights reserved!
SOA & Data & ???
97 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
SOA Requirements
Data Stewardship
Data Development
99 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
What is Analytics?
Analytics:
Something that is analytic
Analytic:
Of or relating to analysis;
especially; separating or
breaking up a whole or a
compound into it component
parts or constituent elements
Skilled in or using analysis
The science of logical analysis
101 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
102 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
BI/Analytic Capabilities
103 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
BI/Analytic Capabilities
104 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
BI/Analytic Capabilities
BI Challenges
Technical Challenges
Poor quality data
Poor understanding of architectural constructs
Poor quality data management practices
New technical expertise is required
Non-Technical Challenges
Architecture is under appreciated
BI perceived as a "technology" project
Inability to link technical capabilities to
business objectives
Putting BI initiatives in context
106 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Obstacles to Real-Time BI-Lessons from Deployment
108 - datablueprint.com
Thanks to Bret Champlin Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Who is Joan Smith?
http://www.sas.com
109 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Defining Customer
Challenges
Purchased an A4
on June 15 2007
Had not done
business with the
dealership prior
"makes them
seem sleazy
when I get a
letter in the mail
before I've even
made the first
payment on the
car advertising
lower payments
than I got"
110 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Defining Customer
Challenges
Purchased an A4
on June 15 2007
Had not done
business with the
dealership prior
"makes them
seem sleazy
when I get a
111 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
112 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
A congratulations
letter from another
bank
Problems
Bank did not know
it made an error
Tools alone could
not have prevented
this error
Lost confidence in
the ability of the
bank to manage
customer funds
113 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
114 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Rolling Stone Magazine
TRA- - datablueprint.com Copyright 11/11/08 and prior years by Data Blueprint - all rights reserved!
Quantitative Benefits
116 - datablueprint.com Copyright 01/1/08 and previous years by Data Blueprint - all rights reserved!
Evidence
Type
Evidence
System
Component
Component
Element
Location
User Type
Information
Business
Processes
Business Intelligence
Increased business
perception of DM
value resulting from
http://peteraiken.net
System
Component
Logical
Process
better business
XML-based Portals
Data
Type Attribute
Logical
Data Entity
Model
Decomposition
systems including
repositories,
warehouses, ERP
01101001
01100100 implementations
01110010
Business
Rules
Data Assets
XML-based Repositories
XML
Revised Data
Management
Challenge #4 Goals
Challenge #3
Challenge #1 Data Analysis
Challenge #2
Challenge #1 Technologies
Quality
Contact Information:
Tomorrow's Data Management
10 - datablueprint.com Copyright 2004 by Data Blueprint - all rights reserved!
Data Blueprint
Maggie L. Walker Business & Technology Center
501 East Franklin Street
Richmond, VA 23219
804.521.4056
http://datablueprint.com
office :+1.804.883.759
cell:+1.804.382.5957
e-mail:peter@datablueprint.com
http://peteraiken.net