Data Warehousing: Engr. Madeha Mushtaq Department of Computer Science Iqra National University
Data Warehousing: Engr. Madeha Mushtaq Department of Computer Science Iqra National University
LECTURE 4
• Imagine a filing cabinet stuffed with documents without any folders and
labels.
• Without metadata, your data warehouse is like such a filing cabinet.
• It is probably filled with information very useful for your users and for IT
developers and administrators.
• But without any easy means to know what is there, the data warehouse is of
very limited value.
WHO NEEDS METADATA?
METADATA IS LIKE A NERVE CENTER
• As the data movement takes place from the data sources to the data
warehouse database through the data staging area, several processes occur.
• In a typical data warehouse, appropriate tools assist in these processes.
• Each tool records its own metadata as data movement takes place.
• The metadata recorded by one tool drives one or more processes that
follow. This is how metadata assumes an active role and assists in the
automation of data warehouse processes.
AUTOMATION OF WAREHOUSING TASKS
• Here is a list of back-end processes shown in the order in which they generally
occur:
• Source data structure definition
• Data extraction
• Initial reformatting/merging
• Preliminary data cleansing
• Data transformation and consolidation
• Validation and quality check
• Data warehouse structure definition
• Load image creation
AUTOMATION OF WAREHOUSING TASKS
• In this area, the data warehouse processes relate to the following functions:
• Data extraction
• Data transformation
• Data cleansing
• Data integration
• Data staging
DATA ACQUISITION
Figure shows metadata types recorded and used in the data acquisition area.
DATA STORAGE
• In this area, the data warehouse processes relate to the following functions:
• Data loading
• Data archiving
• Data management
DATA STORAGE
• Just as in the other areas, as processes take place in the data storage
functional area, the appropriate tools record the metadata elements relating
to the processes.
• Metadata recorded by processes in the data storage area is used for
development, administration, and by the users.
• You will be using the metadata from this area for designing the full data
refreshes and the incremental data loads.
• The DBA will be using metadata for the processes of backup and recovery.
INFORMATION DELIVERY
• In this area, the data warehouse processes relate to the following functions:
• Report generation
• Query processing
• Complex analysis
INFORMATION DELIVERY
Figure shows metadata types recorded and used in the information delivery area.
METADATA TYPES
• Technical metadata is meant for the IT staff responsible for the development
and administration of the data warehouse.
• The technical personnel need information to design each process.
• These are processes in every functional area of the data warehouse.
• Technical metadata is more structured than business metadata.
METADATA REPOSITORY