0% found this document useful (0 votes)
319 views13 pages

Characteristics and Functions of Data Warehouse

The document discusses the key characteristics and functions of data warehouses. It describes data warehouses as subject-oriented, integrated, and time-variant stores of non-volatile data that are used for analysis and insights rather than transactions. The major characteristics are focused on a specific subject or theme, integrated from multiple sources, organized over time, and do not allow updating of stored data. The key functions are consolidating, cleaning, and integrating data from different sources to facilitate analysis and inform decision-making.

Uploaded by

Mustefa Mohammed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
319 views13 pages

Characteristics and Functions of Data Warehouse

The document discusses the key characteristics and functions of data warehouses. It describes data warehouses as subject-oriented, integrated, and time-variant stores of non-volatile data that are used for analysis and insights rather than transactions. The major characteristics are focused on a specific subject or theme, integrated from multiple sources, organized over time, and do not allow updating of stored data. The key functions are consolidating, cleaning, and integrating data from different sources to facilitate analysis and inform decision-making.

Uploaded by

Mustefa Mohammed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 13

Characteristics and Functions of Data

warehouse
 Last Updated : 22 Oct, 2018
Prerequisite – Data Warehousing
Data warehouse can be controlled when the user has a shared way of explaining the
trends that are introduced as specific subject. Below are major characteristics of data
warehouse:

1. Subject-oriented –
A data warehouse is always a subject oriented as it delivers information about a theme
instead of organization’s current operations. It can be achieved on specific theme.
That means the data warehousing process is proposed to handle with a specific theme
which is more defined. These themes can be sales, distributions, marketing etc.

A data warehouse never put emphasis only current operations. Instead, it focuses on
demonstrating and analysis of data to make various decision. It also delivers an easy
and precise demonstration around particular theme by eliminating data which is not
required to make the decisions.
2. Integrated –
It is somewhere same as subject orientation which is made in a reliable format.
Integration means founding a shared entity to scale the all similar data from the
different databases. The data also required to be resided into various data warehouse
in shared and generally granted manner.

A data warehouse is built by integrating data from various sources of data such that a
mainframe and a relational database. In addition, it must have reliable naming
conventions, format and codes. Integration of data warehouse benefits in effective
analysis of data. Reliability in naming conventions, column scaling, encoding
structure etc. should be confirmed. Integration of data warehouse handles various
subject related warehouse.
3. Time-Variant –
In this data is maintained via different intervals of time such as weekly, monthly, or
annually etc. It founds various time limit which are structured between the large
datasets and are held in online transaction process (OLTP). The time limits for data
warehouse is wide-ranged than that of operational systems. The data resided in data
warehouse is predictable with a specific interval of time and delivers information
from the historical perspective. It comprises elements of time explicitly or implicitly.
Another feature of time-variance is that once data is stored in the data warehouse then
it cannot be modified, alter, or updated.
4. Non-Volatile –
As the name defines the data resided in data warehouse is permanent. It also means
that data is not erased or deleted when new data is inserted. It includes the mammoth
quantity of data that is inserted into modification between the selected quantity on
logical business. It evaluates the analysis within the technologies of warehouse.

In this, data is read-only and refreshed at particular intervals. This is beneficial in


analysing historical data and in comprehension the functionality. It does not need
transaction process, recapture and concurrency control mechanism. Functionalities
such as delete, update, and insert that are done in an operational application are lost in
data warehouse environment. Two types of data operations done in the data
warehouse are:
 Data Loading
 Data Access
Functions of Data warehouse:
It works as a collection of data and here is organized by various communities that
endures the features to recover the data functions. It has stocked facts about the tables
which have high transaction levels which are observed so as to define the data
warehousing techniques and major functions which are involved in this are mentioned
below:
1. Data consolidation
2. Data Cleaning
3. Data Integration
Data Warehousing
 Difficulty Level : Easy
 Last Updated : 28 Jun, 2021
Background 
A Database Management System (DBMS) stores data in the form of tables, uses ER
model and the goal is ACID properties. For example, a DBMS of college has tables for
students, faculty, etc. 
A Data Warehouse is separate from DBMS, it stores a huge amount of data, which is
typically collected from multiple heterogeneous sources like files, DBMS, etc. The goal
is to produce statistical results that may help in decision makings. For example, a college
might want to see quick different results, like how is the placement of CS students has
improved over the last 10 years, in terms of salaries, counts, etc. 
Need of Data Warehouse 
An ordinary Database can store MBs to GBs of data and that too for a specific purpose.
For storing data of TB size, the storage shifted to Data Warehouse. Besides this, a
transactional database doesn’t offer itself to analytics. To effectively perform analytics,
an organization keeps a central Data Warehouse to closely study its business by
organizing, understanding, and using its historic data for taking strategic decisions and
analyzing trends. 
Data Warehouse vs DBMS 
 

Example Applications of Data Warehousing 


Data Warehousing can be applied anywhere where we have a huge amount of data and
we want to see statistical results that help in decision making. 
 
 Social Media Websites: The social networking websites like Facebook, Twitter,
Linkedin, etc. are based on analyzing large data sets. These sites gather data related to
members, groups, locations, etc., and store it in a single central repository. Being a
large amount of data, Data Warehouse is needed for implementing the same.
 Banking: Most of the banks these days use warehouses to see the spending patterns
of account/cardholders. They use this to provide them special offers, deals, etc.
 Government: Government uses a data warehouse to store and analyze tax payments
which are used to detect tax thefts.
There can be many more applications in different sectors like E-Commerce,
telecommunications, Transportation Services, Marketing and Distribution, Healthcare,
and Retail. 

Data Warehousing
 Difficulty Level : Easy
 Last Updated : 28 Jun, 2021
Background 
A Database Management System (DBMS) stores data in the form of tables, uses ER
model and the goal is ACID properties. For example, a DBMS of college has tables for
students, faculty, etc. 
A Data Warehouse is separate from DBMS, it stores a huge amount of data, which is
typically collected from multiple heterogeneous sources like files, DBMS, etc. The goal
is to produce statistical results that may help in decision makings. For example, a college
might want to see quick different results, like how is the placement of CS students has
improved over the last 10 years, in terms of salaries, counts, etc. 
Need of Data Warehouse 
An ordinary Database can store MBs to GBs of data and that too for a specific purpose.
For storing data of TB size, the storage shifted to Data Warehouse. Besides this, a
transactional database doesn’t offer itself to analytics. To effectively perform analytics,
an organization keeps a central Data Warehouse to closely study its business by
organizing, understanding, and using its historic data for taking strategic decisions and
analyzing trends. 
Data Warehouse vs DBMS 
 
Example Applications of Data Warehousing 
Data Warehousing can be applied anywhere where we have a huge amount of data and
we want to see statistical results that help in decision making. 
 
 Social Media Websites: The social networking websites like Facebook, Twitter,
Linkedin, etc. are based on analyzing large data sets. These sites gather data related to
members, groups, locations, etc., and store it in a single central repository. Being a
large amount of data, Data Warehouse is needed for implementing the same.
 Banking: Most of the banks these days use warehouses to see the spending patterns
of account/cardholders. They use this to provide them special offers, deals, etc.
 Government: Government uses a data warehouse to store and analyze tax payments
which are used to detect tax thefts.
There can be many more applications in different sectors like E-Commerce,
telecommunications, Transportation Services, Marketing and Distribution, Healthcare,
and Retail. 

KDD Process in Data Mining


 Difficulty Level : Medium
 Last Updated : 02 Aug, 2021
Data Mining – Knowledge Discovery in Databases(KDD). 
Why we need Data Mining? 
Volume of information is increasing everyday that we can handle from business
transactions, scientific data, sensor data, Pictures, videos, etc. So, we need a system that
will be capable of extracting essence of information available and that can automatically
generate report, 
views or summary of data for better decision-making. 
Why Data Mining is used in Business? 
Data mining is used in business to make better managerial decisions by: 
 
 Automatic summarization of data
 Extracting essence of information stored.
 Discovering patterns in raw data.
Data Mining also known as Knowledge Discovery in Databases, refers to the nontrivial
extraction of implicit, previously unknown and potentially useful information from data
stored in databases. 
Steps Involved in KDD Process: 

 KDD process
 
1. Data Cleaning: Data cleaning is defined as removal of noisy and irrelevant data from
collection. 
 Cleaning in case of Missing values.
 Cleaning noisy data, where noise is a random or variance error.
 Cleaning with Data discrepancy detection and Data transformation tools.
2. Data Integration: Data integration is defined as heterogeneous data from multiple
sources combined in a common source(DataWarehouse). 
 Data integration using Data Migration tools.
 Data integration using Data Synchronization tools.
 Data integration using ETL(Extract-Load-Transformation) process.
3. Data Selection: Data selection is defined as the process where data relevant to the
analysis is decided and retrieved from the data collection. 
 Data selection using Neural network.
 Data selection using Decision Trees.
 Data selection using Naive bayes.
 Data selection using Clustering, Regression, etc.
4. Data Transformation: Data Transformation is defined as the process of transforming
data into appropriate form required by mining procedure. 
Data Transformation is a two step process: 
 Data Mapping: Assigning elements from source base to destination to capture
transformations.
 Code generation: Creation of the actual transformation program.
5. Data Mining: Data mining is defined as clever techniques that are applied to extract
patterns potentially useful. 
 Transforms task relevant data into patterns.
 Decides purpose of model using classification or characterization.
6. Pattern Evaluation: Pattern Evaluation is defined as identifying strictly increasing
patterns representing knowledge based on given measures. 
 Find interestingness score of each pattern.
 Uses summarization and Visualization to make data understandable by user.
7. Knowledge representation: Knowledge representation is defined as technique which
utilizes visualization tools to represent data mining results. 
 Generate reports.
 Generate tables.
 Generate discriminant rules, classification rules, characterization rules, etc.
Note: 
 
 KDD is an iterative process where evaluation measures can be enhanced, mining can
be refined, new data can be integrated and transformed in order to get different and
more appropriate results.
 Preprocessing of databases consists of Data cleaning and Data Integration.
 Benefits Of A Data Warehouse
 1. Enables Historical Insight
 No business can survive without a large and accurate storehouse of historical
data, from sales and inventory data to personnel and intellectual property
records. If a business executive suddenly needs to know the sales of a key
product 24 months ago, the rich historical data provided by a data warehouse
make this possible.
 Also important, a data warehouse can add context to this historical data by
listing all the key performance trends that surround this retrospective research.
This kind of efficiency cannot be matched by a legacy database.
 2. Enhances Conformity And Quality Of Data
 Your business generates data in myriad different forms, including structured
and unstructured data, data from social media, and data from sales campaigns.
A data warehouse converts this data into the consistent formats required by
your analytics platforms. Moreover, by ensure this conformity, a data
warehouse ensures that the data produced by different business divisions is at
the same quality and standard – allowing a more efficient feed for analytics.
 3. Boosts Efficiency
 It’s very time consuming for a business user or a data scientist to have to gather
data from multiple sources. It’s far more advantageous for this data to be
gathered in one place, hence the benefit of a data warehouse.
 Additionally, if for instance your data scientist needs data to run a fast report,
they don’t need to get the assistance from tech support to perform this task. A
data warehouse makes this data readily available – in the correct format –
improving efficiency of the entire process.
 4. Increase The Power And Speed Of Data Analytics
 Business intelligence and data analytics are the opposite of instinct and
intuition. BI and analytics require high quality, standardized data – on time and
available for rapid data mining. A data warehouse enables this power and
speed, allowing competitive advantage in key business sectors, ranging from
CRM to HR to sales success to quarterly reporting.
 5. Drives Revenue
 A tech pundit opined that “data is the new oil,” referring to the high dollar
value of data in today’s world. Creating more standardized and better quality
data is the key strength of a data warehouse, and this key strength translates
clearly to significant revenue gains. The data warehouse formula works like
this: Better business intelligence helps with better decisions, and in turn better
decisions create a higher return on investment across any sector of your
business.
 Most important, these revenue gains build on themselves over time, as better
decisions strengthen the business.
 In short, a high quality, fully scalable data warehouse can be seen as less of a
cost and more of an investment – one that adds exponential value like few other
investments that businesses make.
 6. Scalability
 The top key word in the cloud era is “scalable” and a data warehouse is a
critical component in driving this scale. A topflight data warehouse is itself
scalable, and also enables greater scalability in the business overall.
 That is, today’s sophisticated data warehouse are built to scale, handling ever
more queries as the business grows (though this will require more supporting
hardware). Additionally, the efficiency in data flow enabled by a data
warehouse greatly boosts a business’s growth – this growth is the core of
business scalability.
 7. Interoperates With On-Premise And Cloud
 Unlike the legacy databases of yesteryear, today’s data warehouses are built
with multicloud and hybrid cloud in mind. Many data warehouses are now fully
cloud-based, and even those that are built for on-premise typically will
interoperate well with the cloud-based portion of a company’s infrastructure.
As an additional important side point: this cloud-based focus also means that
mobile users are better able to access the data warehouse – this is beneficial for
sales reps in particular.
 8. Data Security
 A number of key advances in data warehouse have enhanced their security,
which enhances the overall security of company data. Among these advances
are techniques like a “slave read only” set up, which blocks malicious SQL
code, and encrypted columns, which protects confidential data.
 Some businesses set up custom user groups on their data warehouses, which
can include or exclude various data pools, and even give permission on a row
by row basis.
 9. Much Higher Query Performance And Insight
 The constant business intelligence queries that are part of today’s business can
put a major strain on an analytics infrastructure, from the legacy databases to
the data marts. Having a data warehouse to more effectively handle queries
removes some of the pressure on the system.
 Furthermore, since a data warehouse is specifically geared to handle massive
levels of date and myriad complex queries, it’s the high functioning core of any
business’s data analytics practice.
 10. Provides Major Competitive Advantage
 This is absolutely the bottom line benefit of a data warehouse: it allows a
business to more effectively strategize and execute against other vendors in its
sector.
 With the quality, speed and historical context provided by a data warehouse,
the greater insight in data mining can drive decisions that create more sales,
more targeted products, and faster response times.

In short, a data warehouse improves business decision making, which in turn gives
any business a key competitive advantage

Importance and Benefits Of Data Warehousing


Why Data Warehouse?
Implementing data warehouse could help a company avoid various challenges. In an
era of intense competition, it isn’t sufficient to just take decisions alone. It must be taken
on time because if you run out of time, you will witness your competitors getting ahead
of you in the marathon.

Let’s assume that a  super market chain has not implemented a data warehouse and
eventually the supermarket finds it very difficult to analyze what products are sold, what
is not selling, when does the sale go up, what is the age group of customers who are
buying a particular  product and several other queries. This is the first step of attracting
challenges because a decision has to be made as to whether, a particular product is a
hit among 18-25 age group or not? In case it is analyzed that the selling value has
subsided, steps have to be taken to analyze the issue surrounding it.

Talking about the strategic value given to a company, let’s take an example of
procurement. Every company procures certain products from a supplier like laptops,
desktops etc. Before making a purchase, the company contacts the supplier in order to
negotiate about the price and inquiring about the terms. How sure is the company about
the supplier adhering to the terms of the contract? After the purchase is made, the
supplier always gives an invoice. If the invoice shows that the discount hasn’t been
given as agreed, and doesn’t match the terms of the contract, then the two could
discuss on the same.
Data Warehousing Certification Training

 Course Type
 Real-life Case Studies
 Assignments
 Lifetime Access

Explore Curriculum

Hence, the sole reason for a company to have a data warehouse is to have the extra
edge. It is gained by taking smarter decisions in a smarter manner. This is possible if
executives responsible for such decisions have this data at their disposal. There was a
time when fact-based decisions and experience-based decisions were much more
prevalent. Moving away from that we have entered into an area, where fact-based
decisions have gained importance in our lives.

There are certain questions asked to a manager or executive and he has to answer this
to get an extra edge over his competitors. These questions may not be needed to run a
business but are needed for the survival and growth of the business.

 How to increase the market share of the company by 5%?


 Which product is not doing well in the market?
 Which agent needs help with selling policies?
 What is the quality of the customer service provided and what improvements are
needed?

Why is Data Warehouse Crucial?


What is the quality of the customer service provided? This is one of the questions a
manager strives to understand. He breaks it down into smaller questions like how many
customer feedback did we receive in the last 6 months?  He files a query on the
database to analyze. The database holds every customer feedback that it has received.

The second sub set question is how many customers have given a feedback of
excellent, how many averages and how many bad? Then there is another column on
comments which will be required for the next question; this will be the comments or
improvement areas highlighted by customers. It can be identified as to why these
questions are asked. All these three questions combined give a picture of the customer
service and what improvements are needed.

Data Warehouse for Decision Support & OLAP


Does datawarehouse support OLAP?
The data warehouse supports on-line analytical processing (OLAP), the functional and
performance requirements of which are quite different from those of the on-line transaction
processing (OLTP) applications traditionally supported by the operational databases.

 On-Line Analytical Processing (OLAP) and Data Warehousing are decision


support technologies. Their goal is to enable enterprises to gain competitive
advantage by exploiting the ever-growing amount of data that is collected and
stored in corporate databases and files for better and faster decision making.
Over the past few years, these technologies have experienced explosive growth,
both in the number of products and services offered, and in the extent of
coverage in the trade press. Vendors, including all database companies, are
paying increasing attention to all aspects of decision support.

Thread: OLAP vs. Data Warehouse


To draw a line and consolidate the topic, let’s overview top asked questions regarding OLAP
guidelines in data warehouse.
1. Are OLAP and Data Warehouse the same things?
The answer is no, they are different. Data warehouse is an archive where historical corporate data is
stored and can be analyzed then. It can use different technologies for data extraction and analyzing.
And OLAP is one of those technologies that analyze and evaluate data from the data warehouse.
2. What is OLAP in data warehouse?
OLAP is a computing technology that allows querying data and analyzing it form different
perspectives. It provides fast analysis with a help of pre-aggregate and pre-calculate data. Online
analytical processing also is one of the tools used in data warehousing.
3. What is the purpose of the data warehouse?
To begin with, data warehouse is an archive of historical corporate data stored together for analyzing
and querying. It serves the following purposes:
 Increase data quality;
 Improve organization’s performance;
 Faster reporting and make it better;
 To integrate information coming from different sources in one convenient form.
 
4. What is the difference between data warehouse, database and ETL?
As it was mentioned previously in the article, data warehouse is not a database and ETL. Let’s try to
formulate the clear difference overviewing their definitions.
ETL that stands for Extract, Transform and Load, is the process of extracting data from various
sources, converting them to a suitable state, and loading into a data warehouse. So, as you can see
it is a tool used by data warehouse for data storage.
Database in the contrary is wider term. In common sense, it is a repository of information that is
used as a backup data storage for some specific purpose, while data warehouse is a type of
database focused on some particular application.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy