Introduction On Data Warehouse With OLTP and OLAP: Arpit Parekh
Introduction On Data Warehouse With OLTP and OLAP: Arpit Parekh
in
International Journal Of Engineering And Computer Science ISSN:2319-7242
Volume2 Issue 8 August, 2013 Page No. 2569-2573
Maharaja Krishnakumar Sinhji Bhavnagar University, Department of Computer Application, Shree Swaminaranyan Naimisharanaya
College of Management and IT, Bhavnagar, India
aparekh5888@gmail.com
Abstract: Data warehouse, online transaction processing system (OLTP) and on-line analytical processing (OLAP) are basically main
need of the database collection in business, corporate fields and many areas. Nowadays many services, products and new techniques are
available and offering many ideas in the DBMS. This paper is research about the data warehousing with OLAP and OLTP with the
basic need and main parts of the business database management. The main here consider that Data warehouse usage, process, Data
warehouse with Meta data, online transaction processing system (OLTP), on-line analytical processing (OLAP) and also OLTP vs.
OLAP with advantages and disadvantages [5].
Keywords: Meta data, online transaction processing system, on – line analytical processing, Data Warehouse
OLTP, OLAP, Meta data and Data warehouse are essential According to Kelly, a tool for Meta data management should:
elements of supports system, which has increasingly become Allow administrator to perform system administration
a focus of the database industry. Many commercial products operations, and manage security.
and services are now available, and all of principal database Allow end users to navigate and query Meta data.
management system providers now offerings in these areas. Allow end users to uses a GUI.
Decision support places some different requirements on
Allow end users to extend Meta data.
database technology compared to traditional on – line
Allow Meta data to be imported / exported into from
transaction processing applications [2].
other standard and formats.
Meta data is “data about data” or Meta data is “the data used 1.5 Data Warehouse: According to Barry Devlin, a single,
to define other data”. It specifies source, values, usage and complete and consistent store of data obtained from a variety
features of DWH data and defines how data can be changed of different sources made available to end users in a way they
and processed at every architecture layer. Meta data is stored can understand and use in a business context. [1]
in a Meta data repository which all the other architecture
components can access. Meta data is a critical need for using,
building, and administering the data warehouse. For end- 2. Research Methodology
users, metadata is like a roadmap to the data warehouse 2.1 Data Warehouse Usage
contents. A Meta data repository is like a general-purpose
Arpit Parekh, IJECS Volume 2 Issue 8 August, 2013 Page No.2569-2573 Page 2569
Data warehouses and data marts are used in a wide while retaining the rapid implementation and opportunistic
range of applications. application of Bottom up approach.
The design and construction of data warehouse may consist
Business executives use the data in data warehouses
of the following steps:
and data marts to perform data analysis and make
Planning
strategic decisions.
Requirements study
In many areas, data warehouses are used as an
Problem Analysis
integral part for enterprise management.
Warehouse Design
The data warehouse is mainly used for generating
Data integration
reports and answering predefined queries.
Testing and
It is used to analyze summarized and detailed data,
Deployment of the data warehouse.
where the results are presented in the form of
reports and charts.
2.3 Software systems can be developed using two
Later, the data warehouse is used for strategic
methodologies:
purposes, performing multidimensional analysis and
sophisticated operations.
Waterfall method and Spiral method:
Finally, the data warehouse may be employed for The waterfall method performs a structured and systematic
knowledge discovery and strategic decision making analysis at each step before proceeding to the next, which is
using data mining tools. like a waterfall, falling from one step to next.
In this context, the tools for data warehousing can he The spiral method involves the rapid generation of
categorized into access and retrieval tools, increasingly functional systems, with short intervals between
database reporting tools, data analysis tools, and successive releases. This is considered a good choice for data
data mining tools [3]. warehouse development, like for data marts, because the
turnaround time is short, modifications can be quickly, and
2.2 Need for Data Warehousing: new designs and technologies can be adapted in a timely
Industry has huge amount of operational data. manner.
So, the warehouse Design process consists of following
It is a platform for consolidated historical data for steps:
analysis. 1.] Business process model:
It stores data of good quality so that knowledge For example, orders, invoices, shipments, inventory, account
worker can make correct / strategic decisions. administration and sales. If the business process us
Better business intelligence for end-users. organizational and involve multiple complex object
Reduction in time to locate, access, and analyze collections, a data warehouse model should be followed.
2.] Choose the grain of the business process:
information.
The grain is the fundamental atomic level of data to be
Consolidation of disparate information sources. represented in the fact table for this process, for example,
Strategic advantage over competitors. individual truncations, individual daily snapshots, and so on.
Faster time-to-market for products and services. 3.] Choose the dimensions:
That will apply to each fact table record. Typical dimension
Replacement of older, less-responsive decision are time, item, customer, supplier, warehouse, transaction
support systems.
type, and status.
Reduction in demand on IS to generate reports.[3] 4.] Choose the measures:
That will populate each fact table record. Typical measures
2.3 The Process of Data Warehouse Design [5]: are numeric additive quantities like dollars sold and units
A DWH can be built using a Top – Down Approach, Bottom sold [5].
- Up Approach or combination of both.
Top – Down Approach: 2.4 The major feature between OLTP & OLAP [5]:
The top – down approach start with the overall design and 1. User and system orientation:
planning. It is useful in cases where the technology is mature Clerks, clients and information technology
and well known, and the business problems that must be professionals use OLTP systems and it is customer_
solved are clear and well understood. oriented whereas the OLAP is market-oriented used
Bottom – Up Approach: by knowledge workers, analysis and managers.
The bottom up approach starts with experiments and 2. Data contents:
prototypes. This is useful in the early stage of business An OLTP system manages current data and is not
modeling and technology development. It allows used for decision making purposes. An OLAP
organizations to move forward at considerably less expense manages large amounts of historical data with
and to evaluate the benefits of the technology before making facilities for Summerton and aggregation.
significant commitments.
Combined Approach: 3. Database design:
In the combined approach, an organization can exploit the An OLTP system usually adopts an Entity-
planned and strategic nature if the Top – down approach relationship model and an application oriented
Arpit Parekh, IJECS Volume 2 Issue 8 August, 2013 Page No.2569-2573 Page 2570
database design. An OLAP system uses a star or a Can leverage functionalities inherent in the
snowflake model. relational database: RDBMS already comes with
4. View: a lot of functionalities. So ROLAP technologies,
An OLTP system works on the current data within (works on top of the RDBMS) can control these
organization, without using historical data in functionalities.
different organizations. An OLAP systems deal with
information that originates from different Disadvantages:
organizations, adding information from many data
stores. Because of their huge volume, OLAP data
Performance can be slow: Each ROLAP report
is a SQL query (or multiple SQL queries) in the
are stored on multiple storage media.
relational database, the query time can be long if
5. Access Patterns:
the underlying data size is large.
The Access Patterns of an OLTP system consist of
Short, atomic transactions. Such a system requires Limited by SQL functionalities: ROLAP
concurrency control and recovery mechanisms. technology relies on generating SQL statements
Accesses to OLAP systems are read only operations. to query the relational database, and SQL
2.5 OLAP statements do not fit all needs.
Logically, OLAP servers present business users ROLAP technologies are limited by what SQL can
with multidimensional data from DW or data do. ROLAP vendors moderate this risk by
marts, without concerns regarding how or where building the tool with ability to allow users to
the data are stored. define their own functions.
So an OLAP Server is a high capacity, multi user
data manipulation engine specifically designed 2. Multidimensional OLAP (MOLAP) servers:
to support and operate on multi-dimensional
data structure.
These servers support multidimensional views of
data through array-based multidimensional
However, physical architecture and storage engines. They map multidimensional
implementation of OLAP servers must consider views directly to data cube array structures.
data storage issues.
Advantage of using data cube is that it allows fast
OLAP stands for On-Line Analytical Processing. indexing to pre-computed summarized data.
The key feature is "Multidimensional", the ability With multidimensional data stores, storage
to analyze metrics in different dimensions such utilization may be low if data set is sparse.
as time, geography, gender, product, etc.
In such cases, sparse matrix compression
Implementations of a warehouse server for techniques should be explored. Many MOLAP
OLAP processing include the following servers adopt a 2-level storage representation to
MOLAP (Multidimensional OLAP) and ROLAP handle dense and sparse data sets: denser sub
(Relational OLAP) and Hybrid OLAP cubes are identified and stored as array structures,
(HOLAP) [5]. whereas sparse sub cubes employ compression
2.6 OLAP Types: technology for efficient storage utilization [3].
1. Relational OLAP (ROLAP) servers:
Advantages:
These are intermediate servers that stand in Excellent Performance: A MOLAP cube is built
between a relational back-end server and client for fast data retrieval, and is optimal for Slicing
front-end tools. and Dicing operations.
They use a relational or extended-relational DBMS Can perform complex calculations: All
to store and manage warehouse data, and OLAP calculations have been pre-generated when the
middleware to support missing pieces. cube is created. Hence, complex calculations are
ROLAP servers include optimization for each not only feasible, but they return quickly.
DBMS back end, implementation of aggregation
navigation logic, and additional tools and Disadvantages:
services.
Limited in the amount of data it can handle:
ROLAP technology tends to have greater Because all calculations are performed when the
scalability than MOLAP technology. cube is built, it is not possible to include a large
The DSS server of Micro strategy, for example, amount of data in the cube itself.
adopts the ROLAP approach [3]. The data in cube cannot be derived from a large
amount of data. Indeed, this is possible.
Advantages:
Only summary-level information will be included
Can handle large amounts of data: The data size in the cube itself.
limitation of ROLAP technology is depends on
data size of the underlying RDBMS. So, ROLAP
Requires additional investment: Cube
technology is often proprietary and does not
itself places no limitation on data amount.
Arpit Parekh, IJECS Volume 2 Issue 8 August, 2013 Page No.2569-2573 Page 2571
already exist in the organization. Therefore, to OLAP database there is aggregated, historical data, stored in
adopt MOLAP technology, chances are multi-dimensional schemas (usually star schema).
additional investments in human and capital The following table summarizes the major differences
resources are needed. between OLTP and OLAP system design [7].
Arpit Parekh, IJECS Volume 2 Issue 8 August, 2013 Page No.2569-2573 Page 2572
In above discussion we see that the Business Database
management and Data warehousing depended on the
performance of Meta data, OLTP and OLAP performance.
In the data warehousing move toward, information is
requested, processed, and merged continuously, so the
information is readily available for direct querying OLAP
and analysis at the warehouse.
5. REFERENCES
BOOKS:
[1] "The Story So Far". 2002-04-15. Retrieved 2008-09-21.
[7] http://datawarehouse4u.info/OLTP-vs-OLAP.html
WEBSITES:
[8] http://en.wikipedia.org.
Author Profile
Arpit Parekh, IJECS Volume 2 Issue 8 August, 2013 Page No.2569-2573 Page 2573