Data Modeling
Data Modeling
Data Modeling
Data Modeling
LEADER:
GONATISE, CARL
MEMBERS:
GONGORA, JORIZ
GUIUO, MARLEN
JAMIAS, MARLYN
INTRODUCTION TO DATA
MODELING
• Data modeling is the process of creating a conceptual
representation of real-world entities and their relationships
within a particular domain. It is crucial for database design,
software development, and data management. Data models
visually represent information systems, showing how data is
used, stored, organized, and related. These models are built
based on business needs and stakeholder feedback, and
evolve with changing business requirements.
INTRODUCTION TO DATA
MODELING
• Data modeling uses standardized schemas and techniques to
maintain consistency. There are different types of models:
conceptual models offer a high-level overview, while logical
and physical models focus on implementation. Techniques
like Entity-Relationship Modeling (ER) and Unified Modeling
Language (UML) help graphically represent entities,
attributes, and relationships.
What Makes a
Good Model?
What Makes a Good Model?
1. Keep Business Objectives in Mind:
Always start with a clear understanding of the business
requirements and objectives.
Consider how the data model will support business
processes, reporting, and analytics.
Align your data model design with the organization’s
strategic goals
What Makes a Good Model?
2. Properly Document Your Data Model:
Comprehensive documentation is essential for
understanding and maintaining data models.
Document the purpose of each table, relationships,
attributes, and any assumptions made during modeling.
Clear documentation aids collaboration among
stakeholders.
What Makes a Good Model?
3. Design Your Data Model to Be Adjustable Over Time:
Business needs evolve, and data models should be flexible
enough to accommodate changes.
Avoid rigid structures that hinder adaptability.
Plan for scalability and future enhancements.
What Makes a Good Model?
4. Have Different Levels of Abstraction for Different People:
Data models serve various audiences: business users,
analysts, developers, and database administrators.
Provide high-level conceptual models for executives and
detailed logical/physical models for technical teams.
Tailor the level of detail based on the audience’s expertise.
What Makes a Good Model?
5. Choose the Right Data Modeling Technique:
Use standard notations such as Entity-Relationship
Diagrams (ERDs) or Unified Modeling Language (UML).
ERDs visually represent entities, attributes, and
relationships.
UML diagrams provide a broader view, including use cases,
classes, and interactions.
What Makes a Good Model?
6. Think about Data Governance and Security:
Incorporate data governance principles into your
modeling process.
Define access controls, data ownership, and privacy rules.
Ensure compliance with regulations (e.g., GDPR, HIPAA).
What Makes a Good Model?
7. Avoid Premature Optimization:
Focus on meeting business requirements first.
Optimize for performance only when necessary.
Premature optimization can lead to complex and over-
engineered models.
TYPES OF DATA MODELS
1. Conceptual Data Model
A conceptual data model (CDM) is a high-level
representation of the business requirements and the
connected data sets and relationships, independent of any
specific implementation or technology
The Entity-Relationship Diagram (ERD) is a graphical
representation of the conceptual data model, depicting entities,
attributes, and relationships:
2. Logical Design Stage
A logical data model (LDM) is an abstract representation of a possible
implementation, without being bound to any specific implementation or
technology.
The logical design stage translates the conceptual model into a logical
data model, following specific data modeling rules and conventions:
3. Physical Data Model
A physical data model (PDM) is a detailed representation of how the data
is actually stored and manipulated in a specific system.
In a physical data model, the entities are transformed into tables, while
the attributes are transformed into columns:
DATA MODELING TECHNIQUES
Enterprise Data Modeling
Entity-Relationship
Modeling (ER Modeling) is
a widely used data
modeling technique,
especially for relational
database design. It
represents real-world
objects or concepts as
entities and their
associations as
relationships. ER models
use graphical notations,
such as rectangles for
entities, diamonds for
relationships, and
connecting lines.
Unified Modeling Language
(UML)
Unified Modeling Language
(UML) is a standardized
modeling language used in
software engineering and
system design. It includes
various diagram types, with
class diagrams being useful
for data modeling. Class
diagrams represent entities
(classes) along with their
attributes, operations, and
relationships. UML also
supports advanced concepts
like inheritance,
composition, and
aggregation, offering a
comprehensive and flexible
approach to modeling both
data and behavior.
Data Flow Diagrams (DFDs)
Data Flow Diagrams (DFDs)
are used in structured
systems analysis to model
the flow of data through a
system and the processes
that transform it. They
consist of external entities,
data stores, processes, and
data flows, helping to clarify
data requirements and
system interactions.
Object-Role Modeling (ORM)
Object-Role Modeling (ORM)
is a fact-based data
modeling approach that
represents information using
natural language sentences.
It captures business rules
and constraints through
predicate logic and set
theory, supporting advanced
concepts like subtypes,
mandatory roles, and value
constraints. ORM is
particularly useful for
capturing complex business
rules and ensuring data
integrity.
Dimension
Modeling
Dimension modeling is a
technique designed for data
warehousing and business
intelligence. It organizes
data into facts (measurable
metrics) and dimensions
(descriptive attributes) and
uses star or snowflake
schemas for efficient
querying. This approach
focuses on supporting
analytical queries and
reporting while addressing
the integration of data from
multiple sources.
DATA MODELING PROCESS
DATA MODELING PROCESS
Data modeling techniques have
different conventions that dictate
which symbols are used to represent
the data, how models are laid out, and
how business requirements are
conveyed. All approaches provide
formalized
workflows that include a sequence
of tasks to be performed in an iterative
manner.
DATA MODELING PROCESS
• Identify the entities. The process of data modeling begins with the
identification of the things, events or concepts that are represented in the
data set that is to be modeled. Each entity should be cohesive and logically
discrete from all others.
• Data redundancy and inconsistencies can arise when data is duplicated or maintained
separately in different systems or departments.
• Data redundancy and inconsistencies can arise when data is duplicated or maintained
separately in different systems or departments.
• Data silos and lack of integration can hinder data sharing, analysis, and decisionmaking
processes.
• Data redundancy and inconsistencies can arise when data is duplicated or maintained
separately in different systems or departments.
• Data silos and lack of integration can hinder data sharing, analysis, and decisionmaking
processes.
PROBLEM OF DATA MANAGEMENT
Problem of Data Management:
• Data quality issues, such as incomplete, inaccurate, or outdated data, can have
significant impacts on business operations and decision-making.
• Data redundancy and inconsistencies can arise when data is duplicated or maintained
separately in different systems or departments.
• Data silos and lack of integration can hinder data sharing, analysis, and decisionmaking
processes.
• Data quality issues, such as incomplete, inaccurate, or outdated data, can have
significant impacts on business operations and decision-making.
• Data cleansing and data quality management processes are essential for ensuring the
reliability and usability of data.
MANAGING DATA AS SHARED
RESOURCE
MANAGING DATA AS SHARED RESOURCE
Managing data as shared data refers to the practice of treating data as a valuable, common
resource that can be accessed, used, and maintained across different systems, departments,
and stakeholders within an organization.
This approach promotes data integration, consistency, and collaboration. Key aspects of
managing data as shared data include:
• Planning/Requirements Gathering
• Analysis
• Design
• Development/Implementation
• Testing/Quality Assurance
• Implementation/Release
• Maintenance/Support
EXPERTISE REQUIREMENTS
Here are some key expertise areas:
SQL Proficiency: Being able to write and optimize SQL queries to interact with databases
ETL Processes: Experience with Extract, Transform, Load (ETL) processes to move data
between systems
Data Vault Modeling: This methodology is used for designing enterprise data warehouses
and focuses on long-term historical storage of data.
Star Schema and Snowflake Schema: These are common dimensional modeling techniques
used in data warehousing to organize data into fact and dimension tables.
THANKYOU FOR
LISTENING