Schema Chalk Talk

Download as pdf or txt
Download as pdf or txt
You are on page 1of 36

Schema Chalk Talk

Alvin Richards
alvin@10gen.com

Wednesday, May 25, 2011 1


Topics

Common patterns
• Single table inheritance
• One-to-Many & Many-to-Many
• Trees
• Queues

Wednesday, May 25, 2011 2


So why model data?

http://www.flickr.com/photos/42304632@N00/493639870/

Wednesday, May 25, 2011 3


A brief history of normalization
• 1970 E.F.Codd introduces 1st Normal Form (1NF)
• 1971 E.F.Codd introduces 2nd and 3rd Normal Form (2NF, 3NF)
• 1974 Codd & Boyce define Boyce/Codd Normal Form (BCNF)
• 2002 Date, Darween, Lorentzos define 6th Normal Form (6NF)

Goals:
• Avoid anomalies when inserting, updating or deleting
• Minimize redesign when extending the schema
• Make the model informative to users
• Avoid bias towards a particular style of query

* source : wikipedia
Wednesday, May 25, 2011 4
Relational made normalized
data look like this

Wednesday, May 25, 2011 5


Document databases make
normalized data look like this

Wednesday, May 25, 2011 6


Terminology

RDBMS MongoDB
Table Collection
Row(s) JSON  Document
Index Index
Join Embedding  &  Linking
Partition Shard
Partition  Key Shard  Key

Wednesday, May 25, 2011 7


DB Considerations
How can we manipulate Access Patterns ?
this data ?

• Dynamic Queries • Read / Write Ratio


• Secondary Indexes • Types of updates
• Atomic Updates • Types of queries
• Map Reduce • Data life-cycle
Considerations
• No Joins
• Document writes are atomic

Wednesday, May 25, 2011 8


Inheritance

Wednesday, May 25, 2011 9


Single Table Inheritance - RDBMS

shapes table
id type area radius d length width

1 circle 3.14 1

2 square 4 2

3 rect 10 5 2

Wednesday, May 25, 2011 10


Single Table Inheritance
>  db.shapes.find()
 {  _id:  "1",  type:  "circle",area:  3.14,  radius:  1}
 {  _id:  "2",  type:  "square",area:  4,  d:  2}
 {  _id:  "3",  type:  "rect",    area:  10,  length:  5,  width:  2}

Wednesday, May 25, 2011 11


Single Table Inheritance
>  db.shapes.find()
 {  _id:  "1",  type:  "circle",area:  3.14,  radius:  1}
 {  _id:  "2",  type:  "square",area:  4,  d:  2}
 {  _id:  "3",  type:  "rect",    area:  10,  length:  5,  width:  2}

//  find  shapes  where  radius  >  0  


>  db.shapes.find({radius:  {$gt:  0}})

Wednesday, May 25, 2011 12


Single Table Inheritance
>  db.shapes.find()
 {  _id:  "1",  type:  "circle",area:  3.14,  radius:  1}
 {  _id:  "2",  type:  "square",area:  4,  d:  2}
 {  _id:  "3",  type:  "rect",    area:  10,  length:  5,  width:  2}

//  find  shapes  where  radius  >  0  


>  db.shapes.find({radius:  {$gt:  0}})

//  create  index
>  db.shapes.ensureIndex({radius:  1})

Wednesday, May 25, 2011 13


Single Table Inheritance

Considerations
• Simple to query across sub-types
• Indexes on specialized values will be small

Wednesday, May 25, 2011 14


One to Many
One to Many relationships can specify
• degree of association between objects
• containment
• life-cycle

Wednesday, May 25, 2011 15


One to Many
- Embedded Array / Array Keys
- slice operator to return subset of array
- some queries harder
e.g find latest comments across all documents

blogs:  {        
       author  :  "Hergé",
       date  :  "Sat  Jul  24  2010  19:47:11  GMT-­‐0700  (PDT)",  
       comments  :  [
     {
    author  :  "Kyle",
    date  :  "Sat  Jul  24  2010  20:51:03  GMT-­‐0700  (PDT)",
    text  :  "great  book"
     }
       ]}
Wednesday, May 25, 2011 16
One to Many
- Embedded tree
- Single document
- Natural
- Hard to query
blogs:  {        
       author  :  "Hergé",
       date  :  "Sat  Jul  24  2010  19:47:11  GMT-­‐0700  (PDT)",  
       comments  :  [
     {
    author  :  "Kyle",
    date  :  "Sat  Jul  24  2010  20:51:03  GMT-­‐0700  (PDT)",
    text  :  "great  book",
               replies:  [  {  author  :  “James”,  ...}  ]
     }
       ]}

Wednesday, May 25, 2011 17


One to Many
- Normalized (2 collections)
- most flexible
- more queries
blogs:  {        
       author  :  "Hergé",
       date  :  "Sat  Jul  24  2010  19:47:11  GMT-­‐0700  (PDT)",  
       comments  :  [
       {comment  :  ObjectId(“1”)}
       ]}

comments  :  {  _id  :  “1”,


                         author  :  "James",
             date  :  "Sat  Jul  24  2010  20:51:03  ..."}

Wednesday, May 25, 2011 18


One to Many - patterns

- Embedded Array / Array Keys

- Embedded Array / Array Keys


- Embedded tree
- Normalized

Wednesday, May 25, 2011 19


Many - Many
Example:

- Product can be in many categories


- Category can have many products

Wednesday, May 25, 2011 20


Many - Many
products:
     {  _id:  ObjectId("10"),
         name:  "Destination  Moon",
         category_ids:  [  ObjectId("20"),
                                         ObjectId("30”]}
   

Wednesday, May 25, 2011 21


Many - Many
products:
     {  _id:  ObjectId("10"),
         name:  "Destination  Moon",
         category_ids:  [  ObjectId("20"),
                                         ObjectId("30”]}
   
categories:
     {  _id:  ObjectId("20"),  
         name:  "adventure",  
         product_ids:  [  ObjectId("10"),
                                       ObjectId("11"),
                                       ObjectId("12"]}

Wednesday, May 25, 2011 22


Many - Many
products:
     {  _id:  ObjectId("10"),
         name:  "Destination  Moon",
         category_ids:  [  ObjectId("20"),
                                         ObjectId("30”]}
   
categories:
     {  _id:  ObjectId("20"),  
         name:  "adventure",  
         product_ids:  [  ObjectId("10"),
                                       ObjectId("11"),
                                       ObjectId("12"]}

//All  categories  for  a  given  product


>  db.categories.find({product_ids:  ObjectId("10")})

Wednesday, May 25, 2011 23


Alternative
products:
     {  _id:  ObjectId("10"),
         name:  "Destination  Moon",
         category_ids:  [  ObjectId("20"),
                                         ObjectId("30”]}
   
categories:
     {  _id:  ObjectId("20"),  
         name:  "adventure"}

Wednesday, May 25, 2011 24


Alternative
products:
     {  _id:  ObjectId("10"),
         name:  "Destination  Moon",
         category_ids:  [  ObjectId("20"),
                                         ObjectId("30”]}
   
categories:
     {  _id:  ObjectId("20"),  
         name:  "adventure"}

//  All  products  for  a  given  category


>  db.products.find({category_ids:  ObjectId("20")})  

Wednesday, May 25, 2011 25


Alternative
products:
     {  _id:  ObjectId("10"),
         name:  "Destination  Moon",
         category_ids:  [  ObjectId("20"),
                                         ObjectId("30”]}
   
categories:
     {  _id:  ObjectId("20"),  
         name:  "adventure"}

//  All  products  for  a  given  category


>  db.products.find({category_ids:  ObjectId("20")})  

//  All  categories  for  a  given  product


product    =  db.products.find(_id  :  some_id)
>  db.categories.find({_id  :  {$in  :  product.category_ids}})  

Wednesday, May 25, 2011 26


Embedding versus Linking

Embedding
• Simple data structure
• Limited to 16MB
• Larger documents
• How often do you update?
• Will the document grow and grow?
Linking
• More complex data structure
• Unlimited data size
• More, smaller documents
• What are the maintenance needs?
Wednesday, May 25, 2011 27
Trees
Full Tree in Document

{  comments:  [
         {  author:  “Kyle”,  text:  “...”,  
             replies:  [
                                           {author:  “James”,  text:  “...”,
                                             replies:  []}  
             ]}
   ]
}

Pros: Single Document, Performance, Intuitive

Cons: Hard to search, Partial Results, 4MB limit

   
Wednesday, May 25, 2011 28
Trees
Parent Links
- Each node is stored as a document
- Contains the id of the parent

Child Links
- Each node contains the id’s of the children
- Can support graphs (multiple parents / child)

Wednesday, May 25, 2011 29


Array of Ancestors
- Store all Ancestors of a node
   {  _id:  "a"  }
   {  _id:  "b",  ancestors:  [  "a"  ],  parent:  "a"  }
   {  _id:  "c",  ancestors:  [  "a",  "b"  ],  parent:  "b"  }
   {  _id:  "d",  ancestors:  [  "a",  "b"  ],  parent:  "b"  }
   {  _id:  "e",  ancestors:  [  "a"  ],  parent:  "a"  }
   {  _id:  "f",  ancestors:  [  "a",  "e"  ],  parent:  "e"  }

Wednesday, May 25, 2011 30


Array of Ancestors
- Store all Ancestors of a node
   {  _id:  "a"  }
   {  _id:  "b",  ancestors:  [  "a"  ],  parent:  "a"  }
   {  _id:  "c",  ancestors:  [  "a",  "b"  ],  parent:  "b"  }
   {  _id:  "d",  ancestors:  [  "a",  "b"  ],  parent:  "b"  }
   {  _id:  "e",  ancestors:  [  "a"  ],  parent:  "a"  }
   {  _id:  "f",  ancestors:  [  "a",  "e"  ],  parent:  "e"  }

//find  all  descendants  of  b:


>  db.tree2.find({ancestors:  ‘b’})

//find  all  direct  descendants  of  b:


>  db.tree2.find({parent:  ‘b’})

Wednesday, May 25, 2011 31


Array of Ancestors
- Store all Ancestors of a node
   {  _id:  "a"  }
   {  _id:  "b",  ancestors:  [  "a"  ],  parent:  "a"  }
   {  _id:  "c",  ancestors:  [  "a",  "b"  ],  parent:  "b"  }
   {  _id:  "d",  ancestors:  [  "a",  "b"  ],  parent:  "b"  }
   {  _id:  "e",  ancestors:  [  "a"  ],  parent:  "a"  }
   {  _id:  "f",  ancestors:  [  "a",  "e"  ],  parent:  "e"  }

//find  all  descendants  of  b:


>  db.tree2.find({ancestors:  ‘b’})

//find  all  direct  descendants  of  b:


>  db.tree2.find({parent:  ‘b’})

//find  all  ancestors  of  f:


>  ancestors  =  db.tree2.findOne({_id:’f’}).ancestors
>  db.tree2.find({_id:  {  $in  :  ancestors})
Wednesday, May 25, 2011 32
Trees as Paths
Store hierarchy as a path expression
- Separate each node by a delimiter, e.g. “/”
- Use text search for find parts of a tree

{  comments:  [
         {  author:  “Kyle”,  text:  “initial  post”,  
             path:  “/”  },
         {  author:  “Jim”,    text:  “jim’s  comment”,
             path:  “/jim”  },
         {  author:  “Kyle”,  text:  “Kyle’s  reply  to  Jim”,
             path  :  “/jim/kyle”}  ]  }

//  Find  the  conversations  Jim  was  part  of  


>  db.posts.find({path:  /^jim/i})

Wednesday, May 25, 2011 33


Queue
• Need to maintain order and state
• Ensure that updates to the queue are atomic
     {  inprogress:  false,
         priority:  1,  
     ...
     }

Wednesday, May 25, 2011 34


Queue
• Need to maintain order and state
• Ensure that updates to the queue are atomic
     {  inprogress:  false,
         priority:  1,  
     ...
     }

//  find  highest  priority  job  and  mark  as  in-­‐progress


job  =  db.jobs.findAndModify({
                             query:    {inprogress:  false},
                             sort:      {priority:  -­‐1),  
                             update:  {$set:  {inprogress:  true,  
                                                             started:  new  Date()}},
                             new:  true})    

Wednesday, May 25, 2011 35


download at mongodb.org

We’re Hiring !
alvin@10gen.com

conferences,  appearances,  and  meetups


http://www.10gen.com/events

Facebook                    |                  Twitter                  |                  LinkedIn


http://bit.ly/mongo>   @mongodb http://linkd.in/joinmongo

Wednesday, May 25, 2011 36

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy