DSW Mongodb1 Part II
DSW Mongodb1 Part II
1
• MongoDB (from "humongous") is a cross-platform
document-oriented database. Classified as a NoSQL database,
MongoDB.
3
• JSON is built on two structures:
• A collection of name/value pairs. In various
languages, this is realized as an object, record,,
dictionary, hash table, keyed list, or associative
array.
4
• JSON is used primarily to transmit data between a server and web
application, as an alternative to XML.
• Although originally derived from the JavaScript scripting
language, JSON is a language-independent data format, and code
for parsing.
• Generating JSON data is readily available in a large variety of
programming languages.
• Following JSON syntax defines an employees object, with an
array of 2 employee records (objects):
{"employees":[
{"firstName":"Anna", "lastName":"Smith"},
{"firstName":"Peter", "lastName":"Jones"}
] }
5
Uses of JSON
• It is used while writing JavaScript based application which
includes browser extension and websites.
• JSON format is used for serializing & transmitting structured
data over network connection.
• This is primarily used to transmit data between server and
web application.
Characteristics of JSON
• Easy to read and write JSON.
• Language independent.
6
• Following example shows Books information stored using JSON
considering language of books and there editions:
{ "book": [
{ "id":01,
"language": "Java",
"edition": "third",
"author": "Herbert Schildt”
},
{ "id":07,
"language": "C++",
"edition": "second”,
"author": "E.Balagurusamy"
}
]
}
7
• Following datatypes are supported by JSON format:
Type Description
Number double- precision floating-point format in JavaScript
String double-quoted Unicode with backslash escaping
Boolean true or false
Array an ordered sequence of values
Value it can be a string, a number, true or false, null etc
Object an unordered collection of key:value pairs
Whitespace can be used between any pair of tokens
null empty
8
• Number
• It is a double precision floating-point format in JavaScript and
it depends on implementation.
• Octal and hexadecimal formats are not used.
• No NaN (not a number) or Infinity is used in Number.
Type Description
Integer Digits 1-9, 0 and positive or negative
Fraction Fractions like .3, .9
Exponent Exponent like e, e+, e-,E, E+, E-
9
• SYNTAX:
var json-object-name = { string : number_value, .......}
• EXAMPLE:
Example showing Number Datatype, value should not be
quoted:
var obj = {marks: 97}
• String
• It is a sequence of zero or more double quoted Unicode
characters with backslash escaping.
• Character is a single character string i.e. a string with
length 1.
10
The following table shows string types −
Type Description
" double quotation
\ reverse solidus
/ solidus
b backspace
f form feed
n new line
r carriage return
t horizontal tab
u four hexadecimal digits
11
• SYNTAX:
var json-object-name = { string : "string value", .......}
• EXAMPLE:
Example showing String Datatype:
var obj = {name: 'Amit'}
Boolean
• It includes true or false values.
• SYNTAX:
var json-object-name = { string : true/false, .......}
• EXAMPLE:
var obj = {name: 'Amit', marks: 97, distinction: true}
12
• Array
• It is an ordered collection of values.
• These are enclosed square brackets which means that array begins
with .[. and ends with .]..
• The values are separated by ,(comma).
• Array indexing can be started at 0 or 1.
• Arrays should be used when the key names are sequential integers.
• SYNTAX:
[ value, .......]
• EXAMPLE:
• Example showing array containing multiple objects:
{ "books": [ { "language":"Java" , "edition":"second" },
{ "language":"C++" , " edition":"fifth" },
{ "language":"C" , "edition ":"third" }
]
}
13
• Object
• It is an unordered set of name/value pairs.
• Object are enclosed in curly braces that is it starts with '{' and
ends with '}'.
• Each name is followed by ':'(colon) and the name/value pairs
are separated by , (comma).
• The keys must be strings and should be different from each
other.
• SYNTAX:
{ string : value, .......}
EXAMPLE:
• Example showing Object:
{
"id": "011A",
"language": "JAVA",
"price": 500,
}
14
• Whitespace
• It can be inserted between any pair of tokens.
It can be added to make code more readable.
Example shows declaration with whitespace:
• SYNTAX:
{string:" ",....}
• EXAMPLE:
var i= " Rohan";
var j = " Harsh”;
• null
• It means empty type.
• SYNTAX:
Null 15
JSON Value
It includes:
• number (integer or floating point)
• string
• boolean
• array
• object
• null
• SYNTAX:
• String | Number | Object | Array | TRUE |
FALSE | NULL
• EXAMPLE:
var i =1; 16
• A possible JSON representation describing a person.
{ "firstName": "John",
"lastName": "Smith",
"isAlive": true,
"age": 25,
"height_cm": 167.6,
"address":
{ "streetAddress": "21 2nd Street",
"city": "New York",
"state": "NY",
"postalCode": "10021-3100"
},
"phoneNumbers":
[ { "type": "home", "number": "212 555-1234" },
{ "type“:"office", "number": "646 555-4567" }
]
} 17
• This XML syntax also defines an employees object with 3 employee
records:
• XML Example
• <employees>
• <employee>
• <firstName>John</firstName> <lastName>Doe</lastName>
• </employee>
• <employee>
• <firstName>Anna</firstName> <lastName>Smith</lastName>
• </employee>
• <employee>
• <firstName>Peter</firstName> <lastName>Jones</lastName>
• </employee>
• </employees>
18
• Example
• JSON
{ "company": “Volkswagen”,
"name": "Vento",
"price": 800000
}
• XML
<car>
<company>Volkswagen</company>
<name>Vento</name>
<price>800000</price>
</car>
19
MongoDB
• MongoDB is a cross-platform, document
oriented database that provides, high
performance, high availability, and easy
scalability. MongoDB works on concept of
collection and document.
20
Database -Database is a physical container for collections.
Each database gets its own set of files on the file system. A
single MongoDB server typically has multiple databases.
21
Document
• A record in MongoDB is a document,
which is a data structure composed of field and
value pairs. MongoDB documents are similar
to JSON objects.
24
The relationship of RDBMS terminology with MongoDB.
RDBMS MongoDB
Database Database
Table Collection
Tuple/Row Document
Column Field
Table Join Embedded Documents
Primary Key Primary Key (Default key _id provided by
mongodb itself)
25
Database Server and Client
Mysqld/Oracle mongod
mysql/sqlplus mongo
26
Features
High Performance
• MongoDB provides high performance data persistence.
In particular,Support for embedded data models reduces
I/O activity on database system.
• Indexes support faster queries and can include keys
from embedded documents and arrays.
High Availability
• To provide high availability, MongoDB’s replication
facility, called replica sets, provide:
-automatic failover.
-data redundancy.
• A replica set is a group of MongoDB servers that
maintain the same data set, providing redundancy and
increasing data availability. 27
MongoDB - Replication
• Replication is the process of synchronizing
data across multiple servers.
29
1)Replica set is a group of two or more nodes (generally
minimum 3 nodes are required).
2)In a replica set one node is primary node and remaining nodes
are secondary.
30
Here, the diagram of mongodb replication is shown in which
client application always interact with primary node and
primary node then replicate the data to the secondary nodes.
31
• Automatic Scaling
32
• To address these issues of scales, database systems have two basic
approaches: vertical scaling and sharding.
33
• Sharding- Sharding is the process of storing data records
across multiple machines and it is MongoDB's approach to
meeting the demands of data growth.
34
• Sharding addresses the challenge of scaling to support high
throughput and large data sets:
-Sharding reduces the number of operations each shard
handles. Each shard processes fewer operations as the cluster
grows. As a result, a cluster can increase capacity and throughput
horizontally.
For example, to insert data, the application only needs
to access the shard responsible for that record.
-Sharding reduces the amount of data that each server
needs to store. Each shard stores less data as the cluster grows.
For example, if a database has a 1 terabyte data set, and
there are 4 shards, then each shard might hold only 256GB of
data. If there are 40 shards, then each shard might hold only 25GB
of data.
35
• Why Sharding?
• In replication all writes go to master node
• Memory can't be large enough when active
dataset is big
• Local Disk is not big enough
• Vertical scaling is too expensive
36
• Diagram shows the sharding in MongoDB using sharded cluster.
37
• In the given diagram there are three main components which are
described below:
• Shards: Shards are used to store data. They provide high availability
and data consistency. In production environment each shard is a separate
replica set.
39
• Deep query-ability. MongoDB supports dynamic
queries on documents using a document-based
query language that's nearly as powerful as SQL
40
• Why should use MongoDB
• Document Oriented Storage : Data is stored in the form of JSON
style documents.
• Index on any attribute
• Replication & High Availability
• Auto-Sharding
• Professional Support By MongoDB
41
• MongoDB Data Modeling
• Data in MongoDB has a flexible schema.
documents in the same collection do not need to
have the same set of fields or structure, and
common fields in a collection’s documents may
hold different types of data.
44
• While in MongoDB schema design will have one collection post and has the following
structure:
{ _id: POST_ID
title: TITLE_OF_POST,
description: POST_DESCRIPTION,
by: POST_BY,
url: URL_OF_POST,
tags: [TAG1, TAG2, TAG3],
likes: TOTAL_LIKES,
comments: [ {user:'COMMENT_BY',
message: TEXT,
dateCreated: DATE_TIME,
like: LIKES
},
{ user:'COMMENT_BY',
message: TEXT,
dateCreated: DATE_TIME,
like: LIKES
} ]
}
45
Sample document
Given example shows the document structure of a blog
site which is simply a comma separated key value pair.
46
{
_id: ObjectId(7df78ad8902c)
title: 'MongoDB Overview',
description: 'MongoDB is no sql database',
by: 'tutorials point',
url: 'http://www.tutorialspoint.com',
tags: ['mongodb', 'database', 'NoSQL'],
likes: 100,
comments: [
{
user:'user1',
message: 'My first comment',
dateCreated: new Date(2011,1,20,2,15),
like: 0
},
47
{
user:'user2',
message: 'My second comments',
dateCreated: new Date(2011,1,25,7,45),
like: 5
}
]
}
• _id is a 12 bytes hexadecimal number which assures the
uniqueness of every document. You can provide _id while
inserting the document. If you didn't provide then MongoDB
provide a unique id for every document. These 12 bytes first 4
bytes for the current timestamp, next 3 bytes for machine id,
next 2 bytes for process id of mongodb server and remaining 3
bytes are simple incremental value.
48
MongoDB Create Database
• The use Command
• MongoDB use DATABASE_NAME is used to create database.
The command will create a new database, if it doesn't exist
otherwise it will return the existing database.
• SYNTAX:
use DATABASE_NAME
• EXAMPLE:
• If you want to create a database with name <mydb>, then use
DATABASE statement would be as follows:
>use mydb
• switched to db mydb.
49
• To check your currently selected database use the command db
>db
mydb
>show dbs
local 0.78125GB
test 0.23012GB
50
51
• Your created database (mydb) is not present in list. To display
database you need to insert atleast one document into it.
>db.tab1.insert({name:"tutorials point"})
>show dbs
local 0.78125GB
mydb 0.23012GB
test 0.23012GB
52
MongoDB Drop Database
• The dropDatabase() Method
• MongoDB db.dropDatabase() command is
used to drop an existing database.
• SYNTAX:
db.dropDatabase()
53
• EXAMPLE:
• First, check the list available databases by using the
command show dbs
>show dbs
local 0.78125GB
mydb 0.23012GB
test 0.23012GB
• If you want to delete new database <mydb>,
then dropDatabase() command would be as follows:
>use mydb
switched to db mydb
>db.dropDatabase()
>{ "dropped" : "mydb", "ok" : 1 }
>
54
55
• Now check list of databases
>show dbs
local 0.78125GB
test 0.23012GB
56
MongoDB Create Collection
• The createCollection() Method
• MongoDB db.createCollection(name, options) is used to
create collection.
• SYNTAX:
db.createCollection(name, options)
57
Parameter Type Description
Name String Name of the collection to be created
Options Document (Optional) Specify options about
memory size and indexing
58
Field Type Description
59
60
While inserting the document, MongoDB first checks size field of capped
collection, then it checks max field.
• EXAMPLES:
• Basic syntax of createCollection() method
without options is as follows
>use test
switched to db test
>db.createCollection("mycollection")
{ "ok" : 1 }
>db.createCollection("mycol",
{ capped : true,
autoIndexID : true,
size : 6142800,
max : 10000 } )
{ "ok" : 1 }
>
62
• In mongodb you don't need to create collection.
MongoDB creates collection automatically, when
you insert some document.
>db.demo.insert({"name" :"tutorials"})
>show collections
mycol
mycollection
system.indexes
demo
>
63
• MongoDB Drop Collection
• The drop() Method
MongoDB's db.collection.drop() is used to drop a collection from the
database.
• SYNTAX:
db.COLLECTION_NAME.drop()
• EXAMPLE:
• First, check the available collections into your database mydb
>use mydb
• switched to db mydb
>show collections
mycol
mycollection
system.indexes
demo
>
65
• Now drop the collection with the name mycollection
>db.mycollection.drop()
true
>
• Again check the list of collections into database
>show collections
mycol
system.indexes
demo
>
• drop() method will return true, if the selected collection is
dropped successfully otherwise it will return false.
66
MongoDB - Insert Document
• The insert() Method
• To insert data into MongoDB collection, you need to use
MongoDB's insert() or save()method.
>db.COLLECTION_NAME.insert(document)
• EXAMPLE
2)db.mycol.insert(
{ _id: ObjectId(7df78ad8902c),
title: 'MongoDB Overview',
description: 'MongoDB is no sql database',
by: 'tutorials point',
url: 'http://www.tutorialspoint.com',
tags: ['mongodb', 'database', 'NoSQL'],
likes: 100
}
)
68
• Here mycol is our collection name. If the collection doesn't exist
in the database, then MongoDB will create this collection and
then insert document into it.
69
• Insert Multiple Documents
• The following example performs a bulk insert of three documents by
passing an array of documents to the insert() method.
• The documents in the array do not need to have the same fields.
• For instance, the first document in the array has an _id field and
a type field. Because the second and third documents do not contain
an _id field,mongod will create the _id field for the second and third
documents during the insert:
db.products.insert(
[
{ _id: 11, item: "pencil”,qty: 50, type: "no.2" },
{ item: "pen", qty: 20 },
{ item: "eraser", qty: 25 }
] )
71
• The operation inserted the following three documents:
73
• Insert a Document with update() Method
• The following example creates a new document if no document in
the inventory collection contains
{type: "book", item : "journal" }:
• db.inventory.update(
{ type: "book", item : "journal" },
{ $set : { qty: 10 } },
{ upsert : true } )
• MongoDB adds the _id field and assigns as its value a unique ObjectId. The
new document includes the item and type fields from the <query> criteria and
the qty field from the <update> parameter.
• { "_id" : ObjectId("51e8636953dbe31d5f34a38a"),
"item" :"journal", "qty" : 10, "type" : "book"
}
74
MongoDB Update Document
• MongoDB's update() and save() methods are used to update
document into a collection. The update() method update values in
the existing document while the save() method replaces the
existing document with the document passed in save() method.
• SYNTAX:
>db.COLLECTION_NAME.update(SELECTIOIN_CRITERIA,
UPDATED_DATA)
75
• EXAMPLE
• Consider the mycol collectioin has following data.
76
>db.mycol.update({'title':'MongoDB Overview'},{$set:{'title':'New
MongoDB Tutorial'}})
>db.mycol.find()
{ "_id" : ObjectId(5983548781331adf45ec5), "title":"New MongoDB
Tutorial"}
{ "_id" : ObjectId(5983548781331adf45ec6), "title":"NoSQL Overview"}
{ "_id" : ObjectId(5983548781331adf45ec7), "title":"Tutorials Point
Overview"}
>
• By default mongodb will update only single document, to update
multiple you need to set a paramter 'multi' to true.
>db.mycol.update({'title':'MongoDB Overview'},
{$set:{'title':'New MongoDB Tutorial'}},
{multi:true}
)
77
• MongoDB Save() Method
• The save() method replaces the existing document with the
new document passed in save() method.
• SYNTAX
>db.COLLECTION_NAME.save({_id:ObjectId(),NEW_DATA})
• EXAMPLE
• Following example will replace the document with the _id
'5983548781331adf45ec7'
79
>db.mycol.save( { "_id" : ObjectId(5983548781331adf45ec7),
"title":"Tutorials Point New Topic",
"by":"Tutorials Point"
}
)
>db.mycol.find()
{ "_id" : ObjectId(5983548781331adf45ec5),
"title":"Tutorials Point New Topic",
"by":"Tutorials Point“
}
{ "_id" : ObjectId(5983548781331adf45ec6),
"title":"NoSQL Overview“
}
{ "_id" : ObjectId(5983548781331adf45ec7),
"title":"Tutorials Point Overview“
}
>
80
• Insert a Document with save() Method
• The following example creates a new document in
the inventory collection:
• { "_id" : ObjectId("51e866e48737f72b32ae4fbc"),
"type" : "book",
"item" : "notebook",
"qty" : 40
}
82
• Replace an Existing Document
• The products collection contains the following document:
{ "_id" : 100, "item" : "water", "qty" : 30 }
83
MongoDB - Query Document
• The find() Method
• To query data from MongoDB collection, you need to
use MongoDB's find() method.
• SYNTAX
>db.COLLECTION_NAME.find()
84
• The pretty() Method
• To display the results in a formatted way, you
can use pretty() method.
• SYNTAX:
>db.mycol.find().pretty()
85
• Example
>db.mycol.find().pretty()
{ "_id": ObjectId(7df78ad8902c),
"title": "MongoDB Overview",
"description": "MongoDB is no sql database",
"by": "tutorials point",
"url": "http://www.tutorialspoint.com",
"tags": ["mongodb", "database", "NoSQL"],
"likes": "100”
}
>
87
• Specify Equality Condition
88
• Specify Conditions Using Query Operators
• A query document can use the query operators to specify
conditions in a MongoDB query.
89
• The $or operator performs a logical OR operation on an array
of two or more <expressions> and selects the documents that
satisfy at least one of the <expressions>.
• Example:
db.inventory.find( { $or: [ { quantity: { $lt: 20 }} , { price: 10 } ] } )
90
• RDBMS Where Clause Equivalents in MongoDB
• To query the document on the basis of some condition, you
can use following operations
92
• Specify AND Conditions
• A compound query can specify conditions for more than one field
in the collection’s documents.
• Implicitly, a logical AND conjunction connects the clauses of a
compound query so that the query selects the documents in the
collection that match all the conditions.
93
• Specify OR Conditions
• Using the $or operator, you can specify a compound query that
joins each clause with a logical OR conjunction so that the query
selects the documents in the collection that match at least one
condition.
94
• Specify AND as well as OR Conditions
• With additional clauses, you can specify precise conditions for
matching documents.
• In the following example, the compound query document selects
all documents in the collection where the value of the type field
is 'food' and either the qty has a value greater than ($gt
) 100 or the value of the price field is less than ($lt) 9.95:
95
MongoDB Delete Document
• The remove() Method
• MongoDB's remove() method is used to remove document from
the collection. remove() method accepts two parameters. One is
deletion criteria and second is justOne flag
• deletion criteria : (Optional) deletion criteria according to
documents will be removed.
• justOne : (Optional) if set to true or 1, then remove only one
document.
• SYNTAX:
• >db.COLLECTION_NAME.remove(DELLETION_CRITTERIA)
97
• EXAMPLE
• Consider the mycol collection has following data.
{ "_id" : ObjectId(5983548781331adf45ec5),
"title":"MongoDB Overview"}
{ "_id" : ObjectId(5983548781331adf45ec6),
"title":"NoSQL Overview"}
{ "_id" : ObjectId(5983548781331adf45ec7),
"title":"Tutorials Point Overview"}
98
• Following example will remove all the documents whose title is
'MongoDB Overview‘
>db.mycol.remove({'title':'MongoDB Overview'})
>db.mycol.find()
99
Remove All documents
• If you don't specify deletion criteria, then mongodb will
delete whole documents from the collection.
• This is equivalent of SQL's truncate command.
• To remove all documents from a collection, pass an
empty query document {} to the remove() method.
• The remove() method does not remove the indexes.
• The following example removes all documents from
the inventory collection:
db.inventory.remove({})
102
• Remove a Single Document that Matches a Condition
• To remove a single document, call the remove() method with
the justOne parameter set to true or 1.
103
MongoDB Projection
• In mongodb, projection meaning is selecting only necessary data
rather than selecting whole of the data of a document. If a
document has 5 fields and you need to show only 3, then select
only 3 fields from them.
• SYNTAX:
• >db.COLLECTION_NAME.find({},{KEY:1})
•
104
• EXAMPLE
• Consider the collection myycol has the following data
{ "_id" : ObjectId(5983548781331adf45ec5), "title":"MongoDB
Overview"}
{ "_id" : ObjectId(5983548781331adf45ec6), "title":"NoSQL
Overview"}
{ "_id" : ObjectId(5983548781331adf45ec7), "title":"Tutorials Point
Overview"}
• Following example will display the title of the document while quering
the document.
>db.mycol.find({},{"title":1,_id:0})
{"title":"MongoDB Overview"}
{"title":"NoSQL Overview"}
{"title":"Tutorials Point Overview"}
>
_id field is always displayed while executing find() method, if you don't
want this field, then you need to set it as 0
105
• MongoDB Limit Records
• The Limit() Method
• To limit the records in MongoDB, you need to
use limit() method. limit() method accepts one number type
argument, which is number of documents that you want to
displayed.
• SYNTAX:
>db.COLLECTION_NAME.find().limit(NUMBER)
106
• EXAMPLE
• Consider the collection mycol has the following data
{ "_id" :1, "title":"MongoDB Overview"}
{ "_id" : 2, "title":"NoSQL Overview"}
{ "_id" : 3, "title":"Tutorials Point Overview"}
db.mycol.find({},{"title":1,_id:0}).limit(2)
{"title":"MongoDB Overview"}
{"title":"NoSQL Overview"}
• If you don't specify number argument in limit() method
then it will display all documents from the collection. 107
• MongoDB Skip() Method
• Apart from limit() method there is one more
method skip() which also accepts number type argument
and used to skip number of documents.
• SYNTAX:
>db.COLLECTION_NAME.find().limit(NUMBER).skip(NUMB
ER)
• EXAMPLE:
• Following example will only display only second
document.
>db.mycol.find({},{"title":1,_id:0}).limit(1).skip(1)
109
MongoDB Sort Documents
• SYNTAX:
>db.COLLECTION_NAME.find().sort({KEY:1}
) 111
• Following example will display the documents sorted by title in
descending order.
>db.mycol.find({},{"title":1,_id:0}).sort({"title":-1})
112
MongoDB Aggregation
• Aggregations operations process data records and return computed
results.
113
• In sql count(*) and with group by is an equivalent
of mongodb aggregation.
• SYNTAX:
db.COLLECTION_NAME.aggregate(AGGREGATE_OPERATION)
114
• EXAMPLE:
• In the collection you have the following data:
{ _id: ObjectId(7df78ad8902c)
title: 'MongoDB Overview',
description: 'MongoDB is no sql database',
by_user: 'tutorials point',
url: 'http://www.tutorialspoint.com',
tags: ['mongodb', 'database', 'NoSQL'],
likes: 100
},
115
{ _id: ObjectId(7df78ad8902d)
title: 'NoSQL Overview',
description: 'No sql database is very fast',
by_user: 'tutorials point',
url: 'http://www.tutorialspoint.com',
tags: ['mongodb', 'database', 'NoSQL'], likes: 10 },
{ _id: ObjectId(7df78ad8902e)
title: 'Neo4j Overview',
description: 'Neo4j is no sql database',
by_user: 'Neo4j',
url: 'http://www.neo4j.com',
tags: ['neo4j', 'database', 'NoSQL'],
likes: 750
},
116
• Now from the above collection if you want to display a list that how
many tutorials are written by each user then use following
aggregate() method :
db.mycol.aggregate(
[
{
$group :
{_id : "$by_user", num_tutorial : {$sum : 1}}
}
] )
117
Output:
{
"result" : [
{
"_id" : "tutorials point",
"num_tutorial" : 2
},
{
"_id" : “Neo4j",
"num_tutorial" : 1
}
],
"ok" : 1 }
118
• Sql equivalent query for the above use case will be
120
Expr Description Example
essio
n
$sum Sums up the defined value db.mycol.aggregate([{$group : {_id :
from all documents in the "$by_user", num_tutorial : {$sum :
collection. "$likes"}}}])
$avg Calculates the average of all db.mycol.aggregate([{$group : {_id :
given values from all "$by_user", num_tutorial : {$avg :
documents in the collection. "$likes"}}}])
$min Gets the minimum of the db.mycol.aggregate([{$group : {_id :
corresponding values from all "$by_user", num_tutorial : {$min :
documents in the collection. "$likes"}}}])
$max Gets the maximum of the db.mycol.aggregate([{$group : {_id :
corresponding values from all "$by_user", num_tutorial : {$max :
documents in the collection. "$likes"}}}])
121
$push Inserts the value to an array in the db.mycol.aggregate([{$group :
resulting document. {_id : "$by_user", url : {$push:
"$url"}}}])
$addToSet Inserts the value to an array in the db.mycol.aggregate([{$group :
resulting document but does not {_id : "$by_user", url :
create duplicates. {$addToSet : "$url"}}}])
$first Gets the first document from the db.mycol.aggregate([{$group :
source documents according to the {_id : "$by_user", first_url :
grouping. {$first : "$url"}}}])
$last Gets the last document from the db.mycol.aggregate([{$group :
source documents according to the {_id : "$by_user", last_url :
grouping. {$last : "$url"}}}])
122
Example
• A collection books contains the following documents:
• { "_id" : 8751, "title" : "The Banquet", "author" : "Dante",
"copies" : 2 }
124
• Group Documents by author
• The following aggregation operation uses the $$ROOT system
variable to group the documents by authors. The resulting
documents must not exceed the BSON Document Size limit.
db.books.aggregate(
[
{
$group :
{ _id : "$author", books: { $push:
"$$ROOT" } }
}
]
)
125
The operation returns the following documents:
• { "_id" : "Homer", "books" :
• [
• { "_id" : 7000, "title" : "The Odyssey", "author" : "Homer",
"copies" : 10 },
• { "_id" : 7020, "title" : "Iliad", "author" : "Homer", "copies" :
10 }
• ]
• }
• { "_id" : "Dante", "books" :
• [
• { "_id" : 8751, "title" : "The Banquet", "author" : "Dante",
"copies" : 2 },
• { "_id" : 8752, "title" : "Divine Comedy", "author" : "Dante",
"copies" : 1 },
• { "_id" : 8645, "title" : "Eclogues", "author" : "Dante", "copies" :
2}
• ]
• }
126
• Aggregation Pipeline
128
129
Map-Reduce
• Map-reduce is a data processing paradigm for condensing large
volumes of data into useful aggregated results.
132
• limit specifies the optional maximum number of documents to be
returned
134
function()
{
if (this.status == 'A')
emit(this.cust_id, 1);
}
135
• Requirements for the reduce Function
• The reduce function has the following prototype:
function(key, values)
{ ... return result; }
function(key, reducedValue)
{ ... return modifiedObject; }
137
Using MapReduce:
• Consider the following document structure storing user posts.
• The document stores user_name of the user and the status of post.
{
"post_text": "tutorialspoint is an awesome website”,
"user_name": "mark",
"status":"active"
}
138
• Now, we will use a mapReduce function on
our posts collection to select all the active
posts, group them on the basis of user_name
and then count the number of posts by each
user using the following code:
139
db.posts.mapReduce(
function()
{ emit(this.user_name,1); },
function(key, values) {return Array.sum(values)},
{
query:{status:"active"},
out:"post_total"
}
)
140
• Output:
{
"result" : "post_total",
"timeMillis" : 9,
"counts" :
{
"input" : 4,
"emit" : 4,
"reduce" : 2,
"output" : 2
},
"ok" : 1, }
142
• Consider the following map-reduce operation:
143
144
• In this map-reduce operation, MongoDB applies the map phase to
each input document (i.e. the documents in the collection that
match the query condition).
• The map function emits key-value pairs. For those keys that have
multiple values, MongoDB applies the reduce phase, which
collects and condenses the aggregated data. MongoDB then stores
the results in a collection.
145
• All map-reduce functions in MongoDB are JavaScript and
run within the mongod process.
146
• Map-Reduce Examples
{ _id: ObjectId("50a8240b927d5d8b5891743c"),
cust_id: "abc123",
ord_date: new Date("Oct 04, 2012"),
status: 'A', price: 25,
items: [ { sku: "mmm", qty: 5, price: 2.5 },
{ sku: "nnn", qty: 5, price: 2.5 }
]
}
147
Return the Total Price Per Customer
• Perform the map-reduce operation on the orders collection to
group by the cust_id, and calculate the sum of the price for each
cust_id:
149
3)Perform the map-reduce on all documents in the orders
collection using the mapFunction1 map function and the
reduceFunction1 reduce function.
db.orders.mapReduce( mapFunction1,
reduceFunction1,
{
out: "map_reduce_example"
})
150
• Single Purpose Aggregation Operations
• Aggregation refers to a broad class of data manipulation
operations that compute a result based on an input and a
specific procedure.
152
Count
• MongoDB can return a count of the number of documents that match a query.
The count command as well as the count() and cursor.count() methods provide
access to counts in the mongo shell.
• Example
• Given a collection named records with only the following documents:
{ a: 1, b: 0 }
{ a: 1, b: 1 }
{ a: 1, b: 4 }
{ a: 2, b: 2 }
• The following operation would count all documents in the collection and return
the number 4:
db.records.count()
• The following operation will count only the documents where the value of the
field a is 1 and return 3:
db.records.count( { a: 1 } )
153
• Distinct
• The distinct operation takes a number of documents that match
a query and returns all of the unique values for a field in the
matching documents.
154
155
• Example
• Given a collection named records with only the following
documents:
{ a: 1, b: 0 }
{ a: 1, b: 1 }
{ a: 1, b: 1 }
{ a: 1, b: 4 }
{ a: 2, b: 2 }
{ a: 2, b: 2 }
db.records.distinct( "b" )
• Output:[ 0, 1, 4, 2 ]
156
Group
• The group operation takes a number of documents that match a
query, and then collects groups of documents based on the value
of a field or fields.
158
• Groups documents in a collection by the specified keys and
performs simple aggregation functions such as computing
counts and sums.
• Definition
159
Field Type Description
key document The field or fields to group. Returns a “key
object” for use as the grouping key.
reduce function An aggregation function that operates on the
documents during the grouping operation. These
functions may return a sum or a count. The
function takes two arguments: the current
document and an aggregation result document for
that group.
initial document Initializes the aggregation result document.
160
keyf function Optional. Alternative to the key field. Specifies a
function that creates a “key object” for use as
the grouping key. Use keyf instead of key to
group by calculated fields rather than existing
document fields.
cond docume Optional. The selection criteria to determine which
nt documents in the collection to process. If you omit
the cond field, db.collection.group() processes all
the documents in the collection for the group
operation.
finalize function Optional. A function that runs each item in the
result set before db.collection.group() returns the
final value. This function can either modify the
result document or replace the result document as a
whole.
ns string The collection from which to perform the group by
operation.
161
• The db.collection.group() method is a shell wrapper for
the group command.
162
• Example
• Given a collection named records with the
following documents:
{ a: 1, count: 4 }
{ a: 1, count: 2 }
{ a: 1, count: 4 }
{ a: 2, count: 3 }
{ a: 2, count: 1 }
{ a: 1, count: 5 }
{ a: 4, count: 4 }
163
• Following group operation groups documents by the field a,
where a is less than 3, and sums the field count for each group:
db.records.group(
{
key: { a: 1 },
cond: { a: { $lt: 3 } },
reduce: function(cur, result)
{
result.count += cur.count },
initial: { count: 0 }
}
)
• Output: [ { a: 1, count: 15 }, { a: 2, count: 4 } ]
164
Group command:
• Groups documents in a collection by the specified key and
performs simple aggregation functions, such as computing
counts and sums.
165
• Syntax:
{ group:
{ ns: <namespace>,
key: <key>,
$reduce: <reduce function>,
$keyf: <key function>,
cond: <query>,
finalize: <finalize function>
}
}
166
Group by Two Fields
• The following example groups by the ord_dt and item.sku fields
those documents that have ord_dt greater than 01/07/2015:
• db.runCommand(
{ group:
{
ns: 'orders',
key: { ord_dt: 1, 'item.sku': 1 },
cond: { ord_dt:
{ $gt: new Date(
‘01/07/2015' ) }}, $reduce: function (
curr, result ) { },
initial: { }
}
}
)
167
• db.runCommand() runs the command in the context of the
current database. Some commands are only applicable in the
context of the admin database, and you must change your db
object to before running these commands.
168
{ "_id" : 1, "domainName" : "test1.com", "hosting" : "hostgator.com" }
169
The following example groups by the “hosting” field, and
display the total sum of each hosting.
> db.website.aggregate(
{
$group : {_id : "$hosting", total : { $sum : 1 } }
}
);
171
MongoDB Indexing
• Indexes support the efficient execution of queries. Without indexes,
MongoDB must scan every document of a collection to select those
documents that match the query statement. This scan is highly
inefficient and require the mongod to process a large volume of data.
• The index stores the value of a specific field or set of fields, ordered
by the value of the field as specified in index.
172
• Index Types
• MongoDB provides a number of different index types.
• Example
• Consider friends collection:
176
{
"_id": ObjectId(...)
"name": "John "
"address":
{ "street": "Main",
"zipcode": "53511",
"state": "WI"
}
}
• You can create an index on the address.zipcode field, using the
following specification:
db.people.ensureIndex( { "address.zipcode": 1 } )
177
Indexes on Subdocuments
• For example, the factories collection contains documents that
contain a metro field, such as:
{
_id: ObjectId(...),
metro: { city: "New York",
state: "NY" },
name: "Giant Factory"
}
178
• The following query can use the index on the metro field:
db.factories.find(
{ metro:
{ city: "New York",
state: "NY"
}
})
179
• The ensureIndex() Method
• SYNTAX:
>db.COLLECTION_NAME.ensureIndex({KEY:1})
• Here key is the name of field on which you want to create index
and 1 is for ascending order. To create index in descending order
you need to use -1.
• EXAMPLE
>db.mycol.ensureIndex({"title":1})
>db.mycol.ensureIndex({"title":1,"description":-1})
180
Compound Indexes
• MongoDB supports compound indexes, where a single index
structure holds references to multiple fields within a collection’s
documents.
db.collection.ensureIndex( { a: 1, b: 1, c: 1 } )
181
• The value of the field in the index specification describes the
kind of index for that field.
• Example
• The following operation will create an index on the item,
category, and price fields of the products collection:
183
• or queries that return results sorted first by descending
username values and then by ascending date values, such as:
db.events.find().sort( { username: -1, date: 1 } )
184
Multikey Indexes
• To index a field that holds an array value, MongoDB adds
index items for each item in the array.
185
Limitations
• Interactions between Compound and Multikey Indexes
• While you can create multikey compound indexes, at most
one field in a compound index may hold an array.
186
• However, the following document is impermissible, and
MongoDB cannot insert such a document into a collection with
the {a: 1, b: 1 } index:
187
• Examples
• Index Basic Arrays
• Given the following document:
{ "_id" : ObjectId("..."),
"name" : "Warm Weather",
"author" : "Steve",
"tags" : [ "weather", "hot", "record", "april" ]
}
188
• Then an index on the tags field, { tags: 1 }, would be a multikey
index and would include these four separate entries for that
document:
"weather",
"hot",
"record", and
"april".
• Queries could use the multikey index to return queries for any
of the above values.
189
Index Arrays with Embedded Documents
• You can create multikey indexes on fields in objects embedded
in arrays, as in the following example:
• Consider a feedback collection with documents in the
following form:
{ "_id": ObjectId(...),
"title": "Grocery Quality",
"comments": [ { author_id: ObjectId(...),
date: Date(...),
text: "Please expand the selection."
},
190
{ author_id: ObjectId(...),
date: Date(...),
text: "Please expand the mustard selection."
},
{ author_id: ObjectId(...),
date: Date(...),
text: "Please expand the olive selection."
}
]
}
• An index on the comments.text field would be a multikey index
and would add items to the index for all embedded documents in
the array.
191
• With the index { "comments.text": 1 } on the feedback
collection, consider the following query:
db.feedback.find(
{ "comments.text": "Please expand the olive selection." } )
192
Array of Embedded Documents
• Consider that the inventory collection includes the following
documents:
{
_id: 100,
type: "food",
item: "xyz",
qty: 25,
price: 2.5,
ratings: [ 5, 8, 9 ],
memos: [ { memo: "on time", by: "shipping" },
{ memo: "approved", by: "billing" } ]
}
193
{
_id: 101,
type: "fruit",
item: "jkl",
qty: 10,
price: 4.25,
ratings: [ 5, 9 ],
memos: [ { memo: "on time", by: "payment" },
{ memo: "delayed", by: "shipping" }
]
}
194
Match a Field in the Embedded Document Using the Array
Index
• If you know the array index of the embedded document, you
can specify the document using the subdocument’s position
using the dot notation.
195
• The operation returns the following document:
{
_id: 100,
type: "food",
item: "xyz",
qty: 25,
price: 2.5,
ratings: [ 5, 8, 9 ],
memos: [ { memo: "on time", by: "shipping" },
{ memo: "approved", by: "billing" }
]
}
196
Match a Field Without Specifying Array Index
• If you do not know the index position of the document in
the array, concatenate the name of the field that contains
the array, with a dot (.) and the name of the field in the
subdocument.
197
• The operation returns the following documents:
{
_id: 100,
type: "food",
item: "xyz",
qty: 25,
price: 2.5,
ratings: [ 5, 8, 9 ],
memos: [ { memo: "on time", by: "shipping" }, { memo: "approved", by:
"billing" } ]
}
{
_id: 101,
type: "fruit",
item: "jkl",
qty: 10,
price: 4.25,
ratings: [ 5, 9 ],
memos: [ { memo: "on time", by: "payment" }, { memo: "delayed", by:
"shipping" } ]
} 198
Create a Unique Index
• MongoDB allows you to specify a unique constraint on an
index. These constraints prevent applications from
inserting documents that have duplicate values for the
inserted fields.
• Unique Indexes
• To create a unique index, consider the following prototype:
db.collection.ensureIndex( { a: 1 }, { unique: true } )
199
• For example, you may want to create a unique index on
the "tax-id": of the accounts collection to prevent storing
multiple account records for the same legal entity:
200
• Unique Constraint Across Separate Documents
• The unique constraint applies to separate documents in the
collection.
201
• For example, a collection has a unique index on a.b:
db.collection.insert( { a: [ { b: 5 }, { b: 5 } ] } )
202
• Drop Duplicates
• MongoDB cannot create a unique index on a
field that has duplicate values.
203
• To create an unique index that drops duplicates on the username
field of the accounts collection, use a command in the following
form:
db.accounts.ensureIndex(
{ username: 1 },
{ unique: true,
dropDups: true }
)
204
Index Names
• The default name for an index is the concatenation of the
indexed keys and each key’s direction in the index, 1 or -1.
• Example
• Consider the following command to create an index on item
and quantity:
db.products.ensureIndex( { item: 1, quantity: -1 } )
205
• Example
• Issue the following command to create an index
on item and quantity and specify inventory as the
index name:
db.products.ensureIndex(
{ item: 1, quantity: -1 } ,
{ name: "inventory" }
)
• The resulting index has the name inventory.
207
• Example
• The following operation, creates a sparse index on the
users collection that only includes a document in the index
if the twitter_name field exists in a document.
db.users.ensureIndex(
{ twitter_name: 1 }, { sparse: true } )
208
• Example:Create a Sparse Index On A Collection
• Consider a collection scores that contains the following
documents:
{ "_id" : ObjectId("523b6e32fb408eea0eec2647"),
"userid" : “abc”
}
{ "_id" : ObjectId("523b6e61fb408eea0eec2648"),
"userid" : “xyz",
"score" : 82 }
{ "_id" : ObjectId("523b6e6ffb408eea0eec2649"),
"userid" : “lmn",
"score" : 90
}
• Because the document for the userid “abc" does not contain the
score field and thus does not meet the query criteria, the query can
use the sparse index to return the results:
• { "_id" : ObjectId("523b6e61fb408eea0eec2648"),
"userid" : “xyz",
"score" : 82 }
210
• Sparse Index On A Collection Cannot Return Complete
Results
• Consider a collection scores that contains the following
documents:
{ "_id" : ObjectId("523b6e32fb408eea0eec2647"), "userid" :
“abc”
}
211
• The collection has a sparse index on the field score:
• Because the document for the userid “abc" does not contain the
score field, the sparse index does not contain an entry for that
document.
212
• To use the sparse index, explicitly specify the index with hint():
db.scores.find().sort( { score: -1 } ).hint( { score: 1 } )
• The use of the index results in the return of only those documents
with the score field:
213
• Consider the following operation:
db.users.find().hint( { score: 1 } )
214
• Sparse Index with Unique Constraint
• Consider a collection scores that contains the following documents:
{ "_id" : ObjectId("523b6e32fb408eea0eec2647"),
"userid" : "newbie"
}
{ "_id" : ObjectId("523b6e61fb408eea0eec2648"),
"userid" : "abby", "score" : 82
}
{ "_id" : ObjectId("523b6e6ffb408eea0eec2649"),
"userid" : "nina", "score" : 90 }
• You could create an index with a unique constraint and sparse filter
on the score field using the following operation:
215
• This index would permit the insertion of documents that had
unique values for the score field or did not include a score field.
• However, the index would not permit the addition of the following
documents since documents already exists with score value of 82
and 90:
217
• Hashed indexes maintain entries with hashes of
the values of the indexed field.
• Example
• Specify a hashed index on _id
218
END
219
• MongoDB Java
Installation
• Before we start using MongoDB in our Java programs, we need
to make sure that we have MongoDB JDBC Driver and Java set
up on the machine.
• You need to download the jar file(Download mongo.jar). Make
sure to download latest release of it.
• You need to include the mongo.jar into your classpath.
Connect to database
• To connect database, you need to specify database name, if
database doesn't exist then mongodb creates it automatically.
220
import com.mongodb.MongoClient;
import com.mongodb.MongoException;
import com.mongodb.WriteConcern;
import com.mongodb.DB;
import com.mongodb.DBCollection;
import com.mongodb.BasicDBObject;
import com.mongodb.DBObject;
import com.mongodb.DBCursor;
import com.mongodb.ServerAddress;
import java.util.Arrays;
public class MongoDBJDBC
{
public static void main( String args[] )
{
try
{ // To connect to mongodb server
MongoClient mongoClient = new MongoClient( "localhost" , 27017 );
221
// Now connect to your databases
DB db = mongoClient.getDB( "test" );
System.out.println("Connect to database successfully");
boolean auth = db.authenticate(myUserName, myPassword);
System.out.println("Authentication: "+auth); }
catch(Exception e)
{
System.err.println( e.getClass().getName() + ": " +
e.getMessage() );
}
}
}
222
• Now, let's compile and run above program to create our
database test. You can change your path as per your
requirement.
$javac MongoDBJDBC.java
$java -classpath ".:mongo-2.10.1.jar" MongoDBJDBC
Connect to database successfully
Authentication: true
223
• The MongoClient instance actually represents
a pool of connections to the database; you will
only need one instance of class MongoClient
even with multiple threads.
• To dispose of an instance,call
MongoClient.close() to clean up resources.
224
• Authentication (Optional)
• MongoDB can be run in a secure mode where access to
databases is controlled through name and password
authentication. When run in this mode, any client application
must provide a name and password before doing any
operations.
• If the name and password are valid for the database, auth will
be true. Otherwise, it will be false. 225
• Getting a List Of Collections
• Each database has zero or more collections. You can
retrieve a list of them from the db.:
DBCollection coll =
db.getCollection("testCollection");
227
• Inserting a Document
• Once you have the collection object, you can insert
documents into the collection.
• For example,
{
"name" : "MongoDB",
"type" : "database",
"count" : 1,
"info" :
{ x : 203,
y : 102
}
}
228
• The above has an “inner” document embedded within it.
coll.insert(doc);
229
Finding the First Document in a Collection Using
findOne()
230
DBObject myDoc = coll.findOne();
System.out.println(myDoc);
• Output:
{ "_id" : "49902cde5162504500b45c2c" ,
"name" : "MongoDB" ,
"type" : "database" ,
"count" : 1 ,
"info" : { "x" : 203 , "y" : 102}
}
231
• Adding Multiple Documents
• These documents will just be { "i" : value } and we can do this
in a loop
232
• Counting Documents in a Collection
• System.out.println(coll.getCount());
and it should print 100.
233
• Using a Cursor to Get All the Documents
• In order to get all the documents in the collection, we will use the
find() method. The find() method returns a DBCursor object which
allows us to iterate over the set of documents that matched our
query. So to query all of the documents and print them out :
DBCursor cursor = coll.find();
try
{ while(cursor.hasNext())
{ System.out.println(cursor.next());
}
}
finally
{ cursor.close(); }
• and that should print all 100 documents in the collection.
234
• Getting A Single Document with A Query
• We can create a query to pass to the find() method to get a subset
of the documents in our collection.
• For example, if we wanted to find the document for which the
value of the “i” field is 72, consider following :
235
• and it should just print just one document
• Consider following:
db.things.find(
{
j: {$ne: 3},
k: {$gt: 10}
}
);
236
• These are represented as regular String keys in the Java driver,
using embedded DBObjects:
cursor = coll.find(query);
237
try
{
while(cursor.hasNext())
{
System.out.println(cursor.next());
}
}
finally
{
cursor.close();
}
238
• Getting A Set of Documents With a Query
• We can use the query to get a set of documents from our
collection. For example, if we wanted to get all documents where
"i" > 50, we could write:
• // find all where i > 50
239
• which should print the documents where i > 50.
• We could also get a range, say 20 < i <= 30:
240
• Getting A List of Databases
• You can get a list of the available databases:
241
• Dropping A Database
• You can drop a database by name using a MongoClient
instance:
mongoClient.dropDatabase("databaseToBeDropped");
242
• Creating A Collection
• There are two ways to create a collection.
db = mongoClient.getDB("mydb");
db.createCollection("testCollection",
new BasicDBObject("capped", true)
.append("size", 1048576)
);
243
• Getting A List of Collections
• You can get a list of the available collections
in a database:
• It should output
system.indexes
test Collection
244
• Dropping A Collection
• You can drop a collection by using the drop() method:
DBCollection test1 =
db.getCollection("testCollection");
test1.drop();
System.out.println(db.getCollectionNames());
245
• Creating An Index
246
• Insert Multiple Documents Using a For Loop
• You can add documents to a new or existing collection by using a
JavaScript for loop run from the mongo shell.
• From the mongo shell, insert new documents into
the testData collection using the following for loop.
• If the testData collection does not exist, MongoDB creates the
collection implicitly.
247
• Insert Multiple Documents with a mongo Shell Function
• You can create a JavaScript function in your shell session to
generate the above data.
• The insertData() JavaScript function, creates new data for use in
testing or training by either creating a new collection or appending
data to an existing collection:
249
• EXAMPLE
• Specify database name, collection name, and the
number of documents to insert as arguments to
insertData().
• Select a Database
• After starting the mongo shell the session will use
the test database by default. Issue the following operation at the
mongo to get the name of the current database:
db
251
• From the mongo shell, display the list of
databases, with the following operation:
show dbs
252
• Create a Collection and Insert Documents
• Insert documents into a new collection named testData within
the new database named mydb.
253
• 2)If mongo does not return mydb for the previous operation, set
the context to the mydb database, with the following operation:
use mydb
j = { name : "mongo" }
k={x:3}
254
• 4)Insert the j and k documents into the testData collection with the
following sequence of operations:
db.testData.insert( j )
db.testData.insert( k )
• When you insert the first document, the mongod will create both
the mydb database and the testData collection.
256
Insert Documents using a For Loop or a JavaScript Function
-Working with the Cursor
• The mongo shell then iterates over the cursor to display the results.
•
• Rather than returning all results at once, the shell iterates over the
cursor 20 times to display the first 20 results and then waits for a
request to iterate over the remaining results. In the shell, use
enter it to iterate over the next set of results.
257
• Iterate over the Cursor with a Loop
258
• The hasNext() function returns true if the cursor has documents.
• The next() method returns the next document.
• The printjson() method renders the document in a JSON-like
format.
• The operation displays 20 documents. For example, if the
documents have a single field named x, the operation displays the
field as well as each document’s ObjectId:
259
• Use Array Operations with the Cursor
• In the mongo shell, query the testData collection and assign
the resulting cursor object to the c variable:
var c = db.testData.find()
260
MongoDB Datatypes
• String : This is most commonly used datatype to store the data.
• Integer : This type is used to store a numerical value. Integer can be 32 bit
or 64 bit depending upon your server.
• Min/ Max keys : This type is used to compare a value against the lowest
and highest BSON elements.
• Arrays : This type is used to store arrays or list or multiple values into one
key.
• Timestamp : This can be used for recording when a document has been
modified or added.
261
• Object : This datatype is used for embedded documents.
• Date : This datatype is used to store the current date or time in UNIX
time format. You can specify your own date time by creating object of
Date and passing day, month, year into it.
• http://docs.mongodb.org/manual/ or
• SQL/XML/MongoDB (https://www.w3schools.com/)
• https://www.tutorialspoint.com/mongodb/
• https://www.json.org/
• https://www.tutorialspoint.com/json/
263
END
264