0% found this document useful (0 votes)
38 views18 pages

Chapter 3. MongoDB

MongoDB is a document database that stores data in JSON-like documents rather than in tables. It does not enforce schemas and stores data in collections as documents rather than normalizing data across tables. The document provides examples of connecting to a MongoDB database from Python, performing CRUD operations, running queries with filters and sorting, and aggregating data using grouping and averages.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
38 views18 pages

Chapter 3. MongoDB

MongoDB is a document database that stores data in JSON-like documents rather than in tables. It does not enforce schemas and stores data in collections as documents rather than normalizing data across tables. The document provides examples of connecting to a MongoDB database from Python, performing CRUD operations, running queries with filters and sorting, and aggregating data using grouping and averages.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 18

MongoDB

Mongodb- A document database

 not a Relational Database Management


System
 It's a "NoSQL" database.
 It is opposite to SQL based databases
where it does not normalize data under
schemas and tables where every table
has a fixed structure.
 Instead, it stores data in the collections
as JSON based documents and does not
enforce schemas.
Mongodb-structure
• Database
• Collections
• Documents
Mongodb installation on windows
STEP1: https://www.youtube.com/watch?v=gB6WLkSrtJk
STEP 2: Create a database and collection
STEP 3: Install pymongo in python environment to connect pyspark to
mongodb
Pyspark+MongoDB-pgm1
import pymongo
client = pymongo.MongoClient("localhost", 27017)
db = client["manjudb"] # Replace with your database name
collection = db["sample"] # Replace with your collection name
cursor = collection.find()
for document in cursor:
print(document)
print("Filtered results\n")
cursor = collection.find({"target":0})
###change this to your condition, here it is target column having zero values
for document in cursor:
print(document)
client.close()
Pyspark+MongoDB-pgm2
import pymongo
client = pymongo.MongoClient("localhost", 27017)
db = client["manjudb"] # Replace with your database name
collection = db["table1"] # Replace with your collection name
print("Filtered results\n")
cursor = collection.find({ "Genre" :"Thriller"}) ###see the condition given
for document in cursor:
print(document)
client.close()
PGM3
import pymongo
# Connect to the MongoDB server
client = pymongo.MongoClient("localhost", 27017)
db = client["manjudb"] # Replace with your database name
collection = db["table1"] # Replace with your collection name
# Count the number of documents with the "Genre" field equal to "Thriller"
thriller_count = collection.count_documents({"Genre": "Thriller"})
print("Count of Thriller Genre",thriller_count)
# Find and print documents where "Budget" is greater than 50
cursor = collection.find({"Budget": {"$gt": 50}})
for document in cursor:
print(document)
client.close()
INSERT A DOCUMENT
import pymongo

client = pymongo.MongoClient("localhost", 27017)


db = client["manjudb"] # Replace with your database name
collection = db["table1"] # Replace with your collection name
# Insert a new record with values for "slNo" and “Budget"
new_record = {
"slNo": 171, # Replace with your desired slno value
"Budget": 95 # Replace with your desired budget value
}
collection.insert_one(new_record)
print("\nNew record inserted successfully.")
client.close()
INSERT MULTIPLE DOCUMENTS
# Insert new records
new_recordS = [{
"SlNo": 181,
"Budget": 95
},
{
"SlNo": 191,
"Budget": 95 },
{"SlNo": 171,
"Genre": "Action" }]

collection.insert_many(new_recordS)
print("\nNew recordS inserted successfully.")
client.close()
DELETE DOCUMENTS
import pymongo
client = pymongo.MongoClient("localhost", 27017)
db = client["manjudb"]
collection = db["table1"]
for document in cursor:
print(document)
# Search for documents with "Budget" less than 10 and delete them
delete_result = collection.delete_many({"Budget": {"$lt": 10}})
print(f"\nDeleted {delete_result.deleted_count} records with budget less than 10.")
client.close()
*****delete_one ---will delete only the first matching document
UPDATE DOCUMENTS
import pymongo
# Connect to the MongoDB server

client = pymongo.MongoClient("localhost", 27017)


db = client["manjudb"]
collection = db["table1"]
# Update the "Budget" field by increasing it by 10 for all documents with "Genre" equal to "Thriller"
update_result = collection.update_many({"Genre": "Thriller"}, {"$inc": {"Budget": 10}})
# Print the number of documents updated
print(f"Updated {update_result.modified_count} documents")
client.close()
COMBINING CONDITIONS USING AND , OR
import pymongo
client = pymongo.MongoClient("localhost", 27017)

db = client["manjudb"] # Replace with your database name


collection = db["table1"] # Replace with your collection name
conditions = {
"$and": [
{"Budget": {"$gt": 20}},
{"Genre": "Romance"}
]}

# Find and print documents that meet the conditions


cursor = collection.find(conditions)
for document in cursor:
print(document)
client.close()
Operators for queries
Comparison
•$eq: Values are equal
•$ne: Values are not equal
•$gt: Value is greater than another value
•$gte: Value is greater than or equal to another value
•$lt: Value is less than another value
•$lte: Value is less than or equal to another value
•$in: Value is matched within an array

Logical
•$and: Returns documents where both queries match
•$or: Returns documents where either query matches
•$nor: Returns documents where both queries fail to match
•$not: Returns documents where the query does not match

Evaluation
The following operators assist in evaluating documents.
•$regex: Allows the use of regular expressions when evaluating field values
•$text: Performs a text search
•$where: Uses a JavaScript expression to match documents
AGGREGATIONS (GROUPING)
import pymongo
client = pymongo.MongoClient("localhost", 27017)
db = client["manjudb"] # Replace with your database name
collection = db["table1"] # Replace with your collection name
# Define the aggregation pipeline
pipeline = [ OUTPUT:
{ "$group": { Genre: Drama , Average Budget:

"_id": "$Genre", 25.242424242424242


"average_budget": {"$avg": "$Budget"}
Genre: Action, Average Budget:
} }]
62.285714285714285
Genre: Comedy, Average Budget:
result = list(collection.aggregate(pipeline))
25.27777777777778
for entry in result:
Genre: Romance, Average Budget: 25.16
genre = entry["_id"]
Genre: Thriller, Average Budget:
average_budget = entry["average_budget"]
31.307692307692307
print(f"Genre: {genre}, Average Budget: {average_budget}") Genre: None, Average Budget: 95.0
client.close()
SORT
import pymongo
client = pymongo.MongoClient("localhost", 27017)
db = client["manjudb"] # Replace with your database name
collection = db["table1"] # Replace with your collection name
# Define the aggregation pipeline
pipeline = [ {
"$sort": {
"Budget": -1 # Sort by "Budget" in descending order
} },
{ "$limit": 1 # Limit the result to one document }]
# Perform the aggregation
result = list(collection.aggregate(pipeline))
# Print the first document (the one with the highest budget)
if result:
document = result[0]
print("Document with highest budget:")
print(document)
else:
print("No documents found.")
client.close()
CREATE AND INSERT

import pymongo
{
client = "mid": 191,
pymongo.MongoClient("localhost", "Budget": 95 },
27017) {"mid": 171,
db = client["manjudb"] "Genre": "Action" }]
db.create_collection("movie")
collection.insert_many(new_recordS)
collection = db["movie"]
new_recordS = [{ client.close()
"mid": 181,
"Budget": 95 },
LIMIT
import pymongo
client = pymongo.MongoClient("mongodb://localhost:27017")
# Select the database and collection
db = client["manjudb"]
collection = db["sample"]
# Perform the find operation with the limit
results = collection.find().limit(2)
for document in results:
print(document)
INDEXING

import pymongo
client = pymongo.MongoClient("mongodb://localhost:27017")
db = client["manjudb"]
collection = db["table1"]
# Create an index on a specific field (e.g., "field_name")
index_key = [("Budget", pymongo.ASCENDING)] # You can use DESCENDING for descending order
collection.create_index(index_key)
# Example query using the index
query = {"Budget": 10}
results = collection.find(query).limit(5)
for document in results:
print(document)

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy