Unit 4
Unit 4
If you have downloaded the source package into your computer, you can use the setup.py as
follows:
Second, use the following statement to create a new database named suppliers in the
PostgreSQL database server.
The connect() function creates a new database session and returns a new instance of the
connection class. By using the connection object, you can create a new cursor to execute
any SQL statements.
To call the connect() function, you specify the PostgreSQL database parameters as a
connection string and pass it to the function like this:
conn = psycopg2.connect(
host="localhost",
database="suppliers",
user="postgres",
password="Abcd1234")Code language: Python (python)
To make it more convenient, you can use a configuration file to store all connection
parameters.
[postgresql]
host=localhost
database=suppliers
user=postgres
password=SecurePas$1Code language: Shell Session (shell)
By using the database.ini, you can change the PostgreSQL connection parameters when
you move the code to the production environment without modifying the code.
Notice that if you git, you need to add the database.ini to the .gitignore file to not
committing the sensitive information to the public repo like github. The .gitignore file will
be like this:
The following config() function read the database.ini file and returns connection
parameters. The config() function is placed in the config.py file:
#!/usr/bin/python
from configparser import ConfigParser
The following connect() function connects to the suppliers database and prints out the
PostgreSQL database version.
#!/usr/bin/python
import psycopg2
from config import config
def connect():
""" Connect to the PostgreSQL database server """
conn = None
try:
# read connection parameters
params = config()
# create a cursor
cur = conn.cursor()
# execute a statement
print('PostgreSQL database version:')
cur.execute('SELECT version()')
if __name__ == '__main__':
connect()
Code language: Python (python)
How it works.
It means that you have successfully connected to the PostgreSQL database server.
Troubleshooting
The connect() function raises the DatabaseError exception if an error occurred. To see
how it works, you can change the connection parameters in the database.ini file.
For example, if you change the host to localhosts, the program will output the following
message:
The following displays error message when you change the database to a database that does
not exist e.g., supplier:
If you change the user to postgress, it will not be authenticated successfully as follows:
Understanding psycopg2
In order to connect to a database that is already created in your system or on the Internet, you
will have to instruct Python how to detect it. In other words, you will have to tell Python the
database of your interest is a PostgreSQL database.
In Python, you have several options that you can choose from. In this case, we will use
psycopg2, probably the most popular PostgreSQL database adapter for Python. Psycopg2
requires a few prerequisites to work properly on your computer. Once you have installed
them (read the documentation for more information), you can install psycopg2 just like any
other Python packages:
However, if you want to use psycopg2 straightforwardly, you could also install psycopg2-
binary, a stand-alone version of the package, not requiring a compiler or external libraries.
This is the preferred installation for new users.
Finally, if you are using Python in a Conda environment, you should install psycopg2 using
the Anaconda installation:
Now that you’re all set, let’s create your first connection to your PostgreSQL session with
psycopg2!
For this tutorial, we will connect with a database called “datacamp_courses” that is hosted
locally.
The specification gives us quite a bit of information on the table's columns. The table's
primary key should be course_id (note that only this one is bold), and its data type should be
an integer. A primary key is a constraint that enforces the column values to be non-null and
unique. It lets you uniquely identify a specific or a set of instances present in the table.
The remaining columns provide information about the course name, the name of the course
instruction, and the topic of the course.
Before creating the table, it’s important to explain how the connection instance you’ve just
created works. In essence, the connection encapsulates a database session, and it allows you
to execute SQL commands and queries, such as SELECT, INSERT, CREATE, UPDATE,
OR DELETE, using the cursor() method, and to make changes persistent using the commit()
method.
Once you have created the cursor instance, you can send commands to the database using the
execute() method and retrieve data from a table using fetchone(), fetchmany(), or fetchall().
Finally, it’s important to close the cursor and the connection to the database whenever you’ve
finished your operations. Otherwise, they will continue to hold server-side resources. To do
so, you can use the close() method.
Below you can find the code to create the data_courses table:
INSERT
You may have noticed that the table has no values so far. To create records in the
data_courses table, we need the INSERT command.
cur = conn.cursor()
conn.commit()
cur.close()
conn.close()
SELECT
Reading data in SQL databases is probably something you will do a lot in your data science
journey. This is generally called a SELECT query. For now, let's see how the table
data_courses is holding up.
We will call the classic SELECT * FROM database_name statement to read all the data
available on the table. Then, we will use the fetchall() method to fetch all the available
rows. Notice that PostgreSQL automatically creates a numerical index for the course_id
column.
cur = conn.cursor()
cur.execute('SELECT * FROM datacamp_courses;')
rows = cur.fetchall()
conn.commit()
conn.close()
for row in rows:
print(row)
UPDATE
Data often comes with errors. You may have noticed in the previous section that the topic
associated with the course “Introduction to SQL” is Julia. After checking the information
about the course, we discovered the mistake. We need to change it and write “SQL” instead.
This can be done with the UPDATE statement, as follows:
cur = conn.cursor()
cur.execute("UPDATE data_courses SET topic = 'SQL' WHERE course_name =
'Introduction to SQL';")
conn.commit()
conn.close()
DELETE
Finally, you may want to delete one of the records in your table. For example, let’s delete the
course Introduction to Statistics in R:
cur = conn.cursor()
cur.execute("""DELETE from data_courses WHERE course_name = 'Introduction
to Statistics in R'""");
conn.commit()
cur.close()
ORDER BY
Say you want to sort your database by the name of the instructor. You can use the ORDER BY
statement:
cur = conn.cursor()
GROUP BY
You may want to perform some aggregate functions within different groups of data. For
example, you may be interested in calculating the number of courses by the different course
instructors. You can do this kind of operation with the GROUP BY function.
cur = conn.cursor()
cur.execute('SELECT course_instructor, COUNT(*) FROM data_courses GROUP BY
course_instructor')
rows = cur.fetchall()
for row in rows:
print(row)
('James Chapman', 2)
('Izzy Weber', 1)
('EbunOluwa Andrew', 1)
JOIN
Up to this point, we’ve only worked with the data_course table. But you only start
leveraging the full potential of relational databases, like PostgreSQL, when you work with
multiple tables at once.
The magic tool to combine multiple tables is the JOIN operation. Imagine that we have a
second table in our database called programming_languages that contains basic information
about the top programming languages for data science, including the name, the position in the
TIOBE Index, and the number of courses about the programming language in Datacamp. The
table looks like this:
cur = conn.cursor()
cur.execute("""SELECT course_name, course_instructor, topic, tiobe_ranking
FROM datacamp_courses
INNER JOIN programming_languages
ON datacamp_courses.topic = programming_languages.language_name""")
rows = cur.fetchall()
for row in rows:
print(row)
Triggers resemble stored procedures. A trigger can execute SQL and PL/SQL statements as a
unit and invoke stored procedures. However, procedures and triggers are activated in
different ways. A cursor is a Database object that retrieves rows from the database row-by-
row, but it is primarily used to reduce network traffic. In addition to traversing records in a
database, cursors facilitate data retrieval, addition and deletion of records.
Advantages:
They are useful for processing each row individually and validating each row
individually.
Utilizing cursors allows for enhanced concurrent control.
While loops execute more slowly than cursors.
Features:
A cursor keeps track of the current position in the result set. It enables you to perform
multiple operations row by row against a result set, either with or without returning to
the original table.
Disadvantages:
They use additional resources each time, which may result in a network round trip.
An increase in the number of network round trips can degrade performance and slow
down the network.
Features:
Advantages:
Disadvantages:
Triggers can be defined on the table, view, schema, or database with which the event is
associated.
Benefits of Triggers
Creating Triggers
The syntax for creating a trigger is −
Where,
Example
To start with, we will be using the CUSTOMERS table we had created and used in the
previous chapters −
+----+----------+-----+-----------+----------+
| ID | NAME | AGE | ADDRESS | SALARY |
+----+----------+-----+-----------+----------+
| 1 | Ramesh | 32 | Ahmedabad | 2000.00 |
| 2 | Khilan | 25 | Delhi | 1500.00 |
| 3 | kaushik | 23 | Kota | 2000.00 |
| 4 | Chaitali | 25 | Mumbai | 6500.00 |
| 5 | Hardik | 27 | Bhopal | 8500.00 |
| 6 | Komal | 22 | MP | 4500.00 |
+----+----------+-----+-----------+----------+
The following program creates a row-level trigger for the customers table that would fire for
INSERT or UPDATE or DELETE operations performed on the CUSTOMERS table. This
trigger will display the salary difference between the old values and new values −
When the above code is executed at the SQL prompt, it produces the following result −
Trigger created.
OLD and NEW references are not available for table-level triggers, rather you can use
them for record-level triggers.
If you want to query the table in the same trigger, then you should use the AFTER
keyword, because triggers can query the table or change it again only after the initial
changes are applied and the table is back in a consistent state.
The above trigger has been written in such a way that it will fire before any DELETE
or INSERT or UPDATE operation on the table, but you can write your trigger on a
single or multiple operations, for example BEFORE DELETE, which will fire
whenever a record will be deleted using the DELETE operation on the table.
Triggering a Trigger
Let us perform some DML operations on the CUSTOMERS table. Here is one INSERT
statement, which will create a new record in the table −
When a record is created in the CUSTOMERS table, the above create trigger,
display_salary_changes will be fired and it will display the following result −
Old salary:
New salary: 7500
Salary difference:
Because this is a new record, old salary is not available and the above result comes as null.
Let us now perform one more DML operation on the CUSTOMERS table. The UPDATE
statement will update an existing record in the table −
UPDATE customers
SET salary = salary + 500
WHERE id = 2;
When a record is updated in the CUSTOMERS table, the above create trigger,
display_salary_changes will be fired and it will display the following result −
CURSORS
A cursor is a pointer to this context area. It contains all information needed for processing the
statement. In PL/SQL, the context area is controlled by Cursor. A cursor contains information
on a select statement and the rows of data accessed by it.
A cursor is used to referred to a program to fetch and process the rows returned by the SQL
statement, one at a time. There are two types of cursors:
Implicit Cursors
Explicit Cursors
These are created by default to process the statements when DML statements like INSERT,
UPDATE, DELETE etc. are executed.
Orcale provides some attributes known as Implicit cursor's attributes to check the status of
DML operations. Some of them are: %FOUND, %NOTFOUND, %ROWCOUNT and
%ISOPEN.
For example: When you execute the SQL statements like INSERT, UPDATE, DELETE
then the cursor attributes tell whether any rows are affected and how many have been
affected. If you run a SELECT INTO statement in PL/SQL block, the implicit cursor attribute
can be used to find out whether any row has been returned by the SELECT statement. It will
return an error if there no data is selected.
The following table soecifies the status of the cursor with each of its attribute.
Attribute Description
Its return value is TRUE if DML statements like INSERT, DELETE and
%FOUND UPDATE affect at least one row or more rows or a SELECT INTO
statement returned one or more rows. Otherwise it returns FALSE.
Its return value is TRUE if DML statements like INSERT, DELETE and
%NOTFOUND UPDATE affect no row, or a SELECT INTO statement return no rows.
Otherwise it returns FALSE. It is a just opposite of %FOUND.
It always returns FALSE for implicit cursors, because the SQL cursor is
%ISOPEN
automatically closed after executing its associated SQL statements.
It returns the number of rows affected by DML statements like INSERT,
%ROWCOUNT
DELETE, and UPDATE or returned by a SELECT INTO statement.
Let's execute the following program to update the table and increase salary of each customer
by 5000. Here, SQL%ROWCOUNT attribute is used to determine the number of rows
affected:
Create procedure:
1. DECLARE
2. total_rows number(2);
3. BEGIN
4. UPDATE customers
5. SET salary = salary + 5000;
6. IF sql%notfound THEN
7. dbms_output.put_line('no customers updated');
8. ELSIF sql%found THEN
9. total_rows := sql%rowcount;
10. dbms_output.put_line( total_rows || ' customers updated ');
11. END IF;
12. END;
13. /
Output:
6 customers updated
PL/SQL procedure successfully completed.
Now, if you check the records in customer table, you will find that the rows are updated.
Steps:
You must follow these steps while working with an explicit cursor.
1. Declare the cursor to initialize in the memory.
2. Open the cursor to allocate memory.
3. Fetch the cursor to retrieve data.
4. Close the cursor to release allocated memory.
1. CURSOR name IS
2. SELECT statement;
1. OPEN cursor_name;
1. Close cursor_name;
Create procedure:
Execute the following program to retrieve the customer name and address.
1. DECLARE
2. c_id customers.id%type;
3. c_name customers.name%type;
4. c_addr customers.address%type;
5. CURSOR c_customers is
6. SELECT id, name, address FROM customers;
7. BEGIN
8. OPEN c_customers;
9. LOOP
10. FETCH c_customers into c_id, c_name, c_addr;
11. EXIT WHEN c_customers%notfound;
12. dbms_output.put_line(c_id || ' ' || c_name || ' ' || c_addr);
13. END LOOP;
14. CLOSE c_customers;
15. END;
16. /
Output:
1 Ramesh Allahabad
2 Suresh Kanpur
3 Mahesh Ghaziabad
4 Chandan Noida
5 Alex Paris
6 Sunita Delhi
PL/SQL procedure successfully completed.