100% found this document useful (2 votes)
590 views4 pages

Final Assignment

The document describes a final project that involves processing images within a ZIP file to search for keywords and faces. The task is to write Python code to extract images from a ZIP file of newspaper pages, detect faces within each image using OpenCV, perform optical character recognition (OCR) on the text using Tesseract, and generate contact sheets of faces found on pages mentioning the searched keywords. Example output is provided showing contact sheets of faces detected on pages containing the words "Christopher" and "Mark" when searching the small and large ZIP files, respectively. Code is provided to implement the necessary classes and functions to complete these tasks, including loading the images from the ZIP, detecting faces, recognizing text, and generating the contact sheets.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (2 votes)
590 views4 pages

Final Assignment

The document describes a final project that involves processing images within a ZIP file to search for keywords and faces. The task is to write Python code to extract images from a ZIP file of newspaper pages, detect faces within each image using OpenCV, perform optical character recognition (OCR) on the text using Tesseract, and generate contact sheets of faces found on pages mentioning the searched keywords. Example output is provided showing contact sheets of faces detected on pages containing the words "Christopher" and "Mark" when searching the small and large ZIP files, respectively. Code is provided to implement the necessary classes and functions to complete these tasks, including loading the images from the ZIP, detecting faces, recognizing text, and generating the contact sheets.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

final_assignment file:///C:/Users/gsiddharth/Desktop/final_assignment.

html

The Project
1. This is a project with minimal scaffolding. Expect to use the the discussion forums to gain insights! It’s not cheating to ask
others for opinions or perspectives!
2. Be inquisitive, try out new things.
3. Use the previous modules for insights into how to complete the functions! You'll have to combine Pillow, OpenCV, and
Pytesseract
4. There are 4 functions you need to complete to have a working project. These functions are described using the RE
formula, which stands for Requires and Effects. Each function will have its own RE located directly above the function
definition. The Requires section describes what is needed in the function argument (inbetween the function definition
parenthesis). The Effects portion outlines what the function is supposed to do.
5. There are hints provided in Coursera, feel free to explore the hints if needed. Each hint provide progressively more details
on how to solve the issue. This project is intended to be comprehensive and difficult if you do it without the hints.

The Assignment
Take a ZIP file (https://en.wikipedia.org/wiki/Zip_(file_format)) of images and process them, using a library built into python
(https://docs.python.org/3/library/zipfile.html) that you need to learn how to use. A ZIP file takes several different files and
compresses them, thus saving space, into one single file. The files in the ZIP file we provide are newspaper images (like you
saw in week 3). Your task is to write python code which allows one to search through the images looking for the occurrences
of keywords and faces. E.g. if you search for "pizza" it will return a contact sheet of all of the faces which were located on the
newspaper page which mentions "pizza". This will test your ability to learn a new (library (https://docs.python.org/3/library
/zipfile.html)), your ability to use OpenCV to detect faces, your ability to use tesseract to do optical character recognition, and
your ability to use PIL to composite images together into contact sheets.

Each page of the newspapers is saved as a single PNG image in a file called images.zip (./readonly/images.zip). These
newspapers are in english, and contain a variety of stories, advertisements and images. Note: This file is fairly large (~200
MB) and may take some time to work with, I would encourage you to use small_img.zip (./readonly/small_img.zip) for testing.

Here's an example of the output expected. Using the small_img.zip (./readonly/small_img.zip) file, if I search for the string
"Christopher" I should see the following image:

Christopher Search
If I were to use the images.zip (./readonly/images.zip) file and search for "Mark" I should see the following image (note that
there are times when there are no faces on a page, but a word is found!):

Mark Search

Note: That big file can take some time to process - for me it took nearly ten minutes! Use the small one for testing.

1 of 4 4/29/2019, 2:49 PM
final_assignment file:///C:/Users/gsiddharth/Desktop/final_assignment.html

In [1]: import zipfile


import math
from PIL import Image
import pytesseract
import cv2 as cv
import numpy as np

class scanned_pages:
# loading the face detection classifier
face_cascade = cv.CascadeClassifier('readonly/haarcascade_frontalface_default.x
ml')
thumbnail_width = 128
thumbnail_height= 128
def __init__(self, zip_file_name):
self.zip_file = zipfile.ZipFile(zip_file_name, 'r')
self.archive_member_names = self.zip_file.namelist()
self.image_files = {x.filename:Image.open(self.zip_file.open(x)) for x in s
elf.zip_file.infolist() }
self.image_nparrs = {key: np.asarray(value) for key, value in self.image_fi
les.items()}
self.face_coods_in_files = {key: self.get_face_cood_list(value) for key, va
lue in self.image_nparrs.items()}
self.faces_in_files = {key: self.cut_faces(value, key) for key, value in se
lf.face_coods_in_files.items()}
self.contact_sheets = {key:self.generate_contact_sheets(value) for key, val
ue in self.faces_in_files.items()}
self.text_in_scanned = {key:self.ocr(value) for key, value in self.image_fi
les.items() }
for key,value in self.image_files.items():
value.close()
self.zip_file.close()

def get_face_cood_list(self, image_nparr):


return self.face_cascade.detectMultiScale(image_nparr,scaleFactor=1.3,minNe
ighbors=5, minSize=(30,30))

def cut_faces(self, cood_list, filename):


faces = [Image.fromarray(self.image_nparrs[filename][x[1]:x[1]+x[2], x[0]:x
[0]+x[3],:])
for x in cood_list]
return faces

def generate_contact_sheets(self, faces_list):


if len(faces_list)==0:
return None
for face in faces_list:
face.thumbnail([self.thumbnail_height,self.thumbnail_width])
first_image=faces_list[0]
num_rows = math.ceil(len(faces_list)/5.0)
contact_sheet=Image.new(first_image.mode, (self.thumbnail_width*5,self.thum
bnail_height*num_rows))
x=0
y=0
for img in faces_list:
contact_sheet.paste(img, (x, y) )
if x+first_image.width == contact_sheet.width:
x=0
y=y+first_image.height
else:
x=x+first_image.width
return contact_sheet

def ocr(self, image):


text = pytesseract.image_to_string(image)

2 of 4 4/29/2019, 2:49 PM
final_assignment file:///C:/Users/gsiddharth/Desktop/final_assignment.html

*****Testing for Chris in small file*****


Results found in file a-0.png

Results found in file a-3.png

*****Testing for Mark in large file*****


Results found in file a-0.png

Results found in file a-1.png

Results found in file a-10.png


But there were no faces in that file
Results found in file a-13.png

3 of 4 4/29/2019, 2:49 PM
final_assignment file:///C:/Users/gsiddharth/Desktop/final_assignment.html

Results found in file a-2.png

Results found in file a-3.png

Results found in file a-8.png


But there were no faces in that file

In [ ]:

In [ ]:

4 of 4 4/29/2019, 2:49 PM

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy