0% found this document useful (0 votes)

39 views

Python and CSV: Readers

CSV (comma-separated values) is a common file format that represents data in rows separated by commas. It is often used to export data from spreadsheets and databases. The csv module allows reading and writing CSV data in Python. It can parse CSV strings or files into lists of rows containing string values. Functions like csv.reader() and csv.writer() provide an easy way to work with CSV data in Python programs.

Uploaded by

Berto Erto

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

39 views

Python and CSV: Readers

Uploaded by

Berto Erto

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

Williams College Lecture 12 Brent Heeringa, Bill Jannen

CSV
Comma Separated Values (CSV) is a common data file format that represents data as a row of values, separated by
a delimiter, which is typically a comma. Data in spreadsheets and databases matches this format nicely, so CSV is
often used as an export format. If you work in data science, CSV is ubiquitous, so it makes sense to spend some
time learning more about the format and developing skills to manipulate data once its read into memory.
Here is some example CSV data representing financial information from Apple Computer.

Date,Open,High,Low,Close,Volume,Adj Close
2009-12-31,213.13,213.35,210.56,210.73,88102700,28.40
2009-12-30,208.83,212.00,208.31,211.64,103021100,28.52
2009-12-29,212.63,212.72,208.73,209.10,111301400,28.18
2009-12-28,211.72,213.95,209.61,211.61,161141400,28.51

header many CSV files start with an initial header row, which gives column names for the data

data data in CSVs is separated by commas, but any delimiter can be used.

Python and CSV: Readers

Suppose the the contents of the above CSV were in a file called aapl.csv. One could open that CSV and stream
through the data using the csv module and the following syntax.
1 import csv
2
3 with open(’aapl.csv’, ’r’) as fin:
4 print(list(csv.reader(fin)))

The reader object is iterable. Each row is a list of strings that were split by the delimiter, which by default is
the comma. This yields the following output.

[[’Date’, ’Open’, ’High’, ’Low’, ’Close’, ’Volume’, ’Adj Close’],

[’2009-12-31’, ’213.13’, ’213.35’, ’210.56’, ’210.73’, ’88102700’, ’28.40’],
[’2009-12-30’, ’208.83’, ’212.00’, ’208.31’, ’211.64’, ’103021100’, ’28.52’],
[’2009-12-29’, ’212.63’, ’212.72’, ’208.73’, ’209.10’, ’111301400’, ’28.18’],
[’2009-12-28’, ’211.72’, ’213.95’, ’209.61’, ’211.61’, ’161141400’, ’28.51’]]

If I wanted to find the highest stock price over the time period given by this data, I could write:
1 import csv
2 HIGHPRICECOL = 2
3
4 with open(’aapl.csv’, ’r’) as fin:
5 data = list(csv.reader(fin))
6 prices = [row[HIGHPRICECOL] for row in data[1:]]
7 print(max(prices))

This code makes several assumptions and uses some new python constructs, all of which are worth mentioning:

• all the data is held in memory at once so this is not a streaming algorithm;

• we use a list comprehension when assigning the prices variable; and

• we assume that we know the column index for price, which is 2.

Fall Semester 2016 1 CS 135: Diving into the Deluge of Data

Williams College Lecture 12 Brent Heeringa, Bill Jannen

Let’s write this without loading all the data into memory and without knowing which column the ’High’ price
occupies:
1 import csv
2 COLNAME = ”High”
3
4 with open(’aapl.csv’, ’r’) as fin:
5 maxprice = float(’−inf’)
6 maxpricecol = None
7
8 for rownum, row in enumerate(csv.reader(fin)):
9 if rownum == 0:
10 maxpricecol = row.index(COLNAME)
11 else:
12 maxprice = max(maxprice, float(row[maxpricecol]))
13
14 print(maxprice)

This code also uses some Python constructs that are new to us. The first is float(’-inf’), which is a Python
way of specifying a value that is always smaller than any other value. The maxpricecol variable we declare and
initialize to None. The index method returns the index or position of the value ‘‘High’’ in the row. Again,
note that this algorithm is streaming, so we only consider one line at a time.

Practice
Imagine I have a file called realestate.csv that contains real estate transactions in Sacramento over 5 days.
The format of the data is

street,city,zip,state,beds,baths,sqft,type,sale date,price,lat,long.

Write a short script that finds that average sale price for every transaction in the file. You know that price occurs
at index 9, that the statistics.mean function is available, and that the data can easily fit into memory.
1 import sys
2 import csv
3 import statistics
4 PRICECOL = 9
5
6 def mean sale(filename):
7 with open(filename, ’r’) as fin:
8 rows = list(csv.reader(fin))[1:]
9 prices = [float(row[PRICECOL]) for row in rows]
10 return(statistics.mean(prices))
11
12 if name == ’ main ’:
13 print(mean sale(sys.argv[1]))

Now imagine writing a function that returned the mean sale price of houses over 2000 square feet. Square
footage is given by the column at index 6.
1 import csv
2 import sys
3 import statistics

Fall Semester 2016 2 CS 135: Diving into the Deluge of Data

Williams College Lecture 12 Brent Heeringa, Bill Jannen

4
5 PRICECOL = 9
6 SQFTCOL = 6
7 SQFTMIN = 2000
8
9 def mean sale high(filename):
10 with open(filename, ’r’) as fin:
11 rows = list(csv.reader(fin))[1:]
12 prices = []
13 for row in rows:
14 if int(row[SQFTCOL]) > SQFTMIN:
15 prices.append(float(row[PRICECOL]))
16 return(statistics.mean(prices))
17
18 if name == ’ main ’:
19 print(mean sale(sys.argv[1]))

You can also use an if statement in the list comprehension for some truly beautiful code.
1 def mean sale high2(filename):
2 with open(filename, ’r’) as fin:
3 rows = list(csv.reader(fin))[1:]
4 prices = [float(row[9]) for row in rows if int(row[6]) > 2000]
5 print(statistics.mean(prices))

CSV Data in Strings

Suppose that the CSV data, however, is in a string data, instead of a file. In this case, one would use the
io.StringIO type to wrap the string inside something that behaves like a file object. You can think of this
as buffering the string.
1 import csv
2 import io
3
4 data = ’purple,cow,moo\nhappy,moose,grunt’
5 reader = csv.reader(io.StringIO(data))
6 for row in reader:
7 print(”∗”.join(row))

Reader Options
There are many options when creating a CSV reader. Here are some, with definitions coming directly from the API1 :

delimiter A one-character string used to separate fields. It defaults to ’,’.

escapechar On reading, the escapechar removes any special meaning from the following character. It de-
faults to None, which disables escaping.
lineterminator The string used to terminate lines produced by the writer. It defaults to ’\r\n’. Note The
reader is hard-coded to recognize either ’\r’ or ’\n’ as end-of-line, and ignores line terminator. This
behavior may change in the future.
1
https://docs.python.org/3.4/library/csv.html#csv-fmt-params

Fall Semester 2016 3 CS 135: Diving into the Deluge of Data

Williams College Lecture 12 Brent Heeringa, Bill Jannen

As an extreme example, suppose we wanted to represent a bunch of data that was just commas. One could use a
different delimiter

,|,,|,
,,|,|,,

and use csv.reader(filename.csv, delimiter="|") to create the correct reader. We could also escape
the commas

\,,\,\,,\,
\,\,,\,,\,\,

and use csv.reader(filename.csv, escapechar="\\") to create the correct reader. Notice that we
need to escape the backslash inside the character string.

Writers
CSV Writer objects accept any object that has a write method (file objects, StringIO objects, etc.) and formats
CSV data using the writerow or writerows method. Here’s an example. Suppose that data is a list of NESCAC
school information.

data = [[’Williams’, ’Ephs’, ’Purple Cows’],

[’Middlebury’, ’Panthers’, ’Panther’]]

To write this to the file called nescac.csv we would use the following code
1 import csv
2 with open(’nescac.csv’, ’w’, newline=’’) as csvfile:
3 writer = csv.writer(csvfile, delimiter=’,’)
4 writer.writerow([’School’, ’Nickname’, ’Mascot’])
5 writer.writerows(data)

Practice
Suppose you had a list of constellations and their galactic coordinates (right ascension and declination) in CSV
format.

constellation, right ascension, declination

Sagittarius,19,-25
Taurus, 4.9, 19
Perseus, 3, 45

Write a function that takes a filename file in CSV format and returns a list of constellations. Suppose that you
know one of the headers is labelled constellation, but not which one. Suppose further that you can easily fit
all the data in memory.
1 with open(file, newline=’’) as fp:
2 data = [row for row in csv.reader(file)]
3 col = data[0].index(’constellation’)
4 return [row[col] for row in data[1:]]

Fall Semester 2016 4 CS 135: Diving into the Deluge of Data

HD1500 7
100% (1)
HD1500 7
12 pages
CW MD Jahid Hasan 2024
No ratings yet
CW MD Jahid Hasan 2024
20 pages
CSV File
No ratings yet
CSV File
30 pages
Notes -CSV FILES
No ratings yet
Notes -CSV FILES
7 pages
Using The CSV Module in Python
No ratings yet
Using The CSV Module in Python
5 pages
CSV Files
No ratings yet
CSV Files
8 pages
CSVFILES
No ratings yet
CSVFILES
37 pages
Csv Files
No ratings yet
Csv Files
28 pages
CSV Files
No ratings yet
CSV Files
8 pages
CSV File Handling
No ratings yet
CSV File Handling
16 pages
Chapter 5.3 CSV File Handling
No ratings yet
Chapter 5.3 CSV File Handling
9 pages
Python Unit 5
No ratings yet
Python Unit 5
21 pages
CSV File Handling Doc
No ratings yet
CSV File Handling Doc
3 pages
notes on CSV Filespdf
No ratings yet
notes on CSV Filespdf
11 pages
CW MD Jahid Hasan 2024
No ratings yet
CW MD Jahid Hasan 2024
23 pages
6. Python CSV
No ratings yet
6. Python CSV
4 pages
Computer Science Project
No ratings yet
Computer Science Project
13 pages
Csv –
No ratings yet
Csv –
7 pages
3.3. CSV Files
No ratings yet
3.3. CSV Files
28 pages
Computer science
No ratings yet
Computer science
35 pages
Csvfiles 2
No ratings yet
Csvfiles 2
28 pages
csv-files_final
No ratings yet
csv-files_final
21 pages
CSV File Notes
No ratings yet
CSV File Notes
23 pages
Unit IV File Handling - CSV Files
No ratings yet
Unit IV File Handling - CSV Files
28 pages
UNIT4
No ratings yet
UNIT4
19 pages
CSV File Handling Doc
No ratings yet
CSV File Handling Doc
3 pages
CSV New
No ratings yet
CSV New
4 pages
Handling Csv Files in Python
No ratings yet
Handling Csv Files in Python
11 pages
CSV File Handing
No ratings yet
CSV File Handing
15 pages
Chapter 06 CSV Files
No ratings yet
Chapter 06 CSV Files
59 pages
Data File Handling
No ratings yet
Data File Handling
29 pages
12 -CS - CSV File
No ratings yet
12 -CS - CSV File
4 pages
CSV File Handling Notes
No ratings yet
CSV File Handling Notes
23 pages
File Handling
No ratings yet
File Handling
6 pages
005.2 CSV (1)
No ratings yet
005.2 CSV (1)
11 pages
CSV FILES Online
No ratings yet
CSV FILES Online
84 pages
Chapter5 3CSVFile
No ratings yet
Chapter5 3CSVFile
7 pages
Class XII - File Handling CSV Files
100% (5)
Class XII - File Handling CSV Files
15 pages
CSV File: Python With CSV Files
No ratings yet
CSV File: Python With CSV Files
19 pages
Ascii Unicode: Chapter - 4 CSV Files 1. What Is A CSV File?
No ratings yet
Ascii Unicode: Chapter - 4 CSV Files 1. What Is A CSV File?
9 pages
Cvs
No ratings yet
Cvs
4 pages
CSV File Reading and Writing: Module Contents
No ratings yet
CSV File Reading and Writing: Module Contents
9 pages
Reading and Writing CSV Files
No ratings yet
Reading and Writing CSV Files
13 pages
CSV File
No ratings yet
CSV File
5 pages
Csv-File-Ppt-Intro
No ratings yet
Csv-File-Ppt-Intro
16 pages
CVS File Handlinng
No ratings yet
CVS File Handlinng
5 pages
CSV file handling
No ratings yet
CSV file handling
4 pages
File Handling CSV Files Notes 3
No ratings yet
File Handling CSV Files Notes 3
17 pages
Python CSV Files
No ratings yet
Python CSV Files
9 pages
csv
No ratings yet
csv
3 pages
XII CS Unit1 CSV Notes
No ratings yet
XII CS Unit1 CSV Notes
6 pages
Unit5 CS
No ratings yet
Unit5 CS
15 pages
Data Transfer Between Files, SQL Databases & Dataframes: Comma To Separate Each Specific Data Value. CSV Advantages
No ratings yet
Data Transfer Between Files, SQL Databases & Dataframes: Comma To Separate Each Specific Data Value. CSV Advantages
6 pages
03_01_csv-files-lesson-notes-optional-download_Files - CSV Files
No ratings yet
03_01_csv-files-lesson-notes-optional-download_Files - CSV Files
9 pages
csv files
No ratings yet
csv files
24 pages
Important Questions of CSV File in Python
50% (2)
Important Questions of CSV File in Python
9 pages
CSV File Handling
No ratings yet
CSV File Handling
20 pages
Writing and Reading in CSV Files
No ratings yet
Writing and Reading in CSV Files
14 pages
CSV Files 1
No ratings yet
CSV Files 1
17 pages
Class 12 IP Ch-1 CSV File Handling
No ratings yet
Class 12 IP Ch-1 CSV File Handling
8 pages
The Essential R Reference
From Everand
The Essential R Reference
Mark Gardener
No ratings yet
Le Spezie Per La Salute: Roberto Benelli e Stefania Capecchi
No ratings yet
Le Spezie Per La Salute: Roberto Benelli e Stefania Capecchi
106 pages
Contemporary Fuzzy Logic, A Perspective of Fuzzy Logic With Scilab
No ratings yet
Contemporary Fuzzy Logic, A Perspective of Fuzzy Logic With Scilab
10 pages
Pygad: An Intuitive Genetic Algorithm Python Library: Ahmed Fawzy Gad
No ratings yet
Pygad: An Intuitive Genetic Algorithm Python Library: Ahmed Fawzy Gad
6 pages
Pygad: An Intuitive Genetic Algorithm Python Library: June 2021
No ratings yet
Pygad: An Intuitive Genetic Algorithm Python Library: June 2021
7 pages
Ds Uhp Wuc 1x en Us 20710
No ratings yet
Ds Uhp Wuc 1x en Us 20710
10 pages
SOA 11G Cluster Installation PDF
No ratings yet
SOA 11G Cluster Installation PDF
125 pages
EV To Lux Conversion Tables
No ratings yet
EV To Lux Conversion Tables
2 pages
Welder Qualification N Control Procedure
100% (3)
Welder Qualification N Control Procedure
11 pages
OA-FT_Catalogue_2018-2019_EN_61532_V1_low 67
No ratings yet
OA-FT_Catalogue_2018-2019_EN_61532_V1_low 67
1 page
Abdulrahman CV Updated
No ratings yet
Abdulrahman CV Updated
8 pages
VENUS e Catalogue
No ratings yet
VENUS e Catalogue
38 pages
GSO Draft Standard
No ratings yet
GSO Draft Standard
20 pages
Continuous Miners
No ratings yet
Continuous Miners
8 pages
Tool Operating Manual
100% (1)
Tool Operating Manual
12 pages
25-27a Ewis Ezap
No ratings yet
25-27a Ewis Ezap
158 pages
System 8000
No ratings yet
System 8000
10 pages
Feet Feet & Inches Feet Feet & Inches: CM CM
No ratings yet
Feet Feet & Inches Feet Feet & Inches: CM CM
4 pages
Network Infrastructure
No ratings yet
Network Infrastructure
74 pages
BS 1802-1951
No ratings yet
BS 1802-1951
12 pages
Ritar Agm / Gel Battery Datasheet
No ratings yet
Ritar Agm / Gel Battery Datasheet
1 page
M187 GP328 Brochure PDF
No ratings yet
M187 GP328 Brochure PDF
4 pages
All-New Figo
No ratings yet
All-New Figo
16 pages
ABCB Handbook On Energy Efficiency Provisions - To - New - Building - Work - Associated - With - Existing - Class - 2-9 - Buildings
No ratings yet
ABCB Handbook On Energy Efficiency Provisions - To - New - Building - Work - Associated - With - Existing - Class - 2-9 - Buildings
40 pages
1 (1) - Pump
100% (1)
1 (1) - Pump
8 pages
Internet: Computer Networks Internet Protocol Suite
No ratings yet
Internet: Computer Networks Internet Protocol Suite
8 pages
2018 Basic Plan Manual
No ratings yet
2018 Basic Plan Manual
145 pages
Practical No - 9: Aim: Write A C Program To Implement LALR Parsing
No ratings yet
Practical No - 9: Aim: Write A C Program To Implement LALR Parsing
5 pages
Gea S1244
No ratings yet
Gea S1244
2 pages
Eudemon Basic Principle
No ratings yet
Eudemon Basic Principle
59 pages
SAES-L-132 PDF Download - Material Selection For Piping Systems - PDFYAR
100% (1)
SAES-L-132 PDF Download - Material Selection For Piping Systems - PDFYAR
6 pages
Newton System 500 Brochure November2012
No ratings yet
Newton System 500 Brochure November2012
5 pages
Lab: Basic Static Route Configuration: Topology Diagram
No ratings yet
Lab: Basic Static Route Configuration: Topology Diagram
6 pages
ASCO Valve 8016gh Solenoid Im
No ratings yet
ASCO Valve 8016gh Solenoid Im
6 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Python and CSV: Readers

Uploaded by

Python and CSV: Readers

Uploaded by

Williams College Lecture 12 Brent Heeringa, Bill Jannen

Python and CSV: Readers

[[’Date’, ’Open’, ’High’, ’Low’, ’Close’, ’Volume’, ’Adj Close’],

• we use a list comprehension when assigning the prices variable; and

• we assume that we know the column index for price, which is 2.

Fall Semester 2016 1 CS 135: Diving into the Deluge of Data

Fall Semester 2016 2 CS 135: Diving into the Deluge of Data

CSV Data in Strings

delimiter A one-character string used to separate fields. It defaults to ’,’.

Fall Semester 2016 3 CS 135: Diving into the Deluge of Data

data = [[’Williams’, ’Ephs’, ’Purple Cows’],

constellation, right ascension, declination

Fall Semester 2016 4 CS 135: Diving into the Deluge of Data

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.