0% found this document useful (0 votes)
35 views3 pages

01 - Worked Example Geodata Chapter 16.en

The document discusses a Python script that loads geographic data from a file into a SQLite database by making API requests to Google Maps. It then extracts the latitude and longitude from the database and writes it to a JavaScript file to display the locations on a map. The process is restartable if interrupted to avoid exceeding API rate limits.

Uploaded by

Box Box
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views3 pages

01 - Worked Example Geodata Chapter 16.en

The document discusses a Python script that loads geographic data from a file into a SQLite database by making API requests to Google Maps. It then extracts the latitude and longitude from the database and writes it to a JavaScript file to display the locations on a map. The process is restartable if interrupted to avoid exceeding API rate limits.

Uploaded by

Box Box
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
You are on page 1/ 3

Hello everybody, welcome to Python for Everybody.

This is another work to code


example. You can download the sample code zip file if you want to follow along. The
code that we're working on today is what I call the geodata code, and that is code
that is going to pull some locations from this file. We're simulating or using the
Google places API to look places up and so we can visualize them on a map, and so
this is the basic picture. If we take a look at this where.data file, it's just a
flat file that has a list of organizations, and as actually was pulled from one of
my Mooc surveys. We just let people type in where they went to school, and this is
just a sample of them. So this data is read in by this program geoload.py, and if
you recall, this Google geodata has great limits. It also has API keys which we'll
talk about in a bit too. So the idea is, this is a restartable spider like process,
and so we want to be able to run this and have it blow up and run it and start it
and not lose what we've got. So this is unlike, some though, so we're not now using
a database, as well as an API. But in order to work around the rate limits to this
API, we're going to use the database with a restartable process. Then, we'll make
some sense of this, and then we'll visualize this, but in the short-term, let's
start with the geoload.py code. Geoload.py, take a look here. So, a lot of this
hopefully by now is somewhat familiar to you, Urllib, json, sqlite. I mentioned,
that the Google APIs, these used to be free and did not require an API key but
increasingly, they're making you do API keys for especially new ones. So what
happens, you can go to your Google places. I mean, go to Google APIs, and get an
API key and you can put it in here. It'll be this long, big long thing, it looks
like that. Then if you have an API key, you can use the places API. I've got a copy
of a subset, not all of it, a subset of it here at this URL. As a matter of fact,
you can just go to this URL in a browser, and it will tell you a list of the data
that it knows about. I made it so that, that does the same basic protocol with the
address equals as the Google places API. So, this will just change how we retrieve
the data. Either retrieve it from my server, nice thing about my server, it's got
no rate limit, it's really fast and you're not fighting with Google all the time.
It means that perhaps, if you're in a country that Google is not well supported,
you can use my API. I mean that's really strange, but somehow my API is more
reliable and available than the Google one. But, it's true. So we're going to make
a database. We're going to do a create table if not exists, and we'll have some
address, and we're really just caching the geographical data. We're gonna cache the
json. One of the things we do when we build these processes is we tend to simplify
these things and not do all the calculation and parsing the json, just load it and
get it in, and load it, and get it in. Fill the data up in this database, so that's
what we're going to do. Because Python doesn't ship with any legitimate
certificates, we have to sort of ignore certificate errors. We're going to open the
file, we're going to loop through it, pull out the address from the file, and we're
going to select from the geodata where that addresses the address. Let's move this
in a bit. We're going to do a select, and pull out that address. The idea is as if
it's already in the database we don't want to do it, so we do a fetch one and pull
up that first thing that will be the json right there. If we get that, we'll
continue up otherwise we'll keep going. Pass, just means don't blow up. So we
accept and we just do a pass that's a no. We're going to make a dictionary, because
that's what we do for the key-value pairs. Everything you've seen so far, I've used
constants here, but because we may or may not have an API key, query equals and
then that's the address and then the key equals and then the API key. If you recall
URL encode adds the plusses, and question marks, and all that nice stuff. We're
going to retrieve it, we're going to read it and decode it, print out how much data
we've got and add account, and then we're going to try to parse that Json data and
print it if something goes wrong. As we've seen at this top level of this json data
from this Geocoding API is an object, which we'll see a little bit of it in a bit.
It has a status field in it, and the status is okay, if things went well. If the
status is not there, that means our JavaScript is not well formed, or not how we
expect it. If the status is not okay, or not equals zero results, then print out
failure to retrieve and then then quit. Then we're simply going to insert this new
data that we just put in. Then we're gonna commit it, and every tenth one, this is
count mod ten, we're going to pause for five seconds. We can hit control C here,
and then we're going to do the geodump. So let's just run this, geodata, Python. So
let's do an LS. So we do have, let's get rid of from a previous test geodata LS
Sqllite. So we'll start with a fresh set of data, and run Python geoload.py. Of
course, I'm always forever making the mistake of forgetting Python three, so you
can see that it's running. It's adding the query and in this case I don't have the
API key. It's putting the pluses in and that's this part here with all the pluses,
that's the URL and code. You notice that it's pausing a bit, depends on how fast
your net connection, this may or may not go so fast, but this is not that much
data, so it should. It's like only 2,000 3,000 characters. So it's working and
talking to my server. The interesting thing here is I can blow this up, I'm going
to hit Control C. In Linux you'd hit Control C and on Windows I think you'd hit
Control Z. Depending on which shell you're working in. But I'm going to hit Control
C, and you see I sought blew it up. That causes a keyboard interrupt traceback. I
do an LS minus L, you can see that now there's Geodata is there. Now, in the name
of restarting, I will restart this. You will see that it checks, and skips and so
it runs this code here, where it's right here. It grabs it, and finds it in the
database. So you'll see it's found in the database really quick, chop, chop, chop
and go really fast. Then it'll go back to catching up where it left off. So all
those up there, they did not actually re-retrieve it, because it knew about those
things. Now it's catching up, and doing some more, and doing some more, and doing
some more. Then I'll hit Control C, it has a little counter in here that basically,
if it hits 200 it stops, and you have to restart it. You could actually change this
code, you can make it so it didn't sleep. Doesn't hurt to sleep for like a second
after every 100 or so if you want. You can change that code. Now let's just hit
Control C and blow it up. LS minus L. There is another bit of code, and this code
it's always good to write these really simple things. Now we're going to now import
Sqllite and json. We're going to connect ourselves up. We're going to open, except
this is a UTF eight. Because it's a UTF, we're going to open this with UTF eight
and we're going to read through. In this case, we're going to decode. You just
select start from locations, and if you recall, locations has a location and a
Geodata. So the sub zero will be the location and the sub one will be the the
geodata and we're going to parse it. Convert it to a string and then parse it. If
something goes wrong with json we'll just keep skipping it, or check to see if we
have the status in our json. Let me run the Sqllite browser here. File, open
database. Let's take a look at what's in this database. Where we code three geodata
to data SqlLite. So this is our data, we've got so if you, I'll make this a little
bigger if I can't, can I make that bigger? Yes, I can show us much. So you can see
that these are the addresses and geodata that's just json. So that's the json that
we've got and it retrieves it. So this is a really simple database, there's just
sort of spidering process run run run. But now we're going to run the geodump code,
which is going to read this and dump this stuff out, and print where.js, so it's
going to actually parse this stuff. That's code we've seen before. So we're
actually reading it and this line goes into the results. Results is an array so if
we go into results, results in an array. We're going to go grab the zero item in
that array, then we're going to go find a geometry, and then location, and then lat
and long for the latitude and longitude. Then we're also going to take the actual
address out of the formatted address right here. So, in this bit of code, we're
actually parsing the json and we're going to clean things up, get rid of some
single quotes. This kind of data cleaning is just stuff after you play with it for
awhile, you realize oh my data's ugly, or does this. Now print it out, then I'm
going to write this out. I'm going to write it into a JavaScript file. So the
Javascript file is this where.js. I'll show you what it looks like, it was going to
be overwritten. This is the one that came out of the zip file. It'll have the
latitude, the longitude, and we're going to use JavaScript to read this. In this
where.html file, it's going to actually read this right there, and pull that data
in. That's how we're going to visualize. I'm not going to go into great detail on
how the visualization happens, but that's what's happening. So we're going to write
that, we're going to actually write this to a file. Let's go ahead and run this
code, and say Python three, geodump. Okay. So it wrote a 120 records to where.js.
So if we look at where.js, this is now the new data that I just downloaded moments
ago. It says open where.html in a browser. Now this you'll need the Google Maps
API, and you might not be
able to see this, depending on where you're at. But, here you go with the Google
Maps locations. I think if you hover over this, you can see. You see the UTF why
we're there in that particular thing, why we had to use the UTF eight when we wrote
the file. So that we didn't end up with trouble writing the file out. There you go
and so that is a simple visualization. Just a simple visualization and wrote this
where.js. If you are smart with HTML and JavaScript, you can look at this
where.HTML file, it's really just reading through a bunch of data and putting the
points. That's all there is, but I'm not going to go through that, at least not in
this. I hope this was useful to you and thanks for watching.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy