CS 150 - Assignment 8 - Weather Report

Due: Wednesday 4/26 at the beginning of class

An important component of many scientific applications is data collection and data analysis. For this lab, we'll be looking at an example data collection application that collects weather data from the web and aggregates it into a data file. In addition, we'll also make a nice script that takes a zip code as a command-line parameter and will give you the current temperature.

As announced, you may work with a partner on this lab. If you do, you must both be there whenever you're working on the lab. Only one of you should submit the assignment, but make sure both your names on in the comments.

 

http://xkcd.com/1245/

An important disclaimer:

When writing a program like this that contacts an external server, you need to be respectful of that external resource. For testing purposes, I have put up a version of the web page you will be extracting the temperature from on our department web server. You should use this test web page until you have your program working. Even when you have your program working and change over to the external web address, please avoid making too many repeated calls.

Part 1: Getting the weather

For the first part of this lab, you are to write a program that reads the current weather from the web for a zip code entered by the user. I've broken the description of this program into two parts: the specification of what is required, and my suggestion about how to proceed on the implementation. Make sure to read both sections before starting!

Specifications

Write a program called weather_reader.py that has the following characteristics:

Implementation

In previous years, we had students extract the weather from the HTML source of a page from weather.com like the following http://www.weather.com/weather/today/l/05753.

However, the current version of this page does not encode the temperature in the HTML code directly, making this approach rather difficult. So, instead of "scraping" a web page (which is what this approach is called), we will use a cleaner approach — we will use an API. API stands for "Application Program Interface" and it means that a service (such as a weather data server on the web) provides a protocol specifically designed to be used by programs, rather than by humans.

In particular, for this lab, we will use the API by OpenWeatherMap. If you follow the link for "current weather data" and then scroll down to "by ZIP code" you will see that you can use a URL like

http://api.openweathermap.org/data/2.5/weather?zip=94040,us&appid=2de143494c0b295cca9337e1e96b00e0
to get the weather conditions for a given zip code. Here is a sample page I retrieved for Middlebury's zip code via the API:
http://www.cs.middlebury.edu/~schar/courses/cs150-s17/hw/hw8-data/weather-05753.txt

For now, your program should only use this page. If you follow this link you'll see it's a text encoding for the weather for Middlebury, with the current temperature being 49.55. Your job is to write a python module that extracts just the temperature from this data.

You may implement this module however you like as long as it meets the specifications above, however, here is one suggested approach to implementing it:

  1. Write some code that opens the web page above and reads through it a line at a time (when I say "code" I mean some statements that may be stand-alone or may be in one or more functions). Note that all the information is just contained on a single line, but it's still fine to use our standard approach of iterating over the lines, we will just get only one iteration. (If you are curious, you can also read the entire file at once using something like "contents = webpage.read()" instead of "for line in webpage:".

  2. Once you have this working, you need to extract the temperature. There are two approaches. One is to use the string method "find" like the example we did in class of the program that extracts email addresses. You can for instance search for the string '"temp":' (including the double quotes) to find the start location of the temperature, and then find the location of the next comma to get the end location. Another, perhaps more elegant approach is to utilize the structure of the data returned by the API. Given your knowledge of Python, what do you notice about the data? Could it be interpreted as a certain Python data structure? I won't give you the full answer, but here's a hint: Python's "eval" function might come in handy to parse (translate) the data into Python, and then to extract the temperature.

    Either approach is valid — but be sure that you can extract the temperature, store it in a variable, and print it.

  3. Once you can print the temperature, put this all together to write the get_temperature function. Recall that it will take a zip code as a parameter. For now it will just ignore that parameter and always get the temperature data from the above web page, but go ahead an put the parameter in anyway.

  4. Finally, write the part of the program that checks to see if this program is being run vs. imported, checks the number of program parameters and prints the usage accordingly. Finish up your program so that when you run it with the correct number of arguments it prints out the temperature.
You should now be able to run your program from the command line with a zip code and it will give you a temperature (pretty cool!). Right now, it should always give you 49.55, however, it will just be a small change to have it do the real thing. We'll get to that soon...

Part 2: Aggregating the weather

We now have a program that we can run and it gives us the temperature and we have a module that we could import and call the get_temperature function to get the current temperature with a zip code. For the second part of this lab, we're going to write another program (i.e., in a different .py file) that can be run regularly over time to build up a file with aggregated temperature data over time.

Your program will be run with two command-line arguments, the name of a file and a zip code. The file will contain multiple entries collected over time. Each line in the file will consist of a date, an hour of the day and the temperature at that hour. For example, here is a short snippet of an example file:

4-18-2017       15      49.55
4-18-2017       16      50.23
4-18-2017       17      51.78
4-18-2017       18      51.52
4-18-2017       19      49.41
4-18-2017       20      42.83
4-18-2017       21      41.59
4-18-2017       22      40.72
Each time you run the program it will add at most one line to this file. So the file above would have been generated with at least 8 calls to the program (over 8 different hours).

We're setting the problem up this way since it is generally straightforward to get a program to run at some fixed interval. You won't be doing that for this lab, but I'm happy to talk to you offline about how that would work.

As with the first part, I've broken the description of this program into two parts, the specification and the implementation.

Specifications

Write a program called weather_aggregator.py that has the following characteristics:

Implementation

Here is one approach to implementing this program:
  1. Write the part of the program that checks to see if this program is being run vs. imported, checks the number of program parameters and prints the usage accordingly.

  2. Write some code to check whether the file has an entry for a given date and time. For testing purposes, it will likely be useful to create a version of the aggregate file manually (e.g., in Spyder or another text editor).

  3. Write some code that gets the current date and hour and checks to see if it's in the file (see the prelab for ideas on how to do this). Again, just modify the file by hand to test this.

  4. Finally, put this together so that you check to see if the current date is in the file already, if it's not, use your weather_reader module to get the temperature and append it on to the end of the file. When writing this file, you may either rewrite the entire thing from scratch each time (in that case you'd open the file with "w") or you can just append the one new entry (in that case you'd open the file with "a"). Don't forget to close your files after you write out the data.

  5. Add any finishing touches to the program to make sure it runs appropriately. When you run it, you won't see any output, but the file passed in may change.
NOTE: If the file doesn't exist at all and you call your program, you may get an error. To get around this, just create a blank text file with nothing in it before you run your program for the first time. This can be done with the unix command "touch", e.g., to create a new empty files called "temps.txt", type
touch temps.txt

The real deal

So far, all of your testing should have been done on the copy on the computer science web server using the url above, always giving you the same temperature. When you're confident that you have everything working you can go back and change your weather_reader module to use the real web page. For a given zip code, the url should look as follows:

http://api.openweathermap.org/data/2.5/weather?zip=05753,us&appid=2de143494c0b295cca9337e1e96b00e0&units=imperial
Note that the URL has several "variable definitions" separated with ampersands. For instance, we specify the zip code via "zip=05753,us". At the end we request "imperial" units, i.e., Fahrenheit, since by default we get Kelvin which is not as useful. What about the "appid" variable? It turns out that the API asks you to create an account, which controls the number of requests you are allowed to make. A free account gives you up to 60 calls per minute. In order to use the API you'll have to create an account, which will give you your own unique appid to use in the URL. (The current value is simply copied from their documentation page, and won't work for arbitrary zip codes.)

Once you have an appid, to use the actual API you need to use a URL like the one above, but with the correct zip code substituted, which is not hard to do in Python.

Change your get_temperature function in the weather_reader module to generate an appropriate url based on the zip code passed in and then use this url to get the temperature. You should now be able to query the current weather based on the zip code entered:

schar$ python3 weather_reader.py 05753
49.55
schar$ python3 weather_reader.py 04773
32.99
schar$ python3 weather_reader.py 33111
80.8
You should be able to run your weather_reader.py program with a zip code and it will give you the current temperature and your weather_aggregator.py should now aggregate the real values.

Extra points

You may earn up to 2 extra points on this assignment. Below are some ideas, but you may incorporate your own if you'd like. Make sure to document your extra point additions in comments at the top of the file.

When you're done

Make sure that your programs are properly commented: In addition, make sure that you've used good style

Submission procedure

For this assignment, you will have two .py files to submit: weather_reader.py and weather_aggregator.py. Please use these exact names, otherwise it is difficult for us to test your programs. Upload both files using the digital submission link, and report the total time taken for the entire assignment both times.

Grading

                                                         points
weather_reader.py
   run vs. import                                        1
   prints usage with incorrect number of arguments       2
   runs correctly with zip code entered                  2
   get temperature                                       5

weather_aggregator.py
   run vs. import                                        1
   prints usage with incorrect number of arguments       2
   data and hour formatted correctly in file             1
   appends temp to end of file                           3
   doesn't add repeated data                             3

Comments, style                                          5
Prelab                                                   3
Extra Points                                             2

Total                                                   28 + 2