Lab 7: Weather Report Due: 11:59 PM on 2020-04-16

An important component of many scientific applications is data collection and data analysis. For this lab, we’ll be looking at an example data collection application that collects weather data from the web and aggregates it into a data file. In addition, we’ll also make a useful program that takes a zip code as a command-line parameter and will give you the current temperature for that zipcode.

A note on the use of external resources:

When writing a program like this that contacts an external server, you need to be thoughtful about how you use that external resource. Many commercial services will have request limits. If someone is offering a service as a courtesy, we want to be respectful of that resource.

For testing purposes, we have put up a version of the web page you will be extracting the temperature from on our department web server. You should use this test web page until you have your program working. Even when you have your program working and change over to the external web address, please avoid making too many repeated calls.

Part 1: Getting the weather

For the first part of this lab, write a program that reads the current weather from the web for a zip code entered by the user. We’ve broken the description of this program into two parts: the specification of what is required, and suggestions about how to proceed on the implementation. Make sure to read both sections before starting.

Specifications

Write a program called weather_reader.py that has the following characteristics:

  1. Importing the module only defines functions and variables, i.e., on import your program should not query the weather (or invoke any functions or print anything to the shell).
  2. Your program should be able to be run from the command-line and take a single argument, which is the zip code:

    1. If your program is run with too few or too many arguments, it should print out the usage:

      >>> %Run weather_reader.py
      usage: python3 weather_reader.py <zip_code>
      
    2. If your program is run with the correct number of arguments (one) you should treat it as a zip code (you can assume it is a valid zipcode) and the program should print out the current temperature at that zip code

      >>> %Run weather_reader.py 05753
      39.71
      
  3. Your module must contain a function named get_temperature that takes a zip code as a string parameter and returns the temperature at that zip code as a float. (Some zipcodes start with ‘0’ so it’s best to represent a zip code as a string.)

Guide

We we will use an API to obtain weather data. API stands for “Application Program Interface” and it means that a service (such as a weather data server on the web) provides a protocol specifically designed to be used by programs, rather than by humans.

In particular, for this lab, we will use the API by OpenWeatherMap. If you follow the link for “current weather data” and then scroll down to “by ZIP code” you will see that you can use a URL like

http://api.openweathermap.org/data/2.5/weather?zip=05753,us&APPID=420f0c6fa4f4d851e8d17537bf60771d&units=imperial

to get the weather conditions for a given zip code (notice zip=05753 in the URL specifying the zip code). Note that the whole URL is required, including the APPID portion (which is my API key); see below for more explanation of API keys and how to get your own. Here is a sample page we retrieved for Middlebury via the API:

http://www.cs.middlebury.edu/~briggs/Courses/CS150-S20/labs/lab7-test-data.json?zip=05753,us&APPID=420f0c6fa4f4d851e8d17537bf60771d&units=imperial

For now, your program should only use this sample page (on CS department servers). If you follow this link you’ll see it’s a text encoding for the weather for Middlebury, with a current temperature, indicated by the “temp” key, of 49.25. Your job is to write a Python module that extracts just the temperature from this data.

For context, this data is provide by the API as JSON. JSON (which stands for JavaScript Object Notation) is one of the most common data interchange formats, that is, specifications for communicating precisely formatted data between different programming languages (or computer systems). In our example, the weather website provides a JSON representation of the weather that can be sent as a string, and then parsed (or understood) by many different programming languages as dictionaries, lists, numbers, etc.

You may implement this module however you like as long as it meets the specifications above; however, here is one suggested approach to implementing it:

  1. Write some code that opens the web page above (on the CS department server) and reads it. You can read the entire file at once using something like contents = webpage.read() instead of for line in webpage:.
  2. Once you have this working, you need to extract the temperature. There are two approaches. One is to use the string method find, like the example program for extracting email addresses we discussed in class. You can, for instance, search for the string '"temp":' (including the double quotes) to find the start location of the temperature, and then find the location of the next comma to get the end location. Another, perhaps more elegant, approach is to utilize the structure of the data returned by the API. As described above, this data is formatted according to the JSON standard, and as you might expect, there is a Python module for parsing this representation.

    Either approach is valid and permitted, but be sure that you can extract the temperature and store it in a variable.

  3. Once you can obtain the temperature, put this all together to write the get_temperature function. Recall that it will take a zip code as a parameter. That parameter will need to be inserted into a properly formatted URL (you might find the string format method helpful here). Note that the sample page will always return the same data (even if you change the zip code), but you still want to generate a properly formatted URL so that you can obtain the correct data in the future.

  4. Finally, write the part of the program that checks to see if this program is being run versus imported, checks the number of command line arguments and prints the usage if the incorrect number of arguments are provided (exactly as shown above). Finish up your program so that when you run it with the zip code command line argument it prints out the temperature. You should now be able to run your program from the command line with a zip code and it will give you a temperature (pretty cool!). With the test URL, it should always give you 45.84, however, it will just be a small change to have it do the real thing. We’ll get to that soon…

Part 2: Aggregating the weather

We now have a program that we can run and it gives us the temperature and we have a module that we could import and call the get_temperature function to get the current temperature with a zip code. For the second part of this lab, you will write another program (i.e., in a different “.py” file) that can be run regularly over time to build up a file with aggregated temperature data over time.

Your program will be run with two command-line arguments, the name of a file and a zip code. The file will contain multiple entries collected over time. Each line in the file will consist of a date, an hour of the day (in 24 hour time) and the temperature at that hour separated by commas (termed a CSV file). For example, here is a short snippet of an example file:

4-7-2020,19,47.62
4-7-2020,21,42.04
4-8-2020,9,41.59
4-8-2020,11,45.27
4-9-2020,11,43.02
4-9-2020,12,42.57

Each time you run the program it will add at most one line to this file. So the file above would have been generated with at least six calls to the program (over 6 different hours). We’re setting the problem up this way since it is generally straightforward to get a program to run at some fixed interval. You won’t be doing that for this lab, but you may be interested in investigating how it could be done with Cron.

As with the first part, we’ve broken the description of this program into two parts, the specification and the guide.

Specifications

Write a program called weather_aggregator.py that has the following characteristics:

  1. Importing the module only defines functions and variables (no functions are invoked, nothing is printed in the shell).
  2. Your program should be able to be run from the command-line and take two arguments, the first a filename and the second a zip code.
  3. If the program is run with an incorrect number of command-line arguments it should print out the usage:

    >>> %Run weather_aggregator.py
    usage: python3 weather_aggregator.py <file> <zip_code>
    
  4. If the program is run with the correct arguments:
    1. Your program should work if the file doesn’t yet exist. In that case, there can’t possibly be an entry for the current date and time and so your program should create the file and write an entry with the correct information and formatting (i.e., comma-separated date, hour, and temperature). The date and hour should be formatted using (or matching) the functions from the prelab.
    2. If the file exists, the program should first check to make sure that there isn’t already an entry in the file for the current date and hour. If there is, the program should do nothing. This means that running the program repeatedly within the same hour will not alter the file after the first time when the current temperature is added for the current hour.
    3. If the file exists, but there is not an entry in the file for the current date and hour, the program should use the weather_reader module to get the current temperature for the zip code specified as a command line argument and add an entry to the file at the end with the appropriate formatting. Your program should only invoke get_temperature if it is going to write an entry to the file (to avoid slowing your program down with unneeded queries to the API).

Guide

Here is one approach to implementing this program:

  1. Write the part of the program that checks to see if this program is being run vs. imported, checks the number of program parameters and prints the usage accordingly.
  2. Write some code to check whether the file exists and has an entry for a specific date and time (not necessarily the current date and time). To check whether a file exists you can use the exists function within the os.path module (which returns True if the file specified by the string argument exists). For testing purposes, it may be useful to create a version of the aggregate file manually. You can do so with Thonny.
  3. Write some code that gets the current date and hour (see the Prelab).
  4. Write some code that checks to see if a given date and hour is in the file.
  5. Finally, put the above code together so that you check to see if an entry should be written to the file and if so use your weather_reader module to get the temperature and append it to the end of the file. When writing this file, you can either rewrite the entire file from scratch each time (in which case you’d open the file with “w”) or instead just append the one new entry (in which case you’d open the file with “a”). In either case you will use the write method on the file object to write a string to the file. Opening a file in append mode (with the “a” argument) will create the file if it doesn’t exist.
  6. Add any finishing touches to the program to make sure it runs appropriately. Note that when you run your program you won’t see any output, but the data file you provided as a command line argument may have been changed.

The Real Deal

So far, all of your testing should have been done with the departmental web server using the URL above, always giving you the same temperature. When you’re confident that you have everything working you can go back and change your weather_reader module to use the real web page. For a given zip code, the URL should look as follows:

http://api.openweathermap.org/data/2.5/weather?zip=05753,us&APPID=9838b264525602b46f0b2ef8c191eef8&units=imperial

Note that the URL has several “query parameters” separated with ampersands. For instance, we specify the zip code via zip=05753,us. At the end we request imperial units, i.e., Fahrenheit, since by default we get Kelvin which is not as useful. What about the APPID variable? This API asks you to create an account, which controls the number of requests you are allowed to make. A free account gives you up to 60 requests per minute. We encourage you to create an account, which will give you your own unique APPID to use in the URL. The current value is Professor Briggs’s key.

In any case, to use the actual API you need to use a URL like the one above, but with the correct zip code substituted.

Change your get_temperature function in the weather_reader module to generate an appropriate URL based on the zip code passed in and then use this URL to get the temperature. You should now be able to query the current weather based on the zip code entered:

>>> %Run weather_reader.py 05753
35.31
>>> %Run weather_reader.py 80424
50.58
>>> %Run weather_reader.py 33111
92.16

Again, please try not to run this program too many times (unless you created your own API account), but do play with it some. You should be able to run your weather_reader.py program with a zip code and it will give you the current temperature and your weather_aggregator.py should now aggregate the real values.

Creativity Points

You may earn up to 2 creativity points on this assignment. Below are some ideas, but you may incorporate your own if you’d like. Make sure to document your additions in the comment at the top of the files.

When you’re done

Make sure that your program is properly commented:

Remember that modules need docstrings too! Make sure you have a docstring at the top of your file that starts with a meaningful one sentence description of the functionality in that module. That is, the top of your file should now look like:

"""
[A brief description of your module here]

CS150 Lab 7

Name: Amy Briggs
Section:

Creativity:
"""

In addition, make sure that you’ve used good coding style (including meaningful variable names, constants where relevant, vertical white space, etc.).

Submit your programs via Gradescope. Your files must be named weather_reader.py and weather_aggregator.py, and you must submit both files at the same time. You can submit multiple times, with only the most recent submission (before the due date) graded. Note that the tests performed by Gradescope are limited. Passing all of the visible tests does not guarantee that your submission correctly satisfies all of the requirements of the assignment.

Grading

Feature Points
weather_reader.py  
run vs. import 1
prints usage if incorrect number of arguments 2
runs correctly if zip code entered 2
get temperature 5
weather_aggregator.py  
run vs. import 1
prints usage if incorrect number of arguments 2
data and hour formatted correctly in file 1
appends temp to end of file 3
doesn’t add repeated data 3
Comments, style 3
Creativity 2
Total 25

FAQs

Organizing the main program of weather_aggregator

In thinking about how to organize the main program of the weather_aggregator file, it may help to make a list of separate steps. Here is an example approach:

  1. get the filename from the command-line arguments
  2. get the zipcode from the command-line arguments
  3. get the current date and time (see the Prelab)
  4. check whether the file already contains this date and time
  5. if no: (a) get the current temp at the given zipcode, and (b) append the new information to the file

You could have a function (say, already_exists) that returns True/False for step 4, and a function (say, add_entry) for step 5 (b).

Recall that to construct a comma-separated string containing the variables a, b, and c, we define s = str(a) + ',' + str(b) + ',' + str(c) + '\n' – this is just like the string we want to append to our file.

Accurately Checking for the Date and Hour

To check if there is an entry with the current date and time, we need to check if any line in the file contains both the current date and the current time. However, just using the in operator to check for the presence of the date and hour in the string can fail some of the time. Instead we want to match the entire date and hour string at one time.

For example, if we have the following line in our data file as the variable line:

4-8-2020,9,41.59

the following expression "4-8-2020" in line and "20" in line would evaluate to True even though the hour, “20”, doesn’t match. Because in scans the entire line, the “20” matches the year within the date.

Two ideas:

  1. Match the entire date and hour string “4-8-2020,20”, i.e., "4-8-2020,20" in line, or even better, line.startswith("4-8-2020,20"); OR
  2. Use line.split() to separate the line into a list containing 3 string values, then compare the date and hour individually.

Formatting weather_aggregator File Entries

The lab specifies (and Gradescope tests for) a specific file format. Per the specification, the date, hour, temperature should be separated by commas (not spaces or other characters). Note, we use commas to make it easy for other tools or libraries, like datascience, to read in our data file. A compliant write would look like file.write("4-9-2020,11,43.02\n").

Note that newline (“\n”) at the end. Unlike print, write doesn’t automatically include a newline. We want to put the newline at the end instead of the beginning of the line. Including a newline at the beginning of the line will create a blank line at the beginning of the file.

With use of optional arguments, we can also use print to write to a file. If we check out the print documentation we see it takes an optional file argument. We can provide the object returned by open to that argument to print to file (including the newline at the end).