CS465 - Assignment One

Due: 2014-02-20 11:00a

Objectives

[40 points] Get some insight

This assignment is very open ended. The short version is that I want you to find some data that interests you, play with it in R, and report anything you discovered that was interesting and preferably unexpected.

For the data, you are welcome to use any data set that you like. There are some that come with R (type ?datasets), and ggplot2 also comes with a couple data sets as well (see the ggplot documentation). I’ve also listed a big collection on the website. The only thing I ask is that the data set you use should be a good size – on the order of a couple of hundred data points. I don’t want to see something like the income/degree scatterplot, where there is one obvious way to turn it into a visualization and then you just stare at it a bit. I want to see evidence that you were exploring that data looking for interesting questions and answering them.

What you will actually turn in to me is a 2-3 page document describing the process. Start by describing what the data set is and what kind of data is in it. Then describe the process you went through exploring the data. What did you try to visualize and what did you learn? I want you to particularly call out anything you found interesting, strange, or unexpected. What questions does this finding lead you to ask? Can you answer them with the data? Obviously, this should be liberally illustrated with visualizations. I want to see a minimum of two different plot types. If you go over the 3 page limit so you can add more pictures, that is fine. If I was to isolate the text, however, I don’t want it to go over a page and half.


Turning in your work

Use whatever tool you feel the most comfortable with to produce your document. However, you must convert the document to PDF before turning it in. Name the file username-hw1.pdf, where username is replaced with your username (the Middlebury username that you log into the lab computers with).

I would like this code turned in in the DROPBOX on MiddFiles. You can refer to the reference on the LIS wiki to help you connect to the file server.