CS465 - Assignment Three

Due: 2016-03-04 01:45p

Objectives

[20 points] Randomize

This week, rather than working with existing data, you are going to be creating data. We want a good quantity of data, so I would like you to do this programmatically. I suggest (and will assume) Python, but you are welcome to use another language like R, Go, Ruby, etc… if you are more comfortable using it. In Python, there is a library for writing out CSV data, which I suggest using with the “excel” dialect. There is also a nice random library that allows you to generate data using a number of different distribution techniques, which will make for more interesting data than generating uniform random values. Do your best to insert a story into your data that you can “reveal” with your visualization.

I would like you to generate a couple hundred rows of data, the precise number is up to you. You can be creative and use whatever column headings and data values you like. Your data could contain various environmental factors surrounding UFO sightings, or demographic data for attendees of the MiddStock music festival, or the predicted results of Winter Carnival 2017, or you could cop out and just give me a collection of letters and numbers with no implied significance. The only requirement is that you have at least four columns (or variables), and that your synthetic data has representatives of nominal, ordinal, interval, and ratio data. As we saw in class, much of this comes down to interpretation, as you could use integers for all of those things, but you need to be clear about which is which (this is where making up some real meaning to your columns might be helpful).

I would then like you to load this data into Tableau and visualize it. I would like to see at least four visualizations of this data. For each of the four levels of measurements, I would like an explanation for why you picked the visual representation and why or why not it would be appropriate for the other three. You can use the story tool to create this write up, giving us one page per variable. Note that you do not need to think of the four visualizations as showcasing each of the four columns. Each visualization will probably involve multiple columns – you may even be able to cover all four variables in a single visualization (though I might advise against it). You can certainly use the same visualization to talk about multiple variables.

[5 points] Area or radius

For the second part of this assignment, I would like you to prove whether or not Tableau uses circle area or circle radius when you map a variable to “size” and you are using circles as the primary mark. You should hand craft a very small dataset and then use Tableau to visualize it in a way that demonstrates how a variable is mapped to size. I recommend thinking literally about drawing circles in space. A small challenge to this is that the size mark in Tableau is relative only within that measure. So, if the value you have mapped to size is 5, however Tableau determines the size of the circle, it will have nothing to do with the scale of the space you have drawn the circle (which is presumably determined by another variable). If you click on the size button, you will get a slider that changes the scale for all of the circles (it is a little fiddly, but you should be able to get it sized to a point where you can at least make comparisons). The consequence of this is that you can’t just draw a single circle, you will need at least two, so you can ask questions like: “If circle 1 is this size, what size do I expect circle 2 to be?”.


Logistics

You will again be working on this assignment in pairs that I will assign shortly. As before, please do this pair programming style, working together rather than splitting the work into tasks.

I would like both parts of the assignment done in the same Tableau workbook. You can add multiple data sources to a workbook. When you do, you will see a little list appear at the top of the Data tab of the worksheets that allows you to select which data source you are working with.

Remember to export your workbook as a “Packaged Workbook” so that I have your data!

Please submit the completed workbook on Moodle. I only need a single workbook per group.