CS465 - Assignment Six

Due: 2014-04-10 11:00a

Objectives

Get some more practice building visualizations in D3
Think about various techniques for visualizing multivariate data

This is a collaborative assignment. Find yourself a partner. You are welcome to make use of the same partner you are working with for the project. For part one, please discuss the various visualizations together, even if you split up the actual work of writing out the evaluations.

Part zero: Final projects

As I said in class, the next stage in the process is to get me a napkin pitch level mock-up (sans napkin). I want some rough sketches of what you think you final project will look like. If it has multiple views or pages, show me what they are. List out the kinds of interactions that the visualizations support and how you transition between views.

Part one [24 points]: Analyze multivariate visualization techniques

For each of the following, briefly describe what problem it is trying to solve. In other words, what kind of data is it most appropriate for, approximately how many data items and how many data attributes can it support, what kinds of questions does it try to address and how would we use it to answer those questions. Then describe any shortcomings. These could include types of data it doesn’t support, limitations on the number of data items or attributes, problems mapping back to original data, and their abilities to actual answer the questions they are trying to address. Keep these brief and to the point.

Parallel sets
Parallel coordinates
Radar chart
Chernoff Faces
Mosaic plot
Table lens

Part 2 [25 points]: Create a heatmap in D3

I would like you to build a heatmap showing the attributes in films50.json. This contains data about 50 recent films. We have data about the film’s name, genre, studio, release year, as well as some ratings data and data about the budget and the film’s take. Here is an example record from the data:


  "film": "Ratatouille",
  "studio": "Disney",
  "year": "2007",
  "rt_rating": "97",
  "aud_rating": "84",
  "genre": "Animation",
  "openingtheatres": "3940",
  "bo_average_opening": "11935",
  "d_gross": "206.45",
  "f_gross": "414.98",
  "ww_gross": "621.42",
  "budget": "150.00",
  "profit": "414",
  "opening": "47"

Don’t be put off by the fact that this is a json file. Just change the d3.csv to d3.json and you will never know the difference.

You need to use this data to create a block of rectangles, with each row corresponding to a single film. My advice is to create a g for each row and populate it with rects for each individual value. This is a little different from what we have done with scatter plots and bar charts where we had one glyph per “row” of the data.

To get you started on how to do this, here is some code that loads the data into a conventional HTML table (try it).


var table = d3.select("body").append("table");

  var tr = table.selectAll("tr")
    .data(data)
    .enter().append("tr");

  var td = tr.selectAll("td")
      .data(function(d) {return d3.entries(d) })
      .enter().append("td")
      .text(function(d) { return d.key + ": " + d.value; });

The first two parts should be fairly familiar territory. First we add a table to the page. Then we add rows (tr is a table row in HTML for the novices), binding them to the data items.

Things get more interesting in the final part. We are selecting all of the tds (table cells) within each of the trs. Rather than passing in data again (which would bind a full object to each cell), we pass in a function. This function is called on the current tr and is passed the trs data object. We use a d3 function d3.entries(), which takes an object and turns it into an array of key value pairs. For example, d3.entries({'A':50, 'B':12}) -> [{'key':'A', 'value':50}, {'key':'B', 'value':12}]).

This array is then used by the data() function to bind things to the tds. Thus, each td gets just one key-value pair, which we load as the text into the td.

Of course, you will not be implementing the heatmap as a table (though you could). Instead, use rows of rects inside of a dedicated g for each row. Otherwise, the structure will remain the same.

The next issue to solve is that we don’t want any of the categorical data in the heatmap. here is a function that can help you filter them out.

 
function objToFilteredArray(d, unwanted){
	// convert object to array of associative pairs
	// {'A':50, 'B':12} -> \[{'key':'A', 'value':50}, {'key':'B', 'value':12}\]
	ar = d3.entries(d);
	
	// remove the pairs corresponding to the categorical values
	// we do this by filtering and removing pairs that have a key in the unwanted list
	return ar.filter(function(v){return unwanted.indexOf(v.key) == -1;});	
}

This takes in an object and returns the key value pair array, with the unwanted keys filtered out. Copy this function into your code and then replace the d3.entries(d) in the table code with objToFilteredArray(d, ['studio']. You should find that you have almost the same thing you had before, but the studio column is missing. You can add as many attributes to the list as you want.

The next piece to work out is the actual colors. The data is unscaled, so you will want to scale the values to color them properly. My recommendation is to create a collection of different color scales, one per attribute, stored in an object so the scales are keyed to the particular attribute of the data. In other words, if the scales are held in an object called colorScales, then you should be able to get the scale associated with the budget with colorScales['budget']. I created mine using a for loop that started for (prop in data[0]). This will iterate through every attribute of the data set, putting the name in the prop variable. Then just create the color scale as normal, using the extent of that particular attribute.

For the final piece, add some labels. Make sure that we can see which movie is which and which attributes are in the columns.

Turning in your work

Hand this in as a single HTML file called username1+username2_hw6.html (if you need multiple files, you are welcome to make use of a zip files instead). Please add appropriate HTML text including your name, date, and the assignment. yes, the solutions to part one should be written as HTML into the same file. You are welcome to include additional explanatory text as you feel appropriate.

I would like this document turned in in the DROPBOX on MiddFiles. You can refer to the reference on the LIS wiki to help you connect to the file server.