CS465 - Assignment seven

Due: 2016-04-20 01:45p

Objectives

Build some multivariate visualizations in D3

Part 0: Plan your final project

I would like you find a partner (your choice this time) and to start thinking about your final projects. The projects should be a step up in complexity from the work we have been doing in the assignments. While you can start from one of our assignments or one of the examples I gave you, they should not just be variations (e.g., a scatterplot of English census data). They should be interactive and make use of a technique such as dynamic queries or brushed histograms to permit exploration of some interesting data set. Pick something that you are interested in and either want to know more about or want to communicate to others. Alternatively, you could make a general purpose visualization that can load other people’s data on demand. Obviously, my preference is for the work to be done in D3, but if make a compelling argument for something else, I will listen.

It is worth noting that while we have talked a lot about visualizations from a theoretical standpoint, we have been using a fairly small set of visualization forms. They are fundamental forms that most designers keep coming back to, but we will be talking about more exotic visual representations and techniques soon. If you want a little bit of inspiration, flip through the book, visit the visualization zoo or browse the D3 demo collection.

If you want to get into things like text visualization or geo-visualization, come talk with me first since you should start on the project before we actually get to it in class.

Write a short proposal and email it to me by April 18th.

Part 1: [15 points] Draw a stacked bar chart

Yet another layout in the D3 library is the stack layout, which allows you to create visualizations with layers. Read the above documentation to figure out what it does and how it works.

For this one, we will deal with some synthetic data. Here is a function that randomly generates some fake sales data for different regions over time. If you want to have a different conceptual model of what the data represents, that is fine (and if you have some real data that is appropriate for this form, even better).


var regions = ['East', 'West', 'North', 'South'];
    var years = d3.range(2005, 2015);
    var rand = d3.random.normal(35, 20);
    var data = regions.map(function(region){
      var profits = years.map(function(year){
        return {x: year, y:Math.max(0,rand())};
      });
      return {region:region, profits:profits};
    });

Other than the stacking, this should be a fairly conventional bar chart. Make sure to include a legend of what the various layers are.

To up the complexity slightly, I would like you to add some interaction. When someone clicks on a layer, the graph should reorder the layers to put the clicked layer on the bottom. The other layers should remain in the same relative order.

Save this work in a file called hw7_barchart.html.

part 2: [15 points] Build a heat map

I would like you to build a heatmap showing the attributes in films50.json. This contains data about 50 recent films. We have data about the film’s name, genre, studio, release year, as well as some ratings data and data about the budget and the film’s take. Here is an example record from the data:


  "film": "Ratatouille",
  "studio": "Disney",
  "year": "2007",
  "rt_rating": "97",
  "aud_rating": "84",
  "genre": "Animation",
  "openingtheatres": "3940",
  "bo_average_opening": "11935",
  "d_gross": "206.45",
  "f_gross": "414.98",
  "ww_gross": "621.42",
  "budget": "150.00",
  "profit": "414",
  "opening": "47"

Don’t be put off by the fact that this is a json file. Just change the d3.csv to d3.json and you will never know the difference.

You need to use this data to create a block of rectangles, with each row corresponding to a single film. My advice is to create a g for each row and populate it with rects for each individual value. This is a little different from what we have done with scatter plots and bar charts where we had one glyph per “row” of the data.

To get you started on how to do this, here is some code that loads the data into a conventional HTML table (try it).


var table = d3.select("body").append("table");

  var tr = table.selectAll("tr")
    .data(data)
    .enter().append("tr");

  var td = tr.selectAll("td")
      .data(function(d) {return d3.entries(d) })
      .enter().append("td")
      .text(function(d) { return d.key + ": " + d.value; });

The first two parts should be fairly familiar territory. First we add a table to the page. Then we add rows (tr is a table row in HTML for the novices), binding them to the data items.

Things get more interesting in the final part. We are selecting all of the tds (table cells) within each of the trs. Rather than passing in data again (which would bind a full object to each cell), we pass in a function. This function is called on the current tr and is passed the trs data object. We use a d3 function d3.entries(), which takes an object and turns it into an array of key value pairs. For example, d3.entries({'A':50, 'B':12}) -> [{'key':'A', 'value':50}, {'key':'B', 'value':12}]).

This array is then used by the data() function to bind things to the tds. Thus, each td gets just one key-value pair, which we load as the text into the td.

Of course, you will not be implementing the heatmap as a table (though you could). Instead, use rows of rects inside of a dedicated g for each row. Otherwise, the structure will remain the same.

The next issue to solve is that we don’t want any of the categorical data in the heatmap. Here is a function that can help you filter them out.

 
function objToFilteredArray(d, unwanted){
	// convert object to array of associative pairs
	// {'A':50, 'B':12} -> \[{'key':'A', 'value':50}, {'key':'B', 'value':12}\]
	ar = d3.entries(d);
	
	// remove the pairs corresponding to the categorical values
	// we do this by filtering and removing pairs that have a key in the unwanted list
	return ar.filter(function(v){return unwanted.indexOf(v.key) == -1;});	
}

This takes in an object and returns the key value pair array, with the unwanted keys filtered out. Copy this function into your code and then replace the d3.entries(d) in the table code with objToFilteredArray(d, ['studio']). You should find that you have almost the same thing you had before, but the studio column is missing. You can add as many attributes to the list as you want.

The next piece to work out is the actual colors. The data is unscaled, so you will want to scale the values to color them properly. My recommendation is to create a collection of different color scales, one per attribute, stored in an object so the scales are keyed to the particular attribute of the data. In other words, if the scales are held in an object called colorScales, then you should be able to get the scale associated with the budget with colorScales['budget']. I created mine using a for loop that started for (prop in data[0]). This will iterate through every attribute of the data set, putting the name in the prop variable. Then just create the color scale as normal, using the extent of that particular attribute.

Once the table has been created, add some labels. Make sure that we can see which movie is which and which attributes are in the columns.

Finally, when you click on an attribute label, the heatmap should resort based on that attribute.

Call this file hw7_heatmap.html

Logistics

We will return to working in pairs for this assignment. Again, you should be working pair programming style, not splitting up the work. Let me know if you are having trouble with your partner assignment.

Please call your HTML file hw7.html and submit it file on Moodle.