CS 465 - Tutorial Six

Published

November 5, 2025

Due
November 12, 2025 at 11:59 PM

Objectives

  • Learn how to create a scrolling narrative visualization
  • Learn how to build a histogram in D3
  • Learn some more patterns for working with transitions and making flexible visualizations in D3

Hold on to your hats folks, there is a lot going on in here…

Getting started

  1. Create the git repository for your tutorial by accepting the assignment from GitHub Classroom. This will create a new repository for you with a bare bones npm package already set up for you.

  2. Clone the repository to you computer with git clone (get the name of the repository from GitHub).

  3. Open the directory with VSCode. You should see all of the files down the panel on the left in the Explorer.

  4. In the VSCode terminal, type pnpm install. This will install all of the necessary packages.

  5. In the terminal, type pnpm dev to start the development server.

What are we building?

We are building a very short narrative about Radiohead. Why Radiohead? I happen to like Radiohead and someone took the time to put together a dataset with data about every one of the songs on their studio albums (and he wrote a blog about it).

The data looks like this:

Column name Interpretation
track_name track name
album_name album name
album_release_year album release year
album_img link to cover art
lyrics full lyrics of the song
duration_ms duration in ms
valence Spotify ranking of the sound’s affect. High valence is happier, low is sad. [0-1]
pct_sad % of the words in the lyrics that are classified as “sad”
word_count number of words in the lyrics
lyrical_density rough measure of importance of lyrics to the song – word count/duration [0-1]
gloom_index measure of track’s “gloominess”, with lower scores indicating increased gloom [1-100]

As you will have gathered, we are focusing on a scrolling, author-controlled approach. It is certainly not the only way to do narrative visualization, but it is a popular one that is fairly effective for presenting a narrative that doesn’t feel like a PowerPoint slide deck.

For background reading, I suggest Mike Bostock’s How to Scroll article. It doesn’t cover the actual technical implementation details, instead it discusses how to design a good scrolling interface.

Understanding index.md

I have already written the narrative for you. Fire up the dev server in the usual way and take a look at site and the index.md that generates it. The first thing I want you to notice is that there is a visualization on the page and the text scrolls over it. In index.md I want you to notice that we are not using pure Markdown in there. I have broken out the HTML and added some <div> tags.

The overall structure looks like this:

<div id="scrollytelling-container">

  <div id="vis-container"> visualization in here </div>

  <div id="steps-container">
    <div class="step"> text build here </div>
    <div class="step"> text build here </div>
    <div class="step"> text build here </div>
  </div>
</div>

The visualization (in “vis-container”) has been styled to have its position be “sticky”, which means that it will stay fixed on the page while the container around it if visible. The steps have a relative position and a higher z-index (depth on the page) so they scroll over the visualization container. All of the steps (other than the first one) also have a large top margin, which creates a good amount of space between them.

Scrollama and Observable Framework

To build our scroller, we are going to use a new library called Scrollama, which was designed specifically for this kind of interface. The library installs a tracker that allows us to know where we are on the page. It works in terms of “steps”. We set up some kind of repeating element and it will inform us of when we have moved to each one in turn as we scroll down the page. The library can also be configured to provide a “progress” value that ranges between 0 and 1 in between the transitions, but we don’t need that for this tutorial (see the docs for more information).

I’ve already done all of the work for you there since there isn’t a lot of meat to this part. You will find the result in src/components/scroller.js. Let’s look at how we set up the scroll tracker:

const scroller = scrollama();
scroller
  .setup({
    step: ".step",  // Targets your narrative blocks in index.md
    offset: 0.2,    // Trigger when the step is 50% down the viewport
  })
  .onStepEnter((status) => {
    notify(status.index);
  })

There are two pieces in here. The first is the initial setup. This just takes in an object with some options. The most critical is the step option. We are telling the scroller what kind of object it should consider to be a “step”. This uses the same CSS selectors we have been using. This is saying each time an element with the class “step” is encountered, it will be considered one step.

Then, we register a handler for onStepEnter that will fire when we transition to a new step. The handler receives an argument that contains element (the DOM element that constitutes the step), index (which step this is - starting at 0) and direction (“up” or “down”).

The notify is not part of Scrollama – that is the Observable piece. One of the things that we have not spent much time dealing with is that Observable Framework is reactive like the Observable Notebooks I use for our demos. That means that we don’t have to write lots of event listeners to connect up user interfaces. Instead, when a value is updated, then any cell on the page that uses that value is re-run automatically. So, if we have a user input where a user types in new values, we could have our visualization update automatically. In our case, we have a process that will occasionally generate new values as the user scrolls.

We are going to use Framework’s Generators to watch the values generated by our scroller. Again, I have put this in place for you. In index.md you will find this line:

const position = Generators.observe(scrollWatcher);

The observe function takes in a function (our scroll watching function in this instance) and calls it, passing a function to call if anything interesting happens. This is the notify function we saw in scrollWatcher. Whenever scrollWatcher calls notify with a value, that value will be assigned to position. We are passing in a simple number, but we can call notify with whatever we like. notice that I declared position as a const. That is to reinforce that we are not just changing the value of a variable, we are re-executing the block it comes from. When that change happens, any other blocks that refer to that value will change.

Find the “vis-container” div in index.md and the following inside of the div:

Current step: ${position}

Now as you scroll, you should see the value update. You can leave that in for the moment for debugging, but remove it before you turn this in.

Visualizations

Now we can worry about the visualizations. Our narrative just has four sections, so we need the following four visualizations (in order).

  • a bar chart showing the average gloom index for each album
  • a histogram showing the distribution of song valence
  • a histogram showing the distribution of songs based on percentage of sad lyrics
  • a histogram showing the distribution of songs based on the gloom index

To make our narrative more visually interesting, we will use animation to transition between the visualizations.

The visualizations are all controlled by updateVisualization. If you look at the bottom of index.md, yuo will see the call. Because it uses position as an argument, Framework will re-execute the cell when position changes. Go to components/visualizations.js to see the (partial) implementation.

The first thing the code does is make sure that the SVG is properly configured. The next thing it does is it creates two <g> tags – one for the bar chart, and one for the histogram. Think of these like layers. We will use opacity to show only one of these at a time.

Why only one histogram?

Aren’t we making three of them? Yes, we are, but we are going to reuse the same histogram and just update its values when we look at different distributions.

Why does this use the data/join pattern?

This function will be run multiple times as the user scrolls. It is important that we have both of those layers, but we don’t want to just keep making more and more layers. Doing the binding makes sure we only have those two layers. We could also just do a selection for each and check to see if it is empty, but this allows you to see data/join some more.

The final piece that is in there is

  barchart.call(drawBarchart, data, options);

which as you probably are imagining, actually draws the bar chart with code I have provided.

Reacting to the position

Now it is time to start adding some code to make the visualization react to the current position. At the bottom of the function, you want to add a conditional statement that will do different things depending on which of our four positions we are at. Now, you could just break out the if-else statement at this point. However, I am going to encourage you to use a switch statement here. Many languages have a switch statement, but I suspect they are not one of the first things you think to reach for. They are tailor made for situations where you have a single expression (like a variable) and you want to specify different behaviors based on the different possible values of the expression.

Add the switch statement at the bottom of the function. For each section (0-3), use console.log to print out what the section should do (e.g., “gloom bar chart”, “valence histogram”, etc…). Scroll through the narrative and make sure it is displaying the correct text in the console (remember that you will need to open your developer tools in your browser to see it).

Once you have things set up properly, move the line that draws the bar chart into your handler for section 0. This will… not really change much. We still only have one visualization, and it won’t go away because we are calling this function again (unless we explicitly cleared the contents of the SVG every time, which we aren’t going to do). We do know, however, that when we are in section 1, we don’t want to see it. So, in the section 1 handler, use the barchart variable and set the opacity to 0. Test it out – you should see the bar chart disappear when you scroll down.

Of course, when you scroll back up, it won’t reappear… So, add a line for section 0 that sets the opacity to 1. Now the visualization should toggle on and off as you scroll between sections. To make it look a little better, use what you learned in the last tutorial to add a transition to each change. For the duration, use the speed variable that came in with the options. Now the visualization should fade in and out.

Add a histogram

The second visualization that we are going to see is a histogram showing the the distribution of valences for the songs in our dataset.

We actually will end up with three different histograms. Instead of fading between them using opacity, we are going to reuse the same elements and just adjust them to suite the different variables we are displaying. This will give us an effect you have probably seen before of the bars growing and shrinking as we transition between variables.

Creating the histograms

It should be obvious that we are going to use a little abstraction so we don’t actually code three separate histograms. In truth, we are going to go a little farther than that. We are going to make a single reconfigurable histogram so that as we transition from one step of the story to the next we are just updating it.

Under updateVisualization you will find the drawHistogram function, which I’ve started for you.

Like the drawBarchart function, this takes in a g that we will use as the base of the visualization. It also takes in the data and size. In addition, it takes in the metric we would like to look at, the title we would like to put on the graph, and the speed to use for any transitions.

The idea is that we will write this function so it can be called multiple times. If we call it with a different metric to look at, it should update the chart to reflect the change. As such, we want to be careful not to add things to the g multiple times.

Create the x scale

Creating the x scale is pretty straightforward. For our histogram, we are going to use a linear scale. Our data has been processed so that all of the values we will be looking at fit comfortably between 0 and 100.

So the x scale is just

const x = d3.scaleLinear()
  .domain([0,100]) // we've normalized the data to fit in this scale
  .range([margin.left, width-margin.right])
  .nice();

Create the bins

For the y scale, things are a little more complicated because we want to make a histogram. We’ve not yet made a histogram in D3. Recall that for our histogram, we need to “bin” the data, and the height of the bar is the number of items in each bin.

As you might imagine, there is a bin generator tool in D3.

  const makeBins = d3.bin()
  .value((d)=>+d[metric])
  .domain(x.domain())
  .thresholds(20);

  const bins = makeBins(data);

The value method tells the generator which value to look at for binning purposes. The domain and the threshold determine where the breaks will be made for the bins. In general, the threshold is just an estimate, but for our data, 20 thresholds nicely gives us a break at every multiple of five.

When we call the generator with our data on the last line, we get back an array of bins. Each bin is itself an array of the binned items, so the length of the bin is the value we want. In addition, each bin has two attributes: x0 the lower bound of the bin (inclusive) and x1, the upper bound for the bin (exclusive except for the last bin).

An important thing to realize is that this does not produce empty bins. If a bin would be empty it is just left out – this is important to remember for data binding.

Create the y scale

Once we have the bins, we can make the y scale. There is nothing particularly new about this. The range is the same one we have used many times before, and for the domain we are just looking for the bin with the most items in it.

  const y = d3.scaleLinear()
  .domain([0, d3.max(bins, d=>d.length) ])
  .range([height - margin.bottom, margin.top])
  .nice();   

#### Making the bars

At this point, the process for making bars with rect elements should be fairly familiar.

There are some subtleties here, however.

We are going to use bins as the data source.

 g.selectAll(".bin")
.data(bins)

As such, we will use x0 and x1 for the x and width attributes of the bars, and the length of the bin for the height. To make our histogram look a little nicer, we will cheat in the two sides of the bars.

We are also going to animate the bar creation. To do that, we will set the initial y value and height to 0, and then add a transition to grow the graph to the correct values.


 enter=>enter
  .append("rect")
  .attr("x",(d)=>x(d.x0) + 1)
  .attr("y", y(0))
  .attr("width", d => x(d.x1) - x(d.x0) - 2)
  .attr("height", 0)
  .attr("class", "bin")
  .style("fill", "currentColor")
  .transition()
    .duration(speed)
    .attr("y", d=>y(d.length))
    .attr("height", d=>y(0) - y(d.length))

As we transition between histograms, what we are actually doing is swapping out the data. So, we need to include an update, which will just transition the bars to their new values.

    update=>update
      .transition()
      .duration(speed)
      .attr("y", d=>y(d.length))
      .attr("height", d=>y(0) - y(d.length))

Finally, as I said above, not every histogram has a bin in every slot. So, we need to handle this with an exit selection.

    exit=>exit
      .transition()
        .duration(speed)
        .attr("y", y(0))
        .attr("height", 0)
        .remove()

We also need to help D3 figure out which bars to remove, so we need to add a key function to the data function.

Add a second argument to data, and write d=>d.x0. Since all of our variables are on the same scale, using the x0 value of the bin should make sure we are always talking about the same bar.

Add the x axis

Our usual process for creating an axis is

  • add a new g
  • translate it to the right location
  • call the appropriate axis function on it

Our problem now is that we want to call drawHistogram multiple times. If we follow our old pattern, this will add a new axis into the graph every time we redraw.

So, we are going to follow a different pattern:

  • try to select the existing axis
  • if we can’t find it,
    • add a new g
    • provide it with an id
    • move it to the right location
    • call the appropriate axis function on it

This is what that looks like in code:

 let xAxis = g.select("#x-axis");
  if (xAxis.empty()){
    xAxis = g.append("g")
    .attr("id", "x-axis")
    .attr("transform", `translate(0, ${height - margin.bottom})`)
    .call(d3.axisBottom(x));
  }

Add the y axis

You will do essentially the same thing for the y axis (with the appropriate changes to make it the y axis, of course).

The difference, however, is that the y axis will change with each of our metrics because we are changing the y scale.

So, move the call that actually populates the axis after the conditional where you create the g and position it. If you add a transition before the call, the axis will dynamically resize.

Add the label

To add the title to the graph, you can use start by copying the code from the bar chart. However, since it will change, use the same concept as you used for the y axis to only create and set up the title once, and then set the text of the title separately so that it updates with the variable.

Add some styling

To give everything a consistent look and feel, add this code:

  g.selectAll("text")
  .style("font-size","12px")
  .style("fill", "currentColor")

This will style the title as well as all of the axis labels.

Add the histogram to the narrative

When the user has scrolled to section 1, we would like to see a histogram of song valence. So add a call to draw the histogram:

histogram.call(drawHistogram, data, "valence", "Valence", options)

Now, when the page loads you should see the bar chart. Then as you scroll down, the histogram will fade in and you should see the bars grow into place. But what happens when you scroll back up?

Oops. That is pretty ugly – the two visualizations are overlapping and we can’t see anything. To fix this, in section 0 fade the histogram’s opacity down to 0.

Add the remaining histograms

Now that you have done all of the hard work, the final piece has very little work for proportionally large payoff.

For section 2, draw the histogram using pct_sad as the metric, with the title “Percentage of the song with sad lyrics”.

For section 3, draw the histogram for gloom_index with the title “Gloom Index”.

As you scroll through the first three sections, the graph should animate between the three metrics.

Test it out! You should now have a scrolly narrative structure that you can use for your projects.

Final thoughts

This is certainly not the only way to write a narrative visualization, nor is it even the only way to create a scrollytelling interface. My goal here was to demonstrate two basic approaches - the fade between two completely different graphs and the dynamic reconfiguration from one graph to another.

If the amount you want to do in each step scales up, you might contemplate a different design where each step is handled in its own function. You could store all of the functions in an array and just use the current section to look up which function to call.

You might also change the granularity on the events and share the offset from the start of the section, using the offset to control the animation.

My hope is that that you can see this as providing the general overall approach that you can then bend to whatever form works for you rather than a prescription that you need to copy.

For some other examples of scrollytelling, check out the examples on the Scrollama page.

Reflection

Before you submit, make sure to answer the questions in reflection.md.

Submitting your work