What is a RadViz?

RadViz is a non-linear visualization tool that displays data on three or more attributes in a two-dimensional projection. While the attributes are represented through equally-spaced points along the perimeter of a circle, the actual data instances are positioned within the circle according to their proportional composition of each attribute (check out more about the math here). RadViz is a powerful tool that is effective at revealing clusters in a high dimensional dataset.

Implementation in D3

After searching around the web a little, we found very few interactive and visually appealing RadViz implementations, so we took on the challenge of building our own using D3. The tool we developed focuses on exploratory data analysis by making the visualization highly customizable. The greatest advantage of our tool over other implementations is that it allows the user to color the data points based on any categorical attribute, which can illuminate patterns that are not otherwise visible.

Here is a link to our RadViz implementation in D3. It is still a work in progress (as is all software), so feel free to contribute by submitting a pull request to our GitHub repo.

Evaluation

As stated earlier, RadViz is a powerful tool at identifying clusters in high dimensional data. However, it has few uses outside this narrow focus, and it takes quite a bit of time to learn how to use the visualization to effectively find clusters in data. Another downside to RadViz is the fact that if you map more than six or so dimensions or if you have a very large dataset, the visualization gets messy and can be difficult to interpret. Additionally, the visualization is not robust to outliers because of the relative mapping of each attribute. In other words, if one data point is a large outlier, it can hide clusters in the rest of the data.

Bibliography

  1. Andrews, Christopher. "Multivariate Visualization." Middlebury College, Middlebury. 25 Mar. 2016. CS 465: Information Visualization. Web. 7 Apr. 2016.http://www.cs.middlebury.edu/~candrews/classes/infovis/lectures/lecture%2010%20-%20multivariate.pdf, Multivariate Presentation
  2. Brunsdon, C., A. S. Fotheringham, and M. E. Charlton. "The RADVIZ Approach to Visualization." An Investigation of Methods for Visualising Highly Multivariate Datasets. N.p.: Advisory Group on Computer Graphics, 1998. N. pag. Web. 24 Mar. 2016.http://www.agocg.ac.uk/reports/visual/casestud/brunsdon/radviz.htm, Great page explaining the math behind RadViz nonlinear projection
  3. https://en.wikipedia.org/wiki/Hooke%27s_law Wikipedia entry on Hooke's law
  4. Hoffman, Patrick, and Georges Grinstein. "Visualizations for High Dimensional Data Mining - Table Visualizations." (2012): n. pag. Web. 6 Apr. 2016. http://web.simmons.edu/~benoit/infovis/MIV-datamining.pdf, Visualization description and normalization info
  5. "Rewrite in D3." VDA-lab- Visual Data Analysis Lab. Visual Data Analysis Group, 24 Feb. 2014. Web. 4 Apr. 2016. http://homes.esat.kuleuven.be/~bioiuser/blog/radviz-rewrite-in-d3/, Other implementation example
  6. "Radviz - Orange Documentation V2.7.8." Orange. University of Ljubljana, n.d. Web. 24 Mar. 2016. http://orange.biolab.si/docs/latest/widgets/rst/visualize/radviz.html, Kick off point
  7. http://www.cs.middlebury.edu/~candrews/classes/infovis/data/census.csv Presentation Data: US Census
  8. https://vincentarelbundock.github.io/Rdatasets/csv/datasets/iris.csv, Edgar Anderson's Iris Data
  9. http://www.cs.middlebury.edu/~bwbrown/cs465/radviz/iris.csv, Presentation Data: local/edited copy of E.A.'s Iris Data
  10. http://www.cs.middlebury.edu/~candrews/classes/infovis/data/green500_top_201511.csv, Presentation Data: Green 500 List

Other RadViz Implementations

Presentation materials