Streamgraphwith html5 <canvas>

Welcome to the Streamgraph Demo!

Calculated Streamgraph Use mouse to show labels & zoom

Your browser does not support the canvas element.
This is a wonderful Streamgraph.

Streamgraph Implementation Overview

This site demonstrates our implementation of Streamgraphs according to Lee Byron's Paper with a focus on showing the visual impact of different calculation aspects.

You can also have a look at the source code which is (hopefully) well documented.

Note: Firefox 4 and Internet Explorer 9 do not yet support html5 sliders. For Firefox there exists a Greasemonkey Extension that helps you out. In Internet Explorer 9 there does not exist a workaround known to us, but you can enter a value in the field and click somewhere outside to update the value.

Note 2: Due to security reasons most browsers need this page to be loaded from a real web server and not a local file. The two function calls send() (on the XMLHttpRequest object) and getImageData() will lead to an error then.

Achievements

We have implemented graph drawing by using the brand new html5 <canvas> element in conjunction with JavaScript. We decided against using any framework as we are cool hackers and tried to take our first steps with these technologies in a bottom up approach.

Main aspects covered by our implementation are:

  • Draw a Streamgraph using spline-interpolated curves
  • Spline Interpolation Smoothness is adjustable live
  • Various input data possibilities
    • Input data can be supplied by an XML file
    • TISS statistics can be scraped with a Python script and saved as XML
    • TISS scraping could also be done live with a PHP script (disabled due to TISS problems)
    • Random data can be generated live
  • A Streamgraph baseline can be calculated with different formulas
    • A straight baseline (which results in an ordinary stacked graph)
    • A simple function like used in the ThemeRiver application
    • The very efficient wiggle-reducing formula described in the paper
    • The more advanced weighted-wiggle-reducing formula described in the paper (which is not yet completely working)
  • Different layer ordering possibilities
  • Different color schemes
  • An x-axis with named values extracted from the XML data
  • A legend with names and values of the currently hovered dataset plus a visual clue for the current x-value
  • Mouse wheel zooming
  • Cross-browser compatibility with Firefox, Chrome and Opera
  • A stunning fire effect, reducing the felt waiting time (of several microseconds) needed for calculating the Streamgraph

Challenges

While developing the application, we had to overcome many smaller and also some stickier obstacles.

One of the most challenging obstacles right in the beginning was to get going with canvas transforms to zoom and translate all screen elements (graph, legend, axis).

Browser compatibility has been a tricky part as well. Even without having to take care of Internet Explorer, the supported browsers still had different issues. For example, calculation of the mouse position is not the same in all of those. Another challenge was stopping event bubbling, which should be standardized by the W3C, but obviously is not.

Open Issues

Not everything worked out as expected, though. We put much effort in implementing also the more advanced weighted wiggle-reducing function, but we couldn't find the mathematical problem with our calculation. Another issue is the spline interpolation which leads to unwanted overshooting and results in slightly overlapping layers. The last open issue is that our implementation only supports equidistant x-values as input data.

Streamgraph Theoretical Overview

Stacked Graph

A stacked graph allows to display a great amount of data in a combined view, whereas the user can discern the sum of a data set at each point as well as the constituent parts. There are two forms of stacked graphs: one for displaying one or more stationary properties and one for properties changing over the x-coordinate, in the most cases expressing share-variation in time.

What the Heck is a Streamgraph?

A Streamgraph is an innovative form for changing the visualization of time-varying stacked graphs in order to make them visually more appealing for the masses. The difference to a conventional stacked graph is a non-linear x-coordinate - in the sense it not being a straight line - to make up for visually unpleasant spikes in the data. Other than that, guidelines for choosing the order of stacking, color choice and labeling exist to make Streamgraphs more appealing and easier to read for the audience. ThemeRiver is a previous approach for making stacked graphs more appealing which resulted in a massive positive echo of the masses and paved the way for the conception of Streamgraphs.

What is It Good For and What are Its Positive Impacts?

Streamgraphs look "non-scientifically" and provide an organic and engaging view which makes them appealing for mass audiences. Also, they are easily readable, at least regarding certain characteristics of the underlying data. Overall trends are as well discernible as details are, even for great amounts of data. A Streamgraph was used to display the listening history of last.fm users which could emotionally connect to the organic form. Another application was the Box Office Revenue graph made for the New York Times which showed interplay between box office hits and Oscar nominations.

Negative Characteristics

The varying baseline of a Streamgraph layout possibly makes the overall graph much harder to read. It also might not be suited for all kinds of given data due to certain statistical properties it highlights best.

Stacked Graph Design

Baseline

The baseline (x-coordinate) needs to be calculated from the derivatives of the constituent functions by numeric integration. If only discrete data points are known, the functions need to be interpolated from these first. The result of these calculations should be a baseline which induces a smoother flow of the single layers, reducing wiggles and spikes.

Silhouette

The silhouette depends on the calculation of the baseline as well as the layer ordering. The goal is to minimize spikes of the overall graph. Another issue is to put focus on onset times of the single layers which are placed on the outsides of the whole graph - this goal might change depending on the statistics of the underlying data. Onset time is the earliest time a specific constituent function is not 0.

Legibility and Aesthetics

Your browser does not support the canvas element.
Color field for LastFM Picture Color Picker

Several design goals are mentioned by the author to increase legibility and aesthetics of the produced graph.

Color

One of those mentioned is color, which mainly serves two purposes. Firstly, color should be chosen carefully as the ability to discern layers depends heavily on the colors and color differences between them. Secondly, color serves the purpose of communicating the temporal extension of a layer by giving time series with a later onset a warmer color.

Ordering of Layers

The order of the single layers can improve the overall look of a graph greatly as it has a big impact on how spikes and wiggles of a layer effect neighboring ones. For example a spike in the outermost layer will have the least effect on the other ones. The interplay of communicative concerns and aesthetics is the most important issue in choosing an order. In NameVoyage, which shows the most popular baby names over time, the layers are simply sorted alphabetically. Ordering the layers following the onset times would lead to a striped look where the overall shape is drifting vertically away. As both methods have their downsides, the layers are ordered inside-out in Streamgraph - the later the onset time of a time series, the further outside it is being placed - which gives best results. This doesn't only put focus on the onset time but also leaves areas underlying the greatest change on the outside where they disturb other layers the least.

Layer Labeling

Depending on the size of the graph it has to be decided how to label the different layers. If only few sets of data or categories are used, like in NameVoyager, a simple legend will suffice. For more complex color schemes, a place for the label and a fitting font size needs to be found. Brute force calculations have been used with Streamgraph, but they often take very long to calculate, rendering them useless for generating them on the fly.