Dendrogram

Dataviz logo representing a Dendrogram chart.

A dendrogram is a network structure. It is constituted of a root node that gives birth to several nodes connected by edges or branches. The last nodes of the hierarchy are called leaves.

This page explains how to build a dendrogram using d3.js to compute the node position, and React to render the nodes and edges. It starts by describing the required data format, explains how to build a very basicdendrogram and then shows how to customize it.

Useful links

The Data

The dataset describes a hierarchy using a recursive structure.

Each item in this structure is called a node. The lowest nodes of the hierarchy being called leaves. The dataset is an object that has at least 3 properties: name, value and children. Children is an array of nodes that have this structure too.

Here is a minimal example of the data structure:

const data = {
  type: 'node',
  name: "boss",
  value: 2300,
  children: [
    {type: 'leaf', name:"Mark", value: 90},
    {type: 'leaf', name:"Robert", value: 12},
    {type: 'leaf', name:"Emily", value: 34},
    ...
}

Note: if your data is not formatted this way at all, don't fret! In the next section I explain how to deal with other formats. ⬇️

The hierarchy format or "root node"

A dendrogram is a hierarchical layout. D3.js has a lot of utility functions allowing to deal with this kind of hierarchical data. To use those functions we first need to create a "Root node" or "Hierarchy".

But what the heck is this? 🤔

A "root node" or "hierarchy" is a javascript object. It has almost the same shape as the input data described above. But with some additional properties and methods that will make our life easier. This data structure is typed as a HierarchyNode.

→ properties of a root node

This "root node" is a recursive structure of nodes as described in the data section above. But it has those new properties:

  • data: associated data
  • depth: 0 for the root node, and increasing by one for each descendant.
  • height: 0 for leaf nodes, and the greatest distance from any descendant leaf otherwise.
  • children: an array of child nodes, if any; undefined for leaf nodes.
  • value: the summed value of the node and its descendants.

On top of that, each node also has associated methods like node.descendants() or node.links() that we will describe later. See the complete list in the d3-hierarchy doc.

→ how to build a root node

If your dealing with the format describe in the previous section, you just have to pass it to the d3 hierarchy function:

const hierarchy = useMemo(() => {
  return d3.hierarchy(data);
}, [data]);

Very convenient. If you have a different input, here is how to do:

My input is a list of connection in .json format

Let's say you have a tabular format in json format. It's an array where each item represents a node. For each node, you have a name property and a parent property that provides the parent name:

export const dataTabular =
  [
    { "name": "Eve", "parent": "" },
    { "name": "Cain", "parent": "Eve" },
    { "name": "Seth", "parent": "Eve" },
    ...
  ]

In this case, you have to use the stratify function to create the hierarchy. This is how the syntax looks like:

const hierarchyGenerator = stratify()
  .id((node) => node.name)
  .parentId((node) => node.parent);

const hierarchy = hierarchyGenerator(dataTabular);

And that's it. You have a hierarchy object and can follow the rest of this tutorial.

My input is a list of connection in .csv format

In this case, you can use the csvParse() function of d3 to get a javascript array and use the stratify function as shown in the accordion above.

const dataTabular = d3.csvParse(text);

The cluster() function

We now have a hierarchy object that is a convenient data structure. From this, we need to compute the position of each node in our 2d space.

This is made possible thanks to the cluster() function of d3.js. You can check its offical documentation.

→ calling d3.cluster()

d3.cluster() is a function that returns a layout generator. It is thus a function that returns a function. There is not much to provide to it, except the width and height of the figure.

// Create a dendogram generator = a function that compute the position of the nodes in a hierarchy
const dendrogramGenerator = d3
  .cluster()
  .size([width, height]);

The generator we have now (dendrogramGenerator) expect 1 input: a hierarchy object that we described in the previous chapter.

// use the generator on our hierarchy
const dendrogramLayout = dendrogramGenerator(hierarchy);

d3.cluster() output

The output is almost the same as the initial hierarchy object. But for each node we have 2 additional properties: x and y that are the coordinates we need to build the dendrogram!

Most Basic dendrogram

We have a list of node in the dendrogram object. For each, we now its position.

We just need to loop through all those nodes to build circles and lines to make the dendrogram

Fortunately, the dendrogram object has a descendants() method that list all nodes in a flat array. It is then possible to use a map() to loop through nodes. So for instance drawing edges looks like:

const allEdges = dendrogram.descendants().map((node) => {
  if (!node.parent) {
    return null;
  }
  return (
    <line
      key={"line" + node.id}
      fill="none"
      stroke="grey"
      x1={node.x}
      x2={node.parent.x}
      y1={node.y}
      y2={node.parent.y}
    />
  );
});

And the same idea goes for nodes and circles. That makes our first dendrogram!

The most basic treemap made with react and d3.js.

Horizontal dendrogram

You can swap the x and y coordinates to make the dendrogram horizontal instead of vertical.

You can also create smooth edges thanks to the d3.linkHorizontal() function. The function is described in its official documentation. Basically, you need to provide an object with a source and a target property. The coordinates of those properties will be used to create the d attribute of an svg path element.

<path
  key={node.id}
  fill="none"
  stroke="grey"
  d={horizontalLinkGenerator({
    source: [node.parent.y, node.parent.x],
    target: [node.y, node.x],
  })}
/>
MarkRobertEmilyMarionNicolasMalkiDjéMélanieEinstein

Horizontal dendrogram with smooth edges made with react and d3.js.

Radial dendrogram

The radial dendrogram is a bit trickier to achieve.

→ polar coordinates

We are dealing with polar coordinates here. As a result, the size attribute of thelayout()function must be updated.

  • The first item is 360. It will define the angle (in degree) to follow to reach a node. 0 is on top.
  • The second item is the radius of the figure. It will provide the distance to the center of a node in pixel.
const dendrogramGenerator = d3
  .cluster()
  .size([360, radius]);
const dendrogram = dendrogramGenerator(hierarchy);

Since x and y are now describing an angle and a distance to the center, we can position a node using the following transform property:

transform={"rotate(" + (node.x - 90) + ")translate(" + node.y + ")"}

→ Smooth edges with linkRadial

Edges are not horizontal anymore, so the linkHorizontal won't be helpful this time. But instead, the d3.linkRadial function does the job based on an angle and a distance.

→ Smart labels

Please make sure your labels are properly oriented. It always give a bit of a headhache to pivot them correctly, and to control the anchoring appropriately. I talked about it extensively in the circular barplot section so please take a look for this matter.

;;;MarkRobertEmilyMarionNicolasMalkiDjéMélanieEinstein

A minimalist radial dendrogram built using d3 and react.

Note: please check of the first level edges are straight lines. IMO it does not make sense to use linkRadial for the first level.

Coming next

There is much more that needs to be added to this tutorial.

Using canvas for rendering is often a requirement when the number of nodes gets big. Interactivity is often necessary, for hover effect or to collapse a part of the tree. It also possible to map the node circle size to a numeric variable.

This will come soon! I have a newsletter called the dataviz universe where I share my latest updates.

Subscribe

Part Of A Whole

Contact

👋 Hey, I'm Yan and I'm currently working on this project!

Feedback is welcome ❤️. You can fill an issue on Github, drop me a message on Twitter, or even send me an email pasting yan.holtz.data with gmail.com. You can also subscribe to the newsletter to know when I publish more content!