Data Visualization: Best Practices

Release date: February 24, 2023

Introduction

When used properly, charts (including figures and diagrams) can simplify the presentation of information and the communication of clear and precise messages. However, with the wide range of options available, creating effective charts can be complex. This reference tool is intended to provide a basic guide to creating effective charts that take advantage of the options available.

Preparation

Before creating a chart, it is important to answer a few important questions.

Chart components

The different types of charts have common components, for which there are best practices.

Figure 6

Description for Figure 6

The image is a sample bar chart with labels and arrows pointing to different areas of the chart. The labels are pointing out the common components of charts, including Title, Axis, Tick marks, Grid lines, Legend, Notes, Sources, Colours.

1. Title

The title of a chart usually appears at the top of the chart and is part of it. The different axis of the chart can also have their own titles.

2. Axis

Some types of charts use axis to present the data.

3. Tick marks

Tick marks provide visual cues for easy reading of the data.

4. Grid lines

Grid lines appear on certain types of charts, in order to facilitate the reading and comparison of values.

5. Legend

A legend can be used to label the different variables or categories presented in a chart. It can be particularly useful when there are multiple items.

6. Notes

Notes can be used to provide details about the chart, such as methodological considerations, data limitations, or a description of abbreviations used.

7. Sources

A chart should always include the sources of the data used.

8. Colours

Colours can be used to make data easier to understand, to emphasise specific elements or to communicate certain messages quickly.

9. Other considerations

Types of charts

There are several types of charts, each with advantages and disadvantages, depending on the context and the nature of the data. The selection of the right type of chart will be influenced by different factors such as the type of data, the messages to be communicated and the target audience.

1. Pie charts

A pie chart shows the percentage distribution of a given variable. Each segment represents a category and its size is proportional to its weight in the total.

Figure 15 : Description of this image is directly below

Description for Figure 15

This is an image of a 3 category pie chart where the three categories add up to 100%.

Pie chart family

A. Bar of pie charts

Figure 19 : Description of this image is directly below

Description for Figure 19

This is an image of a 4 category pie chart with “Other” category being represented by a 2 category stacked bar chart to the right of the pie chart.

A bar of pie charts can be used when there are more than six categories, or when there are several small categories, which are difficult to illustrate clearly in a regular pie chart. In these cases, a new category called 'Other', the amount of which is equal to the sum of the smaller categories, is inserted into the main chart. Stacked bars show these categories next to the pie chart.

B. Donut charts

Figure 20 : Description of this image is directly below

Description for Figure 20

This is an image of a 4 category donut chart where each category is represented not just with a different colour, but also by a different logo.

A donut chart is a pie chart with a hole in the centre. The hole makes it more difficult to estimate the relative size of categories, but can be used to present relevant information, such as a logo.

2. Bar charts

A bar chart uses bars to represent the different categories. It can be vertical or horizontal and has two axes. The names of the different categories are shown on one axis or with labels on the bars. The value of the data is shown on the other axis: this is called the scale.

Figure 21 : Description of this image is directly below

Description for Figure 21

This is an image of a 5 category, horizontal bar chart, where the titles are on the y axis and percentages on the x axis.

Bar chart family

A. Grouped bar charts

It is possible to present two or more series of data in a grouped bar chart. However, the more series there are, the more difficult it is to focus on one at a time.

Figure 25 : Description of this image is directly below

Description for Figure 25

This is an image of a grouped bar chart that displays data for 4 categories over 3 years. Each year is represented as a grouping of the same of 4 categories.

B. Histograms

Histograms are used to illustrate the summary of a continuous variable measured on an interval scale. In a histogram, the bars are connected to each other, with no space between them.

Figure 26 : Description of this image is directly below

Description for Figure 26

This is an image of a 7 category histogram.

C. Box plots

Box plots are used to illustrate the distribution of different categories of a variable. Each bar starts at the minimum value and ends at the maximum value of the category. There is usually a thick line inside each bar that shows the center of the distribution, usually the median.

Figure 27 : Description of this image is directly below

Description for Figure 27

This is an image of a 2 category box plot.

D. Box and whisker plots

This chart is one of the most effective charts for visualizing information about the frequency distribution of variables and the distribution of a continuous variable. It displays minimum, first quartile, median, third quartile and maximum value of a category of a variable.

Figure 28 : Description of this image is directly below

Description for Figure 28

This is an image of a 2 category box and whisker plot.

E. Stacked bar chart

Stacked bar charts are used to illustrate the total values ​​of different categories. Additionally, each bar is broken down to show subcomponents in each category. Since the baseline of subcategories varies between bars, only the first subcomponent can be visualized efficiently.

Figure 29 : Description of this image is directly below

Description for Figure 29

This is an image of a 4 category stacked bar chart that displays data for 2 years.

F. 100% stacked bar charts

100% stacked bar charts are used to illustrate the ratio of subcategories. They are similar to stacked bar charts, but show the relative value of each category rather than the absolute value.

Figure 30 : Description of this image is directly below

Description for Figure 30

This is an image of a 4 category 100% stacked bar chart that displays data for 2 years.

G. Waterfall charts

Waterfall charts pull apart the pieces of a stacked bar chart and show each subcomponent separately. The first bar starts from its natural base value and the rest of the bars start at the value of the previous bar and can have a positive or a negative value. 

Figure 31 : Description of this image is directly below

Description for Figure 31

This is an image of a 4 category waterfall chart.

3. Line charts

Unlike bar charts which emphasize individual values, line charts emphasize continuity and evolution from point to point. They are commonly used to show changes and trends over time.

Figure 32 : Description of this image is directly below

Description for Figure 32

This is an image of a line chart that displays data over 4 years.

Line chart family

A. Grouped line charts

It is possible to present two or more series of data in a grouped line chart. However, the more series there are, the more difficult it is to focus on one at a time.

Figure 37

Description for Figure 37

This is an image of a 3 category line chart that displays data over 10 years. Each line is a different colour in order to differentiate the measured items from one another.

B. Slope graphs

Slope graphs illustrate the relative increase or decrease in a set of variables between two data points. They provide a clear visual ordering among variables and can be used to visualize a ranking.

Figure 38 : Description of this image is directly below

Description for Figure 38

This is an image of a slope graph that displays data for three categories over 2 years.

4. Point charts

Point charts are typically used to illustrate the trend or pattern of frequency distribution of variables. They usually have an additional element, i.e. a regression line which show the estimated slope of a model.

Point charts family

A. Lollipop graph

This chart is similar to bar charts. It uses a line for visualizing the values of each variable instead of a bar.

Figure 41 : Description of this image is directly below

Description for Figure 41

This is an image of a lollipop graph displaying 12 lollipops total, one for each month of the year.

B. Strip plots

Strip plots display the value of each points in a data set. They are useful for visualizing the precise value of each elements in a small data set. 

Figure 42 : Description of this image is directly below

Description for Figure 42

This is an image of a strip plot.

C. Bubble plots

With bubble plots, it is possible to illustrate a third element on the same chart, using bubbles that vary in size depending on the value.

Figure 43 : Description of this image is directly above

Description for Figure 43

This is an image of 7 bubbles displayed in a bubble plot. The bubbles are of varying sizes which represent the value of a third variable.

5. Maps

A map is a geospatial design that displays information on geographical locations.

Figure 44 : Description of this image is directly below

Description for Figure 44

This is an image of a map of Canada where different shades of color are assigned to defined regions.

The "choropleth map" uses colors to provide information: different shades of color are assigned to defined regions such as countries, provinces, and cities.

Maps family

A. Tree maps

This type of maps uses rectangles proportional to the relative size of each category to illustrate them.

Figure 45 : Description of this image is directly below

Description for Figure 45

This is an image of a 5 category tree map displaying rectangles proportional to the relative size of each of the 5 categories.

B. Tile grid maps

In a traditional geographical map, the size of each area has some effects on how we process the information. In tile grid maps, elements with same sizes and shapes are used and the audience can see and process the information without the side effect of element size on their judgement.

Figure 46 : Description of this image is directly below

Description for Figure 46

This is an image of a tile grid map.

Glossary

Base value

The base value (also called "baseline") is the natural starting point of a variable. Usually the base value of variables is zero, but there are cases where the logical base value is different (e.g. the price index has a base value of 1).

Uncertainty

In statistics, uncertainty refers to the fact that estimates based on a sample or projections may not reflect the true value.

Continuous variable

A continuous variable can take all possible values in a predefined range, as opposed to discrete variables, which can only take certain values in a range, usually integers.


Date modified: