Interviewing data: exploratory graphical analysis

Introducing Tableau Public

In this week’s class we will work with Tableau Public, which allows you to create a wide variety of interactive charts, maps and tables and organize them into dashboards and stories that can be saved to the cloud and embedded on the web.

The free Public version of the software requires you to save your visualizations to the open web. If you have sensitive data that needs to be kept within your organization, you will need a license for the Desktop version of the software. Members of Investigative Reporters and Editors can obtain a free license.

Tableau was developed for exploratory graphical data analysis, so it is a good tool for exploring a new dataset — filtering, sorting and summarizing/aggregating the data in different ways while experimenting with various chart types.

Although Tableau was not designed as a publication tool, the ability to embed finished dashboards and stories has also allowed newsrooms and individual jorunalists lacking JavaScript coding expertise to create interactive online graphics.

The data we will use today

Download the data for this session from here, unzip the folder and place it on your desktop. It contains the following file:

Visualize the data on neonatal mortality

Connect to the data

Launch Tableau Public, and you should see the following screen:

Under the Connect heading at top left, select Text File, navigate to the file nations.csv and Open. At this point, you can view the data, which will be labeled as follows:

You can edit fields to give them the correct data type if there are any problems:

Once the data has loaded, click Sheet 1 at bottom left and you should see a screen like this:

Dimensions and measures: categorical and continuous

The fields should appear in the Data panel at left. Notice that Tableau has divided the fields into Dimensions and Measures. These broadly correspond to categorical and continuous variables. Dimensions are fields containing text or dates, while measures contain numbers.

If any field appears in the wrong place, click the small downward-pointing triangle that appears when it is highlighted and select Convert to Dimension or Convert to Measure as required.

Shelves and Show Me

Notice that the main panel contains a series of “shelves,” called Pages, Columns, Rows, Filters and so on. Tableau charts and maps are made by dragging and dropping fields from the data into these shelves.

Over to the right you should see the Show Me panel, which will highlight chart types you can make from the data currently loaded into the Columns and Rows shelves. It is your go-to resource when experimenting with different visualization possibilities. You can open and close this panel by clicking on its title bar.

Columns and rows: X and Y axes

The starting point for creating any chart or map in Tableau is to place fields into Columns and Rows, which for most charts correspond to the X and Y axes, respectively. When making maps, longitude goes in Columns and latitude in Rows. If you display the data as a table, then these labels are self-explanatory.

Some questions to ask this data

Create new calculated variables

The data contains fields on birth and neonatal death rates, but not the total numbers of births and deaths, which must be calculated. From the top menu, select Analysis>Create Calculated Field. Fill in the dialog box as follows (just start typing a field name to select it for use in a formula):

Notice that calculated fields appear in the Data panel preceded by an = symbol.

Now create a second calculated field giving the total number of neonatal deaths:

In the second formula, we have rounded the number of neonatal deaths to the nearest thousand using -3 (-2 would round to the nearest hundred, -1 to the nearest ten, 1 to one decimal place, 2 to two decimal places, and so on.)

Here we have simply run some simple arithmetic, but it’s possible to use a wide variety of functions to manipulate data in Tableau in many ways. To see all of the available functions, click on the little gray triangle at the right of the dialog boxes above.

Understand that Tableau’s default behavior is to summarize/aggregate data

As we work through today’s exercise, notice that Tableau routinely summarizes or aggregates measures that are dropped into Columns and Rows, calculating a SUM or AVG (average or mean), for example.

This behavior can be turned off by selecting Analysis from the top menu and unchecking Aggregate Measures. However, I do not recommend doing this, as it will disable some Tableau functions. Instead, if you don’t want to summarize all of the data, drop categorical variables into the Detail shelf so that any summary statistic will be calculated at the correct level for your analysis. If necessary, you can set the aggregation so it is being performed on a single data point, and therefore has no effect.

Make a series of treemaps showing neonatal deaths over time

A treemap allows us to directly compare the neonatal deaths in each country, nested by region.

Drag Country and Region onto Columns and Neonatal deaths onto Rows. Then open Show Me and select the treemap option. The initial chart should look like this:

Look at the Marks shelf and see that the size and color of the rectangles reflect the SUM of Neonatal deaths for each country, while each rectangle is labeled wuth Region and Country:

Now drag Region to Color to remove it from the Label and color the rectangles by region, using Tableau’s default qualitative color scheme for categorical data:

For a more subtle color scheme, click on Color, select Edit Colors... and at the dialog box select the Tableau Classic Medium qualitative color scheme, then click Assign Palette and OK.

(Tableau’s qualitative color schemes are well designed, so there is no need to adopt a ColorBrewer scheme. However, it is possible to edit colors individually as you wish.)

Click on Color and set transparency to 75%. (For your assignment you will create a chart with overlapping circles, which will benefit from using some transparency to allow all circles to be seen. So we are setting transparency now for consistency.)

The treemap should now look like this:

Tableau has by default aggregated Neonatal deaths using the SUM function, so what we are seeing is the number for each country added up across the years.

To see one year at a time, we need to filter by year. If you drag the existing Year variable to the Filters shelf, you will get the option to filter by a range of numbers, which isn’t what we need:

Instead, we need to be able check individual years, and draw a treemap for each one. To do that, select Year in the Dimensions panel and Duplicate.

Select the new variable and Convert to Discrete and then Rename it Year (discrete). Now drag this new variable to Filters, select 2014, and click OK:

The treemap now displays the data for 2014:

That’s good for a snapshot of the data, but with a little tinkering, we can adapt this visualization to show change in the number of neonatal deaths over time at the national, regional and global levels.

Select Year (discrete) in the Filters shelf and Filter ... to edit the filter. Select all the years with even numbers and click OK:

Now drag Year (discrete) onto Rows and the chart should look like this:

The formatting needs work, but notice that we now have a bar chart made out of treemaps.

Extend the chart area to the right by changing from Standard to Entire View on the dropdown menu in the top ribbon:

I find it more intuitive to have the most recent year at the top, so select Year (discrete) in the Rows shelf, select Sort and fill in the dialog box so that the years are sorted in Descending order:

The chart should now look like this:

We will create a map to serve as a legend for the regions, so click on the title bar for the color legend and select Hide Card to remove it from the visualization.

To remove some clutter from the chart, select Format>Borders from the top menu, and under Sheet>Row Divider, set Pane to None. Then close the Format Borders panel.

Right-click on the Sheet 1 title for the chart and select Hide Title. Also right-click on Year (discrete) at the top left of the chart and select Hide Field Labels for Rows. Then hover just above the top bar to get a double-arrowed drag symbol and drag upwards to reduce the white space at the top. You may also want to drag the bars a little closer to the year labels.

The labels will only appear in the larger rectangles. Rather than removing them entirely, let’s just leave a label for India in 2014, to make it clear that this is the country with by far the largest number of neonatal deaths. Click on Label in the Marks shelf, and switch from All to Selected under Marks to Label. Then right-click on the rectangle for India in 2014, and select Mark Label>Always Show. The chart should now look like this:

Hover over one of the rectangles, and notice the tooltip that appears. By default, all the fields we have used to make the visualization appear in the tooltip. (If you need any more, just drag those fields onto Tooltip.) Click on Tooltip and edit as follows. (Unchecking Include command buttons disables some interactivity, giving a plain tooltip):

Save to the web

Right-click on Sheet 1 at bottom left and Rename Sheet to Treemap bar chart. Then select File>Save to Tableau Public... from the top menu. At the logon dialog box enter your Tableau Public account details, give the Workbook a suitable name and click Save. When the save is complete, a view of the visualization on Tableau’s servers will open in your default browser.

Make a map to use as a color legend

Select Worksheet>New Worksheet from the top menu, and double-click on Country. Tableau recognizes the names of countries and states/provinces; for the U.S., it also recognizes counties. Its default map-making behavior is to put a circle at the geographic center, or centroid, of each area, which can be scaled and colored to reflect values from the data:

However, we need each country to be filled with color by region. Using Show Me, switch to the filled maps option, and each nation should fill with color. Drag Region to Color and see how the same color scheme we used previously carries over to the map. Click on Color, set the transparency to 75% to match the bubble chart and remove the borders. Also click on Tooltip and uncheck Show tooltip so that no tooltip appears on the legend.

We will use this map as a color legend, so its separate color legend is unnecessary. Click the color legend’s title bar and select Hide Card to remove it from the visualization. Also remove the Sheet 2 title as before.

Center the map in the view by clicking on it, holding and panning, just as you would on Google Maps. It should now look something like this:

Rename the worksheet Map legend and save to the web again.

Make a line chart showing neonatal mortality rate by country over time

To address our second question, and explore the neonatal death rate over time by country, we can use a line chart.

First, rename Neonat Mortal as Neonatal death rate (per 1,000 births). Then, open a new worksheet, drag this variable to Rows and Year to Columns. The chart should now look like this:

Tableau has aggregated the data by adding up the rates for each country in every year, which makes no sense here. So drag Country to Detail in the Marks shelf to draw one line per country:

Drag region to Color and set the transparency to 75%.

Now right-click on the X axis, select Edit Axis, edit the dialog box as follows and click OK:

Right-click on the X axis again, select Format, change Alignment to Up and use the dropdown menu set the Font to bold. Also remove the Sheet 3 title.

The chart should now look like this:

We can also highlight the countries with the highest total number of neonatal deaths by dragging Neonatal deaths to Size. The chart should now look like this:

This line chart shows that the trend in most countries has been to reduce neonatal deaths, while some countries have had more complex trajectories. But to make comparisons between individual countries, it will be necessary to add controls to filter the chart.

Tableau’s default behavior when data is filtered is to redraw charts to reflect the values in the filtered data. So if we want the Y axis and the line thicknesses to stay the same when the chart is filtered, we need to freeze them.

To freeze the line thicknesses, hover over the title bar for the line thickness legend, select Edit Sizes... and fill in the dialog box as follows:

Now remove this legend from the visualization, together with the color legend. We can later add an annotation to our dashboard to explain the line thickness.

To freeze the Y axis, right-click on it, select Edit Axis..., make it Fixed and click OK:

Right-click on the Y axis again, select Format... and increase the font size to 10pt to make it easier to read.

Now drag Country to Filters, make sure All are checked, and at the dialog box, click OK:

Now we need to add a filter control to select countries to compare. On Country in the Filters shelf, select Show Filter. A default filter control, with a checkbox for each nation, will appear to the right of the chart:

This isn’t the best filter control for this visualization. To change it, click on the title bar for the filter, note the range of filter controls available, and select Multiple Values (Custom List). This allows users to select individual countries by starting to type their names. Then select Edit Title... and add some text explaining how the filter works:

Take some time to explore how this filter works.

Rename Income to Income group. Then add Region and Income group to Filters, making sure that All options are checked for each. Select Show Filter for both of these filters, and select Single Value Dropdown for the control. Reset both of these filters to All, and the chart should now look like this:

Notice that the Income group filter lists the options in alphabetical order, rather than income order, which would make more sense. To fix this, right-click on Income group in the data panel and select Default Properties>Sort. At the dialog box below, select Manual sort, edit the order as follows and click OK:

The chart should now look like this:

Finally, click on Tooltip and edit as follows:

Rename the sheet Line chart and save to the web.

Make a dashboard combining both charts

From the top menu, select Dashboard>New Dashboard. Set its Size to Automatic, so that the dashboard will fill to the size of any screen on which it is displayed:

To make a dashboard, drag charts, and other elements from the left-hand panel to the dashboard area. Notice that Tableau allows you to add items including: horizontal and vertical containers, text boxes, images (useful for adding a publication’s logo), embedded web pages and blank space. These can be added Tiled, which means they cannot overlap, or Floating, which allows one element to be placed over another.

Drag Treemap bar chart from the panel at left to the main panel. The default title, from the worksheet name, isn’t very informative, so right-click on that, select Edit Title ... and change to Total deaths.

Now add Line Chart to the right of the dashboard (the gray area will show where it will appear) and edit its title to Death rates. Also add a note to explain that line widths are proportional to the total number of deaths. The dashboard should now look like this:

Notice that the Country, Region and Income group filters control only the line chart. To make them control the treemaps, too, click on each filter, open up the dropdown menu form the downward-pointing triangle, and select Apply to Worksheets>Selected Worksheets... and fill in the dialog box as follows:

The filters will now control both charts.

Add Map legend for a color legend at bottom right. (You will probably need to drag the window for the last filter down to push it into position.) Hide the legend’s title then right-click on the map and select Hide View Toolbar to remove the map controls.

We can also allow the highlighting of a country on one chart to be carried across the entire dashboard. Select Dashboard>Actions... from the top menu, and at the first dialog box select Add action>Highlight. Filling the second dialog box as follows will cause each country to be highlighted across the dashboard when it is clicked on just one of the charts:

Click OK on both dialog boxes to apply this action.

Select Dashboard>Show Title from the top menu. Right-click on it, select Edit Title... and change from the default to something more informative:

Now drag a Text box to the bottom of the dashboard and add a footnote giving source information:

The dashboard should now look like this:

Design for different devices

This dashboard works on a large screen, but not on a small phone. To see this, click the Device Preview button at top left and select Phone under Device type. In portrait orientation, this layout does not work at all:

Click the Add Phone Layout at top right, and then click Custom tab under Layout - Phone in the left-hand panel. You can then rearrange and if necessary remove elements for different devices. Here I have removed the line chart and filter controls, and changed the legend to a Floating element so that it sits in the blank space to the top right of the bar chart of treemaps.

Now save to the web once more. Once the dashboard is online, use the Share link at the bottom to obtain an embed code, which can be inserted into the HTML of any web page.

(You can also Download a static view of the graphic as a PNG image or a PDF.)

You can download the workbook for any Tableau visualization by clicking the Download Workbook link. The files (which will have the extension .twbx) will open in Tableau Public.

Having saved a Tableau visualization to the web, you can reopen it by selecting File>Open from Tableau Public... from the top menu.

Another approach to responsive design

As an alternative to using Tableau’s built-in device options, you may wish to create three different dashboards, each with a size appropriate for phones, tablets, and desktops respectively. You can then follow the instructions here to put the embed codes for each of these dashboards into a div with a separate class, and then use @media CSS rules to ensure that only the div with the correct dashboard displays, depending on the size of the device.

If you need to make a fully responsive Tableau visualization and are struggling, contact me for help!

When making responsively designed web pages, make sure to include this line of code between the <head></head> tags of your HTML:

<meta name="viewport" content="width=device-width, initial-scale=1.0">

From dashboards to stories

Tableau also allows you to create stories, which combine successive dashboards into a step-by-step narrative. Select Story>New Story from the top menu. Having already made a dashboard, you should find these simple and intuitive to create. Select New Blank Point to add a new scene to the narrative.

Assignment

Further reading/viewing

Tableau Public training videos

Gallery of Tableau Public visualizations: Again, you can download the workbooks to see how they were put together.

Tableau Public Knowledge Base: Useful resource with the answers to many queries about how to use the software.