Hands On Training: Intro To Tableau: About This Data
Hands On Training: Intro To Tableau: About This Data
The Excel file contains geographic data (city and country), time data (date of observation), product information (name, quantity) and
price (standardized quantities, prices standardized to USD).
Agenda
1) Format of the Training
2) Connect to Data & Data Cleanup
3) Overview of Tableau Basics and Concepts
4) Mapping the Data
5) Analysis
6) (Optional) LOD Expressions Exercise
7) Story Points
8) Dashboards in Story Points
1
Connect to Data
Connect to an Excel file and join two tables
Overview:
Open the file “Crowd Sourced Grocery Prices” and cleanup the data
Detailed Steps:
1) Open Tableau Desktop 9.0 (Tableau 9.0 Beta if applicable)
2) On the Connect Pane, click Excel
3) Navigate to the file on your machine (desktop) and open it
4) Click and drag out the sheet named Observations
5) View the data in the preview pane below
6) Find the data field called Location. Split this field into City and Country (we’ll want to analyze these separately)
a. Click the drop down menu next to Location and select Split
b. Change the Geographic Role for Location – Split 1 to be City by clicking the =Abc Icon
c. Change the Geographic Role for Location – Split 2 to be Country by clicking the =Abc Icon
d. Rename the column headers by editing the metadata
2
Overview of Tableau Basics and Concepts
Concepts to go over
1) A good way to approach data analysis is to start by asking questions of the data, then planning how to answer
the questions with Tableau. Often the questions will change as you begin working with the data but it’s helpful
to have a guiding question when you start.
3) Tableau creates some fields that can be used in a visualization that do not exist in the original data set.
o If the data set contains geographic fields, such as country or city, Tableau searches an internal database
and generates Latitude and Longitude fields. This enables the geographic data to be plotted on a map
o Number of Records is a simple count of rows in the data set
4) Show Me can be accessed in the upper right corner of the screen. With field(s) selected, Show Me offers one-
click options for chart types
Take the next few minutes and explore the data – ask a question and try to answer it, play
with dragging fields to different places or using Show Me to get a feel for how the software
behaves.
3
Mapping the Data
What does our data look like globally?
Overview:
I. Where is data coming in from?
a. Plot countries as a map
II. Which country has the most expensive average price?
a. Color by Price, changing the default aggregation to average
III. Make a filled map of countries
Detailed Steps:
1) Clear the sheet (in the ribbon) or create a new sheet (at the bottom)
2) Drag Country from the data window to the canvas
a. If you get anything other than a map, undo and try dropping the field Country into the large, bottom
right rectangle that says “Drop field here”.
i. Alternatively, double click on the field name to bring Country out as a map
b. Note that the generated Latitude and Longitude were automatically plotted on the Rows and Columns
shelves
4
Make a filled map of countries
If desired, try creating additional maps to answer further question. What cities are
5
represented in the data? Are there an equal numbers of observations from all locations?
6
Analysis
Questions for analysis: How much variation is there in product prices across each country?
Overview:
1) If the analysis relates to variation, what visualization type makes the most sense?
Detailed Steps:
How much variation is there in product prices per country?
1) Decide which chart type to use. Here, we’ll use Bar Charts.
2) Holding down the Control key (Command on a Mac) click to select the fields: Product Name, Price
3) With those two fields selected, click on the Show Me tab
4) Select the Horizontal Bar Chart and click the Show Me tab again to close it
5) Double click on the “SUM(Price)” pill on the Columns shelf
a. Change the aggregation from a SUM to a MEDIAN by replacing SUM with MEDIAN
b. Hit Enter
6) Hover over the words “Median Price” on the X axis until the Sort icon appears. Click the icon to sort
7) Drag Country from the Rows shelf to the Columns shelf in front of Price.
8) Drag a new copy of Country from Dimensions to the Color shelf
9) Right click on the sheet tab and rename the sheet “Price Variation by Country”
Analysis:
Here, we can get a sense of overall prices across countries. Nigeria has more tall bars, indicating it has higher prices,
than other countries. What other patterns do you see? What insight do you get?
1) Click to the Analytics pane
2) Drag Average Line and drop on Pane to add an average reference line per pane
3) Click on a reference line and select Edit
4) Choose to label the line with the Value
5) Click OK
6) Play around with this view. Try selecting multiple bars. What happens to the Average line?
7) Right click on the sheet tab and rename the sheet “Price Variation by Country”
7
If time remains, explore other analyses of the data. Can you create a boxplot of the data to see
what products are outliers across countries? What quantities were looked at for each product?
How might you best compare prices for a single product across countries?
8
Optional LOD Expression Exercise
Which City has the greatest maximum price? Which Country has the highest average of the
maximum price?
Overview:
1) Brief explanation of creating a Calculated Field
2) Creating and using an LOD Expression
For example, in this particular dataset, City is more granular than Country. If we start with a view of only Country and
MAX(Price), each row of data (price of one item in one city) will be grouped by Country and then the maximum price is
shown on the view.
Note: the default aggregation of SUM has been changed to a MAX, similarly to how the SUM(Price) was changed to
AVG(Price) earlier in the training.
If we place City next to Country on the Rows Shelf, the length of the bars becomes the maximum price within each
City/Country combination (i.e. the data is more granular than the previous view).
9
Let’s say we’d like to take the maximum price per city and average that for each country. We want to use the
information about each city’s maximum price without having to make our view granular to the level of city.
LOD Expressions allow you to explicitly define the level of aggregation on the view—we’ll explicitly define the level of
detail to be City. The syntax for using an LOD Expression is as follows:
The value resulting from this LOD Expression (on the right in the image below) will give us the average of the maximum
Price for each city—which is computing at the level of Country, because of how we built the view, but also INCLUDING
the level of City, as set in the LOD Expression. On the other hand, if we had just placed the AVG(Price) on the view (on
the left in the image below), the value would be the average Price for each country, since that is the granularity in the
view.
10
Follow along the remainder of the training to see how to use this LOD Expression in the view to answer our initial
question.
Detailed Steps:
3) Create a Calculated Field
o Right click in the Data Pane to Create a Calculated Field. If you click on a field instead of white space,
your menu may look slightly different. If you don’t see Create calculated field, look for a Create menu,
expand that, then click Calculated field.
4) The Calculation Editor – for a more thorough discussion of Calculations, refer to the Online Help.
o Calculation name – orange – give your calculation a name
o Calculation editor canvas – green – areas where calculations can be built. These calculations may
include, but are not limited to using fields, functions, and other operators.
o Functions – blue – using functions, you may compute values such as a sum, average, and minimum, to
just name a few. You can use the menu to drill down into specific types of functions (date, string,
number, etc.).
o Function description – purple – additional information using the function
11
This remainder of this LOD Expression exercise goes a bit beyond true introductory topics, which is why this is optional,
but it’s a game changing 9.0 feature and you’ll want to at least have a basic knowledge.
From our previous view, we know that there is a large variation in the prices of each Product Name. Because of this, the
current view may be somewhat misleading as it only accounts for the variation in prices among the cities. Let’s filter the
data to show only one Product Name at a time.
10) Right click on Product Name anywhere you see it and select “Show Quick Filter”
o Click on the caret in the upper right corner of the filter to bring up the menu and select “Single Value
(Dropdown)”
11) Click through a few different Product Names. Does Kenya have the highest average of maximum price for each
Product Name?
12
If you’d like to dive deeper into LOD Expressions, please stay tuned for the LOD Expressions whitepaper.
Explore a few other views using the LOD Expression we created. What other insights can you
find?
13
Story
Use Story Points to present your findings on the data set
Overview:
Create a Story using the visualizations created above to tell the story of the data. Stories can have captions, floating
descriptions, and are fully interactive. Filter selections can be saved (updated) or duplicated as new points.
Detailed Steps:
I. Create a story and add a point with description
14
10) In the ribbon, use the drop down to change the fit from “Normal” to “Entire View”
a. If you want to maximize screen real estate, you can remove the color legend. Click on the caret on the
Countries color legend and select “Hide Card”
11) Click back on the story to verify there are no longer scroll bars on the story
a. Note: most changes to a visualization must be made on the underlying sheet, not in the Story
15
Rename the story to be “Crowd Sourced Grocery Prices”. Now let’s go back and make a couple more visualizations to
put into our Story.
16
Dashboards and Stories
Questions for analysis: Are there price fluctuations or have prices held fairly steady?
Overview:
1) If the analysis relates to variation, what visualization type makes the most sense?
2) Note: instead of choosing a specific country or product, allow the end-user to choose by using quick filters
Detailed Steps:
Are there price fluctuations or have prices held fairly steady?
3) Create a new sheet
4) Drag Obs Date to the Columns shelf
5) Right click on the pill and select the option “Week Number”
6) Drag Price to the Rows shelf
7) Double click on the “SUM(Price)” pill on the Rows shelf
a. Change the aggregation from a SUM to a Average by replacing SUM with AVG
b. Hit Enter
8) Drag Country to Color
9) Right click on Country anywhere you see it and select “Show Quick Filter”
c. If desired, click on the caret in the upper right corner of the filter to bring up the menu and select
“Multiple Value (Dropdown)”
10) Drag Product Name to Detail
11) Right click on Product Name anywhere you see it and select “Show Quick Filter”
d. Click on the caret in the upper right corner of the filter to bring up the menu and select “Single Value
(Dropdown)”
12) Right click on the sheet tab and rename the sheet “Timeline of Price Fluctuation”
Analysis:
Clicking through the Product Name filter, what patterns emerge? What products have a more or less price variation?
Which products are the most stable, within or across countries? Are there any countries that deviate from the
overall pattern?
17
Now let’s build a dashboard to bring two elements together
Overview:
13) Make a dashboard with a timeline and map
Detailed Steps:
18
Back on the Story
IV. Add another point and finish the story
15) Double click Price Fluctuation to bring it out to the story
16) Click in the navigator box to add a caption
a. “Explore the dashboard to see how prices changed over time”
17) Click and drag out the Description to add a caption
a. “Click on a country to see just the records for that country”
b. “Click on a product (or “All”) to change the timeline”
If you made any other sheets, continue adding new points. Experiment with
updating points or creating new points with filter or highlighted selections.
19