Preparing Excel Files For Analysis - Tableau Software
Preparing Excel Files For Analysis - Tableau Software
Preparing Excel Files For Analysis - Tableau Software
PreparingExcelFilesforAnalysis
Article Note: This article is no longer actively maintained by Tableau. We continue to make it available because the
information is still valuable, but some steps may vary due to product changes.
The rst step in exploring your data with Tableau is examining how the data is presented.
When an Excel data source (other than a cube) is already formatted as a cross-tabulation
or is otherwise aggregated, options for viewing, aggregating, and grouping in Tableau are
limited. Tableau cannot see underlying data points that have already been summarized
into a higher level group or order. To take advantage of Tableau's full functionality, you
need to normalize the data - that is, format it as raw data - before connecting to it from
Tableau.
For example, consider the two workbooks shown below. The rst one is a formatted report
with repeated headers, empty rows, grand totals, and so on. When you open the workbook
in Tableau, your data should instead be in a raw data table like the second one.
Incorrect - Formatted Report
Below are some tips for turning your formatted reports into a raw data table that is ready
for analysis in Tableau.
Removeorexcludeintroductoryandotherunnecessarytext
The rst row in the entire le must contain your eld headers (or column names). Many
reports delivered as Excel workbooks have a block of introductory text at the top. This text
may be titles, color legends, descriptions, and so on. Remove all this information before
opening the data with Tableau.
Remove unnecessary information at the top of the le.
If you don't want to remove introductory text, you can alternatively create a Named
Range that contains just the data. When opening Excel workbooks in Tableau, you can
connect to an entire sheet or a named range within a sheet.
InExcel:
Step1
Select the data.
Step2
On the Formulas tab, in the Dened Names section, select Dene Name.
Step3
In the New Name dialog box, in the Name text box, Excel o ers a name based on the
content of the top left cell of the selected data range. Keep this name.
Step4
In the Scope list, select Sheet1.
Step5
When nished, click OK.
InTableau:
Your named range is o ered as a table when you connect to the Excel workbook.
Makesureeachrowcontainsonlyonepieceofdata
This example shows an Excel table that lists students and their grades in three subjects. In
a crosstab layout, you have a column for each subject. In this table, each row contains
three pieces of data: the student's grade in Math, grade in English, and grade in Science.
ID Gender School Math English Science
1
West
90
80
70
South
50
50
50
Central 90
80
90
Central 50
80
80
West
100
90
100
6 F
West
80
80
60
South
50
80
100
Central 80
50
100
South
80
80
9 M
70
Replace the columns Math, English, and Science with a single column: Subject. Now the
table contains three rows for each student, but each row contains only one grade.
West
Math
90
West
English 80
West
Science 70
South
Math
South
English 50
South
Science 50
Central Math
Central English 80
Central Science 90
Central Math
Central English 80
Central Science 80
West
Math
West
English 90
West
Science 100
6 F
West
Math
6 F
West
English 80
6 F
West
Science 60
South
Math
South
English 80
South
Science 100
Central Math
Central English 50
50
90
50
100
80
50
80
9 M
South
Math
70
9 M
South
English 80
9 M
South
Science 80
Limitheaderstoasinglerow
Not only should the rst row contain your eld headers, but also this should be the only
row of headers. If you have headers that include some type of "categorical" breakdown
above them, create a new column that contains the category.
In this example, East is removed as a categorical header, and a new column, Region, is
added to the table.
Fillblankcells
If you have created a new column for categories, make sure to ll the blank cells so that
the information is repeated for each row of data, not just the rst occurrence. While this
seems redundant, it is important that each record (or row) has data across all the
columns.
Cleanupaggregatedanddescriptivedata
Make sure to remove the rows that do not contain raw data records. For example, an Excel
report may have rows that contain descriptive information and Grand Totals rows. You
can easily add totals in Tableau and do not need to calculate them in your data source.
Deleteblankrowsandduplicateheaders
Remove any blank rows and rows that contain duplicate headers.
Addmissingheaders
If any column does not have a title, make sure to add one. Be descriptive when writing
your column headers.
ReshapetheentirestructureofyourExceldatausingTableau'sExcel
ReshaperPlugin
Even if you have followed all the suggestions shown above, you may still have data in a
format that is not ideal for Tableau, purely from an analytic perspective. For example, you
may still have a column for each month of business data, which Tableau interprets as
separate columns, making month-to-month comparisons dicult.
You can use the Tableau plug-in for Excel to reshape your data. You still need to follow
any/all of the steps in this article. Download the plug-in from the Installing the Tableau
Add-in for Reshaping Data in Excel knowledge base article.
NO
PRINTTHISPAGE
RelatedArticles
Attachments
Search Knowledge Base
English
Blog
Academic
Careers
ContactUs
Legal&Privacy
Uninstall
20032016TABLEAUSOFTWARE.ALLRIGHTSRESERVED