PowerQuery PowerPivot DAX
PowerQuery PowerPivot DAX
PowerQuery PowerPivot DAX
2 Power Query
• Types of data connectors, query editing tools, loading options, etc.
6 Final Project
• VanArsdel sales data (2000-2010)
VERSIONS & COMPATIBILITY
For a full, current list of compatible versions, visit support.office.com (or Google “Where is Power Pivot?”):
https://support.office.com/en-us/article/Where-is-Power-Pivot-aa64e217-4b6e-410b-8337-20b87e1c2a4b (or use: bit.ly/2yd80rd)
Other considerations:
• Power Pivot works best with 64-bit Excel, which can access more processing power and memory (not critical)
• Note: make sure you’re running a 64-bit operating system and that you’ve updated Office to the 64-bit version
• Power Pivot menus, features and tools have evolved over time; what you see on your screen may differ from
what you see on mine, but the fundamental skills and concepts covered are universally applicable
• Even if you have a compatible version of Excel, you may need to enable the Power Pivot or Power Query
plug-ins to access the tools in this course (File > Options > Add-Ins > Manage: COM Add-Ins)
GETTING TO KNOW THE FOODMART DATABASE
• Throughout the course, we’ll be using sample data from a fictitious super market chain
called “FoodMart”*
• In addition to daily transactional records from 1997-1998, our data set includes
information about products, customers, stores, and regions
• All files are available for download in the course resources section of your course
dashboard (Course Dashboard > Course Content > All Resources)
Transactions Returns Customer Lookup Calendar Lookup Product Lookup Store Lookup Region Lookup
-transaction_date -return_date customer_id date product_id store_id region_id
-stock_date -product_id customer_acct_num month_num product_brand region_id sales_district
-product_id -store_id first_name quarter product_name store_type sales_region
-customer_id -quantity last_name year product_sku store_name
-store_id customer_address weekday_num product_retail_price store_street_address
-quantity etc.. etc… etc… etc…
*This data is provided by Microsoft for informational purposes only as an aid to illustrate a concept. These samples are provided “as is” without warranty of any kind. The example companies, organizations, products, domain names,
e-mail addresses, people, places, and events depicted herein are fictitious, and no association with any real company, organization, product, domain name, e-mail address, person, place, or event is intended or should be inferred.
SETTING EXPECTATIONS
2 This course is designed to get you up & running with Excel’s BI tools
• The goal is to provide a solid foundational understanding of Power Query, Power Pivot and DAX; we may
simplify some concepts to make them easier to grasp, and will not cover some of the more advanced tools
These are Excel’s Business Intelligence tools, all of which are available directly in Excel
(provided you have a compatible version); no additional software is required!
RAW DATA POWER QUERY DATA MODEL POWER PIVOT & DAX
Flat files (csv, txt), Excel tables, (aka “Get & Transform”) Explore and analyze the entire
Create table relationships, add
databases (SQL, Azure), folders, Connect to sources, import calculated columns, define data model, and create powerful
streaming sources, web data, etc. data, and apply shaping and hierarchies and perspectives, etc. measures using Data Analysis
transformation tools (ETL) Expressions (DAX)
“THE BEST THING TO HAPPEN TO EXCEL IN 20 YEARS”
Use Power Query and Power Pivot when you want to…
From File From Database FromAzure From Online Services From Other Sources
THE QUERY EDITOR
Query
Editing
Tools
Formula Bar
(this is “M” code)
Name your
table!
Data
Preview Applied
Steps
Access the Query Editor by creating a new query and choosing the “Edit” option, or by launching
the Workbook Queries pane (Data > Show Queries) and right-clicking an existing query to edit
QUERY EDITOR TOOLS
The HOME tab includes general settings and common table transformation tools
The TRANSFORM tab includes tools to modify existing columns (splitting/grouping, transposing, extracting text, etc.
The ADD COLUMN tools create new columns based on conditional rules, text operations, calculations, dates, etc.
DATA LOADING OPTIONS
When you load data from Power Query, you have several options:
• Table
• Stores the data in a new or existing worksheet
• Requires relatively small data sets (<1mm rows)
• Connection Only
• Saves the data connection settings and applied steps
• Data does not load to a worksheet
Date & Time tools are relatively straight-forward, and include the following options:
• Age: Difference between the current time and the date in each row
• Date Only: Removes the time component of a date/time field
• Year/Month/Quarter/Week/Day: Extracts individual components from a date field
(Time-specific options include Hour, Minute, Second, etc.)
• Earliest/Latest: Evaluates the earliest or latest date from a column as a single value (can
only be accessed from the “Transform” menu)
Note: You will almost always want to perform these operations from the “Add Column” menu to
build out new fields, rather than transforming an individual date/time column
PRO TIP:
Load up a table containing a single date column and use Date tools to build out an entire calendar table
CREATING A BASIC CALENDAR TABLE
1) Create a new, blank query (Data > New Query > From Other Sources > Blank Query)
2) In the formula bar, generate a starting date by entering a “literal” (1/1/2013 shown below):
3) Click the fX icon to add a new custom step, and enter the following formula exactly as shown:
4) Convert the resulting list into a Table (List Tools > To Table) and format the column as a Date
5) Add calculated Date columns (Year, Month, Week, etc.) as necessary using the Add Column tools
ADDING AN INDEX COLUMN
Note that we lose any field not specified in the Group By settings
PIVOTING & UNPIVOTING
PRO TIP:
Use the “From Folder” query option to automatically append all files from within the same folder
POWER QUERY BEST PRACTICES
Give your queries clear and intuitive names, before loading the data
• Define names immediately; updating query & table names later can be a headache,
especially if you’ve already referenced them in calculated measures
• Don’t use spaces in table names (otherwise you have surround them with single quotes)
When working with large tables, only load the data you need
• Don’t include hourly data when you only need daily, or product-level transactions when
you only care about store-level performance; extra data will only slow you down
DATA MODELING 101
MEET EXCEL’S DATA MODEL
The Data Model provides simple and intuitive tools for building
relational databases directly in Excel. With the data model you can:
• Manage massive datasets that can’t fit into worksheets
• Create table relationships to blend data across multiple sources
• Define custom hierarchies and perspectives
The Data Model opens in a separate Excel window, where you can view
your data tables, calculate new measures, and define table relationships
Note: Closing the Data Model window does NOT close your Excel workbook
DATA VIEW VS. DIAGRAM VIEW
DATA VIEW DIAGRAM VIEW
Normalization is the process of organizing the tables and columns in a relational database to reduce
redundancy and preserve data integrity. It is commonly used to:
• Eliminate redundant data to decrease table sizes and improve processing speed & efficiency
• Minimize errors and anomalies from data modifications (inserting, updating or deleting records)
• Simplify queries and structure the database for meaningful analysis
In a normalized database, each table should serve a distinct and specific purpose (i.e. product information, calendar
fields, transaction records, customer attributes, etc.)
This Calendar Lookup table provides additional attributes about each date (month, year, weekday, quarter, etc.)
This Product Lookup table provides additional attributes about each product (brand, product name, sku, price, etc.)
Original Fact Table fields Attributes from Calendar Lookup table Attributes from Product Lookup table
Option 1: Click and drag relationships in Diagram View Option 2: Use “Create Relationship” in the Design tab
Tip: Always drag relationships from the Data table to the Lookup tables
*Note: In Excel 2010/2013 the diagram view looks a bit different, and arrows point in the opposite direction by default
CONNECTING LOOKUPS TO LOOKUPS
PRO TIP:
Models with multiple related lookup tables
are called “snowflake” schemas
Models with a single table for each lookup
or dimension are called “star” schemas
To make a connection active or inactive, double-click the connection and check the box, or
right-click the relationship line itself (Note: must deactivate one before activating another!)
RELATIONSHIP CARDINALITY
In this case we’re joining the Calendar_Lookup table to the FoodMart_Transactions data table
using the date column as our key
There is only one instance of each date in the lookup table (noted by the “1”), but many instances of
each date in the data table (noted by the asterisk “*”), since multiple transactions occur each day
*Note: In Excel 2010/2013 the diagram view looks a bit different, and arrows point in the opposite direction by default
BAD CARDINALITY: MANY-TO-MANY
• If we try to connect these tables using the product_id field, we’ll have a many-to-many relationship
since there are multiple instances of each ID in both tables
• Even if we could create this relationship in Power Pivot, how would you know which product was
actually sold on each date – Cream Soda or Diet Cream Soda?
BAD CARDINALITY: ONE-TO-ONE
• In this case, connecting the tables above using the product_id field creates a one-to-one
relationship, since each ID only appears once in each table
• Unlike many-to-many, there is nothing illegal about this relationship; it’s just inefficient
*Note: In Excel 2010/2013 the diagram view looks a bit different, and arrows point in the opposite direction by default
FILTER DIRECTION IS IMPORTANT (CONT.)
PRO TIP:
Always hide the foreign key columns in your data tables to prevent users from accidentally filtering on them!
DEFINING HIERARCHIES
Hierarchies are groups of nested columns that reflect multiple levels of granularity
• For example, a “Geography” hierarchy might include Country, State, and City columns
• Each hierarchy is treated as a single item in PivotTables and PivotCharts, allowing users to “drill up”
and “drill down” through different levels of the hierarchy in a meaningful way
Option #1: From the Data Model Option #2: From the Insert > PivotTable dialog box
“NORMAL” PIVOTS VS. “POWER” PIVOTS
• Restricted to the data capacity of a single • Virtually unlimited data capacity as tables are
Excel worksheet (1,048,576 rows) compressed outside of normal worksheets
• Limited to relatively basic calculated fields, • Performs complex calculations using Data
using a sub-set of Excel functions Analysis Expressions (DAX)
NOTE: It’s not the PivotTable itself that’s different; it’s the data behind it
“NORMAL” PIVOTS VS. “POWER” PIVOTS
*Note: Depending on the version of Excel you’re using, you might see these referred to as either “Measures” (Excel 2010, 2016) or “Calculated Fields” (Excel 2013)
DATA ANALYSIS EXPRESSIONS (DAX)
2) Adding Measures
PRO TIP:
Calculated columns are typically placed in the Filters, Slicers, Rows or Columns areas of a pivot
CREATING CALCULATED COLUMNS
Measures are DAX formulas used to generate dynamic values within a PivotTable
• Like calculated columns, measures reference entire tables
or columns (no A1-style or “grid” references) HEY THIS IS IMPORTANT!
As a rule of thumb, use measures
• Unlike calculated columns, measures don’t actually live in (vs. calculated columns) when a
the table; they get placed in the values area of a PivotTable single row can’t give you the
and dynamically calculated in each individual cell answer (i.e. requires aggregation)
Measures can ONLY be placed in
• Measures are evaluated based on the filter context of each the values area of a PivotTable
cell, which is determined by the PivotTable layout (filters,
slicers, rows and columns)
PRO TIP:
Use measures to create values that users can explore with a pivot (Power Pivot version of a “Calculated Field”)
CREATING IMPLICIT MEASURES
PRO TIP:
AutoSum is a nice way to get comfortable with basic DAX and quickly add measures;
just don’t rely on them when things start to get more complicated!
CREATING EXPLICIT MEASURES (POWER PIVOT)
Each measure is
The Formula pane assigned to a table and
contains the actual DAX given a measure name
code, as well as options (as well as an optional
to browse the formula description)
library or check syntax
Measures are calculated based on filter context, which is the set of filters (or “coordinates”)
determined by the PivotTable layout (filters, slicers, row labels and column labels)
This cell does NOT add up the values above it (it’s an island, remember?)
• Total rows represent a lack of filters; since this cell does not have a customer_city coordinate, it
evaluates the Total Quantity measure across the entire, unfiltered Customer_Lookup table
FILTER CONTEXT EXAMPLES
Cell coordinates:
• Calendar_Lookup[Year] = 1997
• Customer_Lookup[customer_country] = “USA”
• Customer_Lookup[customer_city] = “Altadena”
Cell coordinates:
• Calendar_Lookup[Year] = 1998 Cell coordinates:
• Calendar_Lookup[Quarter] = 1 • Store_Lookup[store_country] = “Canada”
• Customer_Lookup[customer_country] = “USA” • Product_Lookup[product_brand] = “Amigo”
Cell coordinates:
• Customer_Lookup[customer_country] = “USA”
STEP-BY-STEP MEASURE CALCULATION
Store_Lookup[store_country] = “USA” 1 1
FoodMart_Transactions
Store_Lookup Table USA *
*
FoodMart Returns
USA
USA Sum of Transactions[quantity]
when store_country = “USA” = 555,899
RECAP: CALCULATED COLUMNS VS. MEASURES
*Note: Calculated columns CAN be placed in the values area of a pivot, but you can (and should) use a measure instead
POWER PIVOT BEST PRACTICES
& Concatenates two values to produce one text string [City] & “ “ & [State]
&& Create an AND condition between two logical expressions ([State]=“MA”) && ([Quantity]>10)
|| (double pipe) Create an OR condition between two logical expressions ([State]=“MA”) || ([State]=“CT”)
IN Creates a logical OR condition based on a given list (using curly brackets) ‘Store Lookup’[State] IN { “MA”, “CT”, “NY” }
*Head to www.msdn.microsoft.com for more information about DAX syntax, operators, troubleshooting, etc.
COMMON FUNCTION CATEGORIES
Common Examples: Common Examples: Common Examples: Common Examples: Common Examples:
• SUM • IF • CONCATENATE • CALCULATE • DATEDIFF
• AVERAGE • IFERROR • FORMAT • FILTER • YEARFRAC
• MAX/MIN • AND • LEFT/MID/RIGHT • ALL • YEAR/MONTH/DAY
• DIVIDE • OR • UPPER/LOWER • ALLEXCEPT • HOUR/MINUTE/SECOND
• COUNT/COUNTA • NOT • PROPER • RELATED • TODAY/NOW
• COUNTROWS • SWITCH • LEN • RELATEDTABLE • WEEKDAY/WEEKNUM
• DISTINCTCOUNT • TRUE • SEARCH/FIND • DISTINCT
• FALSE • REPLACE • VALUES Time Intelligence Functions:
Iterator Functions: • REPT • EARLIER/EARLIEST • DATESYTD
• SUBSTITUTE • HASONEVALUE DATESQTD
• SUMX •
• TRIM • HASONEFILTER DATESMTD
• AVERAGEX •
• UNICHAR • ISFILTERED DATEADD
• MAXX/MINX •
• USERELATIONSHIP DATESINPERIOD
• RANKX •
• COUNTX
*Note: This is NOT a comprehensive list (does not include trigonometry functions, parent/child functions, information functions, or other less common functions)
BASIC MATH & STATS FUNCTIONS
Sum of quantity from the Transactions table Average of product_retail_price Quantity Returned divided by Total Quantity
PRO TIP:
Even though it might seem unnecessary, creating measures for even simple calculations (like the sum of a column)
allows you to use those measures within other calculations, anywhere in the workbook
COUNT, COUNTA, DISTINCTCOUNT & COUNTROWS
Count of all rows in the Transactions table Count of non-empty cells in the recyclable column
SWITCH() Evaluates an expression against a list of values and returns one of multiple possible result expressions
Any DAX expression that returns a List of values produced by the expression, each paired Value returned if
single scalar value, evaluated multiple with a result to return for rows/cases that match the expression
times (for each row/constant) doesn’t match any
Examples: value argument
Examples:
• Calendar_Lookup[month_num]
=SWITCH(Calendar_Lookup[month_num],
• Product_Lookup[product_brand] 1, “January”,
2, “February”,
etc…
PRO TIP: =SWITCH(TRUE(),
Use the SWITCH(TRUE() combo to [retail_price]<5, “Low Price”,
generate results based on Boolean AND([retail_price>=5, [retail_price]<20), “Med Price”,
(True/False) expressions (instead of
AND([retail_price>=20, [retail_price]<50), “High Price”
those pesky nested IF statements!)
“Premium Price”)
SWITCH & SWITCH(TRUE) (EXAMPLES)
Switch quarter 1 with “Q1”, quarter 2 with “Q2”, quarter 3 = “Q3”, else “Q4”
Extract characters from the left of the customer_address column, up to the space
CALCULATE
PRO TIP:
CALCULATE works just like SUMIF or COUNTIF, except it can evaluate measures based on ANY sort of
calculation (not just a sum, count, etc); it may help to think of it like “CALCULATEIF”
CALCULATE (EXAMPLE)
USA
Store_Lookup[store_country] = “MEXICO”
1 1
Store_Lookup Table
MEXICO *
Transactions
* USA
FoodMart Returns
USA
Total Transactions where
store_country = “USA” = 180,823
FILTER
Examples:
to be evaluated for each row of the table Since FILTER returns a table (as opposed
to a scalar), it’s almost always used as an
• Store_Lookup Examples: input to other functions, like enabling
• Product_Lookup • Store_Lookup[store_country]=“USA” more complex filtering options within a
• Calendar[Year]=1998 CALCULATE function (or passing a
• [retail_price]>AVERAGE[retail_price] filtered table to an iterator like SUMX)
PRO TIP:
Since FILTER iterates through each row in a table, it can be slow and processor-intensive; never use FILTER
when a normal CALCULATE function will accomplish the same thing!
PRO TIP: FILTERING WITH DISCONNECTED SLICERS (PART 1)
STEP 1: Create an Excel table containing a list STEP 3: Make sure that your table loaded, and is
of values to use as thresholds or parameters: NOT connected to any other table in the model:
Calculate Total Transactions only for cases where the product price is below a selected threshold Calculate Total Revenue, but only for USA stores
ALL
ALL() Returns all rows in a table, or all values in a column, ignoring any filters that have been applied
PRO TIP:
ALL is like the opposite of FILTER; instead of adding filter context, ALL removes filter context. This is often used when
you need unfiltered values that won’t be skewed by the PivotTable layout (i.e. Category sales as % of Total)
ALL (EXAMPLE)
• In this example, we use ALL to calculate total transactions across all rows in the
Transactions table, ignoring any filter context from the PivotTable
• By dividing the original [Total Transaction] measure (which responds to PivotTable filter context as
expected) by the new [All Transactions] measure, we can correctly calculate the percentage of the
total no matter how the PivotTable is filtered
RELATED
RELATED() Returns related values in each row of a table using relationships with other tables
=RELATED(<column>)
HEY THIS IS IMPORTANT!
RELATED works almost exactly like a VLOOKUP function – it uses
The column that contains the the relationship between tables (defined by primary and foreign
values you want to retrieve keys) to pull values from one table into a new column of another.
Examples: Since this function requires row context, it can only be used as a
• Product_Lookup[product_brand] calculated column or as part of an iterator function that cycles
• Store_Lookup[store_country] through all rows in a table (FILTER, SUMX, MAXX, etc.)
PRO TIP:
Avoid using RELATED to create redundant calculated columns unless you absolutely need them, since those
extra columns increase file size; instead, use RELATED within a measure like FILTER or SUMX
RELATED (EXAMPLES)
Retrieve the retail price from the Product_Lookup table and append it to the Transactions table
Multiply the quantity in each row of the Transactions table with the
related retail price from the Product_Lookup table, and sum the results
ITERATOR (“X”) FUNCTIONS
Iterator (or “X”) functions allow you to loop through the same calculation or expression on
each row of a table, and then apply some sort of aggregation to the results (SUM, MAX, etc.)
=SUMX(<table>, <expression>)
Aggregation to apply Table in which the Expression to be evaluated for
to calculated rows* expression will be evaluated each row of the given table
Examples: Examples: Examples:
• SUMX • Transactions • [Total Transactions]
• COUNTX • FILTER(Transactions, • Transactions[price] * Transactions[quantity]
• AVERAGEX RELATED(Store_Lookup[country])=“USA”)
• RANKX
• MAXX/MINX
PRO TIP:
Imagine the function adding a temporary new column to the table, calculating the value in each row
(based on the expression) and then applying the aggregation to that new column (like SUMPRODUCT)
*In this example we’re looking at SUMX, but all “X” functions follow a similar syntax
ITERATOR (“X”) FUNCTIONS (EXAMPLES)
Multiply quantity and retail price for each row in the Transactions table, and sum the results Calculate the rank of each product brand, based on total revenue
BASIC DATE & TIME FUNCTIONS
Calculate the end date of the month, for each row in the Calendar_Lookup table
TIME INTELLIGENCE FORMULAS
Time Intelligence functions allow you to easily calculate common time comparisons:
PRO TIP:
To calculate a moving average, use the running total calculation above and divide by the # of intervals!
SPEED & PERFORMANCE CONSIDERATIONS
Write measures for even the simplest calculations (i.e. Sum of Sales)
• Once you create a measure it can be used anywhere in the workbook and as an
input to other, more complex calculations
There are several options for building visuals and reports from a data model:
Available
within Excel
2 Spreadsheet-based dashboards built with CUBE functions
• Use CUBE functions to pull values from the data model for custom Excel reports (no pivots)
Standalone
4 Microsoft PowerBI
product
(desktop + online) • Brand new (free!) self-service BI product for loading, shaping, modeling, and visualizing data
SNEAK PEEK: POWERBI
Looking to become an absolute Excel ROCK STAR? Try the full stack:
• Microsoft Excel – Data Analysis with Excel PivotTables
• Microsoft Excel – Advanced Excel Formulas & Functions
• Microsoft Excel – Data Viz with Excel Charts & Graphs
• Microsoft PowerBI Essentials (COMING SOON!)
Ratings and reviews mean the world to me, so please share feedback!
• Feel free to post to the Q&A section or message me directly if you need any support, or if there’s
anything I can do to improve your course experience!