Thursday, August 9, 2012

A Simple Tutorial on PowerPivot (2)

3. Enriching the Data

After importing data, the next natural thing to do is to integrate and enrich the data. There are following ways to enrich data.

First, it is possible to change the actual data types for imported columns. As illustrated below, after choosing a column in a PowerPivot table, one can choose the “Data Type” option in the ribbon to convert the data types.


Compared to changing data type of columns, a second approach is to just change the way data looks like in the table. That is done by re-“formatting” the data which only changes the “view” of data without converting the underlying data type. Formatting changes in PowerPivot are applied to an entire column. This is illustrated below. 



One of the key features of PowerPivot is to create relationships among tables. As illustrated below, it is possible to set up relationships after the PowerPivot tables are imported. The “Create Relationship” option at the “Design” ribbon in PowerPivot window allows one to set up relationship between tables. Due to the limitation of the current version, relationships can only be defined on single columns (as shown below). This means that if you want to establish two distinct relationships between two tables, you must import another instance of the table. It is a bit similar to using SQL to define join statements that use multiple columns in the same table. So users cannot expect to define complicated relationships in PowerPivot. The “Manage Relationship” button shows all the defined relationships of all the imported PowerPivot tables. 



A very existing feature in PowerPivot is that one can define ”measures” (like in Cubes) using the DAX formulas. This includes calculated columns (which are materialized into the table) and measures explicitly defined (which are kept virtual until called). Below is an example of a defined calculated column in a PowerPivot table. The DAX formula in the Fx textbox seems very similar to Excel. But the main difference is that this definition covers a whole column “TotalSales” instead of a single cell in Excel. 




A little bit more about DAX. DAX in fact has a total of 135 functions to help with data analysis. About 80 functions are supported by Excel. DAX also provides functions that operate on tables.The DAX functions can be categorized into eight categories, i.e., Date and Time, Information, Logical, Math and Trigonometric, Statistical, Text, Filter, Time Intelligence. Similar to math and trigonometric functions , only a subset of Excel’s statistical functions is supported by DAX in the current release of PowerPivot. When typing DAX formulas, PowerPivot provides the “autocomplete” feature. One needs to press the Tab key to use this feature.


Like Excel, PowerPivot window provides the ”Sorting” and ”Filtering” options at the “Sort and Filter” Section, and the “Freeze” option at the “View” Section for users to refine the data. This is illustrated below. The column width can also be adjusted in a similar way to Excel.


Another useful feature for improving the view is to use ”hide” and ”unhide” options. As illustrated below, it is possible to hide columns from PowerPivot or PivotTable or both.






After data is loaded to PowerPivot, the “Refresh” button can be used to retrieve the latest source data to PowerPivot. As illustrated below, it is possible to refresh the current table or refresh all tables in PowerPivot. When clicking the “refresh” button, what happens behind the scene is that PowerPivot sends two commands to the VertiPaq engine when you click Refresh:
  • The first command is to “Alter”, i.e., to make any metadata changes you have made in the PowerPivot Window.
  • The Second command is to “Process”, i.e., to refresh the data from external data sources and any calculations that depend on the refreshed data.






No comments: