What is DAX language in data analysis?
![]() |
DAX |
When it comes to designing data models in Power BI, one tool you'll need to use very often is the DAX formula language. We'll now take a closer look at what this language is and how it can help us model data.
What is DAX?
We start off with some of the salient points of the DAX or data analysis expression language. First of all, this is not something which is unique to Power BI. This is available in fact for various Microsoft data analysis tools. Beyond Power BI, it is also available for Microsoft Analysis Services as well as Power Pivot. So if you get familiar with DAX, you can take your skills beyond just Power BI. The purpose of the DAX language if for us to be able to derive data from the existing raw information. So we can use DAX in order to create calculated columns as well as measures. The building blocks of the DAX language are functions as well as operators. In many ways, DAX is quite similar to other programming languages where you have functions with arguments, return values, and where it's also possible to make nested function calls.
Filter context & measures in DAX:
An important consideration when writing DAX formulas is that, these can operate in something called a filter context. A filter context is one which is only applicable to specific rows in a table, and operations can be performed using that filtered data. This is something which will become much clearer in the demos of this learning path. And we will spend a lot of time on this topic since it is important for the DA-100 certification.
The most common use of the DAX language will probably be to create measures. So, measures are essentially values which are computed over a large portion of a dataset, if not all of it. For example, what is the aggregate of all expenses for each department in a large company, or what are the expenses for each department in the region of Europe? In many ways, measures are similar to calculated columns, in that they are typically derived from raw data. However, unlike calculated columns, they don't need to operate over rows on an individual basis and can simply aggregate the contents of certain columns. In fact, measures can often be used in order to perform a row by row calculation, followed by an aggregation.
For example, you could multiply each of the transaction amounts for sales data by an exchange rate and then sum them up to get the total sales in a specific currency. Now, measures need not operate on the entirety of the dataset, so it is possible for us to apply filters when doing such aggregations. For example, we can only consider the sales data from Southeast Asia or for customers in the 18 to 24 age group. So now that you have some idea of what measures are. Well, let's switch to another important topic in the context of Power BI, not just from the point of view of the DA-100 certification, but also when you need to use Power BI for advanced operations. And this has to do with how context is defined for various DAX operations. Well for this, consider that during data analysis, values such as total sales are usually aggregated where you may not care whether a specific customer bought a product for $25 at a specific store. But you may be interested in knowing that a specific store recorded a total of $30,000 worth of transactions in a month. Now, such aggregations can be performed on various keys.
So, you could aggregate sales based on a restaurant on a specific category of restaurants such as drive-throughs or maybe even restaurants located in a specific city. And given that such sales can be aggregated across different dimensions, the specific value of the aggregated sales will be different for a different set of dimensions. For instance, the numbers for fast-food restaurants sales in the city of London in the month of February will likely be very different from the sales recorded in all drive-through restaurants for the whole year. In other words, the value of aggregated sales depends on the context. When it comes to Power BI, such context can be determined by various filters. These filters could be set within a DAX formula explicitly or may even be set by a user interaction in a visualization. For example, when a user clicks on a particular bar in a bar chart. An example of a filter is one which only considers sales from fast-food restaurants, so this could be based on restaurant types, but other dimensions, including date dimensions, such as month, quarter, and year, could also be part of the filter.
Now the reason we are focusing on the filter context is because when it comes to DAX formulas, this can play a significant role. For example, there are certain DAX functions which are meant to operate in a filter context. There is one called CALCULATE which will first filter the data based on the filters specified, following which it can perform operations on that data which includes aggregations. While understanding the DAX language can be rather simple if you have any programming background, understanding the filter context may be the most difficult part. So, we will spend a lot of time on it in the demos of this path.
DAX functions and their categories:
Now, let's take a quick glance at the different types of DAX functions. This will help us understand what this language is capable of, and also give us a glimpse into what is possible with Power BI. So first of all, there are a variety of aggregation functions which could take in a variety of arguments and then returns a scalar value. Examples of these include SUM which performs a simple sum operation, MAX which returns a maximum value, and AVERAGEX which evaluates an expression and then returns an average.
And then there are a variety of filter functions. These are functions which allow us to work on filtered data. Such as CALCULATE, FILTER, and LOOKUPVALUE. But also functions which can be set to ignore any existing filters, such as ALL. Beyond that, there are the regular Date, Time, and Math functions which should be familiar to you, if you have any prior experience with Excel formulas. So the TODAY function returns the current date. ABS returns an absolute value. And numbers can be rounded to specific decimal units or to integers using the ROUND function. And then there are functions which can be categorized as information functions. For example, ISBLANK tells us whether there is a blank in the cell of a table, CONTAINS can be used to perform a substring search, and ISNUMBER can be used to check for numeric values. Usually, when using DAX, you will end up using a combination of these functions in order to create a single DAX formula.
Bottom line:
The DAX language is way essential for building powerful data models in Power BI where filter context, measures and important key functions allows us to create advanced calculations, and to optimize reports and to enhance data insights as well.
Comments
Post a Comment