Numeric calculations in Data Analysis Expressions with Power BI | Community Webinars

>>Hello, everybody. Welcome to my webinar about
numeric calculations in DAX. So this is a slight introduction
to dynamic analysis expressions. My name is Markus Ehrenmüller-Jensen. Welcome and thanks for joining the webinar or
watching the recording. So DAX, the Data
Analysis Expressions, sometimes I wrote Data
Analytic Expression, I’m not sure when I got
confused about the naming, but there was quite a couple [inaudible] in via LinkedIn and pointed out that there
was a typo in my slide, and in the first slide, so I changed it to Data
Analysis Expressions. That’s the official name of DAX. So DAX was born with
PowerPivot for Excel. I’m not sure who of you has
worked with PowerPivot for Excel. This was an add-in, or still
an add-in available for Excel, and as this was the first
version of what we now know as Power BI or SQL Server
Analysis Services Tabular, which is used for measures
and calculated columns. It’s like similar to what
we’re used to from Excel. So the syntax is still
actually derived from Excel. About 80 functions are identically in Excel formula language in the DAX, Excel formula language is
function-based so you call the function percentage and
then you put in the parameters, and another faction could
be also in parameters. So you can have several
functions calling each other. But the thing with DAX is that the
syntax is really easy to learn. That’s something you
will really first checkout but the semantic
is very complex. It is a good thing because it
allows for powerful computations, but it’s a bad thing because it takes some time to really understand
what’s going up there. So for this talk, I chose three calculations and I want to make sure
that everybody understands the calculations because I want to concentrate on DAX and
not on the calculation. The first calculation I will use is sales amount and if we have the
price and the quantity given, we calculate the sales amount
as price times quantity. Next calculation we
will do is the margin. The margin is that what we get from the sales amount if we
subtract the total cost. So sales amount minus the total
cost gives us the margin. The third one we’re talking about
is the margin in percentage. That’s the margin percentage
of the sales amount. So that’s the margin divided by the sales amount and then you
can multiply it with 100 to get the percentage or
use the modeling top in Power BI to make a
percentage on all of that. So please be sure that you understand those calculations because I want to concentrate on the DAX and
not on the calculations. If you are watching
this like a recording, please make a hardcopy screenshot
of this so you have it at hand when we talk about the extra
calculations then in DAX, I want to start with the
calculated columns because I think they’re the most
easy to understand because they look like or they perform like a
calculation in Excel, we have like sales and can be “copy” and “paste” the
same formula to the sales and then these calculations for
all my columns or for all my rows. So when we use or refer
to calculated columns, we should use this syntax
that we always state the table name and
then the column name. For measures, we will omit the table as we will
see in a few minutes, because it’s really
important then to refer to those objects that
are pretty sure, is it a column or is it a measure? So for columns, we always
write the table name. That’s not the same tactical thing, but it makes it easier to understand what’s
happening in the calculation, and despite that it’s named calculated column, it’s
calculated row-by-row. So the formula is put for one column, and this formula is then repeated for all the rows within the table but report to
the calculated column in. The result of the Calculated Column is also persisted in the model. So if I write this calculation and then we
get another column there, then this column will take
up space in the model. So the Power BI file will get bigger and every time we refresh the models, we press the “Refresh” button then older data is loaded and
after the data is loaded, then the calculated
columns are re-calculated. Again, persisted, stored
in our Power BI file. The datatype is automatically determined by the formula and
you can change it, of course. In this 60 minutes, we will concentrate on
numeric calculated columns. I will not talk about
fancy text functionalities or time intelligent stuff, just simple calculated columns. That the image that we really understand what’s happening
in the calculated column. There is one spatiality with
the calculated columns. If you connect to like a Tabular
model or to Power BI via Excel, this is possible, checkout, there’s a lot of stuff,
all in the internet, which really explain to
you how you can connect an Excel spreadsheet to and open
PowerBI file and if you do this, you will find out that if
there’s a calculated column, even if it’s a numeric
calculated column, you can put it in the
filter columns and rows, but you cannot put it in the values. This is just not allowed. Power BI allows us to
create pivot tables or payables and matrixes, and put the calculated
column in there because behind the scenes it’s
creating an implicit measure. This is a good thing
because then we can put stuff where we want to put it, like we calculate the column into
the values or region of a table, but it’s hard to explain
because it’s not visible. We cannot see this measure anywhere. So it’s hard for me to explain it to you where it
is but it is there. So just trust me, it is there, but it can be the source
of some confusion because we’re all going through the column and suddenly
it behind the scenes, there’s also a measure available. How is this measure calculated? It’s a very simple aggregation
like a sum or mean or whatever and the controlled is by the column property
default summarization. So the default summarization
isn’t a modeling part of Power BI and there you have
the free choice of sum, average, minimum, maximum
count or count (Distinct). You can also choose to not
summarize the thinking, then no implicit measure
is created for you. This is not only
possible to select and change in the modeling tab for the calculated column for
the whole of the modeling, you can always change it per visual. So in every visual, you also have the possibility to change it to
sum, average, minimum, maximum. Funny enough, like
the UI have to count distinct here and in the
modeling it’s installed here, so the exchanged here and we have three more possibilities if you’re changing the summarization
for a field in the visual, which can also be standard
deviation, variance, and median. What Power BI does is it checks out this summarization and then creates an implicit
measure which is hidden. We can’t see it which takes just
the expression of the column, put it in parenthesis, and then put, for example, a sum or average or min
or a max before it. That’s how it is created. So let’s start with our
first, calculate the column. Very simple, sales amount. I hope you remember
those three formulas. The sales amount is
calculated if the price and the quantity given by multiplying the price
with the sales quantity. If we do this, then for example, for the sample table, we use the sample, that’s four lines, four entries in this table, and we have the price of 10
and the quantity of three, and this gives us 30-40 sales amount, 20 times 1 is 20, 30 times 4 is 120, 100 times 5 is 500. Then we also have this total 670. The question is, how
is this calculated? This is calculated by
the implicit measure. If the implicit measure or the default summarization for
our sales amount say sum, then this implicit measure
will sum up those results. So 30 plus 20 plus 120
plus 500 gives us 670. We also can see this
in a life example. So I have copy and pasted the formula here so we
can easily read it. This time you have this table here, 10 times 3 is 30 and so on. So we can see the
calculation is done right. If you only put the sales
amount into table part by date, then the default summarization
does not only work for the total, but also sums up two rows. We have two rows with the same date, 1st of October, you have only
one row, 1st of October, here because Power BI
automatically groups the data and the summing up the
30 and the 20 gives us 50 here, and still you have 670 here. So this is working really easy. As is told you, tax struggles then tax
is really easy to read. So sales margin, the next
easy one, sales margin. We do not multiply, but we minus, we subtract the total costs
from the sales amount and if I give you another
model is the same example. A few rows, we have sales amount of 50 and the total cost of 40
gives us a margin of 10, 120 minus 75 gives us 45, 500 minus 350, 150. Then for the total 205, again, the implicit measure comes into place as the default
summarization is sum we get to 10 plus 45
plus 150 equals 205. This is not only on the slides, but we can see this
in the demo as well. So we have the margin here, 20 minus 15 is 5, and
so on and so forth. Then we have 205 here
for the total of the table for the margin
column. Easier step. So that’s of the third example, the margin in percentage. So the margin percentage
is the margin divided by the sales amount
in the same table again. So if you take a look
on our sample table, we have four rows again. So 5 divided by 20 is 25 percent, 5 divided by 30 is 16, 45 divided by 120 is 37, 150 divided by 500 is 30. So we have between 16
and 30 percent margin. That means if we sale a
good of the price of 100, then for example, €30
of 30 or whatever, it has a distant margin in
percentage, 30 percentage here. But then we have this 109 here. So 109 is a margin in percentage
is just not possible. There is a bigger margin
than the sales amount because you give me €100 and then I have to pay like 70 for the goods I deliver to
you and 30 is my margin then. But if you give me €100, there’s no way that
€109 could stay at me. So it’s not possible that we have a margin bigger
than 100 percent. So why is this? Because we have this
implicit measure. Type is a measure,
currently says sum. So 25 plus 16 plus 37 plus
30 gives us this 100 time. Of course, this is rubbish. This is wrong. This calculation does
not really work. If you see this in our examples
are since in a sales percent. Say I’m margin in percent, then we have the
calculations again here. But as soon this is
something summed up like the 1st of October, two entries here. This is not really working out. Maybe it’s correct, maybe not. I changed my thing here. So my bad. The default should be the sum here, then we see the 25 plus
the 16 is summed up to 41, and this is wrong if
you see it 109 here. So one way to fix it, this is not really to fix it, this is only to hide
that we have a problem, is to change the default
summarization from sum to average. That’s some people suggesting
this, and deprecate like, decent numbers could be right
or it could not be right. The question is, is it okay to have
or calculate the average here? Or what is the right sum? So the right value for the
average of margin for total would be the sum of the margins divided
by the sum of the sales amount. So this would be, can we calculate this. If I calculate this five here; 5 plus 5 plus 45 plus
150 gives us 205. Then we have 20 plus 30
plus 120 plus 500, 670. Divides the 205 by the 670, correct number would be 30
percent or 30.5 percent, what we get with the average is 27. So some would say
this is close enough, but I don’t think this is right because it depends on
the actual numbers, or if this is more close or not
close and anyway, it’s wrong. We should not calculate it this way. I don’t see any solution in
calculated columns here. So let’s move on to measures. So what’s a measure?
What is a measure different from a calculated column? Firstly convention, if
we refer a measure, we should omit the table name for it. So we can really see in the formula that now we
have a room to measure. As I had promised, we’ll see this
later through up to 60 minutes, the reason why I suggest this, and many people suggest this. I think of a measure like, this is an element or a formula
which is detached from any table. So to calculate the column lives in the table and just
calculate this row per row. The measure looks at the whole model, the all the tables like
from an outset perspective. This is how I imagine
being a measure is. There’s no row context. There’s only a filter
context because of course, if I see the table that never
see like the unfiltered table. The table is always filtered, we have slicers or we have cross-field
those between the visuals, that would be a page
or report filter. Of course in the pivot table we have rows and columns and those are filtering the measures of
this just one formula, but depending on the
context where I put it, like inside of a pivot table or different filters on
columns and rows, the same formula will come
back with different results. It is because the model always looks different for every position
where the measure is.>>The calculation is
definitely not done row-by-row because there
is no concept for row, no row concept or no row context in the measure except you will see
later that we can force to do so. What’s also really important
as a differentiation is that the result of the measure
is not saved in the model. So if we built a lot of
measures, the model, the size of the Power BI
file will not increase, it will stay the same always. I mean, it’s a couple
of characters for the formula which of course
are stored in the model, but there is no space
used for the results. Also, the refreshing of them, “Power BI” file is not
influenced by measures. So to refresh we recalculate
in measures that the refresh. If we look this lower
the file will not grow. The measures be if the same, if I connect to the model
with Excel or Power BI. So Excel allows me
to put measures into the Great zero part
portion of pivot table. So let’s take a look on
the margin percentage, which was calculated wrong
as a calculated column. So what do we have here? Margin percent used to sales
margin divided by sales amount. But as I told you, a measure does not have
any row context to cease the full table or the filled
that table to the sales table. So there could be potentially will only one group at many
rows of the sales table, looks from the outside
onto the model. So we always have to state how those potentially many
rows are aggregated. So in this case, I sum it up. So I asked tax to calculate the sum of margin and divide
it by the sum of sales amount. When we look at this example, we have one row here, margin of 10 amount
sales amount of 50. So 10 divided by 50 is 20 percent, 45 divided by 120 is 37.5 percent, and 150 divided by 500 is 30 percent. Then we get the 30.60
percentage because the measure never calculated
like any downwards portion, but now, it takes the
sum of the margin. So the total of the margin and
the total of the sales amount, and then divide it each other. So 205 divided by 670 as a just performed in the
calculator ends up as 30.60. So this is working fine according
to our demonstration here. So the margin in
percentage as a measure. So take this in blue, everyone can see is a measure, and I omitted the table up there, so everybody can see
this is a measure. So this measures run now working in all circumstances because
it’s summing up the elements, and that’s the right way
of calculating percentage. I never thought a correct calculation of calculating percentages and then summing them up
or averaging them up. This is most probably wrong way. We always have to sum up first to elements like the margin
and the sales amount, and then perform the division. So what about the margin? Could the margin also in a measure and sale of a
calculated column? Yes, we can. Again, you have to put some
aggregation there because the measure cease to fill the table. Sales table and [inaudible]
can consist of many rows, and then we have to
sum of sales amount minus the sum of total cost. So we have 50 minus 40 is 10, 120 minus 75 is 45, 500 minus 350 is 150, and then again, they measure
never calculates downwards, but always calculates horizontally. So 670 minus 465 equals 205. This also can lead to some
confusion because people tend to like calculate the total
downwards process. So I’ve seen reports were people complaining that the sum is
not correct because it’s not correcting the way I
cannot calculate back debt. Then you have to find another trick, which we’ll see in a minute. This is this one, sales amount. So the sales amount is the
price times the quantity. Again, we had to put sum on it. We cannot write a calculated. We cannot find the
measure and referring columns without stating the aggregate functions,
is just not possible. So if you have the
first example here, 30 times 4, 120, looks okay, 30 times 4, 120 again, 100 times 5, 500. But then we have this 2,080. So we sold 120, 125, 500. If this is correct, it’s not. But even if this is correct, we should have like 740. But look 2,080, why do we have 2,080? Because measure is not calculating downwards motion, but
calculated horizontally. So 160 times 13 equals 2,080. I mean, this is correct from
the technical perspective, 160 times 13 is 2,080, but it’s not what we expect
for the sales amount. Because what we need is
sum of the sale amount, the sum of the multiplications, not the multiplications of the sum. So just like the operators are
not in the correct V here. We can also see here, the margin. If not seeing the margin, but the
sales amount calculated wrong, gives us these numbers because
it’s not calculate in this way, but is calculated by
the sum and I mean, if you think about it, the sum of a price
doesn’t give any meaning. I mean, doesn’t make any
sense to sum up a price. We could click an average. But again, this would be then close but not the correct calculation. So this is wrong, no to this. Now we have the problem is solved some things with the
calculated column, and to solve the other
things with the measures. So it’s just a way to
distinguish when should I use calculated columns and
when should they use measures. We can of course solve
with the measure. So what we need is that we have
this sales table, like this row. What we need is that, we need to take the calculation price times quantity for every row in the sales table. So for older roles here, you have to calculate
price times quantity. So we get like 10 times 3 is 30, 20 times 1 is 20, 30 times 4 is 120, and 100 times 5 is 500. But then, we don’t need a calculation like the sum of the price times the
sum of the quantity, but now we need this in a downwards projection and
a downwards calculation. We can force this by stating SUMX. So SUMX, it’s like the sum function. It’s summing up the results
of this calculation. So SUMX has two parameters
versus the table, which delivers the row contexts to be calculating something
row-by-row for the sales table. In this case, you stayed
here that we calculate the price times the quantity
for all the rows in this table. But at the end, if we
have aggregated values, we do not calculated with
the aggregate drills, but you just aggregate
the singular values, what we have seen with the calculated column and
the implicit measure. This is what we can do with
this aggregator function. So X stands for aggregator. I’m not sure where
does X is coming from or why is this X [inaudible]
A for aggregator. This function is not
only really but for some it is also available
for an average X-min, X-max, X is just a more, I
can’t remember currently. So this is how we get the correct calculation
into this sales amount. So we have like on single
rows in the table, this is calculated row-by-row. We’re delivered a row
context now to the measure. But as soon as this is
done for all the rows, which are visible in the
current filter context. As soon as this is done, then
it’s aggregated with a SUMX.>>Let’s rethink what we have seen. So we have seen the sales amount, and we have seen two versions
of calculated sales amount. One was the sum of the
calculations price times quantity, and the other one was sum price, times the sum of the quantity. The question is, are both correct? There’s only one correct. Which one is correct?
Hopefully, you remember this. So we can only calculate
the price times the quantity and then
sum the results up. It’s not possible to
sum up the price, and sum up the quantity,
and then multiply. This will not lead to the
correct sales amount. We can calculate the
first one in measures, as we have seen with the SUMX
functionalities on function, and we, of course, can calculate this in
the calculated columns. The second one is only
possible in the measure, but visual part of it. So that’s one, one, four measures, and
calculate the columns. The margin. Can the margin be
calculated as the sum over the difference of sales
amount and total cost, and, or can it calculated as the sum of sales amount
minus the sum of total cost, so the difference of the sums. This is possible in both
ways, and you can calculate. The first one is measuring
calculated column and the second one is only
possible in measures. So there we have the free choice of what to
use or how to calculate it. The margin in percentage. Can be calculated margin
in percentage as the sum of the margin divided by
the sum of sales amount, and, or can be calculated as the sum of margin
divided by sales amount. So we’re summing up the
results of the division. Count one again is wrong. The first one can be
calculated as a measure, the second one could be calculated
as measure or calculate column. So so far, we can calculate
everything but the measure, and sometimes you have problems, it’s difficult to calculate column. This leads me to the
first take-aways. We’re not finished with this talk, but I want to like if it’s
a little break in between, think about what we have seen. So the takeaway is that, we should always take care about our numeric columns and think
about the default summarization. So we move the default
summarization for columns which should not be aggregated. Typically, the price as we have seen, or for the date dimension, the calendar year, the month number. If you take the calendar
and just drag it into the report pane of report canvas of Power BI and
you do not see a list of years, but you get three trillions, then probably your
default summarization is set wrong for the calendar year. It should be set to don’t summarize. For all the other columns
where we need the aggregation, please create a measure for those with the correct
aggregation course. It could be multiple, so you have two or
three or four issues for one calculated column
showing the sum or the average, or the smallest or the biggest value, and then hide the original columns. Nobody uses the original column, because then we always have this default summarization and then sometimes the
calculation could be wrong. My recommendation is not to
create calculated columns at all. This is really harsh, but I think this is right. As I stated here, don’t create
numeric calculated columns at all. If you need them, if
you need columns, and there are certain situations
where you need a column, for example, if you want to filter, you cannot filter on a measure, then you have to have a column
or some visuals ask you for calculated columns and do not allow you to use
measures in them. Then, of course, you have to create
the calculation as a column. But please don’t do it
in tax in the model, but use Power Query, which comes with another
language, unfortunately, so you have to learn
that language M to calculate stuff there in Power Query, or even better as early as possible, like in the data source. So somebody who delivers
you a CSV or Excel file, it should be calculated there or if you are lucky and can connect to a data warehouse or to a
cube or tabular database, then they should make sure
to calculate it there. Because there’s a high probability that you’re not the only one
who needs the calculation, and if you do it in your
Power BI file index on M, that’s only available for you, and all the others have to
replicate the follower. So it’s better to have
it in the data source, that’s the one reason. Why you should put it into Power
Query and not as into docs, into the model is that because
if you put it in Power Query, there’s a higher potential that the compression algorithm
will be better. So your Power BI file will be
smaller and the smaller the file is, you should expect then that the
performance of the queries, and the performance of
the reports are better. So let’s proceed to the next level
in numeric calculations index. So what I want to look on is, it can be used in iterate like SUMX, which we just used in
the measure to get a grow context into a measure. Can we do this in a
calculated column? What if you’re using a measure inside a calculated
column, is this possible? Let’s take a look. So we have
the sales amount here again. The sales amount, and I calculated
this with SUMX, instead of SUM. So previously, we only set up SUM. Previously we only had sales price
times sales quantity, and now, I put into a calculated column the iterator function SUMX to
iterate over the sales table. This table where we have the
price and quantity in it, and then calculate the
price times the quantity. So the question is, what
will be the outcome? Would be this the sales amount
we would expect for every row, we get the sales amount for this row, or what would be the outcome? So think a little about that. If you’re watch a
recording, just pause the recording and think about it before you start pressing play again. So if we have this sales amount SUMX, then we get the result of 670, because SUMX is calculated. So the calculated column is
executed for all the row. So for the first row, it takes, of course, 10 times 3. But it then continues, because for the first row, it will then iterate
over the whole table. So for all the rows, it will iterate over the whole table. As we can remember, the sum of the sales
amount for the whole table is 670, because for the first row, it will take 10 times 3, then it will also
calculate the 20 times 1, 30 times 4, 100 times 5. Then this, the execution
of this expression, and then the SUMX will make
sure that all those values, those four values, are summed up, the iterator sums up to 670. This is done for all the rows. The next row, it also iterating
over the whole table. For the third row, if it’s iterating
over the whole table into 4, 4, it will do the same. So for all the rows, it’s
iterating over the whole table, so we get the 670. This can be a good thing. Maybe you’ll need this in one column, but probably not, because we use calculated columns
to filter on it. But the filter table fields are
very same for all the columns. If this comes as a surprise to you, like we get 670 at all the rows, then it’s because you
were not aware that the calculated column
has a row context. But this row context, so the row context is like
I am sitting in this row, this row context does not filter the sales table which
is used in the SUMX. So the row context is
not transferred into a filtering or transitioned into
a filtering of the sales table. So for this row, the sales table looks like all
the rows of the sales table, and that’s why we get the 670
on all the rows as the result. The name of file, I
think this is this one, where we see SUMX. Here we go. So we get the SUMX of sales, price times quantity, and then we get the 670 on all
the rows, of course. The total will be, not very useful, I guess. It’s summing up 670, four times, because there’s this default
summarization indices amount column, and this is set to SUM. One way would to click it and to
ask it not to summarize that way. At least get rid of this
wrong summarization here. Next question is, if you’re referring from one column, another column. So this is a really simple formula. I have the sales amount C for column. Then I just put in here the name of another column of
the sales amount calculation. What will be the result? I think the result is
no surprise to anybody. It just copies the value. So 20 is copied over and all
the values are copied over. So of course this is written altitude refer one column and another column. It’s not showing the sales amount, the total of the whole table, but as we are just
referring another column, the row context is coming in. So for this value, the row context makes sure that
I only see this sales amount. So sales amount C always one row of sales amount which is 20.
[inaudible] can see this here. So we have like just
copied the value, the original formula over. Because when one column is
referring the other one. Then there’s row by
row context is done, the calculation is
done row by row and therefore, the new calculated column only as the value of the old column. What if we have a measure inside the calculation of the
calculated columns? So sales. Sales amount has a very
simple formula, sales amount. Sales amount, I just
omitted a table name. So this should give you a hint. This is not the column sales amount, but this is the measure sales amount. The question here, and again, if you’re watching the recording, you can pause this at this moment. The question is does this verge? Do we get an error? Or if
you don’t get an error, what will be the result? Just to remind you for the sales amount measure
we use the SUMX formula. So SUMX or sales, sales price times sales quantity. We will use this formula SUMX inside a calculated
column. What happens? We got the total of the sales table. So this was this one then
we had the SUMX here. We got 670 here. So when we have the SUMX in the
calculated column, we got 670. So one way to export we could expect is that we get the 670 also here, because we have to SUMX here. Funny enough, we get
the right values. So for this Sales Amount M, which is preferring the measure, we get the right values, 30, 20, 120, 500 not 670 at all the places. Why is this so? Because if we have a row context and this
row contexts, we have a measure. Then that transitions the row
context into a filter context. So we have a measure inside of
a row contexts into stable and this row context is then transitioned at for the sake
of the calculation here, the sales table is only one row. So when this is calculated
for the first row, the sales table is only
consisting of one row. This only happens if you’re using a measure inside a calculation
for a row context. So if you use a measure inside
of a calculated column. A calculated column is
inside of row-context. Then for the sake of the measure, the row context is transitioned
into a filter context. This can be a surprise. This is the reason why it’s really important that we see
the formula if we referring a measure or if you’re
referring a calculated column. Because for a column, this filter transition is not
done for a measure it is done. This is a very important difference. Difference is really important
that we know when we have a column or when we have a measure at hand
inside of a formula. So let this sink. If we go back here, you
can see this is also working in the demo here. So we have this sales amount
M, calculated column, calculated column Sales
Amount M. This consists of the sales amount measure. Sales amount measure has this iterated to calculate the price times the quantity for all the roles available in the current filter
context and then summing up. As these measure is used inside the calculated column
and for measures, we have this rule that the row
context is transferred into the filter context for the sake
of this row of this one element. The filter context from this row is, the row context of these
rows transitioned into a filter context for
this measure and so this measure sees only
one row, 10 in three. This is what we get here, 30. For this row, the
filter context afford the measure is again only 20
to 120 and so on and so forth. What we can do is we can also
use or simulate this effect of transition row context into a filter context by
making the following. So again, have a sales
amount, a calculated column. I used to SUMX sales
price times quantity. So the same formula
we have seen now in the measure then inside
of the calculated column, and then the measure inside
of the calculated column. Now I put this little
formula inside Calculate. So Calculate is another
function available in packs. I’ve first see Calculate effort name, I know English, I mean calculate
means to calculate something. What could be the danger? What could be special
thing with Calculate? It and this was a big mistake. So don’t be me when learning that. Don’t assume that you know what the function does because
you know English, you assume what, what
the name stands for. Calculate especially is
a very complex function. It can be very cool things, but we have to be
aware of what it does. Calculate does the following. It transitions the row context
into air filter context. So we don’t need a measure. So we don’t need to put
the SUMX into the measure. We can also use Calculate of SUMX of sales price times quantity
inside of a calculated column. Then it will get the right result, like the calculation is done only row by row with the
right results showing up. Because calculating this is one of different functionalities
this Calculate does and performs for us is transition the row context
into a filter context. So what we see here, the
sales amount calculation. So we have this calculate of SUMX over the sales table,
price times quantity. Again, [inaudible] rights or
direct radius, the calculation, the sales amount for each row, which we would expect. Also the total is working up, because this can be summed up then
with the defaults [inaudible]. So the second take-away
from this talk before we come in to the real hard stuff is that calculated columns can
reference other columns and this is then done row by rows or when one column is
referring another column. Of course, I only see the value of
the column for the current row, but this row context does not filter a table.
There’s a question, Deepak?>>Yes Mark. As I was
about to interrupt you.>>Yes.>>So does the text allows a parameter to be passed
in for a calculations?>>Does text allow to put a
parameter in for the calculation? Can the person elaborate in more
detail what he or she means?>>Yes, it’s from an
anonymous person. We need to wait for it because it’s been a single sentence
question like this, does the text allow a parameter to be passed in for a
calculation? All right.>>I only have this formulas and I could refer to the
formula under this name. This could be a calculated
column or a measure, but there’s no way that
I have a parameter like here which then puts
into the calculation. If this is meant, no,
this is not possible, but I’m not really sure if I
understood the question correctly. So the person may reach out to me. At the very last slide, there’s my e-mail address
and my Twitter handle. Just reach out to me and I will make sure that
you get the right answer.>>Okay. So anyway, I’ve
replied back saying that, “Can you brief the
question furthermore?” So let’s wait for that.>>Yeah, thank you very much, Deepak.>>Yeah.>>Sticking here, take-aways. The row context does
not filter the table. That’s what we have
seen when we used to Sum X within a calculated
column because Sum X has this first parameter
at table and this table is not filtered by the columns
rule, so this multiply. If we want to filter transitions from the row
context, the filter context, we can use Calculate or we can
put the formula into a measure. For the measure also, this transition
is happening because, again, implicitly, text wraps
calculate around the measure. We don’t see this. Again, this is very nice that
a box is doing this for us, but if you’re not aware of that, you are confused that
the same formula put directly and to calculate
the column behaves differently as if we do put the very same formula
in the measure and then using the measure in
the calculated column. This can be confusing, but it’s the way it is
and the reason is because the automation does automatically
put a calculate around it. Now, we’re coming back to
why best practice is to always put the table name in
front of the Column and omit the table name in the
Measure is because it’s really important that
we see in the formula is this element referred in the formula is that a
column or is that a measure. So one thing is like here that the context transition is
happening for the measure, and the other thing is
that as I mentioned, a measure does like it’s
detached of any table. So it looks at the whole model, but it’s not really part of
a table calculation wise. Of course, we have to
put a measure into a table because otherwise
we cannot refer it. So we need the measures to be part of one of the tables here
because it has to show up somewhere in the field list
that the field list is always first the table and then the field. But what I do since a year now maybe, I put all the measures into a separated measure table.
So it also makes here. It’s not part of any
of the other tables here and of course, we
could move it around. So you can move the margin again, back to the Sales table because people expect
that the calculation for the margin is in the Sales
folder here in the field list. Then I can do this, but only if I never used the table name to
refer to measures because if in formulas I refer to measures with like I just call measures
and then the measure mean, and then I moved the measure
from here to another table, then of course, I break my formulas. So this is the second
reason why you should not refer to measures with
the tabling before. So let’s bring it to the
third and last level. We can calculate the filter context or we can manipulate the
filter context with calculate. We’ve just seen this and we
can play around with this. So I have a measure which
has the main product A and this measure shows me the sales amount and this
is read into calculate. If you use this, we get of course, a syntax in here because
there’s a comma at the end, but other than that, it would not change anything as long as we don’t have a row context around here because
there is no row context. So this would be the very same with calculate
or without calculate, but calculate is
capable of doing more. We can put a filter as a
second parameter to it. The first parameter is in
expression and then we can have as many parameters as we want and all those other parameters from
the second one on are filters. So we can write a filter here. This equals states that
I only want to see product descriptions which are A. So I have this table here, products and similar for products. I have different sales amount
for those products and there’s more sales amount here and
calculate sales amount, make sure that the
filter for this measure, for the sake of this calculation for Sales Amount is always set to A. So I end up with the column
product A where I will see the sales amount of product A. Like this, it’s not really
useful but of course, we could do here and then we
could calculate percentages. So how much makes B
in percentage of A, or how many times is C of A? This can be calculated by having this intermediate calculation and then calculating the
percentage or the factors. This filter is not only where
it depend the only product, on different products, but this filter applies also for
the total as you can see it. Again, this can be confusing for some people because
people tend to see those numbers and then to sum it up implicitly as expected
to the very end. We have 4 times 30 so
they would expect 120, but this is not how [inaudible]
is calculating stuff. It’s calculating a
separate calculation for this row and on this row, there is no product,
but we are setting, we are asking for certain
product description to A. So what we get is 30 because this is the sales amount of product A. So calculate can also
filter an expression, it doesn’t only transition to a
low context into filter context, but it can also change
the filter context. To do the very same calculation, calculate sales amount product
product description A equals to A, and you could add
filter as a function. So filter is a Table function, but it can be used
also as a filter and to calculate as a second parameter. So what I am stating here
is that I want to filter on product description A for
All product descriptions. So I don’t care if there’s already a filter on the product
description table like here, it would be filter on the product A, of course, I end up with 30, but here the filter would be on B, and they all make sure
that this filters applied not on all
available products, this would only be P, but all. So it’s like deleting, it’s removing any filter that
are on the product description. So theoretically, this filter is
put on all available products, and then it’s as a
product description on A, so we end up again with 30. So what we see here, this filter expression here
in the calculate statement, this is, again, was what
DAX is doing for us. So if we’re writing a
calculation like this, calculate and put
just a filter in it, what it does behind the scenes, it’s expanding it to filter all on this column and then
puts the filter on it. As we’re ending up that you’ll see only the sales amount of
product E on all the rows. So filter all or having just a filter on the
calculator does the same thing. It’s just Syntax Sugar. If you only want to
have the product E, then we do not have
to remove to filter. So if I remove the All on
the product description, so it just stayed product here, so now the filter is applied
on the product table, and the product table
could be filtered, and in my case it’s filtered. So here it is filtered
on product B and the end up on this nuclear collision
that I don’t see anything, because the filtered
on the product is B, and I only have one row on
the table B in this row. So when the calculation is done here, the product table
looks like it only has one row consisting of the product B, and then as a state different
product only one with a product descriptions
E. As it is not E but B, I end up with not seeing
any sales amounts. So if I have different filters in my filter
context and add more filters, they are calculated like
connected with and. So E and A ends up, it’s not possible if you
don’t have any products which have the description A and B, so we end up not
seeing anything here. So this is the way how we can
get to the sales amount of product A only when we are at product A and all the
others don’t see anything. Again, this looks like
it’s summing up correctly, but it’s not summing up. As you remember, it’s
calculating the same thing for the total and here
just sets the filter on A. So on the total level we will have, the product table would
look like A, B, C, and D and consists of four
[inaudible] and there are four rows on this total row, and they’re just asking
for “Give me only E”. Without the All filter, all the filters are combined
with the logical and operator. If we want to see All the products, we can take the short
version of the calculate, we only put All product
description and the sales amount is calculated
for All the products. So we end up with a column
here in this calculation, which also 670 in all the roles. Again, this might be not
very useful as it is, but then we can
calculate, for example, 30 divided by 670 and we
see the percentage of what is the percentage sales amount compared to the total for product A, B, C, and D. This is how we can do this by having the All
directly and to calculate. So All removes all the filters
on every label on the table, or in this case, the
column of this table. This can be useful, but sometimes we have filters outside and when we’re not using the
All, but the Allselected. Allselected make sure that if there’s a filter on the outside
of the actual crucial, I mean, this is a little bit more
complicated behind the scenes, but this is how we perceive it. If there’s a filter on the outside, the Allselected will only show the
sum for the selected products. While the All without the
selected will show us the 670. What’s the correct thing? It depends on your report cases, but we are capable of
doing both calculations. So we can ignore the filters
on the outset that use All, if you want to have the filters
on the outset just in place, then we use the Allselected for this second parameter
on the Calculate here. Allselected removes the filters
only from within the Visual and keeps the filters from outside
of the visual, so to say. Again, take-aways the
calculate filter overwrites existing filters on the same column, which ends up as filter
All of the column, and then adding up onto
filter was removed first all the other filters and just sets the filter on what we are asking for. Because All removes the
filter from a table or a column and also it removes it, but keeps the filter from existing filter from the
outset of the calculation. So time to wrap up,
five minutes to go, so I have time for some questions, if there are some or should we
take the questions right now? We’ve got other more questions?>>Yes, no Markus. We don’t have any
further questions on it.>>Very good. Then I will wrap up it, and give you a break,
three or four minutes. So DAX is easy from the syntax side. So I think almost everybody can
read formal syntax to understand, to get a glimpse to
what it’s doing bound. It can be rather complex because the meaning of the
calls to the formulas, like there’s some, there’s
some Xs calculate. They all sound the
same in plain English, but they do totally different things, it can be really complex, which is a good thing. At the end, we have seen that we can do rather complex calculations, we can fulfill many, many requests because the
semantic is so powerful, but it can be confusing
in the beginning. Then the second takeaway, don’t create any
calculate the columns. If you need something as a column, create it in Power Query
and the M language, or make sure that the data source
already delivers what you need, and create a state measures for all your calculations with
the correct aggregation. Hide all the original columns, so nobody is confused that there’s
two sales amounts in the model. Always if you’re
referencing the columns, please put in the table
name and omit it if you are referring to measures, because we really need to distinguish between tables and
columns because in one case, the rule contexts is not
transferred to the other. It is transferred and it could be very confusing if
you’re not sure if you are using a facing table columns,
columns or measures. With that said, thank you very
much for attending this webcast. Thank you very much for
watching the recording. If there are any questions, please feel free to reach
out to me on Twitter, or via LinkedIn, my e-mail
address is also here, and I will make sure
that you get the answer. I will post this [inaudible]
and the Power BI file, the sample file I just used
on Twitter in the next hour, I would say, and also, on LinkedIn. So feel free to download this thing or to contact me so I
can it towards you. So I think I will give
everybody two minute speech, if there are no further
questions, Deepak.>>No more questions on it, Markus.>>No more questions?>>Let me check one final time. There was a question from anonymous, like if remove filters
functions the same as All?>>Remove filters. It’s a remove filters
function? I’m not sure. I have to look this up. All removes the filters from
the table or the column, I have to look it up if there’s
a Remove filter function.>>All right.>>I’m not sure, I would check this out and I will make sure that I post an answer to this
question on Twitter next hour.>>All right, perfect.
So we don’t have any furthermore
questions on it, Markus.>>Thank you very much.>>Thank you.>>Please check out
the Power BI MVP book. I’m one of the contributors. We don’t earn any money there. It’s going to charity. It’s a book with 31 chapters,
21 different chapters, so you will have fun for one month reading one
chapter each evening. With that said, thank you very much. Have a nice day.

Leave a Reply

Your email address will not be published. Required fields are marked *