Context Propagation details and other surprises while doing a ranking calculation

This post is also available in: Español

During my PowerBI courses, one of the topics that presents the most difficulties and surprises for the participants is to understand what the evaluation context is, what is the row context, the filter context, and above all, the propagation of contexts.

Microsoft, as it usually does for its popular products, hides internal complexities to “facilitate” the use of its product. And that’s fine to solve the simplest cases. But for real cases, which are never so simple, if you don’t understand in depth what innocent functions like CALCULATE do, you get unpleasant surprises 🙁

As an example, today I am going to do a simple ranking calculation, staring with a table with 500 rows, and two columns: i d and val

id will have values between 0 and 9.

val will have values between 1 and 35.

The following M script may produce the data requeired to follow this example. The name of this query, and the corresponding table is ‘datos’
I will give a link to unload the complete .pbix file later.

let
Origen = {1..500},
p_01 = Table.FromList(Origen, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
p_02 = Table.DuplicateColumn(p_01″, “Column1”, “Column1 – Copia”),
p_03 = Table.TransformColumnTypes(p_02″,{{“Column1 – Copia”, Int64.Type}}),
p_04 = Table.TransformColumns(p_03″,{{“Column1 – Copia”, each Number.Mod(, 2), Int64.Type}}), p_05 = Table.DuplicateColumn(p_04″, “Column1”, “Column1 – Copia.1″), p_06 = Table.TransformColumns(p_05”,{{“Column1 – Copia.1”, each Number.Mod(, 10), type number}}),
p_07 = Table.RenameColumns(p_06″,{{“Column1”, “i”}, {“Column1 – Copia”, “p”}, {“Column1 – Copia.1”, “id”}}),
p_08 = Table.ReorderColumns(p_07″,{“id”, “i”, “p”}),
p_09 = Table.AddColumn(p_08″, “Personalizado”, each [p] + Number.Sqrt([i]) + ([p]*[id])),
p_10 = Table.SelectColumns(p_09″,{“id”, “Personalizado”}),
p_11 = Table.RenameColumns(p_10″,{{“Personalizado”, “val”}}),
p_12 = Table.TransformColumnTypes(p_11″,{{“val”, type number}, {“id”, Int64.Type}}),
final = Table.TransformColumnTypes(p_12″,{{“val”, Int64.Type}})

in
final

In PowerBI your data will look as this:

Now we can build a measure to calculate the frequency of the val column, disregarding id

Freq = COUNTROWS(misDatos) // misDatos is the table name in the model

Available fields now are:


As a best practice, my measures are always under a false table, called _medidas

Fields of misDatos have the default aggregation set to none

We may look at current state exposing val and Freq in a visualization table

dimensión: val, medida Freq

I can verify that the total of Freq is 500, as expected, because we have 500 rows in our model table misDatos 🙂

Now we are going to dive into the rankling calculation, using simple functions CALCULATE, VALUES, FILTER, ALL

I start building my measure Rk, using variables: I set variable miFreq to the Freq value for each row of this visual table, which is trivial.

Rk = var miFreq = [Freq]
var r = miFreq
return r

Rk is evaluates once for each row, that is, in the filter context defined by the value of ‘val’. Rk and Freq have same value, as expected.
To get a ranking, we say that the first place is for the value of ‘val’ whose row has the highest Freq, that is, if I count how many values greater than Freq there are, the result is 0. If I generalize this criterion, I may say that the ranking “FOR EACH val VALUE” is the number that results from “counting values or rows” of where the Freq is greater than that of its own value. As we usually make rankings from value 1, and not from 0, we will have to add 1 to this definition.

It is clear that we must take into our calculation foer each row, data that is out of the current row of the visual table.
La columna visual la podemos calcular con VALUES
Let’s go ahead:

Rk = var miFreq = [Freq]
var r = 1 + CALCULATE(
COUNTROWS(
VALUES(misDatos[val])
),
FILTER(
ALL(misDatos[val]),
[Freq] > miFreq
)
)
return r

In the column marked in red we see the ranking. For those rows with the same frequency (draws), they assigned with the same result number, and the next frequency value, jumps positions accordingly. For example for ‘val’ 20 and 22, they both have frequency 34, and share position 2. The next position in the ranking will be with 4th position, for the value 18 frequently 32.

It is very important to notice that we have used the VALUES to generate a vector (one column table) which contains each differente value of ‘val’. We could have used the DISTINCT function, with the same result, since we are using data from only one table. This is true even if you have blank vaues You can deep dive into this topic here.

But if we try to built this temporary table with SUMMARIZE, or SUMMARIZECOLUMNS, then COUNTROWS will fail: it requires a table definition, and not a table expression. In plain words, it does not accept a function which may contain an aggregation, only functions returning raw data from the model table.

Another important detail is that, for counting rows, I define the context for te iteration as ALL(field) and not ALL(Table)
When the meaasure Freq takes place, there will be a context propagation from the visual row context to a filter context

ALL(Tabla) would define a filter mask for both columns, id and val which is not what we want.

If you would like to comment on this article, please write me at roberto@mirelman.com, and I will add your comments here.

Leave a Comment