Creating a “Correlation Matrix” in Power BI

Erik Hamoen
3 min readAug 1, 2023
Photo by Casey Lovegrove on Unsplash

Recently I came across this question asked at the Power BI (Fabric) community, because I was looking for something like this myself. I had no idea what these matrices were called, but I think the correlation matrix will be in the best direction. With this example, I decided to try it myself.

Getting the data

First, I load some one the data in, using “Enter data”, and then open the Power Query editor:

The first thing to do is unpivot the columns, other than Patient. You can do this by right-clicking the patient column and selecting “Unpivot Other Columns”. You can also do this in the Transform menu. Last, I rename the Attribute column to Condition.

After that, I make a reference to the unpivot table. Remove all the columns except for Condition and then remove the duplicate rows and call this table Condition. Then duplicate the query, and call it Condition 2.

After loading the data, make sure there are no relationships between the Condition tables and the unpivoted table:

Now it’s time to create a measure and a visual to display the correlation between the different conditions.

First, create a matrix with the Condition from Condition1 on the rows, and Condition from Condition 2 on the columns. You’ll get an error message because these two tables don’t have a relationship. So, let’s create a new measure!

The Dax

# Correlation = 
VAR TableCondition1 =
SELECTCOLUMNS(
Filter(
'Table unpivot'
,'Table unpivot'[Condition] = SELECTEDVALUE('Condition 1'[Condition])
&& 'Table unpivot'[Value] = 1
)
,"Patient" , 'Table unpivot'[Patient] )

VAR TableCondition2 =
FILTER(
'Table unpivot'
,'Table unpivot'[Patient] IN TableCondition1
&& 'Table unpivot'[Condition] = SELECTEDVALUE('Condition 2'[Condition])
&& 'Table unpivot'[Value] = 1
)

RETURN COUNTROWS ( TableCondition2 )

In the first variable, TableCondition1, I create a virtual table per condition from the table Condition 1, where this condition is one. So if we look at the first image, this is a table with patients 1,2, 5, 6, 8, & 9 for Diabetes. From this virtual table, I only want the patients, that’s why I use the selected columns.

In TableCondition2, I create a new virtual table but with only the patients from TableCondition1 and then check which of the Conditions from Condition 2 have the value 1. For Heart Failure, that will be Patient 1, 5, 8 & 9 . In the end, we count those rows, which will make it 4. After putting this measure in the matrix:

And there you go, a very simple solution that might give the insights you need!

Take care.

Originally published at http://sidequests.blog on August 1, 2023.

--

--