Map database schema in Power BI
I've come across a video on youtube that describes How to Easily Map Your Database Schema in Power BI using the AdventureWorks database from Microsoft. Now I'm trying to replicate that example using another database. The problem is that many of my columns have got similar content, but different column names with prefixes such as
fk_ depending on which tables they are located in. And that causes the following query to fail:
SELECT c.TABLE_NAME ,c.COLUMN_NAME FROM INFORMATION_SCHEMA.COLUMNS c INNER JOIN (SELECT COLUMN_NAME FROM INFORMATION_SCHEMA.COLUMNS GROUP BY COLUMN_NAME HAVING COUNT(*) > 1 ) dupes ON dupes.COLUMN_NAME = c.COLUMN_NAME
Does anyone know if it's possible to fuzzy match column names or taking different prefixes into account to make this work? The very same question has been asked directly to the youtube OP. It can also be found on reddit.com, but the question remains unanswered.
I'm trying to wrap my head around some more advanced Power BI features and at the same time learn some much needed SQL, and I thought this would be a cool place to start, so any help is much appreciated!
1 Answers Map database schema in Power BI
If you want to show relationships between tables then using common column names between two tables is not best idea.
CREATE TABLE tab(id INT PRIMARY KEY, name INT); CREATE TABLE tab2(id2 INT PRIMARY KEY, name INT); -- completely unrelated tables SELECT c.TABLE_NAME ,c.COLUMN_NAME FROM INFORMATION_SCHEMA.COLUMNS c INNER JOIN (SELECT COLUMN_NAME FROM INFORMATION_SCHEMA.COLUMNS GROUP BY COLUMN_NAME HAVING COUNT(*) > 1 ) dupes ON dupes.COLUMN_NAME = c.COLUMN_NAME +-------------+-------------+ | TABLE_NAME | COLUMN_NAME | +-------------+-------------+ | tab | name | | tab2 | name | +-------------+-------------+
I would propose to use proper metadata views i.e. sys.foreign_key_columns:
SELECT [table] = tab1.name, [column] = col1.name, [referenced_table] = tab2.name, [referenced_column] = col2.name FROM sys.foreign_key_columns fkc JOIN sys.objects obj ON obj.object_id = fkc.constraint_object_id JOIN sys.tables tab1 ON tab1.object_id = fkc.parent_object_id JOIN sys.schemas sch ON tab1.schema_id = sch.schema_id JOIN sys.columns col1 ON col1.column_id = parent_column_id AND col1.object_id = tab1.object_id JOIN sys.tables tab2 ON tab2.object_id = fkc.referenced_object_id JOIN sys.columns col2 ON col2.column_id = referenced_column_id AND col2.object_id = tab2.object_id;
Then you need to choose appropriate visualisation method in PowerBI.