Code Monkey home page Code Monkey logo

fabric_cat_tools's Introduction

fabric_cat_tools

This is a python library intended to be used in Microsoft Fabric notebooks. This library was originally intended to contain functions used for migrating semantic models to Direct Lake mode. However, it quickly became apparent that functions within such a library could support many other useful activities in the realm of semantic models, reports, lakehouses and really anything Fabric-related. As such, this library contains a variety of functions ranging from running Vertipaq Analyzer or the Best Practice Analyzer against a semantic model to seeing if any lakehouse tables hit Direct Lake guardrails and more.

Instructions for migrating import/DirectQuery semantic models to Direct Lake mode can be found here.

If you encounter any issues, please raise a bug.

If you have ideas for new features/functions, please request a feature.

Install the .whl file in a Fabric notebook

%pip install "https://raw.githubusercontent.com/m-kovalsky/fabric_cat_tools/main/fabric_cat_tools-0.3.3-py3-none-any.whl"

Once installed, run this code to import the library into your notebook

import fabric_cat_tools as fct

Load fabric_cat_tools into a custom Fabric environment

An even better way to ensure the fabric_cat_tools library is available in your workspace/notebooks is to load it as a library in a custom Fabric environment. If you do this, you will not have to run the above '%pip install' code every time in your notebook. Please follow the steps below.

Create a custom environment

  1. Navigate to your Fabric workspace
  2. Click 'New' -> More options
  3. Within 'Data Science', click 'Environment'
  4. Name your environment, click 'Create'

Add fabric_cat_tools as a library to the environment

  1. Download the latest fabric_cat_tools library
  2. Within 'Custom Libraries', click 'upload'
  3. Upload the .whl file which was downloaded in step 1
  4. Click 'Save' at the top right of the screen
  5. Click 'Publish' at the top right of the screen
  6. Click 'Publish All'

Update your notebook to use the new environment (must wait for the environment to finish publishing)

  1. Navigate to your Notebook
  2. Select your newly created environment within the 'Environment' drop down in the navigation bar at the top of the notebook

Function Categories

Semantic Model

Report

Model Optimization

Direct Lake Migration

Direct Lake

Lakehouse

Add/remove objects from a semantic model

Misc

Helper Functions

Functions

add_data_column

Adds a data column to a semantic model.

import fabric_cat_tools as fct
fct.add_data_column(
        dataset = 'AdventureWorks'
        ,table_name = 'Internet Sales'
        ,column_name = 'SalesAmount'
        ,source_column = 'SalesAmount'
        ,data_type =  'Int64'
        #,format_string = ''
        #,display_folder = ''
        #,workspace = '' 
        )

Parameters

dataset str

Required; Name of the semantic model.

table_name str

Required; Name of the table in which the column will reside.

column_name str

Required; Name of the column.

source_column str

Required; Name of the column in the source system.

data_type str

Required; Data type of the column. Options: 'Int64', 'String', 'Double', 'Decimal', 'DateTime', 'Boolean'.

format_string str

Optional; Format string of the column.

description str

Optional; Description of the column.

display_folder str

Optional; Display folder of the column.

workspace str

Optional; The workspace where the semantic model resides.

Returns

A printout stating the success/failure of the operation.


add_field_parameter

Adds a field parameter to a semantic model.

import fabric_cat_tools as fct
fct.add_field_parameter(
            dataset = 'AdventureWorks'
            ,table_name = 'Parameter'
            ,objects = ["[Sales Amount]", "[Order Qty]", "'Internet Sales'[Color]"]
            #,workspace = '' 
            )

Parameters

dataset str

Required; Name of the semantic model.

table_name str

Required; Name of the field parameter table.

objects list of str

Required; List of columns/measures to be included in the field parameter. Columns are fully qualified 'TableName'[ColumnName] and measures are in square brackets [MeasureName].

workspace str

Optional; The workspace where the semantic model resides.

Returns

A printout stating the success/failure of the operation.


add_hierarchy

Adds a hierarchy to a semantic model.

import fabric_cat_tools as fct
fct.add_hierarchy(
            dataset = 'AdventureWorks'
            ,table_name = 'Geography'
            ,hierarchy_name = 'Geography Hierarchy'
            ,levels = ['Continent', 'Country', 'City']
            #,workspace = '' 
            )

Parameters

dataset str

Required; Name of the semantic model.

table_name str

Required; Name of the table in which the hierarchy will reside.

hierarchy_name str

Required; Name of the hierarchy.

levels list of str

Required; List of columns to be included as levels in the hierarchy.

workspace_name str

Optional; The workspace where the semantic model resides.

Returns

A printout stating the success/failure of the operation.


add_measure

Adds a measure to a semantic model.

import fabric_cat_tools as fct
fct.add_measure(
        dataset = 'AdventureWorks'
        ,table_name = 'Internet Sales'
        ,measure_name = 'Sales Amount'
        ,expression =  "SUM( 'Internet Sales'[SalesAmount] )"
        #,display_folder = ''
        #,format_string = ''
        #,workspace = '' 
        )

Parameters

dataset str

Required; Name of the semantic model.

table_name str

Required; Name of the table in which the measure will reside.

measure_name str

Required; Name of the measure.

expression str

Required; DAX expression for the measure.

display_folder str

Optional; Display folder for the measure.

format_string str

Optional; Format string for the measure.

workspace str

Optional; The workspace where the semantic model resides.

Returns

A printout stating the success/failure of the operation.


add_relationship

Adds a relationship to a semantic model.

import fabric_cat_tools as fct
fct.add_relationship(
            dataset = 'AdventureWorks'
            ,from_table = 'Internet Sales'
            ,from_column = 'ProductKey'
            ,to_table = 'Product'
            ,to_column = 'ProductKey'
            ,from_cardinality = 'Many'
            ,to_cardinality = 'One'
            #,workspace = '' 
            )

Parameters

dataset str

Required; Name of the semantic model.

from_table str

Required; Name of the table on the 'from' side of the relationship

to_table str

Required; Name of the table on the 'to' side of the relationship

from_column str

Required; Name of the column on the 'from' side of the relationship

to_column str

Required; Name of the column on the 'to' side of the relationship

from_cardinality str

Required; Cardinality on the 'from' side of the relationship. Options: ('Many', 'One', None').

to_cardinality str

Required; Cardinality on the 'to' side of the relationship. Options: ('Many', 'One', None').

cross_filtering_behavior str

Optional; Setting for the cross filtering behavior of the relationship. Options: ('Automatic', 'OneDirection', 'BothDirections'). Default value: 'Automatic'.

security_filtering_behavior str

Optional; Setting for the security filtering behavior of the relationship. Options: ('None', 'OneDirection', 'BothDirections'). Default value: 'OneDirection'.

is_active bool

Optional; Setting for whether the relationship is active or not. Default value: True.

rely_on_referential_integrity bool

Optional; Setting for the rely on referential integrity of the relationship. Default value: True.

workspace str

Optional; The workspace where the semantic model resides.

Returns

A printout stating the success/failure of the operation.


add_role

Adds a role to a semantic model.

import fabric_cat_tools as fct
fct.add_role(
            dataset = 'AdventureWorks'
            ,role_name = 'Reader'
            ,role_description = 'This role is for...'
            #,workspace = '' 
            )

Parameters

dataset str

Required; Name of the semantic model.

role_name str

Required; Name of the role.

role_description str

Optional; Description of the role.

workspace str

Optional; The workspace where the semantic model resides.

Returns

A printout stating the success/failure of the operation.


add_rls

Adds row level security to a table within a role to a semantic model.

import fabric_cat_tools as fct
fct.add_rls(
            dataset = 'AdventureWorks'
            ,role_name = 'Reader'
            ,table_name = 'UserGeography'
            ,filter_expression = "'UserGeography'[UserEmail] = USERPRINCIPALNAME()"
            #,workspace = '' 
            )

Parameters

dataset str

Required; Name of the semantic model.

role_name str

Required; Name of the role to apply row level security.

table_name str

Required; Name of the table to apply row level security.

filter_expression str

Required; DAX expression for the row low level security.

workspace str

Optional; The workspace where the semantic model resides.

Returns

A printout stating the success/failure of the operation.


cancel_dataset_refresh

Cancels the refresh of a semantic model which was executed via the Enhanced Refresh API.

import fabric_cat_tools as fct
fct.cancel_dataset_refresh(
            dataset = 'MyReport'
            #,request_id = None
            #,workspace = None
            )

Parameters

dataset str

Required; Name of the semantic model.

request_id str

Optional; The request id of a semantic model refresh. Defaults to finding the latest active refresh of the semantic model.

workspace str

Optional; The workspace where the semantic model resides.

Returns

A printout stating the success/failure of the operation.


check_fallback_reason

Shows the reason a table in a Direct Lake semantic model would fallback to Direct Query.

Note

This function is only relevant to semantic models in Direct Lake mode.

import fabric_cat_tools as fct
fct.check_fallback_reason(
            dataset = 'AdventureWorks'
            #,workspace = '' 
            )

Parameters

dataset str

Required; Name of the semantic model.

workspace str

Optional; The workspace where the semantic model resides.

Returns

Pandas dataframe showing the tables in the semantic model and their fallback reason.


clear_cache

Clears the cache of a semantic model.

import fabric_cat_tools as fct
fct.clear_cache(
            dataset = 'AdventureWorks'
            #,workspace = '' 
            )

Parameters

dataset str

Required; Name of the semantic model.

workspace str

Optional; The workspace where the semantic model resides.

Returns

A printout stating the success/failure of the operation.


clone_report

Makes a clone of a Power BI report

import fabric_cat_tools as fct
fct.clone_report(
            report = 'MyReport'
            ,cloned_report = 'MyNewReport'
            #,workspace = None
            #,target_workspace = None
            #,target_dataset = None
            )

Parameters

report str

Required; Name of the report to be cloned.

cloned_report str

Required; Name of the new report.

workspace str

Optional; The workspace where the original report resides.

target_workspace str

Optional; The workspace where the new report will reside. Defaults to using the workspace in which the original report resides.

target_dataset str

Optional; The semantic model from which the new report will be connected. Defaults to using the semantic model used by the original report.

Returns

A printout stating the success/failure of the operation.


control_fallback

Set the DirectLakeBehavior for a semantic model.

Note

This function is only relevant to semantic models in Direct Lake mode.

import fabric_cat_tools as fct
fct.control_fallback(
            dataset = 'AdventureWorks'
            ,direct_lake_behavior = 'DirectLakeOnly'            
            #,workspace = '' 
            )

Parameters

dataset str

Required; Name of the semantic model.

direct_lake_behavior str

Required; Setting for Direct Lake Behavior. Options: ('Automatic', 'DirectLakeOnly', 'DirectQueryOnly').

workspace str

Optional; The workspace where the semantic model resides.

Returns

A printout stating the success/failure of the operation.


create_blank_semantic_model

Creates a new blank semantic model (no tables/columns etc.).

import fabric_cat_tools as fct
fct.create_blank_semantic_model(
            dataset = 'AdventureWorks'
            #,workspace = None
            )

Parameters

dataset str

Required; Name of the semantic model.

compatibility_level int

Optional; Setting for the compatibility level of the semantic model. Default value: 1605.

workspace str

Optional; The workspace where the semantic model resides.

Returns

A printout stating the success/failure of the operation.


create_pqt_file

Dynamically generates a Power Query Template file based on the semantic model. The .pqt file is saved within the Files section of your lakehouse.

import fabric_cat_tools as fct
fct.create_pqt_file(
            dataset = 'AdventureWorks'
            #,file_name = 'PowerQueryTemplate'
            #,workspace = '' 
            )

Parameters

dataset str

Required; Name of the import/DirectQuery semantic model.

file_name str

Optional; TName of the Power Query Template (.pqt) file to be created.

workspace str

Optional; The workspace where the semantic model resides.

Returns

A printout stating the success/failure of the operation.


create_report_from_reportjson

Creates a report based on a report.json file (and an optional themes.json file).

import fabric_cat_tools as fct
fct.create_report_from_reportjson(
            report = 'MyReport'
            ,dataset = 'AdventureWorks'
            ,report_json = ''
            #,theme_json = ''
            #,workspace = ''
            )

Parameters

report str

Required; Name of the report.

dataset str

Required; Name of the semantic model to connect to the report.

report_json Dict or str

Required; The report.json file to be used to create the report.

theme_json Dict or str

Optional; The theme.json file to be used for the theme of the report.

workspace str

Optional; The workspace where the semantic model resides.

Returns

A printout stating the success/failure of the operation.


create_semantic_model_from_bim

Creates a new semantic model based on a Model.bim file.

import fabric_cat_tools as fct
fct.create_semantic_model_from_bim(
            dataset = 'AdventureWorks'
            ,bim_file = ''
            #,workspace = ''
            )

Parameters

dataset str

Required; Name of the semantic model.

bim_file Dict or str

Required; The model.bim file to be used to create the semantic model.

workspace str

Optional; The workspace where the semantic model resides.

Returns

A printout stating the success/failure of the operation.


create_shortcut_onelake

Creates a shortcut to a delta table in OneLake.

import fabric_cat_tools as fct
fct.create_shortcut_onelake(
            table_name = 'DimCalendar'
            ,source_lakehouse = 'Lakehouse1'
            ,source_workspace = 'Workspace1'
            ,destination_lakehouse = 'Lakehouse2'
            #,destination_workspace = ''
            ,shortcut_name = 'Calendar'
            )

Parameters

table_name str

Required; The table name for which a shortcut will be created.

source_lakehouse str

Required; The lakehouse in which the table resides.

sourceWorkspace str

Required; The workspace where the source lakehouse resides.

destination_lakehouse str

Required; The lakehouse where the shortcut will be created.

destination_workspace str

Optional; The workspace in which the shortcut will be created. Defaults to the 'sourceWorkspaceName' parameter value.

shortcut_name str

Optional; The name of the shortcut 'table' to be created. This defaults to the 'tableName' parameter value.

Returns

A printout stating the success/failure of the operation.


create_warehouse

Creates a warehouse in Fabric.

import fabric_cat_tools as fct
fct.create_warehouse(
            warehouse = 'MyWarehouse'
            ,workspace = None
            )

Parameters

warehouse str

Required; Name of the warehouse.

description str

Optional; Description of the warehouse.

workspace str

Optional; The workspace where the warehouse will reside.

Returns

A printout stating the success/failure of the operation.


delete_shortcut

Deletes a OneLake shortcut.

import fabric_cat_tools as fct
fct.delete_shortcut(
            shortcut_name = 'DimCalendar'
            ,lakehouse = 'Lakehouse1'
            ,workspace = 'Workspace1'
            )

Parameters

shortcut_name str

Required; The name of the OneLake shortcut to delete.

lakehouse str

Optional; The lakehouse in which the shortcut resides.

workspace str

Optional; The workspace where the lakehouse resides.

Returns

A printout stating the success/failure of the operation.


direct_lake_schema_compare

Checks that all the tables in a Direct Lake semantic model map to tables in their corresponding lakehouse and that the columns in each table exist.

Note

This function is only relevant to semantic models in Direct Lake mode.

import fabric_cat_tools as fct
fct.direct_lake_schema_compare(
            dataset = 'AdventureWorks'
            ,workspace = ''
            #,lakehouse = ''
            #,lakehouse_workspace = ''
            )

Parameters

dataset str

Required; Name of the semantic model.

workspace str

Optional; The workspace where the semantic model resides.

lakehouse str

Optional; The lakehouse used by the Direct Lake semantic model.

lakehouse_workspace str

Optional; The workspace in which the lakehouse resides.

Returns

Shows tables/columns which exist in the semantic model but do not exist in the corresponding lakehouse.


direct_lake_schema_sync

Shows/adds columns which exist in the lakehouse but do not exist in the semantic model (only for tables in the semantic model).

Note

This function is only relevant to semantic models in Direct Lake mode.

import fabric_cat_tools as fct
fct.direct_lake_schema_sync(
     dataset = 'AdvWorks'
    ,add_to_model = True
    #,workspace = ''
    #,lakehouse = ''
    #,lakehouse_workspace = ''
    )

Parameters

dataset str

Required; Name of the semantic model.

add_to_model bool

Optional; Adds columns which exist in the lakehouse but do not exist in the semantic model. No new tables are added. Default value: False.

workspace str

Optional; The workspace where the semantic model resides.

lakehouse str

Optional; The lakehouse used by the Direct Lake semantic model.

lakehouse_workspace str

Optional; The workspace in which the lakehouse resides.

Returns

A list of columns which exist in the lakehouse but not in the Direct Lake semantic model. If 'add_to_model' is set to True, a printout stating the success/failure of the operation is returned.


export_model_to_onelake

Exports a semantic model's tables to delta tables in the lakehouse. Creates shortcuts to the tables if a lakehouse is specified.

Important

This function requires:

XMLA read/write to be enabled on the Fabric capacity.

OneLake Integration feature to be enabled within the semantic model settings.

import fabric_cat_tools as fct
fct.export_model_to_onelake(
            dataset = 'AdventureWorks'
            ,workspace = None
            ,destination_lakehouse = 'Lakehouse2'            
            ,destination_workspace = 'Workspace2'
            )

Parameters

dataset str

Required; Name of the semantic model.

workspace str

Optional; The workspace where the semantic model resides.

destination_lakehouse str

Optional; The lakehouse where shortcuts will be created to access the delta tables created by the export. If the lakehouse specified does not exist, one will be created with that name. If no lakehouse is specified, shortcuts will not be created.

destination_workspace str

Optional; The workspace in which the lakehouse resides.

Returns

A printout stating the success/failure of the operation.


export_report

Exports a Power BI report to a file in your lakehouse.

import fabric_cat_tools as fct
fct.export_report(
            report = 'AdventureWorks'
            ,export_format = 'PDF'
            #,file_name = None
            #,bookmark_name = None
            #,page_name = None
            #,visual_name = None
            #,workspace = None
            )
import fabric_cat_tools as fct
fct.export_report(
            report = 'AdventureWorks'
            ,export_format = 'PDF'
            #,file_name = 'Exports\MyReport'
            #,bookmark_name = None
            #,page_name = 'ReportSection293847182375'
            #,visual_name = None
            #,workspace = None
            )
import fabric_cat_tools as fct
fct.export_report(
            report = 'AdventureWorks'
            ,export_format = 'PDF'
            #,page_name = 'ReportSection293847182375'
            #,report_filter = "'Product Category'[Color] in ('Blue', 'Orange') and 'Calendar'[CalendarYear] <= 2020"
            #,workspace = None
            )
import fabric_cat_tools as fct
fct.export_report(
            report = 'AdventureWorks'
            ,export_format = 'PDF'
            #,page_name = ['ReportSection293847182375', 'ReportSection4818372483347']
            #,workspace = None
            )
import fabric_cat_tools as fct
fct.export_report(
            report = 'AdventureWorks'
            ,export_format = 'PDF'
            #,page_name = ['ReportSection293847182375', 'ReportSection4818372483347']
            #,visual_name = ['d84793724739', 'v834729234723847']
            #,workspace = None
            )

Parameters

report str

Required; Name of the semantic model.

export_format str

Required; The format in which to export the report. See this link for valid formats: https://learn.microsoft.com/rest/api/power-bi/reports/export-to-file-in-group#fileformat. For image formats, enter the file extension in this parameter, not 'IMAGE'.

file_name str

Optional; The name of the file to be saved within the lakehouse. Do not include the file extension. Defaults ot the reportName parameter value.

bookmark_name str

Optional; The name (GUID) of a bookmark within the report.

page_name str or list of str

Optional; The name (GUID) of the report page.

visual_name str or list of str

Optional; The name (GUID) of a visual. If you specify this parameter you must also specify the page_name parameter.

report_filter str

Optional; A report filter to be applied when exporting the report. Syntax is user-friendly. See above for examples.

workspace str

Optional; The workspace where the report resides.

Returns

A printout stating the success/failure of the operation.


generate_embedded_filter

Runs a DAX query against a semantic model.

import fabric_cat_tools as fct
fct.generate_embedded_filter(
            filter = "'Product'[Product Category] = 'Bikes' and 'Geography'[Country Code] in (3, 6, 10)"       
            )

Parameters

filter str

Returns

A string converting the filter into an embedded filter


get_direct_lake_guardrails

Shows the guardrails for when Direct Lake semantic models will fallback to Direct Query based on Microsoft's online documentation.

import fabric_cat_tools as fct
fct.get_direct_lake_guardrails()

Parameters

None

Returns

A table showing the Direct Lake guardrails by SKU.


get_directlake_guardrails_for_sku

Shows the guardrails for Direct Lake based on the SKU used by your workspace's capacity.

Use the result of the 'get_sku_size' function as an input for this function's skuSize parameter.

import fabric_cat_tools as fct
fct.get_directlake_guardrails_for_sku(
            sku_size = ''
            )

Parameters

sku_size str

Required; Sku size of a workspace/capacity

Returns

A table showing the Direct Lake guardrails for the given SKU.


get_direct_lake_lakehouse

Identifies the lakehouse used by a Direct Lake semantic model.

Note

This function is only relevant to semantic models in Direct Lake mode.

import fabric_cat_tools as fct
fct.get_direct_lake_lakehouse(
            dataset = 'AdventureWorks'
            #,workspace = ''
            #,lakehouse = ''
            #,lakehouse_workspace = ''            
            )

Parameters

dataset str

Required; Name of the semantic model.

workspace str

Optional; The workspace where the semantic model resides.

lakehouse str

Optional; Name of the lakehouse used by the semantic model.

lakehouse_workspace str

Optional; The workspace where the lakehouse resides.


get_direct_lake_sql_endpoint

Identifies the lakehouse used by a Direct Lake semantic model.

Note

This function is only relevant to semantic models in Direct Lake mode.

import fabric_cat_tools as fct
fct.get_direct_lake_sql_endpoint(
            dataset = 'AdventureWorks'
            #,workspace = ''       
            )

Parameters

dataset str

Required; Name of the semantic model.

workspace str

Optional; The workspace where the semantic model resides.

Returns

A string containing the SQL Endpoint ID for a Direct Lake semantic model.


get_lakehouse_columns

Shows the tables and columns of a lakehouse and their respective properties.

import fabric_cat_tools as fct
fct.get_lakehouse_columns(
            lakehouse = 'AdventureWorks'
            #,workspace = '' 
            )

Parameters

lakehouse str

Optional; The lakehouse name.

workspace str

Optional; The workspace where the lakehouse resides.

Returns

A pandas dataframe showing the tables/columns within a lakehouse and their properties.


get_lakehouse_tables

Shows the tables of a lakehouse and their respective properties. Option to include additional properties relevant to Direct Lake guardrails.

import fabric_cat_tools as fct
fct.get_lakehouse_tables(
        lakehouse = 'MyLakehouse'
        #,workspace = ''
        ,extended = True
        ,count_rows = True)

Parameters

lakehouse str

Optional; The lakehouse name.

workspace str

Optional; The workspace where the lakehouse resides.

extended bool

Optional; Adds the following additional table properties ['Files', 'Row Groups', 'Table Size', 'Parquet File Guardrail', 'Row Group Guardrail', 'Row Count Guardrail']. Also indicates the SKU for the workspace and whether guardrails are hit. Default value: False.

count_rows bool

Optional; Adds an additional column showing the row count of each table. Default value: False.

export bool

Optional; If specified as True, the resulting dataframe will be exported to a delta table in your lakehouse.

Returns

A pandas dataframe showing the delta tables within a lakehouse and their properties.


get_measure_dependencies

Shows all dependencies for all measures in a semantic model

import fabric_cat_tools as fct
fct.get_measure_dependencies(
            dataset = 'AdventureWorks'
            #,workspace = None
            )

Parameters

dataset str

Required; Name of the semantic model.

workspace str

Optional; The workspace where the semantic model resides.

Returns

A pandas dataframe showing all dependencies for all measures in the semantic model.


get_object_level_security

Shows a list of columns used in object level security.

import fabric_cat_tools as fct
fct.get_object_level_security(
        dataset = 'AdventureWorks'
        ,workspace = '')

Parameters

dataset str

Optional; The semantic model name.

workspace str

Optional; The workspace where the semantic model resides.

Returns

A pandas dataframe showing the columns used in object level security within a semantic model.


get_report_json

Gets the report.json file content of a Power BI report.

import fabric_cat_tools as fct
fct.get_report_json(
            report = 'MyReport'
            #,workspace = None
            )
import fabric_cat_tools as fct
fct.get_report_json(
            report = 'MyReport'
            #,workspace = None
            ,save_to_file_name = 'MyFileName'
            )

Parameters

report str

Required; Name of the report.

workspace str

Optional; The workspace where the report resides.

save_to_file_name str

Optional; Specifying this parameter will save the report.json file to your lakehouse with the file name of this parameter.

Returns

The report.json file for a given Power BI report.


get_semantic_model_bim

Extracts the Model.bim file for a given semantic model.

import fabric_cat_tools as fct
fct.get_semantic_model_bim(
            dataset = 'AdventureWorks'
            #,workspace = None
            )
import fabric_cat_tools as fct
fct.get_semantic_model_bim(
            dataset = 'AdventureWorks'
            #,workspace = None
            ,save_to_file_name = 'MyFileName'
            )

Parameters

dataset str

Required; Name of the semantic model.

workspace str

Optional; The workspace where the semantic model resides.

save_to_file_name str

Optional; Specifying this parameter will save the model.bim file to your lakehouse with the file name of this parameter.

Returns

The model.bim file for a given semantic model.


get_shared_expression

Dynamically generates the M expression used by a Direct Lake model for a given lakehouse.

import fabric_cat_tools as fct
fct.get_shared_expression(
            lakehouse = ''
            #,workspace = '' 
            )

Parameters

lakehouse str

Optional; The lakehouse name.

workspace str

Optional; The workspace where the lakehouse resides.

Returns

A string showing the expression which can be used to connect a Direct Lake semantic model to its SQL Endpoint.


get_sku_size

Shows the SKU size for a workspace.

import fabric_cat_tools as fct
fct.get_sku_size(
            workspace = '' 
            )

Parameters

workspace str

Optional; The workspace where the semantic model resides.

Returns

A string containing the SKU size for a workspace.


import_vertipaq_analyzer

Imports and visualizes the vertipaq analyzer info from a saved .zip file in your lakehouse.

import fabric_cat_tools as fct
fct.import_vertipaq_analyzer(
          folder_path = '/lakehouse/default/Files/VertipaqAnalyzer'
          ,file_name = 'Workspace Name-DatasetName.zip'
          )

Parameters

folder_path str

Required; Folder within your lakehouse in which the .zip file containing the vertipaq analyzer info has been saved.

file_name str

Required; File name of the file which contains the vertipaq analyzer info.


launch_report

Shows a Power BI report within a Fabric notebook.

import fabric_cat_tools as fct
fct.launch_report(
          report = 'MyReport'
          #,workspace = None
          )

Parameters

report str

Required; The name of the report.

workspace str

Optional; The name of the workspace in which the report resides.


list_dashboards

Shows the dashboards within the workspace.

import fabric_cat_tools as fct
fct.list_dashboards(
            #workspace = '' 
            )

Parameters

workspace str

Optional; The workspace name.

Returns

A pandas dataframe showing the dashboards which exist in the workspace.


list_dataflow_storage_accounts

Shows the dataflow storage accounts.

import fabric_cat_tools as fct
fct.list_dataflow_storage_accounts()

Parameters

None

Returns

A pandas dataframe showing the accessible dataflow storage accounts.


list_direct_lake_model_calc_tables

Shows the calculated tables and their respective DAX expression for a Direct Lake model (which has been migrated from import/DirectQuery.

Note

This function is only relevant to semantic models in Direct Lake mode.

import fabric_cat_tools as fct
fct.list_direct_lake_model_calc_tables(
            dataset = 'AdventureWorks'
            #,workspace = '' 
            )

Parameters

dataset str

Required; Name of the semantic model.

workspace str

Optional; The workspace where the semantic model resides.

Returns

A pandas dataframe showing the calculated tables which were migrated to Direct Lake and whose DAX expressions are stored as model annotations.


list_lakehouses

Shows the properties associated with lakehouses in a workspace.

import fabric_cat_tools as fct
fct.list_lakehouses(
            workspace = None
            )

Parameters

workspaceName str

Optional; The workspace where the lakehouse resides.

Returns

A pandas dataframe showing the properties of a all lakehouses in a workspace.


list_shortcuts

Shows the shortcuts within a lakehouse

import fabric_cat_tools as fct
fct.list_shortcuts(
            lakehouse = 'MyLakehouse'
            #,workspace = '' 
            )

Parameters

lakehouse str

Optional; Name of the lakehouse.

workspace str

Optional; The workspace where the lakehouse resides.

Returns

A pandas dataframe showing the shortcuts which exist in a given lakehouse and their properties.


measure_dependency_tree

Shows a measure dependency tree of all dependent objects for a measure in a semantic model.

import fabric_cat_tools as fct
fct.measure_dependency_tree(
            dataset = 'AdventureWorks'
            ,measure_name = 'Sales Amount'
            #,workspace = '' 
            )

Parameters

dataset str

Required; Name of the semantic model.

measure_name str

Required; Name of the measure to use for building a dependency tree.

workspace str

Optional; The workspace where the semantic model resides.

Returns

A tree view showing the dependencies for a given measure within the semantic model.


migrate_calc_tables_to_lakehouse

Creates delta tables in your lakehouse based on the DAX expression of a calculated table in an import/DirectQuery semantic model. The DAX expression encapsulating the calculated table logic is stored in the new Direct Lake semantic model as model annotations.

Note

This function is specifically relevant for import/DirectQuery migration to Direct Lake

import fabric_cat_tools as fct
fct.migrate_calc_tables_to_lakehouse(
            dataset = 'AdventureWorks'
            ,new_dataset = 'AdventureWorksDL'
            #,workspace = ''
            #,new_dataset_workspace = ''
            #,lakehouse = ''
            #,lakehouse_workspace = ''
            )

Parameters

dataset str

Required; Name of the import/DirectQuery semantic model.

new_dataset str

Required; Name of the Direct Lake semantic model.

workspace str

Optional; The workspace where the semantic model resides.

new_dataset_workspace str

Optional; The workspace to be used by the Direct Lake semantic model.

lakehouse str

Optional; The lakehouse to be used by the Direct Lake semantic model.

lakehouse_workspace str

Optional; The workspace where the lakehouse resides.

Returns

A printout stating the success/failure of the operation.


migrate_calc_tables_to_semantic_model

Creates new tables in the Direct Lake semantic model based on the lakehouse tables created using the 'migrate_calc_tables_to_lakehouse' function.

Note

This function is specifically relevant for import/DirectQuery migration to Direct Lake

import fabric_cat_tools as fct
fct.migrate_calc_tables_to_semantic_model(
            dataset = 'AdventureWorks'
            ,new_dataset = 'AdventureWorksDL'
            #,workspace = ''
            #,new_dataset_workspace = ''
            #,lakehouse = ''
            #,lakehouse_workspace = ''
            )

Parameters

dataset str

Required; Name of the import/DirectQuery semantic model.

new_dataset str

Required; Name of the Direct Lake semantic model.

workspace str

Optional; The workspace where the semantic model resides.

new_dataset_workspace str

Optional; The workspace to be used by the Direct Lake semantic model.

lakehouse str

Optional; The lakehouse to be used by the Direct Lake semantic model.

lakehouse_workspace str

Optional; The workspace where the lakehouse resides.

Returns

A printout stating the success/failure of the operation.


migrate_field_parameters

Migrates field parameters from one semantic model to another.

Note

This function is specifically relevant for import/DirectQuery migration to Direct Lake

import fabric_cat_tools as fct
fct.migrate_field_parameters(
            dataset = 'AdventureWorks'
            ,new_dataset = ''
            #,workspace = ''
            #,new_dataset_workspace = ''
            )

Parameters

dataset str

Required; Name of the import/DirectQuery semantic model.

new_dataset str

Required; Name of the Direct Lake semantic model.

workspace str

Optional; The workspace where the semantic model resides.

new_dataset_workspace str

Optional; The workspace to be used by the Direct Lake semantic model.

Returns

A printout stating the success/failure of the operation.


migrate_model_objects_to_semantic_model

Adds the rest of the model objects (besides tables/columns) and their properties to a Direct Lake semantic model based on an import/DirectQuery semantic model.

Note

This function is specifically relevant for import/DirectQuery migration to Direct Lake

import fabric_cat_tools as fct
fct.migrate_model_objects_to_semantic_model(
            dataset = 'AdventureWorks'
            ,new_dataset = ''
            #,workspace = ''
            #,new_dataset_workspace = ''
            )

Parameters

dataset str

Required; Name of the import/DirectQuery semantic model.

new_dataset str

Required; Name of the Direct Lake semantic model.

workspace str

Optional; The workspace where the semantic model resides.

new_dataset_workspace str

Optional; The workspace to be used by the Direct Lake semantic model.

Returns

A printout stating the success/failure of the operation.


migrate_tables_columns_to_semantic_model

Adds tables/columns to the new Direct Lake semantic model based on an import/DirectQuery semantic model.

Note

This function is specifically relevant for import/DirectQuery migration to Direct Lake

import fabric_cat_tools as fct
fct.migrate_tables_columns_to_semantic_model(
            dataset = 'AdventureWorks'
            ,new_dataset = 'AdventureWorksDL'
            #,workspace = ''
            #,new_dataset_workspace = ''
            #,lakehouse = ''
            #,lakehouse_workspace = ''
            )

Parameters

dataset str

Required; Name of the import/DirectQuery semantic model.

new_dataset str

Required; Name of the Direct Lake semantic model.

workspace str

Optional; The workspace where the semantic model resides.

new_dataset_workspace str

Optional; The workspace to be used by the Direct Lake semantic model.

lakehouse str

Optional; The lakehouse to be used by the Direct Lake semantic model.

lakehouse_workspace str

Optional; The workspace where the lakehouse resides.

Returns

A printout stating the success/failure of the operation.


model_bpa_rules

Shows the default Best Practice Rules for the semantic model used by the run_model_bpa function

import fabric_cat_tools as fct
fct.model_bpa_rules()

Returns

A pandas dataframe showing the default semantic model best practice rules.


optimize_lakehouse_tables

Runs the OPTIMIZE function over the specified lakehouse tables.

import fabric_cat_tools as fct
fct.optimize_lakehouse_tables(
            tables = ['Sales', 'Calendar']
            #,lakehouse = None
            #,workspace = None
        )
import fabric_cat_tools as fct
fct.optimize_lakehouse_tables(
            tables = None
            #,lakehouse = 'MyLakehouse'
            #,workspace = 'MyNewWorkspace'
        )

Parameters

tables str or list of str

Required; Name(s) of the lakehouse delta table(s) to optimize. If 'None' is entered, all of the delta tables in the lakehouse will be queued to be optimized.

lakehouse str

Optional; Name of the lakehouse.

workspace str

Optional; The workspace where the lakehouse resides.

Returns

A printout stating the success/failure of the operation.


refresh_calc_tables

Recreates the delta tables in the lakehouse based on the DAX expressions stored as model annotations in the Direct Lake semantic model.

Note

This function is only relevant to semantic models in Direct Lake mode.

import fabric_cat_tools as fct
fct.refresh_calc_tables(
            dataset = 'AdventureWorks'
            #,workspace = '' 
            )

Parameters

dataset str

Required; Name of the semantic model.

workspace str

Optional; The workspace where the semantic model resides.

Returns

A printout stating the success/failure of the operation.


refresh_semantic_model

Performs a refresh on a semantic model.

import fabric_cat_tools as fct
fct.refresh_semantic_model(
            dataset = 'AdventureWorks'
            ,refresh_type = 'full'
            #,workspace = '' 
            )

Parameters

dataset str

Required; Name of the semantic model.

refresh_type str

Optional; Type of processing to perform. Options: ('full', 'automatic', 'dataOnly', 'calculate', 'clearValues', 'defragment'). Default value: 'full'.

workspace str

Optional; The workspace where the semantic model resides.

Returns

A printout stating the success/failure of the operation.


remove_column

Removes a column (or multiple columns) in a semantic model.

import fabric_cat_tools as fct
fct.remove_column(
            dataset = 'AdventureWorks'
            ,table_name = ['Internet Sales', 'Geography']
            ,column_name = ['SalesAmount', 'GeographyKey']
            #,workspace = None
            )

Parameters

dataset str

Required; Name of the semantic model.

table_name str or list of str

Required; Name of the column's table(s).

column_name str or list of str

Required; Name of the column(s).

workspace str

Optional; The workspace where the semantic model resides.

Returns

A printout stating the success/failure of the operation.


remove_measure

Removes a measure (or multiple measures) in a semantic model.

import fabric_cat_tools as fct
fct.remove_measure(
            dataset = 'AdventureWorks'
            ,measure_name = ['Sales Amount', 'Order Quantity']
            #,workspace = '' 
            )

Parameters

dataset str

Required; Name of the semantic model.

measure_name str or list of str

Required; Name of the measure(s).

workspace str

Optional; The workspace where the semantic model resides.

Returns

A printout stating the success/failure of the operation.


remove_table

Removes a table (or multiple tables) in a semantic model.

import fabric_cat_tools as fct
fct.remove_table(
            dataset = 'AdventureWorks'
            ,table_name = ['Internet Sales', 'Geography']
            #,workspace = '' 
            )

Parameters

dataset str

Required; Name of the semantic model.

table_name str or list of str

Required; Name of the table(s).

workspace str

Optional; The workspace where the semantic model resides.

Returns

A printout stating the success/failure of the operation.


report_rebind

Rebinds a report to a semantic model.

import fabric_cat_tools as fct
fct.report_rebind(
            report = ''
            ,dataset = ''
            #,report_workspace = ''
            #,dataset_workspace = ''
            )

Parameters

report str

Required; Name of the report.

dataset str

Required; Name of the semantic model to rebind to the report.

report_workspace str

Optional; The workspace where the report resides.

dataset_workspace str

Optional; The workspace where the semantic model resides.

Returns

A printout stating the success/failure of the operation.


report_rebind_all

Rebinds all reports in a workspace which are bound to a specific semantic model to a new semantic model.

import fabric_cat_tools as fct
fct.report_rebind_all(
            dataset = ''
            ,new_dataset = ''
            #,dataset_workspace = '' 
            #,new_dataset_workspace = '' 
            #,report_workspace = '' 
            )

Parameters

dataset str

Required; Name of the semantic model currently binded to the reports.

new_dataset str

Required; Name of the semantic model to rebind to the reports.

dataset_workspace str

Optional; The workspace where the original semantic model resides.

new_dataset_workspace str

Optional; The workspace where the new semantic model resides.

report_workspace str

Optional; The workspace where the reports reside.

Returns

A printout stating the success/failure of the operation.


resolve_lakehouse_name

Returns the name of the lakehouse for a given lakehouse Id.

import fabric_cat_tools as fct
fct.resolve_lakehouse_name(
        lakehouse_id = ''
        #,workspace = '' 
        )

Parameters

lakehouse_id UUID

Required; UUID object representing a lakehouse.

workspace str

Optional; The workspace where the lakehouse resides.

Returns

A string containing the lakehouse name.


resolve_lakehouse_id

Returns the ID of a given lakehouse.

import fabric_cat_tools as fct
fct.resolve_lakehouse_id(
        lakehouse = 'MyLakehouse'
        #,workspace = '' 
        )

Parameters

lakehouse str

Required; Name of the lakehouse.

workspace str

Optional; The workspace where the lakehouse resides.

Returns

A string conaining the lakehouse ID.


resolve_dataset_id

Returns the ID of a given semantic model.

import fabric_cat_tools as fct
fct.resolve_dataset_id(
        dataset = 'MyReport'
        #,workspace = '' 
        )

Parameters

datasetName str

Required; Name of the semantic model.

workspaceName str

Optional; The workspace where the semantic model resides.

Returns

A string containing the semantic model ID.


resolve_dataset_name

Returns the name of a given semantic model ID.

import fabric_cat_tools as fct
fct.resolve_dataset_name(
        dataset_id = ''
        #,workspace = '' 
        )

Parameters

dataset_id UUID

Required; UUID object representing a semantic model.

workspace str

Optional; The workspace where the semantic model resides.

Returns

A string containing the semantic model name.


resolve_report_id

Returns the ID of a given report.

import fabric_cat_tools as fct
fct.resolve_report_id(
        report = 'MyReport'
        #,workspace = '' 
        )

Parameters

report str

Required; Name of the report.

workspace str

Optional; The workspace where the report resides.

Returns

A string containing the report ID.


resolve_report_name

Returns the name of a given report ID.

import fabric_cat_tools as fct
fct.resolve_report_name(
        report_id = ''
        #,workspace = '' 
        )

Parameters

report_id UUID

Required; UUID object representing a report.

workspace str

Optional; The workspace where the report resides.

Returns

A string containing the report name.


run_dax

Runs a DAX query against a semantic model.

import fabric_cat_tools as fct
fct.run_dax(
            dataset = 'AdventureWorks'
            ,dax_query = 'Internet Sales'
            ,user_name = 'FACT_InternetSales'
            #,workspace = ''          
            )

Parameters

dataset str

Required; Name of the semantic model.

dax_query str

Required; The DAX query to be executed.

user_name str

Optional; The workspace where the semantic model resides.

workspace str

Optional; The workspace where the semantic model resides.

Returns

A pandas dataframe with the results of the DAX query.


run_model_bpa

Runs the Best Practice Rules against a semantic model.

import fabric_cat_tools as fct
fct.run_model_bpa(
        dataset = 'AdventureWorks'
        #,workspace = ''
        )

Parameters

dataset str

Required; Name of the semantic model.

rules_dataframe

Optional; A pandas dataframe including rules to be analyzed.

workspace str

Optional; The workspace where the semantic model resides.

return_dataframe bool

Optional; Returns a pandas dataframe instead of the visualization.

export bool

Optional; Exports the results to a delta table in the lakehouse.

Returns

A visualization showing objects which violate each Best Practice Rule by rule category.


show_unsupported_direct_lake_objects

Returns a list of a semantic model's objects which are not supported by Direct Lake based on official documentation.

import fabric_cat_tools as fct
fct.show_unsupported_direct_lake_objects(
        dataset = 'AdventureWorks'
        #,workspace = '' 
        )

Parameters

dataset str

Required; Name of the semantic model.

workspace str

Optional; The workspace where the semantic model resides.

Returns

A list of objects (tables/columns/relationships) within the semantic model which are currently not supported by Direct Lake mode.


update_direct_lake_model_lakehouse_connection

Remaps a Direct Lake semantic model's SQL Endpoint connection to a new lakehouse.

Note

This function is only relevant to semantic models in Direct Lake mode.

import fabric_cat_tools as fct
fct.update_direct_lake_model_lakehouse_connection(
            dataset = ''
            #,lakehouse = ''
            #,workspace = ''
            )

Parameters

dataset str

Required; Name of the semantic model.

lakehouse str

Optional; Name of the lakehouse.

workspace str

Optional; The workspace where the semantic model resides.

lakehouse_workspace str

Optional; The workspace where the lakehouse resides.

Returns

A printout stating the success/failure of the operation.


update_direct_lake_partition_entity

Remaps a table (or tables) in a Direct Lake semantic model to a table in a lakehouse.

Note

This function is only relevant to semantic models in Direct Lake mode.

import fabric_cat_tools as fct
fct.update_direct_lake_partition_entity(
            dataset = 'AdventureWorks'
            ,table_name = 'Internet Sales'
            ,entity_name = 'FACT_InternetSales'
            #,workspace = ''
            #,lakehouse = ''
            #,lakehouse_workspace = ''            
            )
import fabric_cat_tools as fct
fct.update_direct_lake_partition_entity(
            dataset = 'AdventureWorks'
            ,table_name = ['Internet Sales', 'Geography']
            ,entity_name = ['FACT_InternetSales', 'DimGeography']
            #,workspace = ''
            #,lakehouse = ''
            #,lakehouse_workspace = ''            
            )

Parameters

dataset str

Required; Name of the semantic model.

table_name str or list of str

Required; Name of the table in the semantic model.

entity_name str or list of str

Required; Name of the lakehouse table to be mapped to the semantic model table.

workspace str

Optional; The workspace where the semantic model resides.

lakehouse str

Optional; Name of the lakehouse.

lakehouse_workspace str

Optional; The workspace where the lakehouse resides.

Returns

A printout stating the success/failure of the operation.


update_item

Creates a warehouse in Fabric.

import fabric_cat_tools as fct
fct.update_item(
            item_type = 'Lakehouse'
            ,current_name = 'MyLakehouse'
            ,new_name = 'MyNewLakehouse'
            #,description = 'This is my new lakehouse'
            #,workspace = None
            )

Parameters

item_type str

Required; Type of item to update. Valid options: 'DataPipeline', 'Eventstream', 'KQLDatabase', 'KQLQueryset', 'Lakehouse', 'MLExperiment', 'MLModel', 'Notebook', 'Warehouse'.

current_name str

Required; Current name of the item.

new_name str

Required; New name of the item.

description str

Optional; New description of the item.

workspace str

Optional; The workspace where the item resides.

Returns

A printout stating the success/failure of the operation.


vertipaq_analyzer

Extracts the vertipaq analyzer statistics from a semantic model.

import fabric_cat_tools as fct
fct.vertipaq_analyzer(
        dataset = 'AdventureWorks'
        #,workspace = ''
        ,export = None
        )
import fabric_cat_tools as fct
fct.vertipaq_analyzer(
        dataset = 'AdventureWorks'
        #,workspace = ''
        ,export = 'zip'
        )
import fabric_cat_tools as fct
fct.vertipaq_analyzer(
        dataset = 'AdventureWorks'
        #,workspace = ''
        ,export = 'table'
        )

Parameters

dataset str

Required; Name of the semantic model.

workspace str

Optional; The workspace where the semantic model resides.

export str

Optional; Specifying 'zip' will export the results to a zip file in your lakehouse (which can be imported using the import_vertipaq_analyzer function. Specifying 'table' will export the results to delta tables (appended) in your lakehouse. Default value: None.

lakehouse_workspace str

Optional; The workspace in which the lakehouse used by a Direct Lake semantic model resides.

read_stats_from_data bool

Optional; Setting this parameter to true has the function get Column Cardinality and Missing Rows using DAX (Direct Lake semantic models achieve this using a Spark query to the lakehouse).

Returns

A visualization of the Vertipaq Analyzer statistics.


warm_direct_lake_cache_perspective

Warms the cache of a Direct Lake semantic model by running a simple DAX query against the columns in a perspective

Note

This function is only relevant to semantic models in Direct Lake mode.

import fabric_cat_tools as fct
fct.warm_direct_lake_cache_perspective(
        dataset = 'AdventureWorks'
        ,perspective = 'WarmCache'
        ,add_dependencies = True
        #,workspace = '' 
        )

Parameters

dataset str

Required; Name of the semantic model.

perspective str

Required; Name of the perspective which contains objects to be used for warming the cache.

add_dependencies bool

Optional; Includes object dependencies in the cache warming process.

workspace str

Optional; The workspace where the semantic model resides.

Returns

A printout stating the success/failure of the operation.


warm_direct_lake_cache_isresident

Performs a refresh on the semantic model and puts the columns which were in memory prior to the refresh back into memory.

Note

This function is only relevant to semantic models in Direct Lake mode.

import fabric_cat_tools as fct
fct.warm_direct_lake_cache_isresident(
        dataset = 'AdventureWorks'
        #,workspace = '' 
        )

Parameters

dataset str

Required; Name of the semantic model.

workspace str

Optional; The workspace where the semantic model resides.

Returns

A printout stating the success/failure of the operation.



Direct Lake migration

The following process automates the migration of an import/DirectQuery model to a new Direct Lake model. The first step is specifically applicable to models which use Power Query to perform data transformations. If your model does not use Power Query, you must migrate the base tables used in your semantic model to a Fabric lakehouse.

Check out Nikola Ilic's terrific blog post on this topic!

Check out my blog post on this topic!

Prerequisites

  • Make sure you enable XMLA Read/Write for your capacity
  • Make sure you have a lakehouse in a Fabric workspace
  • Enable the following setting: Workspace -> Workspace Settings -> General -> Data model settings -> Users can edit data models in the Power BI service

Instructions

  1. Download this notebook. Use version 0.2.1 or higher only.
  2. Make sure you are in the 'Data Engineering' persona. Click the icon at the bottom left corner of your Workspace screen and select 'Data Engineering'
  3. In your workspace, select 'New -> Import notebook' and import the notebook from step 1.
  4. Add your lakehouse to your Fabric notebook
  5. Follow the instructions within the notebook.

The migration process

Note

The first 4 steps are only necessary if you have logic in Power Query. Otherwise, you will need to migrate your semantic model source tables to lakehouse tables.

  1. The first step of the notebook creates a Power Query Template (.pqt) file which eases the migration of Power Query logic to Dataflows Gen2.
  2. After the .pqt file is created, sync files from your OneLake file explorer
  3. Navigate to your lakehouse (this is critical!). From your lakehouse, create a new Dataflows Gen2, and import the Power Query Template file. Doing this step from your lakehouse will automatically set the destination for all tables to this lakehouse (instead of having to manually map each one).
  4. Publish the Dataflow Gen2 and wait for it to finish creating the delta lake tables in your lakehouse.
  5. Back in the notebook, the next step will create your new Direct Lake semantic model with the name of your choice, taking all the relevant properties from the orignal semantic model and refreshing/framing your new semantic model.

Note

As of version 0.2.1, calculated tables are also migrated to Direct Lake (as data tables with their DAX expression stored as model annotations in the new semantic model). Additionally, Field Parameters are migrated as they were in the original semantic model (as a calculated table).

  1. Finally, you can easily rebind your all reports which use the import/DQ semantic model to the new Direct Lake semantic model in one click.

Completing these steps will do the following:

  • Offload your Power Query logic to Dataflows Gen2 inside of Fabric (where it can be maintained and development can continue).
  • Dataflows Gen2 will create delta tables in your Fabric lakehouse. These tables can then be used for your Direct Lake model.
  • Create a new semantic model in Direct Lake mode containing all the standard tables and columns, calculation groups, measures, relationships, hierarchies, roles, row level security, perspectives, and translations from your original semantic model.
  • Viable calculated tables are migrated to the new semantic model as data tables. Delta tables are dynamically generated in the lakehouse to support the Direct Lake model. The calculated table DAX logic is stored as model annotations in the new semantic model.
  • Field parameters are migrated to the new semantic model as they were in the original semantic model (as calculated tables). Any calculated columns used in field parameters are automatically removed in the new semantic model's field parameter(s).
  • Non-supported objects are not transferred (i.e. calculated columns, relationships using columns with unsupported data types etc.).
  • Reports used by your original semantic model will be rebinded to your new semantic model.

fabric_cat_tools's People

Contributors

m-kovalsky avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

fabric_cat_tools's Issues

Try to use fabric_cat_tools

Hello!
First of All, thank you for fantastic tools.
I try to use fabric_cat_tools method run_model_bpa, but i find this error:
KeyError: "None of [Index(['Hierarchy Name'], dtype='object')] are in the [columns]"

I don't understand why thi heppened.
Versions:
Runtime 1.2
Spark 3.4
Delta 2.4

I attach the screen of code.
2024-04-09_09h47_03
2024-04-09_09h48_09

Thank you!

run_model_bpa - KeyError: 'To Cardinality'

Hi there, thanks for your awesome tools.

I was trying to use the "run_model_bpa" and when I ran it I get the following error below. This is connecting to a DirectLake semantic model. It only has got a single table. I think it is failing because there are no relationships to traverse (I did try this on a semantic model which does have relationships and it works as expected)

`---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
File ~/cluster-env/trident_env/lib/python3.10/site-packages/pandas/core/indexes/base.py:3653, in Index.get_loc(self, key)
3652 try:
-> 3653 return self._engine.get_loc(casted_key)
3654 except KeyError as err:

File ~/cluster-env/trident_env/lib/python3.10/site-packages/pandas/_libs/index.pyx:147, in pandas._libs.index.IndexEngine.get_loc()

File ~/cluster-env/trident_env/lib/python3.10/site-packages/pandas/_libs/index.pyx:176, in pandas._libs.index.IndexEngine.get_loc()

File pandas/_libs/hashtable_class_helper.pxi:7080, in pandas._libs.hashtable.PyObjectHashTable.get_item()

File pandas/_libs/hashtable_class_helper.pxi:7088, in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'To Cardinality'

The above exception was the direct cause of the following exception:

KeyError Traceback (most recent call last)
Cell In[25], line 2
1 import fabric_cat_tools as fct
----> 2 fct.run_model_bpa(
3 dataset = 'OneLake Storage Semantic Model'
4 # ,workspace = 'NYC-Taxi'
5 ,return_dataframe = True
6 )

File /nfs4/pyenv-ecefaf67-281f-4ba1-9ddb-c957bfa6b98a/lib/python3.10/site-packages/fabric_cat_tools/ModelBPA.py:394, in run_model_bpa(dataset, rules_dataframe, workspace, **kwargs)
391 dfP['Has Date Table'] = hasDateTable
393 # Set dims to dual mode
--> 394 dfR_one = dfR[dfR['To Cardinality'] == 'One']
395 dfTP = dfP_imp.groupby('Table Name')['Partition Name'].count().reset_index()
396 dfTP.rename(columns={'Partition Name': 'Import Partitions'}, inplace=True)

File ~/cluster-env/trident_env/lib/python3.10/site-packages/pandas/core/frame.py:3761, in DataFrame.getitem(self, key)
3759 if self.columns.nlevels > 1:
3760 return self._getitem_multilevel(key)
-> 3761 indexer = self.columns.get_loc(key)
3762 if is_integer(indexer):
3763 indexer = [indexer]

File ~/cluster-env/trident_env/lib/python3.10/site-packages/pandas/core/indexes/base.py:3655, in Index.get_loc(self, key)
3653 return self._engine.get_loc(casted_key)
3654 except KeyError as err:
-> 3655 raise KeyError(key) from err
3656 except TypeError:
3657 # If we have a listlike key, _check_indexing_error will raise
3658 # InvalidIndexError. Otherwise we fall through and re-raise
3659 # the TypeError.
3660 self._check_indexing_error(key)

KeyError: 'To Cardinality'`

Unable to migrate the calculated tables into direct lake mode from import mode semantic model

Describe the bug

I'm trying to migrate existing import mode semantic model into direct lake mode. We have one calculated table which is created using Calendar() DAX function. While doing the migration , we are able to migrate the calculated table into Lake house but failed to load the same calendar table into direct lake mode semantic model. I'm noting down the reference links below which I followed along with the screenshots. Could you please investigate the issue and provide me the solution ?

I have followed all the steps which are mentioned under below links

https://github.com/m-kovalsky/fabric_cat_tools/blob/main/Model%20Optimization.ipynb
https://data-mozart.com/migrate-existing-power-bi-semantic-models-to-direct-lake-a-step-by-step-guide/

![Calculated table error](https://github.com/m-kovalsky/fabric_cat_tools/assets/168816400/60c48825-35f3-42c5-b4d6-eb95b9577670

Introduce a setting similar to DAX Studio to not query direct lake tables for columns statistics but just get the memory footprint as is.

Is your feature request related to a problem? Please describe.
if you run vertipaq analyzer - the dax studio supports a setting called "read statistics from data" in the options which prevents reading actual data (from Direct Query or Direct Lake) models.
As it was noted in the function signature it is necessary to not query the model otherwiese all columns will be loaded into memory. But also querying with Spark can introduce a very long runtime / cost if those queries are run on the lakehouse via spark. I need to wait for 20+ Minutes, while I am actually only interested in the current memory usage of the model (and not so much of not loaded columns)

Describe the solution you'd like
Please introduce new parameter of generic configuration to disable the Querying of vertipaq analyzer column statistics via Spark.

Describe alternatives you've considered
An alternative would be to just not do that - as dax studio supports.

Additional context
See problem description

fct.get_lakehouse_tables - HTTPError: HTTP Error 404: Not Found

HI there

I am using version 0.3.0 and when I try and run the following notebook function below

`
import fabric_cat_tools as fct

df_tableSizes = fct.get_lakehouse_tables(
extended = True)`

I am then getting the error below, this was working on Thursday.

`HTTPError Traceback (most recent call last)
Cell In[10], line 10
4 import fabric_cat_tools as fct
6 # Get Table Details
7 # Reference: https://github.com/m-kovalsky/fabric_cat_tools?tab=readme-ov-file#get_lakehouse_tables
8
9 # Get the Lakehouse Table sizes into Pandas Dataframe
---> 10 df_tableSizes = fct.get_lakehouse_tables(
11 extended = True)
13 # ## Remove Invalid Characters from Column Names
14 df_tableSizes.columns = df_tableSizes.columns.str.replace(r'(.[)|(].)', '', regex=True)

File /nfs4/pyenv-e2048f8d-14e9-4483-9558-0f3a2c89bfda/lib/python3.10/site-packages/fabric_cat_tools/GetLakehouseTables.py:92, in get_lakehouse_tables(lakehouse, workspace, extended, count_rows)
90 else:
91 sku_value = get_sku_size(workspace)
---> 92 guardrail = get_directlake_guardrails_for_sku(sku_value)
94 spark = SparkSession.builder.getOrCreate()
96 intColumns = ['Files', 'Row Groups', 'Table Size']

File /nfs4/pyenv-e2048f8d-14e9-4483-9558-0f3a2c89bfda/lib/python3.10/site-packages/fabric_cat_tools/Guardrails.py:71, in get_directlake_guardrails_for_sku(sku_size)
56 def get_directlake_guardrails_for_sku(sku_size):
58 """
59
60 This function obtains guardrails for a given SKU size.
(...)
68 This function returns a pandas dataframe showing the guardrails for the SKU size.
69 """
---> 71 df = get_direct_lake_guardrails()
72 filtered_df = df[df['Fabric/Power BI SKUs'] == sku_size]
74 return filtered_df

File /nfs4/pyenv-e2048f8d-14e9-4483-9558-0f3a2c89bfda/lib/python3.10/site-packages/fabric_cat_tools/Guardrails.py:22, in get_direct_lake_guardrails()
7 """
8
9 This function shows the Direct Lake guardrails based on Microsoft documentation.
(...)
17 This function returns a pandas dataframe showing the guardrails by SKU.
18 """
20 url = 'https://learn.microsoft.com/power-bi/enterprise/directlake-overview'
---> 22 tables = pd.read_html(url)
23 df = tables[0]
24 df['Fabric/Power BI SKUs'] = df['Fabric/Power BI SKUs'].str.split('/')

File ~/cluster-env/trident_env/lib/python3.10/site-packages/pandas/util/_decorators.py:331, in deprecate_nonkeyword_arguments..decorate..wrapper(*args, **kwargs)
325 if len(args) > num_allow_args:
326 warnings.warn(
327 msg.format(arguments=_format_argument_list(allow_args)),
328 FutureWarning,
329 stacklevel=find_stack_level(),
330 )
--> 331 return func(*args, **kwargs)

File ~/cluster-env/trident_env/lib/python3.10/site-packages/pandas/io/html.py:1205, in read_html(io, match, flavor, header, index_col, skiprows, attrs, parse_dates, thousands, encoding, decimal, converters, na_values, keep_default_na, displayed_only, extract_links)
1201 validate_header_arg(header)
1203 io = stringify_path(io)
-> 1205 return _parse(
1206 flavor=flavor,
1207 io=io,
1208 match=match,
1209 header=header,
1210 index_col=index_col,
1211 skiprows=skiprows,
1212 parse_dates=parse_dates,
1213 thousands=thousands,
1214 attrs=attrs,
1215 encoding=encoding,
1216 decimal=decimal,
1217 converters=converters,
1218 na_values=na_values,
1219 keep_default_na=keep_default_na,
1220 displayed_only=displayed_only,
1221 extract_links=extract_links,
1222 )

File ~/cluster-env/trident_env/lib/python3.10/site-packages/pandas/io/html.py:986, in _parse(flavor, io, match, attrs, encoding, displayed_only, extract_links, **kwargs)
983 p = parser(io, compiled_match, attrs, encoding, displayed_only, extract_links)
985 try:
--> 986 tables = p.parse_tables()
987 except ValueError as caught:
988 # if io is an io-like object, check if it's seekable
989 # and try to rewind it before trying the next parser
990 if hasattr(io, "seekable") and io.seekable():

File ~/cluster-env/trident_env/lib/python3.10/site-packages/pandas/io/html.py:262, in _HtmlFrameParser.parse_tables(self)
254 def parse_tables(self):
255 """
256 Parse and return all tables from the DOM.
257
(...)
260 list of parsed (header, body, footer) tuples from tables.
261 """
--> 262 tables = self._parse_tables(self._build_doc(), self.match, self.attrs)
263 return (self._parse_thead_tbody_tfoot(table) for table in tables)

File ~/cluster-env/trident_env/lib/python3.10/site-packages/pandas/io/html.py:821, in _LxmlFrameParser._build_doc(self)
819 pass
820 else:
--> 821 raise e
822 else:
823 if not hasattr(r, "text_content"):

File ~/cluster-env/trident_env/lib/python3.10/site-packages/pandas/io/html.py:802, in _LxmlFrameParser._build_doc(self)
800 try:
801 if is_url(self.io):
--> 802 with urlopen(self.io) as f:
803 r = parse(f, parser=parser)
804 else:
805 # try to parse the input in the simplest way

File ~/cluster-env/trident_env/lib/python3.10/site-packages/pandas/io/common.py:265, in urlopen(*args, **kwargs)
259 """
260 Lazy-import wrapper for stdlib urlopen, as that imports a big chunk of
261 the stdlib.
262 """
263 import urllib.request
--> 265 return urllib.request.urlopen(*args, **kwargs)

File ~/cluster-env/trident_env/lib/python3.10/urllib/request.py:216, in urlopen(url, data, timeout, cafile, capath, cadefault, context)
214 else:
215 opener = _opener
--> 216 return opener.open(url, data, timeout)

File ~/cluster-env/trident_env/lib/python3.10/urllib/request.py:525, in OpenerDirector.open(self, fullurl, data, timeout)
523 for processor in self.process_response.get(protocol, []):
524 meth = getattr(processor, meth_name)
--> 525 response = meth(req, response)
527 return response

File ~/cluster-env/trident_env/lib/python3.10/urllib/request.py:634, in HTTPErrorProcessor.http_response(self, request, response)
631 # According to RFC 2616, "2xx" code indicates that the client's
632 # request was successfully received, understood, and accepted.
633 if not (200 <= code < 300):
--> 634 response = self.parent.error(
635 'http', request, response, code, msg, hdrs)
637 return response

File ~/cluster-env/trident_env/lib/python3.10/urllib/request.py:557, in OpenerDirector.error(self, proto, *args)
555 http_err = 0
556 args = (dict, proto, meth_name) + args
--> 557 result = self._call_chain(*args)
558 if result:
559 return result

File ~/cluster-env/trident_env/lib/python3.10/urllib/request.py:496, in OpenerDirector._call_chain(self, chain, kind, meth_name, *args)
494 for handler in handlers:
495 func = getattr(handler, meth_name)
--> 496 result = func(*args)
497 if result is not None:
498 return result

File ~/cluster-env/trident_env/lib/python3.10/urllib/request.py:749, in HTTPRedirectHandler.http_error_302(self, req, fp, code, msg, headers)
746 fp.read()
747 fp.close()
--> 749 return self.parent.open(new, timeout=req.timeout)

File ~/cluster-env/trident_env/lib/python3.10/urllib/request.py:525, in OpenerDirector.open(self, fullurl, data, timeout)
523 for processor in self.process_response.get(protocol, []):
524 meth = getattr(processor, meth_name)
--> 525 response = meth(req, response)
527 return response

File ~/cluster-env/trident_env/lib/python3.10/urllib/request.py:634, in HTTPErrorProcessor.http_response(self, request, response)
631 # According to RFC 2616, "2xx" code indicates that the client's
632 # request was successfully received, understood, and accepted.
633 if not (200 <= code < 300):
--> 634 response = self.parent.error(
635 'http', request, response, code, msg, hdrs)
637 return response

File ~/cluster-env/trident_env/lib/python3.10/urllib/request.py:557, in OpenerDirector.error(self, proto, *args)
555 http_err = 0
556 args = (dict, proto, meth_name) + args
--> 557 result = self._call_chain(*args)
558 if result:
559 return result

File ~/cluster-env/trident_env/lib/python3.10/urllib/request.py:496, in OpenerDirector._call_chain(self, chain, kind, meth_name, *args)
494 for handler in handlers:
495 func = getattr(handler, meth_name)
--> 496 result = func(*args)
497 if result is not None:
498 return result

File ~/cluster-env/trident_env/lib/python3.10/urllib/request.py:749, in HTTPRedirectHandler.http_error_302(self, req, fp, code, msg, headers)
746 fp.read()
747 fp.close()
--> 749 return self.parent.open(new, timeout=req.timeout)

File ~/cluster-env/trident_env/lib/python3.10/urllib/request.py:525, in OpenerDirector.open(self, fullurl, data, timeout)
523 for processor in self.process_response.get(protocol, []):
524 meth = getattr(processor, meth_name)
--> 525 response = meth(req, response)
527 return response

File ~/cluster-env/trident_env/lib/python3.10/urllib/request.py:634, in HTTPErrorProcessor.http_response(self, request, response)
631 # According to RFC 2616, "2xx" code indicates that the client's
632 # request was successfully received, understood, and accepted.
633 if not (200 <= code < 300):
--> 634 response = self.parent.error(
635 'http', request, response, code, msg, hdrs)
637 return response

File ~/cluster-env/trident_env/lib/python3.10/urllib/request.py:563, in OpenerDirector.error(self, proto, *args)
561 if http_err:
562 args = (dict, 'default', 'http_error_default') + orig_args
--> 563 return self._call_chain(*args)

File ~/cluster-env/trident_env/lib/python3.10/urllib/request.py:496, in OpenerDirector._call_chain(self, chain, kind, meth_name, *args)
494 for handler in handlers:
495 func = getattr(handler, meth_name)
--> 496 result = func(*args)
497 if result is not None:
498 return result

File ~/cluster-env/trident_env/lib/python3.10/urllib/request.py:643, in HTTPDefaultErrorHandler.http_error_default(self, req, fp, code, msg, hdrs)
642 def http_error_default(self, req, fp, code, msg, hdrs):
--> 643 raise HTTPError(req.full_url, code, msg, hdrs, fp)

HTTPError: HTTP Error 404: Not Found`

migrate_tables_columns_to_semantic_model - source- and target semantic models in different workspaces

Hi m-kovalsky,

first of all thx for this great tool.

I have a small issue with migrating tables and columns to another semantic model.

It all works fine if source- and target-semantic-model reside in the same workspace,
but I can not seem to get it to work if my target-semantic-model is in a different workspace.

What I actually would want to achieve is building new semantic models in a dev-workspace, once I am fine with it, push them to a production-workspace.
Your tool indicates this could be possible.

Here is what I'm doing (copied of notebook cell):
import fabric_cat_tools as fct
import time

newDatasetName = 'targetSM' #Enter the new Direct Lake semantic model name
""" comment
-> I can create the newDataset in any workspace I want.
"""
fct.create_blank_semantic_model(newDatasetName, workspaceName='DevWS')

time.sleep(5)

""" comment
-> re-creates all tables and columns of source_semantic_model (=param 1) in target_semantic_model (=param 2)
-> lakehouseName and lakehouseWorkspaceName define the workspace and lakehouse where the data should be coming from. (Switching datasource, all fine!)

-> PROBLEM:
It does not seem to work if source- and target-semantic-model do not reside in the same workspace!
"""
fct.migrate_tables_columns_to_semantic_model( 'sourceSM', 'targeSM', workspaceName= 'DevWS', lakehouseName= 'LH_Gold', lakehouseWorkspaceName='ProdWS')

fct.migrate_model_objects_to_semantic_model('sourceSM', 'targetSM', workspaceName= 'DevWS')

I am not sure if that's an issue. Could be a feature.

Thx a million for a feedback.

Best Regards from Germany,
Marc

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.