While Azure Purview provides an out of the box user experience with Purview Studio, not all tasks are suited to the point-and-click nature of the graphical user experience.
For example:
- Triggering a scan to run as part of an automated process.
- Monitoring for metadata changes in real-time.
- Building your own custom user experience.
Azure Purview provides several mechanisms in which we can interact with the underlying platform in an automated and programmatic fashion, spanning both the control plane (i.e. Azure Resource Manager) as well as Azure Purview's multiple data planes (e.g. catalog, scanning, administration, and more).
This article provides a summary of the options available, and guidance on what to use when.
Tool Type | Tool | Scenario | Management | Catalog | Scanning |
---|---|---|---|---|---|
Resource Management | Infrastructure as Code | ✓ | |||
Command Line | Interactive | ✓ | |||
API | On-Demand | ✓ | ✓ | ✓ | |
Streaming (Atlas Kafka) | Real-Time | ✓ | |||
Streaming (Diagnostic Logs) | Real-Time | ✓ | |||
SDK | Custom Development | ✓ | ✓ | ✓ |
Azure Resource Manager (ARM) is a deployment and management service in Azure that provides a management layer, enabling customers to create, update, and delete resources (e.g. Azure Purview Account) in your Azure subscription.
While there are several methods in which the management service can be invoked (e.g. CLI, PowerShell, SDK), a common pattern is to build and deploy templates so that resources can be deployed consistently and repetedly (i.e. Infrastructure as Code).
To implement infrastructure as code for Azure resources including Azure Purview, we can build ARM templates using JSON or Bicep, or open-source alternatives such as Terraform.
When to use?
- Instances where you are required to repeatedly deploy Azure Purview, templates ensure resources are deployed in a consistent manner.
- When coupled with deployment scripts, templated solutions can traverse the control and data planes helping facilitate the deployment of end-to-end solutions (e.g. provision an Azure Purview account, register sources, trigger scans).
Azure CLI and Azure PowerShell are command-line tools that enable you to manage Azure resources such as Azure Purview. Note: Only a subset of Azure Purview control plane operations (e.g. account management) are currently available via the command-line, for an up to date list of commands currently available, check out the documentation (Azure CLI | Azure PowerShell).
- Azure CLI - A cross-platform tool that allows the execution of commands through a terminal using interactive command-line prompts or a script. Azure CLI has a purview extension that allows for the management of Azure Purview accounts (e.g.
az purview account
). - Azure PowerShell - A cross-platform task automation program, consisting of a set of cmdlets for managing Azure resources. Azure PowerShell has a module called Az.Purview that allows for the management of Azure Purview accounts (e.g.
Get-AzPurviewAccount
).
When to use?
- Best suited for ad-hoc tasks and quick exploratory operations.
REST API's are service endpoints that surface sets of HTTP methods (e.g. POST
, GET
, PUT
, DELETE
), which can perform create, read, update, or delete (CRUD) operations with the service's resources. Azure Purview exposes a large portion of the Azure Purview platform via multiple service endpoints.
When to use?
- Required operations not available via Azure CLI, Azure PowerShell, or native client libraries.
- Custom application development or process automation.
Each Azure Purview account comes with a fully-managed Event Hub, which is accessible via the Atlas Kafka endpoint that can be found via Azure Portal > Azure Purview Account > Properties. This allows you to monitor and react to Azure Purview events (i.e. consume), and notify Azure Purview of events when they occur (i.e. publish).
- Consume Events - Azure Purview will send notifications about metadata changes to Kafka topic ATLAS_ENTITIES. Applications interested in metadata changes can monitor for these notifications. Supported operations include:
ENTITY_CREATE
,ENTITY_UPDATE
,ENTITY_DELETE
,CLASSIFICATION_ADD
,CLASSIFICATION_UPDATE
,CLASSIFICATION_DELETE
. - Publish Events - Azure Purview can be notified of metadata changes via notifications to Kafka topic ATLAS_HOOK. Supported operations include:
ENTITY_CREATE_V2
,ENTITY_PARTIAL_UPDATE_V2
,ENTITY_FULL_UPDATE_V2
,ENTITY_DELETE_V2
.
When to use?
- Applications or processes that need to publish or consume catalog events (i.e. Apache Atlas) in real-time.
Similar to other Azure Services, Azure Purview can send platform logs and metrics via "Diagnostic settings" to one or more destinations (e.g. Log Analytics Workspace, Storage Account, or Event Hub). Available metrics include Data Map Capacity Units
, Data Map Storage Size
, Scan Canceled
, Scan Completed
, Scan Failed
, and Scan Time Taken
.
Once configured, as these events take place within the Purview Account, the platform automatically sends these events to the destination as a JSON payload. From there, application subscribers that need to consume and act on these events can do so scalably to orchestrate downstream logic.
When to use?
- Applications or processes that need to consume diagnostic events (i.e. Scan Failed) in real-time.
Microsoft provides Azure SDKs to programmatically manage and interact with Azure services. Azure Purview client libraries are available in several languages (.NET, Java, JavaScript, and Python), designed to be consistent, approachable, and idiomatic.
When to use?
- Recommended over the REST API as the native client libraries (where available) will follow standard programming language conventions in line with the target language that will feel natural to the developer.
Azure SDK for .NET
- Docs | Nuget Azure.Analytics.Purview.Account
- Docs | Nuget Azure.Analytics.Purview.Administration
- Docs | Nuget Azure.Analytics.Purview.Catalog
- Docs | Nuget Azure.Analytics.Purview.Scanning
- Docs | Nuget Microsoft.Azure.Management.Purview
Azure SDK for Java
- Docs | Maven com.azure.analytics.purview.account
- Docs | Maven com.azure.analytics.purview.administration
- Docs | Maven com.azure.analytics.purview.catalog
- Docs | Maven com.azure.analytics.purview.scanning
- Docs | Maven com.azure.resourcemanager.purview
Azure SDK for JavaScript
- Docs | npm @azure-rest/purview-account
- Docs | npm @azure-rest/purview-administration
- Docs | npm @azure-rest/purview-catalog
- Docs | npm @azure-rest/purview-scanning
- Docs | npm @azure/arm-purview
Azure SDK for Python