Databricks

Securely connect your Connecty's Data Environment to your Databricks database.

As a best practice, generate a dedicated Databricks Principal Service scoped only for your chosen catalog (Hive or Unity Catalog) and use it against your workspace host. This isolates your integration credentials and simplifies permission management, ensuring a seamless no-code connection.

Prerequisites

Host
M2M OAuth credentials or Personal Access Token
(Conditional) Hive catalog:
- SQL Warehouse HTTP Path or Cluster ID
(Conditional) Unity Catalog:
- SQL Warehouse HTTP Path

Host

Your Host is the domain portion of your Databricks workspace URL.

When you log into Databricks, your browser URL looks like: https://<workspace-name>.cloud.databricks.com/?o=1234567890123456 Use only <workspace-name>.cloud.databricks.com (no protocol, no path, no query string).

✅ mycompany.cloud.databricks.com
❌ https://mycompany.cloud.databricks.com
❌ mycompany.cloud.databricks.com/?o=1234567890123456

M2M OAuth credentials

Databricks recommends using service principals for machine-to-machine access.

How to create new service principal and generate credentials:

Click your user avatar (top-right) -> Settings -> Identity and access -> Service Principals.
Click on Add service principal.
Select an existing service principal and assign it to the workspace, or create a new one.
Assign the appropriate permissions to the new service principal, including access to the required catalog(s).
On page for newly created service principal, open the Secrets tabs and click Generate secret.
Copy the generated client_id and secret, and store them securely.

Personal Access Token

Usage of Personal Access Tokens in Databricks is currently in legacy mode. Connecty will be supporting this method of authentication as long as Databricks does. Though for new integrations using Service Principals and M2M OAuth is preferred.

How to generate a token:

Click your user avatar (top-right) → User Settings → Access Tokens.
Click Generate New Token, give it a name, set an expiration.
Copy the token value — you won’t see it again.

⚠️ Warning:

Treat this token like a password. Don’t check it into source control.
Store it in environment variables or a secure vault.
✅ dapiXXXXXXXXXXXXXXXXXXXX - expected format of Databricks access token.
❌ (leaving blank) — connection will fail.

Catalog Selection (Conditional Fields)

Databricks supports two catalog types. Fill only the fields required for your chosen catalog.

Hive Catalog

Use Hive catalog when connecting to classic clusters or SQL Warehouses without Unity Catalog.

If you’re using a SQL Warehouse, supply the SQL Warehouse HTTP Path:Tip:
1. In Databricks UI, go to SQL → SQL Warehouses.
2. Click your warehouse → Connection Details → copy the JDBC/ODBC HTTP Path.

✅ /sql/1.0/warehouses/abcdef1234567890 - expected HTTP path format.
❌ (cluster IT path or cluster ID)

If you’re using a standard (compute) cluster, supply the Cluster ID instead:Tip:
1. In Databricks UI, go to Compute → Clusters.
2. Click your cluster name → copy the Cluster ID from the URL or details.

✅ 1234-567890-abcd123 - expected cluster ID.
❌ /sql/1.0/warehouses/abcdef1234567890 - do not use HTTP path as cluster ID.

⚠️ Warning:

Provide either SQL Warehouse HTTP Path or Cluster ID — not both.
If both are filled, the connection may default to the wrong endpoint.

Unity Catalog

Use Unity Catalog when your data lives under the Databricks Unity Catalog model.

SQL Warehouse HTTP Path is required.

In SQL → SQL Warehouses, select a warehouse that’s enabled for Unity Catalog, then copy its HTTP Path as above.

❌ Do not enter a Cluster ID when using Unity Catalog.

Putting It All Together

Below is an example configuration for each scenario:

Example: Hive Catalog + SQL Warehouse

host: mycompany.cloud.databricks.com
token: dapiXXXXXXXXXXXXXXXXXXXX (or client_id and secret for M2M OAuth)
sql_warehouse_http_path: /sql/1.0/warehouses/abcdef1234567890
cluster_id: # leave blank

Example: Hive Catalog + Standard Cluster

host: mycompany.cloud.databricks.com
token: dapiXXXXXXXXXXXXXXXXXXXX (or client_id and secret for M2M OAuth)
sql_warehouse_http_path: # leave blank
cluster_id: 1234-567890-abcd123

Example: Unity Catalog

host: mycompany.cloud.databricks.com
token: dapiXXXXXXXXXXXXXXXXXXXX (or client_id and secret for M2M OAuth)
sql_warehouse_http_path: /sql/1.0/warehouses/uvwxyz9876543210
cluster_id: <left empty>

Query History support for Day Zero Semantic Layer

Connecty generally uses SQL query history to provide a high-quality semantic layer. For Databricks connections, Connecty synchronises query history based on SQL queries executed via Databricks SQL Warehouses. Fetching and parsing SQL queries executed inside standard Databricks jobs is currently not supported.

Unity Catalog Sync

Connecty support synchronising its bespoke semantic layer back into Databricks Unity Catalog metric views through Context Engine Export process. This chapter focuses on the required configuration and permissions for enabling the export. For more details about the export process itself, refer to Databricks Unity Catalog Sync.

Permissions

Context Engine Export process writes Connecty semantic layer entities into Unity Catalog as metric views. To enable this, ensure that the configured service principal has write permissions on the target scope.

Configuration

Connecty exports semantic layer entities into a specified Databricks schema (catalog.schema). Before configuring the export, it is recommended to:

Verify that the service principal has write permissions for selected schema.
Ensure that the selected schema is empty (optional but recommended).

Selecting Data Workspaces

A single Data Connection in Connecty may be used by multiple Data Workspaces. In the Export Configuration, you can select which Data Workspaces should be included in the export. For each Data Workspace, you can choose one of the following export classes:

All entities - all semantic layer entities will be exported.
Verified only - only verified entities from semantic layer will be exported in that Data Workspace.

Handling Naming Conflicts

Because multiple Data Workspaces can export into the same target schema, naming conflicts may occur (for example, two Workspaces may have a subject named Products). Connecty provides two strategies to avoid these conflicts:

Prefix object names - in this strategy Data Workspace identifier will be put as exported metric view name prefix. It would of the form catalog.schema.<dw_id>_<subject_name>.
Create a per-workspace schema - in this strategy Connecty will create a new schema for each Data Workspace selected for the Export process. In this strategy export metric view names would of the form catalog.schema_<dw_id>.<subject_name>. This strategy requires that configure service principal has permission for creating new schemas in given catalog.

Last updated 1 month ago

hashtagPrerequisites

hashtagHost

hashtagM2M OAuth credentials

hashtagPersonal Access Token

hashtagCatalog Selection (Conditional Fields)

hashtagHive Catalog

hashtagUnity Catalog

hashtagPutting It All Together

hashtagExample: Hive Catalog + SQL Warehouse

hashtagExample: Hive Catalog + Standard Cluster

hashtagExample: Unity Catalog

hashtagQuery History support for Day Zero Semantic Layer

hashtagUnity Catalog Sync

hashtagPermissions

hashtagConfiguration