Databricks

Learn how to set up a Databricks source connection

GrowthLoop Audience Platform connects directly to Databricks so you can leverage the customer data in your data lake to create audiences with just a few clicks.

To establish the connection to Databricks, here are the key steps we will walk through together.

  1. Creating a Private Access Token (PAT) or OAuth Service Principle
  2. Finding your connection metadata
  3. Providing your connection metadata and private access token to the GrowthLoop app

Option 1 (Recommended) - Creating an OAuth Service Principle

Navigate to the User Management tab of your Databricks account console.

Navigate to the Service Principles tab.

Click the "Add a Service Principle" button on the right.

Give it a name and click the "Create" button.

Click on the service principle to open the Service Principle details page.

Click "Generate Secret" at the bottom.

Copy the Client ID and Client Secret pair before closing the modal.

Navigate to your Databricks workspace before proceeding to the section on "Finding Your Connection Metadata"


Option 2 - Creating a Private Access Token

Navigate to the Admin Settings page in the top right of your Databricks console.

Untitled

Navigate to the Developer tab and click manage on Access tokens

Untitled

Hit the Generate button to create a new token and copy it in a file as we'll need this later on.

Untitled

Finding Your Connection Metadata

Navigate to SQL Warehouses in your workspace and select the warehouse you'd like to connect to.

Untitled

Navigate to the Connection details tab and copy both

  • Server hostname
  • and HTTP path

which we'll use in the next step.

Make sure you click the Start button in the top-right in case your Databricks warehouse is not running.

Untitled

Providing your connection metadata and private access token to the GrowthLoop app

Once you log into your GrowthLoop account, provide the industry you belong to and agree to the terms & conditions to move ahead.

Pick Databricks as your data warehouse

Provide the following details:

  • Name -- Anything you'd like to call this source connection
  • Description -- Details on the source connection
  • Catalog -- The name of the catalog in Databricks that you want to connect to by default
  • Hostname -- Copied previous step
  • HTTP Path -- Copied previous step
  • Client ID -- Copied previous step for option 1
  • Client Secret -- Copied previous step for option 1
  • Access Token -- Copied previous step for option 2
  • Dataset for snapshots -- The desired schema for storing application state - "flywheel_system" recommended

Note -- If your connection failed in the first try (after a few minutes), give it one more try before you contact our team. This is usually because the first attempt tries to wake up your Databricks warehouse.

You're now ready to build your first dataset in GrowthLoop.

Try choosing your Dataset and Table from the drop-downs and give your Table alias & description some details, while defining your unique key.

You can choose which fields from your data you'd like to always be available when building any future audiences from this dataset (eg, total_credit_amount)

On the next step you should provide the retention as well as catalog to store your audience snapshots at.

You're all set up to build your first audience now: