Databricks
This guide provides detailed instructions for setting up and using Data Connect with Databricks.
The Data Connect Databricks integration allows you to sync Contentsquare data to Databricks to leverage behavioral data in other tools.
Prerequisites
Section titled PrerequisitesBefore setting up the Snowflake integration, ensure you have:
- Admin or Architect privileges in Heap
- Access to an AWS-hosted Databricks account that uses the Unity Catalog
Configure Data Connect
Section titled Configure Data ConnectSetup Process
Section titled Setup ProcessTo get started, navigate to Integrations > Directory, search for Databricks, then select it where it appears.
You’ll be prompted to provide the following information:
- Hostname: The ID of your Databricks account, which you can find in the account URL.
- Path: The path of the warehouse you are connecting via this integration.
- Catalog: The catalog that this data should sync to; if left blank, this integration will create a new catalog.
- Schema (optional): The schema that this data should sync to; if left blank, this integration will create a new schema.
- Token: This is required to allow Data Connect to write to the schema. The token must be a Personal Access Token (PAT) rather than an OAuth Token.
Once all those fields are populated, click the Connect button.
That’s it!
Once setup is complete, you’ll see a sync within 48 hours with the following built-in tables.
- Pageviews
- Sessions
- Users
user_migrations
You can create an all_events
view to Databricks by setting up a query like this one:
SELECT event_id, time, user_id, session_id, 'test_event_table' AS event_table_nameFROM "TEST_DB"."TEST_SCHEMA"."TEST_EVENT_TABLE"UNIONSELECT event_id, time, user_id, session_id, 'click_event_table' AS event_table_nameFROM "SCHEMA"."CLICK_EVENT_TABLE"UNIONSELECT event_id, time, user_id, session_id, 'pageview_event_table' AS event_table_nameFROM "SCHEMA"."PAGEVIEW_EVENT_TABLE"
Limitations
Section titled Limitations- The All Events table is not synced to Databricks. As a workaround, you can create your own
all_events
. - Defined properties syncing is not supported during beta.
- Segments syncing is not supported during beta.