BigQuery

This guide provides detailed instructions for setting up and using Data Connect with Google BigQuery.

Google BigQuery is a fully managed, serverless data warehouse that enables scalable analysis over petabytes of data. The Data Connect integration with BigQuery allows you to automatically sync your Heap data to BigQuery for advanced analysis and joining with other business data.

Before setting up the BigQuery integration, ensure you:

  1. Have a Google Cloud Platform (GCP) project.
  2. Enable billing in the GCP project.
  3. Enable the BigQuery API
  4. Know the region you want to use (see Supported Regions).
  5. (Optional) Decide on a name for your dataset (default is project_environment).

These prerequisites are also outlined in GCP’s quick-start guide

Within the GCP dashboard for your selected project, please visit IAM & admin settings and click + Add.

In the subsequent view, add heap-sql@heap-204122.iam.gserviceaccount.com as a BigQuery User and save the new permission.

We would prefer to be added as a BigQuery user per the steps above. At minimum, we need to be assigned to a dataEditor role, and we need the following permissions:

Project Permissions

  • bigquery.datasets.get
  • bigquery.jobs.create

Dataset Permissions

  • bigquery.routines.create
  • bigquery.routines.delete
  • bigquery.routines.get
  • bigquery.routines.list
  • bigquery.routines.update
  • bigquery.tables.create
  • bigquery.tables.delete
  • bigquery.tables.get
  • bigquery.tables.getData
  • bigquery.tables.list
  • bigquery.tables.update
  • bigquery.tables.updateData

See BigQuery’s access control doc to learn more about the different roles in BigQuery, and see this StackOverflow response for steps to grant individual permissions to create a custom IAM role for Contentsquare.

Provide Contentsquare Your Project Details

Section titled Provide Contentsquare Your Project Details

Once the GCP project is configured, you’ll need to enter the following information on the BigQuery configuration page in Contentsquare:

  • Your Project ID, which you can find in the Project info section of your GCP project dashboard (make sure you’re in the correct project). In the screenshot below, our project ID is heap-204419.
  • Your region: us, eu, europe-west1, europe-west2, us-central1, us-west1, us-west2, australia-southeast1, europe-west6, or us-east4.
  • The dataset name override if you don’t want the default. The default dataset name is project_environment. For example, the Main Production environment will default to a dataset name of main_production.

That’s it! The initial sync should complete within 24 hours. If you don’t see data after 24 hours, please contact our support.

You can learn about how the data will be structured upon sync by viewing our docs on data syncing.

  1. Log in to Contentsquare.
  2. Navigate to Analysis setup > Data Connect.
  3. Select Connect next to BigQuery
  4. Enter your BigQuery hostname and select Next.
  5. Supply the required information:
    • GCP Project ID
    • Dataset name
    • region
  6. Select Connect.