BigQuery

Google BigQuery is a fully managed, serverless data warehouse that enables scalable analysis over petabytes of data. The Data Connect integration with BigQuery allows you to automatically sync your Contentsquare data to BigQuery for advanced analysis and joining with other business data.

Before setting up the BigQuery integration, ensure you:

  1. Have a Google Cloud Platform (GCP) project.
  2. Enable billing in the GCP project.
  3. Enable the BigQuery API
  4. Know the region you want to use among us, eu, europe-west1, europe-west2, us-central1, us-west1, us-west2, australia-southeast1, europe-west6, or us-east4.
  5. (Optional) Decide on a name for your dataset (default is project_environment).

These prerequisites are also outlined in Google Cloud Platform quick-start guide

Within the GCP dashboard for your selected project, visit IAM & admin settings and click + Add.

In the subsequent view, add heap-sql@heap-204122.iam.gserviceaccount.com as a BigQuery User and save the new permission. At minimum, this account need to be assigned to a dataEditor role, and requires the following permissions:

Project Permissions

  • bigquery.datasets.get
  • bigquery.jobs.create

Dataset Permissions

  • bigquery.routines.create
  • bigquery.routines.delete
  • bigquery.routines.get
  • bigquery.routines.list
  • bigquery.routines.update
  • bigquery.tables.create
  • bigquery.tables.delete
  • bigquery.tables.get
  • bigquery.tables.getData
  • bigquery.tables.list
  • bigquery.tables.update
  • bigquery.tables.updateData

See BigQuery’s access control doc to learn more about the different roles in BigQuery, and see this StackOverflow response for steps to grant individual permissions to create a custom IAM role for Contentsquare.

If you must, you can use your own GCP service account and have Data Connect write to a bucket in your Cloud Storage. Contentsquare will authenticate as your service account to deliver files into your bucket and if configured load them into BigQuery. In this case, provide Contentsquare teams with the:

  • Email associated with your GCP service account
  • Private key for this service account
  • Bucket name

If you have a Virtual Private Cloud:

  • Create a GCS staging bucket that allows ingress from outside the perimeter and share the bucket name with Contentsquare
  • Setup a perimeter bridge including this staging bucket and the target BigQuery dataset
  1. Log in to Contentsquare.

  2. Navigate to Analysis setup > Data Connect.

  3. Select Connect next to BigQuery

  4. Enter your BigQuery hostname and select Next.

  5. Supply the required information:

    • Your Project ID, which you can find in the Project info section of your GCP project dashboard (make sure you’re in the correct project).
    • Your region: us, eu, europe-west1, europe-west2, us-central1, us-west1, us-west2, australia-southeast1, europe-west6, or us-east4.
    • The dataset name override if you don’t want the default. The default dataset name is project_environment. For example, the Main Production environment will default to a dataset name of main_production.
  6. Select Connect.