---
title: BigQuery - Data Connect
description: Instructions for setting up and using Data Connect with Google BigQuery
lastUpdated: 09 March 2026
source_url:
  html: https://docs.contentsquare.com/en/connect/bigquery/
  md: https://docs.contentsquare.com/en/connect/bigquery/index.md
---

Google BigQuery is a fully managed, serverless data warehouse that enables scalable analysis over petabytes of data. The Data Connect integration with BigQuery allows you to automatically sync your Contentsquare data to BigQuery for advanced analysis and joining with other business data.

Tip

As part of this integration, Contentsquare creates the dataset in BigQuery for you during the initial sync. You do not need to create it manually.

## Prerequisites

Before setting up the BigQuery integration, ensure you:

1. [Have a Google Cloud Platform (GCP) project ↗](https://support.google.com/cloud/answer/6251787?hl=en\&ref_topic=6158848).
2. [Enable billing in the GCP project ↗](https://cloud.google.com/billing/docs/how-to/modify-project).
3. [Enable the BigQuery API ↗](https://console.cloud.google.com/flows/enableapi?apiid=bigquery)
4. Know the region you want to use among `us`, `eu`, `europe-west1`, `europe-west2`, `europe-west4`, `europe-west6`, `us-central1`, `us-west1`, `us-west2`, `us-east4`, or `australia-southeast1`.
5. (Optional) Decide on a name for your dataset (default is `project_environment`).

These prerequisites are also outlined in Google Cloud Platform [quick-start guide ↗](https://cloud.google.com/bigquery/docs/quickstarts/quickstart-web-ui)

## BigQuery setup

Within the [GCP dashboard ↗](https://console.cloud.google.com/home/dashboard) for your selected project, visit `IAM & admin` settings and click **+ Add**.

In the subsequent view, add `heap-sql@heap-204122.iam.gserviceaccount.com` with the `BigQuery User` role and save the new permission.

We would prefer to be added as a BigQuery user per the steps above. At minimum, we need to be assigned to a `dataEditor` role, and we need the following [permissions ↗](https://cloud.google.com/bigquery/docs/access-control):

**Project Permissions**

* `bigquery.datasets.get`
* `bigquery.jobs.create`

**Dataset Permissions**

* `bigquery.routines.create`
* `bigquery.routines.delete`
* `bigquery.routines.get`
* `bigquery.routines.list`
* `bigquery.routines.update`
* `bigquery.tables.create`
* `bigquery.tables.delete`
* `bigquery.tables.get`
* `bigquery.tables.getData`
* `bigquery.tables.list`
* `bigquery.tables.update`
* `bigquery.tables.updateData`

See how to grant individual permissions to create a [custom IAM role ↗](https://console.cloud.google.com/projectselector2/iam-admin/iam) for Contentsquare.

### External service account

If you must, you can use your own GCP service account and have Data Connect write to a bucket in your Cloud Storage. Contentsquare will authenticate as your service account to deliver files into your bucket and if configured load them into BigQuery. In this case, provide Contentsquare teams with the:

* Email associated with your GCP service account
* Private key for this service account
* Bucket name

### Virtual Private Cloud (VPC)

If you have a Virtual Private Cloud:

* Create a GCS staging bucket that allows ingress from outside the perimeter and share the bucket name with Contentsquare
* Setup a perimeter bridge including this staging bucket and the target BigQuery dataset

## Configure Data Connect

Admin only

Configuration of this destination is only available to admin users in Contentsquare.

1. Log in to Contentsquare.

2. Navigate to **Analysis setup** > **Data Connect**.

3. Select **Connect** next to BigQuery.

4. Enter your BigQuery hostname and select **Next**.

5. Supply the required information:

   * Your **Project ID**, which you can find in the `Project info` section of your GCP project dashboard (make sure you're in the correct project).
   * Your **region**: `us`, `eu`, `europe-west1`, `europe-west2`, `us-central1`, `us-west1`, `us-west2`, `australia-southeast1`, `europe-west6`, or `us-east4`.
   * The **dataset name override** if you don't want the default. The default dataset name is `project_environment`. For example, the Main Production environment will default to a dataset name of *main\_production*.

6. Select **Connect**.

## Advanced Features

### Partitioning

Data Connect automatically partitions BigQuery tables by date to optimize query performance and reduce costs. Understanding how partitioning works helps you write more efficient queries.

**How Data Connect Partitions Tables:**

* Event tables are partitioned by the `time` column
* Partitions are created daily based on event timestamps
* Each partition contains all events for that specific date

**Benefits of Partitioning:**

* **Reduced query costs**: Queries that filter by date only scan relevant partitions
* **Improved performance**: Less data to scan means faster query execution
* **Automatic pruning**: BigQuery automatically eliminates unnecessary partitions

**Writing Partition-Aware Queries:**

```sql
-- Good: Uses partition filter on time column
SELECT user_id, event_id, time
FROM `your-project.main_production.page_viewed`
WHERE DATE(time) >= DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY)


-- Bad: Scans all partitions (expensive)
SELECT user_id, event_id, time
FROM `your-project.main_production.page_viewed`
WHERE user_id = 12345
```

**Partition Pruning Tips:**

* Always include a `WHERE` clause on the `time` column when possible
* Use `DATE(time)` for date-based filters
* Combine date filters with other conditions for optimal performance

**Check Partition Details:**

```sql
-- View partition information
SELECT
  partition_id,
  total_rows,
  total_logical_bytes
FROM `your-project.main_production.INFORMATION_SCHEMA.PARTITIONS`
WHERE table_name = 'page_viewed'
ORDER BY partition_id DESC
LIMIT 10
```

### External Service Account Setup

In some cases, you may need to use your own Google Cloud Platform (GCP) service account for Data Connect instead of Contentsquare's default service account.

**When to Use a Custom Service Account:**

* Your organization requires all data access to use internal service accounts
* You need granular control over permissions and access logs
* You have VPC Service Controls that restrict external service accounts
* You want to route data through your own Cloud Storage bucket

**Step-by-Step Configuration:**

1. **Create a Service Account** in your GCP project:

   ```shell
   gcloud iam service-accounts create contentsquare-connect \
     --display-name="Contentsquare Data Connect" \
     --project=your-project-id
   ```

2. **Grant Required Permissions** to the service account:

   ```shell
   # BigQuery permissions
   gcloud projects add-iam-policy-binding your-project-id \
     --member="serviceAccount:contentsquare-connect@your-project-id.iam.gserviceaccount.com" \
     --role="roles/bigquery.dataEditor"


   gcloud projects add-iam-policy-binding your-project-id \
     --member="serviceAccount:contentsquare-connect@your-project-id.iam.gserviceaccount.com" \
     --role="roles/bigquery.jobUser"
   ```

3. **Create a Cloud Storage Staging Bucket** (if needed):

   ```shell
   gsutil mb -p your-project-id -l us gs://your-csq-staging-bucket


   # Grant service account access
   gsutil iam ch serviceAccount:contentsquare-connect@your-project-id.iam.gserviceaccount.com:objectAdmin \
     gs://your-csq-staging-bucket
   ```

4. **Generate Service Account Key**:

   ```shell
   gcloud iam service-accounts keys create contentsquare-key.json \
     --iam-account=contentsquare-connect@your-project-id.iam.gserviceaccount.com
   ```

5. **Provide Credentials to Contentsquare**:

   * Service account email: `contentsquare-connect@your-project-id.iam.gserviceaccount.com`
   * Private key (from `contentsquare-key.json`)
   * Bucket name (if using staging bucket)

**Security Considerations:**

* **Rotate keys regularly**: Generate new service account keys periodically
* **Limit permissions**: Only grant the minimum required permissions
* **Monitor access**: Enable audit logging for the service account
* **Secure key storage**: Never commit service account keys to version control
* **Use IAM conditions**: Apply conditional access policies where appropriate

**Required Permissions:**

The service account needs these specific BigQuery permissions:

* `bigquery.datasets.get`
* `bigquery.jobs.create`
* `bigquery.routines.*` (`create`, `delete`, `get`, `list`, `update`)
* `bigquery.tables.*` (`create`, `delete`, `get`, `getData`, `list`, `update`, `updateData`)

If using a staging bucket, also grant:

* `storage.objects.create`
* `storage.objects.delete`
* `storage.objects.get`
* `storage.objects.list`

### VPC Service Controls

VPC Service Controls create security perimeters around GCP resources to prevent data exfiltration and unauthorized access.

If your organization uses VPC Service Controls to secure GCP resources, you need toimplement the following steps for Data Connect to work:

1. Create a GCS Staging Bucket outside the perimeter:

   ```shell
   # Create bucket in the same region as your BigQuery dataset
   gsutil mb -p your-project-id -l us gs://csq-staging-bucket


   # Grant Contentsquare's service account access
   gsutil iam ch serviceAccount:heap-sql@heap-204122.iam.gserviceaccount.com:objectAdmin \
     gs://csq-staging-bucket
   ```

   The staging bucket must be accessible from both Contentsquare (outside your perimeter) and your BigQuery dataset (inside your perimeter).

2. Create a perimeter bridge to allow data flow from the staging bucket to BigQuery:

   ```shell
   # Get your perimeter name
   gcloud access-context-manager perimeters list --policy=YOUR_POLICY_ID


   # Create bridge configuration
   gcloud access-context-manager perimeters update YOUR_PERIMETER_NAME \
     --set-egress-policies=egress-policy.yaml \
     --policy=YOUR_POLICY_ID
   ```

   Example `egress-policy.yaml`:

   ```yaml
   - egressFrom:
       identities:
       - serviceAccount:heap-sql@heap-204122.iam.gserviceaccount.com
     egressTo:
       resources:
       - projects/YOUR_PROJECT_NUMBER
       operations:
       - serviceName: storage.googleapis.com
         methodSelectors:
         - method: "*"
   ```

3. During Data Connect setup, provide Contentsquare with:

   * Project ID
   * BigQuery dataset name
   * Staging bucket name: `gs://csq-staging-bucket`
   * Region
   * VPC perimeter name (for support reference)

4. Once the setup is complete, verify data is flowing correctly:

   ```sql
   -- Check recent sync activity
   SELECT table_name, TIMESTAMP_MILLIS(creation_time) as created
   FROM `your-project.main_production.__TABLES__`
   ORDER BY creation_time DESC
   LIMIT 10
   ```

If syncs fail with VPC-related errors, follow these steps to troubleshoot:

1. Verify the staging bucket is **not** inside the VPC perimeter
2. Check that the perimeter bridge includes the staging bucket
3. Confirm egress policies allow Contentsquare's service account
4. Review VPC Service Controls audit logs for denied requests
5. Ensure the staging bucket and BigQuery dataset are in the same region

Common error messages include:

* `"Request violates VPC Service Controls"`: Perimeter bridge not configured correctly
* `"Bucket is inside VPC perimeter"`: Staging bucket must be outside the perimeter
* `"Permission denied"`: Service account lacks necessary IAM permissions
