Amazon S3
Amazon S3 (Simple Storage Service) is an object storage service that offers industry-leading scalability, data availability, security, and performance. The Data Connect integration with S3 allows you to export your Contentsquare data to S3 for flexible storage and analysis options.
Unlike the other warehouse integrations (BigQuery, Redshift, Snowflake), the S3 integration provides raw data files that you can process with your preferred analytics tools, such as Athena, EMR, or third-party data processing services.
Prerequisites
Section titled PrerequisitesBefore setting up the S3 integration, ensure you have:
- An AWS account with S3 access
- An S3 bucket to store Contentsquare data
- AWS credentials with appropriate permissions for the S3 bucket
S3 setup
Section titled S3 setup-
Create an S3 bucket to store Contentsquare data (if you don’t already have one)
-
Create an IAM user or role with appropriate permissions:
-
The IAM policy should include:
s3:PutObject
s3:GetObject
s3:ListBucket
s3:DeleteObject
(optional, for cleanup)
-
-
Generate AWS access keys for the IAM user (if using user-based authentication)
Configure Data Connect
Section titled Configure Data Connect- Log in to Contentsquare.
- Navigate to Analysis setup > Data Connect.
- Create the S3 bucket
csq-rs3-<bucket_name>
to sync Data Connect data with. - Select Next.
- Add the displayed policy to your CSQ bucket on S3.
- Input your S3 credentials to connect to your bucket.
- Select Connect.
Once setup is complete, you’ll see a sync within 24 hours with the following built-in tables.
Data Structure in S3
Section titled Data Structure in S3When Data Connect syncs data to S3, it creates the following structure:
s3://your-bucket/[optional-prefix]/ ├── users/ │ ├── date=YYYY-MM-DD/ │ │ ├── part-00000.[format].[compression] │ │ ├── part-00001.[format].[compression] │ │ └── ... ├── sessions/ │ ├── date=YYYY-MM-DD/ │ │ ├── part-00000.[format].[compression] │ │ ├── part-00001.[format].[compression] │ │ └── ... ├── pageviews/ │ ├── date=YYYY-MM-DD/ │ │ └── ... ├── [custom_event_name]/ │ ├── date=YYYY-MM-DD/ │ │ └── ... └── ...
The data is organized by:
- Table name (users, sessions, pageviews, custom events)
- Date partition (based on sync date)
- Part files (data is split into multiple files)