Data Schema

Data Connect creates the following tables in your data warehouse, each representing a different aspect of user behavior data. Understanding this schema is essential for effective analysis and joining with other data sources.

Data Connect adopts a hierarchical schema. By default, Data Connect will sync the following tables:

  • One users table (for all user information)
  • One sessions table (for all session information)
  • One pageviews table (for all pageview information)

The users table contains a row for each unique user that has visited your site.

Column NameTypeDescription
user_idBIGINTUnique ID of user, randomly generated by Contentsquare (CSQ).
identityTEXTUser's username or other unique token, passed identify. Must be unique.
handleTEXTUser's username or other unique token, passed identify.
emailTEXTUser's email address, passed in via addUserProperties API.
joindateTIMESTAMPTimestamp without time zone of when the user was first seen. In S3, this is in UNIX timestamp format.
last_modifiedTIMESTAMPTimestamp without time zone of when the user's data was last modified. In S3, this is in UNIX timestamp format.
user propertiesTEXTThere will be one column for every unique user property you've sent via the addUserProperties API (name, gender, account status, etc). The column type is automatically inferred from the underlying property values.

The sessions table represents individual user sessions.

Column NameTypeDescription
event_idBIGINTID of associated session event, randomly generated by CSQ.
user_idBIGINTID of associated user, randomly generated by CSQ.
session_idBIGINTID of associated session, randomly generated by CSQ.
timeTIMESTAMPTimestamp without time zone of when session started.
libraryTEXTVersion of Contentsquare library which began the session. Can be one of "web" or "iOS".
platformTEXTUser's operating system.
device_typeTEXTDevice type. Can be one of: Mobile, Desktop, or Tablet.
countryTEXTCountry in which user session occurred, based on IP.
regionTEXTRegion in which user session occurred, based on IP.
cityTEXTCity in which user session occurred, based on IP.
referrerTEXTURL that linked to your site and initiated the session. If the user navigated directly to your site, or referral headers were stripped, then this value will appear as NULL downstream and as direct in the UI.
landing_pageTEXTURL of the first pageview of the session.
browserTEXTUser's browser.
utm_sourceTEXTGA-based utm_source tag associated with the session's initial pageview.
utm_campaignTEXTGA-based utm_campaign tag associated with the session's initial pageview.
utm_mediumTEXTGA-based utm_medium tag associated with the session's initial pageview.
utm_termTEXTGA-based utm_term tag associated with the session's initial pageview.
utm_contentTEXTGA-based utm_content tag associated with the session's initial pageview.
Custom variablesTEXTOne column is created for every unique custom variable key you've attached. The column type is automatically inferred from the underlying property values.
session_replay_linkTEXTContentsquare Platform url link of the session replay.
session_durationTEXTTotal duration of the session.
session_languageTEXTLanguage displayed during the session.
session_number_of_viewsTEXTThe number of page views in the session.
frustration_scoreTEXTMeasure of user frustration during the session.
looping_IndexTEXTThe proportion of same urls visited during the session (meaning the user is doing back & forth).
page_consumptionTEXTProportion of pages where the user spent less than 2s and did less than 2 clicks.
filtered_APVTEXTNumber of views in which APVs (Artificial Pageviews) are not counted / a boolean determining whether a given page view is an Artificial Page view.

The pageviews table contains detailed information about each pageview.

Column NameTypeDescription
event_idBIGINTID of associated pageview event.
user_idBIGINTUnique ID of associated user, randomly generated by CSQ.
session_idBIGINTUnique ID of associated session, randomly generated by CSQ.
timeTIMESTAMPTimestamp without time zone of when the pageview occurred.
libraryTEXTVersion of Contentsquare library which started the session. Can be one of "web" or "iOS".
platformTEXTUser's operating system.
device_typeTEXTDevice type. Can be one of: Mobile, Desktop, or Tablet.
countryTEXTCountry in which user session occurred, based on IP.
regionTEXTRegion in which user session occurred, based on IP.
cityTEXTCity in which user session occurred, based on IP.
referrerTEXTURL that linked to your site and started the session. If the user navigated directly to your site, or referral headers were stripped, then this value will appear as NULL downstream and as direct in the UI.
landing_pageTEXTURL of the first pageview of the session.
browserTEXTUser's browser.
utm_sourceTEXTGA-based utm_source tag associated with the session's initial pageview.
utm_campaignTEXTGA-based utm_campaign tag associated with the session's initial pageview.
utm_mediumTEXTGA-based utm_medium tag associated with the session's initial pageview.
utm_termTEXTGA-based utm_term tag associated with the session's initial pageview.
utm_contentTEXTGA-based utm_content tag associated with the session's initial pageview.
pathTEXTThe path of the pageview.
queryTEXTThe query parameters associated with the pageview.
hashTEXTThe hash parameters associated with the pageview.
titleTEXTTitle of the current page.
Custom variablesTEXTOne column is created for every unique custom variable key you've attached. The column type is automatically inferred from the underlying property values.
Dynamic variablesTEXTOne column is created for every unique dynamic variable key you've attached. The column type is automatically inferred from the underlying property values.
view_numberTEXTThe pageview number in the session.
is_firstTEXTIf the page is the entry page then value is 1 otherwise value is 0.
is_lastTEXTIf the page is the exit page then value is 1, otherwise value is 0.
scroll_rateTEXTThe portion of the page which was displayed (in percentage points).
view_duration_msecTEXTDuration between the page view event time (view_time) and the timestamp of the last interaction event (in milliseconds).
first_input_delayTEXTDelay (in milliseconds) between the first user interaction and when the web browser can handle it (part of Core Web Vitals).
interaction_to_next_paintTEXTLongest visual delay (in milliseconds) experienced by a user when interacting with the page (clicks or keystrokes).
time_to_first_byteTEXTDelay (in milliseconds) between the request for the web page and when the web browser received the first piece of data.
largest_contentful_paintTEXTDelay (in milliseconds) to render the largest piece of content (part of Core Web Vitals).
cumulative_layout_shiftTEXTScore about the layout stability of the page rendering (part of Core Web Vitals).
dom_interactive_after_msecTEXTThe time in milliseconds required before the user is able to interact with the page (although all resources may not be fully loaded yet).
fully_loadedTEXTThe time in milliseconds required to fully load all the resources of the page, from initiation of the pageview to load completion in the browser.
first_contentful_paintTEXTDelay (in milliseconds) to render the first piece of content.
start_renderTEXTTimestamp sent by tag when the rendering of the page started.
window_heightTEXTHeight of the window in which the page is opened.
window_widthTEXTWidth of the window in which the page is opened.
session_replay_linkTEXTContentsquare Platform url link of the session replay.

Contentsquare will create one table for every labeled event you've created within Contentsquare and synced downstream. See how to create, manage, and sync Events.

Column NameTypeDescription
event_idBIGINTID of the associated event, randomly generated by CSQ.
user_idBIGINTUnique ID of the associated user, randomly generated by CSQ.
session_idBIGINTUnique ID of the associated session, randomly generated by CSQ.
pageview_idBIGINTUnique ID of the associated page, randomly generated by CSQ.
timeTIMESTAMPTimestamp without time zone of when the event happened.
typeTEXTFor web autocaptured events, can be any of view page, click, submit, change, with push state events registered as view page events. For custom events, this will be the custom event name.
libraryTEXTVersion of Contentsquare library on which event occurred. Can be one of "web", "iOS", or "server".
platformTEXTUser's operating system.
device_typeTEXTDevice type. Can be one of: Mobile, Desktop, or Tablet.
countryTEXTCountry in which user session occurred, based on IP.
regionTEXTRegion in which user session occurred, based on IP.
cityTEXTCity in which user session occurred, based on IP.
referrerTEXTURL that linked to your site and started the session. If the user navigated directly to your site, or referral headers were stripped, then this value will appear as NULL downstream and as direct in the UI.
landing_pageTEXTURL of the first pageview of the session.
landing_page_queryTEXTThe query parameters of the first page of the user’s session.
landing_page_hashTEXTThe hash route of the first page of the user’s session.
browserTEXTUser's browser.
utm_sourceTEXTGA-based utm_source tag associated with the session's initial pageview.
utm_campaignTEXTGA-based utm_campaign tag associated with the session's initial pageview.
utm_mediumTEXTGA-based utm_medium tag associated with the session's initial pageview.
utm_termTEXTGA-based utm_term tag associated with the session's initial pageview.
utm_contentTEXTGA-based utm_content tag associated with the session's initial pageview.
domainTEXTDomain including subdomain, e.g. blog.company.io.
pathTEXTPortion of the current URL following your domain, e.g. /docs for company.io/docs.
hashTEXTPortion of the current URL following the hash sign, e.g. #install for company.io/docs#install.
queryTEXTQuery params of the page's current URL, e.g. ?utm_id=1234 for company.io?utm_id=1234.
titleTEXTTitle of the current page.
hrefTEXThref property of link (used for clicks on anchor tags).
target_textTEXTButton text of the event target.

Several columns across tables track time in different ways. Understanding their differences is important for writing accurate queries.

ColumnTableMeaning
timesessionsTimestamp when the session started. Use for session-level date filters.
timeeventsTimestamp when the event occurred.
session_timeeventsSeconds elapsed since the session started. Use for analyzing event timing within a session (for example, WHERE session_time <= 60 to find events in the first minute).
joindateusersTimestamp when the user was first seen. Use for cohort analysis and user lifetime calculations.
-- Find users and their first action within 30 seconds of session start
SELECT u.user_id, u.joindate, e.event_table_name, e.session_time
FROM users u
JOIN all_events e ON u.user_id = e.user_id
WHERE e.session_time <= 30
ORDER BY u.joindate DESC;
Experience Monitoring

The api_errors table contains information about API errors detected during user sessions.

Column NameTypeDescription
event_idBIGINTID of associated event, randomly generated.
user_idBIGINTID of associated user, randomly generated.
session_idBIGINTID of associated session, randomly generated.
pageview_idBIGINTUnique ID of the associated page, randomly generated by CSQ.
timeTIMESTAMPTimestamp without time zone of when the event happened.
libraryTEXTVersion of Contentsquare library which began the session. "web" or "iOS".
platformTEXTUser's operating system.
device_typeTEXTDevice type. Can be one of: Mobile, Desktop, or Tablet.
countryTEXTCountry in which user session occurred, based on IP.
regionTEXTRegion in which user session occurred, based on IP.
cityTEXTCity in which user session occurred, based on IP.
landing_pageTEXTURL of the first pageview of the session.
api_error_durationTEXTTime taken to resolve API error.
api_error_endpointTEXTURL where API errors are reported.
api_error_methodTEXTStandardized communication of API error responses.
api_error_status_codeTEXTNumerical code identifying API error.
errors_after_clicksTEXTIssue occurring post-user interaction (boolean).
error_group_idTEXTIdentifier for categorizing related errors.
error_sourceTEXTOrigin point of the reported error.
typeTEXTAPI error.
referrerTEXTURL that linked to your site and initiated the session. If the user navigated directly to your site, or referral headers were stripped, then this value will appear as NULL downstream and as direct in the UI.
Experience Monitoring

The js_errors table contains information about JavaScript errors detected during user sessions.

Column NameTypeDescription
event_idBIGINTID of associated session event, randomly generated.
user_idBIGINTID of associated user, randomly generated.
session_idBIGINTID of associated session, randomly generated.
pageview_idBIGINTUnique ID of the associated page, randomly generated by CSQ.
timeTIMESTAMPTimestamp without time zone of when the event happened.
libraryTEXTVersion of Contentsquare library which began the session. "web" or "iOS".
platformTEXTUser's operating system.
device_typeTEXTDevice type. Can be one of: Mobile, Desktop, or Tablet.
countryTEXTCountry in which user session occurred, based on IP.
regionTEXTRegion in which user session occurred, based on IP.
cityTEXTCity in which user session occurred, based on IP.
landing_pageTEXTURL of the first pageview of the session.
error_messageTEXTDescription of the encountered error.
error_line_numberTEXTLine in the code where the error occurred.
js_error_file_nameTEXTFile name containing the JavaScript error.
js_error_column_numberTEXTColumn in the file where the error occurred.
errors_after_clicksTEXTIssue occurring post-user interaction (boolean).
error_group_idTEXTIdentifier for categorizing related errors.
error_sourceTEXTOrigin point of the reported error.
typeTEXTJS error.
referrerTEXTURL that linked to your site and initiated the session. If the user navigated directly to your site, or referral headers were stripped, then this value will appear as NULL downstream and as direct in the UI.
Experience Monitoring

The custom_errors table contains information about custom errors detected during user sessions.

Column NameTypeDescription
event_idBIGINTID of associated session event, randomly generated.
user_idBIGINTID of associated user, randomly generated.
session_idBIGINTID of associated session, randomly generated.
pageview_idBIGINTUnique ID of the associated page, randomly generated by CSQ.
timeTIMESTAMPTimestamp without time zone of when the event happened.
libraryTEXTVersion of Contentsquare library which began the session. "web" or "iOS".
platformTEXTUser's operating system.
device_typeTEXTDevice type. Can be one of: Mobile, Desktop, or Tablet.
countryTEXTCountry in which user session occurred, based on IP.
regionTEXTRegion in which user session occurred, based on IP.
cityTEXTCity in which user session occurred, based on IP.
landing_pageTEXTURL of the first pageview of the session.
error_messageTEXTDescription of the encountered error.
errors_after_clicksTEXTIssue occurring post-user interaction (boolean).
error_group_idTEXTIdentifier for categorizing related errors.
error_sourceTEXTOrigin point of the reported error.
typeTEXTCustom Error.
referrerTEXTURL that linked to your site and initiated the session. If the user navigated directly to your site, or referral headers were stripped, then this value will appear as NULL downstream and as direct in the UI.

Data Connect creates one table per frustration signal type, each prefixed with frustration_events_. All frustration tables share a common set of columns, with a few table-specific columns listed under each table.

Common columns (all frustration_events_* tables)

Section titled Common columns (all frustration_events_* tables)
Column NameTypeDescription
event_idBIGINTID of the associated event, randomly generated by Contentsquare.
user_idBIGINTUnique ID of the associated user, randomly generated by Contentsquare.
session_idBIGINTUnique ID of the associated session, randomly generated by Contentsquare.
pageview_idBIGINTUnique ID of the associated page, randomly generated by Contentsquare.
timeTIMESTAMPTimestamp without time zone of when the event happened.
libraryTEXTAlways frustration.
platformTEXTUser's operating system.
device_typeTEXTDevice type. Can be one of: Mobile, Desktop, or Tablet.
countryTEXTCountry in which user session occurred, based on IP.
regionTEXTRegion in which user session occurred, based on IP.
cityTEXTCity in which user session occurred, based on IP.
landing_pageTEXTURL of the first pageview of the session.
landing_page_queryTEXTQuery parameters of the landing page URL.
landing_page_hashTEXTHash fragment of the landing page URL.
typeTEXTFrustration signal type (e.g. Rage Click, Multiple Button Interactions).
referrerTEXTURL that linked to your site and initiated the session. If the user navigated directly to your site, or referral headers were stripped, then this value will appear as NULL downstream and as direct in the UI.
domainTEXTDomain including subdomain, e.g. blog.company.io.
pathTEXTPortion of the current URL following your domain, e.g. /docs for company.io/docs.
hashTEXTPortion of the current URL following the hash sign, e.g. #install for company.io/docs#install.
queryTEXTQuery params of the page's current URL, e.g. ?utm_id=1234 for company.io?utm_id=1234.
titleTEXTTitle of the current page.
hrefTEXThref attribute of the link element, if the frustration event occurred on an anchor tag.
target_textTEXTThe text displayed on the element where the event occurred.
frustration_scoreTEXTFrustration score of the pageview.

A rage click occurs when a user clicks on the same element multiple times in quick succession, indicating frustration.

Column NameTypeDescription
target_pathTEXTTarget path of the HTML component where the event happened.
relative_timeINTTime offset in milliseconds from the start of the pageview.
valueINTNumber of clicks recorded.

frustration_events_multiple_button_interactions

Section titled frustration_events_multiple_button_interactions

A multiple button interaction occurs when a user clicks the same button multiple times, suggesting the button may not be responding as expected.

Column NameTypeDescription
target_pathTEXTTarget path of the HTML component where the event happened.
relative_timeINTTime offset in milliseconds from the start of the pageview.
valueINTNumber of clicks recorded.

frustration_events_multiple_field_interactions

Section titled frustration_events_multiple_field_interactions

A multiple field interaction occurs when a user interacts with the same form field multiple times, which may indicate confusion or an error in the form.

Column NameTypeDescription
target_pathTEXTTarget path of the HTML component where the event happened.
relative_timeINTTime offset in milliseconds from the start of the pageview.
valueINTNumber of interactions recorded.

frustration_events_multiple_use_target

Section titled frustration_events_multiple_use_target

A multiple use target event occurs when a user clicks the same element multiple times across the page.

Column NameTypeDescription
target_pathTEXTTarget path of the HTML component where the event happened.
relative_timeINTTime offset in milliseconds from the start of the pageview.
valueINTNumber of clicks recorded.
countINTNumber of times the target was used.

frustration_events_excessive_hover

Section titled frustration_events_excessive_hover

An excessive hover event occurs when a user hovers over an element for an unusually long time, which may indicate hesitation or confusion.

Column NameTypeDescription
target_pathTEXTTarget path of the HTML component where the event happened.
relative_timeINTTime offset in milliseconds from the start of the pageview.
valueINTDuration of the hover in milliseconds.
countINTNumber of hover events recorded.

frustration_events_loading_time

Section titled frustration_events_loading_time

A loading time event is recorded when a page or resource takes longer than expected to load, which can negatively impact user experience.

Column NameTypeDescription
target_pathTEXTTarget path of the HTML component associated with the loading event.
relative_timeINTTime offset in milliseconds from the start of the pageview.
valueINTLoading time duration in milliseconds.
countINTNumber of loading time events recorded.

frustration_events_low_activity

Section titled frustration_events_low_activity

A low activity event is recorded when a user shows very little interaction with a page (fewer than 2 clicks and less than 2 seconds of engagement), suggesting disengagement.

Column NameTypeDescription
target_pathTEXTTarget path of the HTML component where the low activity was detected.
relative_timeINTTime offset in milliseconds from the start of the pageview.
valueINTActivity metric value.
countINTNumber of low activity events recorded.

The frustration_scores view is automatically created when a customer syncs at least one frustration table. It aggregates frustration signals at the pageview level by joining the pageviews table with all active frustration_events_* tables. Each row corresponds to one pageview and includes all pageview columns plus one count column per synced frustration table.

The count columns are dynamically named number_of_<table_name> based on which frustration tables are enabled. For example:

Column NameTypeDescription
number_of_frustration_events_rage_clickINTNumber of rage click events for this pageview. NULL if none occurred.
number_of_frustration_events_excessive_hoverINTNumber of excessive hover events for this pageview. NULL if none occurred.
number_of_frustration_events_loading_timeINTNumber of loading time events for this pageview. NULL if none occurred.
number_of_frustration_events_multiple_button_interactionsINTNumber of multiple button interaction events. NULL if none occurred.
number_of_frustration_events_multiple_field_interactionsINTNumber of multiple field interaction events. NULL if none occurred.
number_of_frustration_events_multiple_use_targetINTNumber of multiple use target events for this pageview. NULL if none occurred.
number_of_frustration_events_low_activityINTNumber of low activity events for this pageview. NULL if none occurred.

Only columns for frustration tables actively synced by the customer will appear in the view.

The ecommerce_transactions table contains information about e-commerce transactions.

Column NameTypeDescription
event_idBIGINTID of associated session event, randomly generated.
user_idBIGINTID of associated user, randomly generated.
session_idBIGINTID of associated session, randomly generated.
timeTIMESTAMPTimestamp without time zone of when the pageview occurred.
libraryTEXTVersion of Contentsquare library which began the session. "web" or "iOS".
platformTEXTUser's operating system.
countryTEXTCountry in which user session occurred, based on IP.
regionTEXTRegion in which user session occurred, based on IP.
cityTEXTCity in which user session occurred, based on IP.
referrerTEXTURL that linked to your site and initiated the session. If the user navigated directly to your site, or referral headers were stripped, then this value will appear as NULL downstream and as direct in the UI.
typeTEXTE-commerce transaction.
device_typeTEXTDevice type. Can be one of: Mobile, Desktop, or Tablet.
transaction_amountTEXTThe total monetary value of a purchase or payment made by a user during a session.
transaction_currencyTEXTThe currency in which the transaction amount was processed.

Table Does Not Sync For S3. The all_events table does not sync for S3. You can create an all_events view of all synced tables after the data synced to S3 has gone through ETL processing.

Column NameTypeDescription
event_idBIGINTID of associated event, randomly generated by Contentsquare.
user_idBIGINTUnique ID of associated user, randomly generated by Contentsquare.
session_idBIGINTUnique ID of associated session, randomly generated by Contentsquare.
timeTIMESTAMPTimestamp without time zone of when event happened.
event_table_nameTEXTName of the table where similar events can be found.

The all_events table is a UNION of all individual event tables. Because a single event can qualify for multiple event definitions, the same event_id may appear multiple times in this table with different event_table_name values.

For data warehouses (Redshift, BigQuery, Snowflake):

  • The composite primary key for all_events should be: (user_id, event_id, event_table_name)
  • Individual event tables can use: (user_id, event_id)

For S3:

  • You must handle deduplication in your ETL process after identity resolution
  • Use the same composite key pattern when creating views

See Troubleshooting for more information about duplicate event IDs.

Contentsquare creates a table called _sync_history that contains a record for every attempted sync. This table is helpful for monitoring sync performance and scheduling dependent jobs. All timestamps are in UTC for Redshift and BigQuery and in local time for Snowflake.

Column NameTypeDescription
statusSTRINGTakes value running/succeeded/failed based on the outcome of the sync attempt.
start_timeTIMESTAMPStart time of the sync attempt.
finish_timeTIMESTAMPFinish time of the sync attempt.
errorSTRINGError message if sync fails.
next_scheduled_sync_atTIMESTAMPTime of the next scheduled sync (estimated at the time of sync based on sync frequency settings).

Contentsquare creates a table called _sync_info that contains metadata around the sync process and last updated times.

Column NameTypeDescription
event_table_nameTEXTThe event name.
sync_startedTIMESTAMPA timestamp for when that table began syncing.
sync_endedTIMESTAMPA timestamp for when that table completed syncing.
synced_to_timeTIMESTAMPA timestamp that reflects the most recent occurrence of the event.
inserted_row_countBIGINTThe number of rows inserted during the most recent sync.

When a user is identified, a migration occurs to aggregate the data under the new identity. These migrations are recorded in this table. Time is a column for Redshift and S3 destinations only. Snowflake and BigQuery do not have a column for time.

Column NameTypeDescription
from_user_idBIGINTThe migrating user's ID.
to_user_idBIGINTThe destination user's ID.
timeTIMESTAMPA timestamp for when the migration occurred.

This table contains metadata about the names of events, which allows you to build automated jobs that use this metadata. This table is re-created on each sync to stay updated with the latest event metadata.

Column NameTypeDescription
table_nameTEXTFor each event, this row contains the sanitized name of the table containing the event's data synced to your warehouse.
ui_nameTEXTFor each event, this row contains the name of the event in Contentsquare.
last_updated_atTIMESTAMPTime event label was last updated.
last_updated_byTEXTUser who last updated event label.
notesTEXTNotes attached to the event.
sourceTEXTData capture source (ex: web, ios, android).
verified_atTIMESTAMPTime event was verified.
verified_byTEXTUser who verified the event.

The event_metadata table shows up differently for S3. You can query it as follows:

{
"ui_name": "Stripe charge",
"table_name": "stripe_charge",
"source": "stripe",
"last_updated_at": "1609459200000",
"last_updated_by": "sales@company.com",
"notes": "This is a note",
"verified_at": "1640995200000",
"verified_by": "admin@company.com"
}