Data Schema

Data Connect creates the following tables in your data warehouse, each representing a different aspect of user behavior data. Understanding this schema is essential for effective analysis and joining with other data sources.

Data Connect adopts a hierarchical schema. By default, Data Connect will sync the following tables:

  • One users table (for all user information)
  • One sessions table (for all session information)
  • One pageviews table (for all pageview information)

The users table contains a row for each unique user that has visited your site.

Column NameTypeDescription
user_idBIGINTUnique ID of user, randomly generated by Contentsquare (CSQ).
identityTEXTUser's username or other unique token, passed identify. Must be unique.
handleTEXTUser's username or other unique token, passed identify.
emailTEXTUser's email address, passed in via addUserProperties API.
joindateTIMESTAMPTimestamp without time zone of when the user was first seen. In S3, this is in UNIX timestamp format.
last_modifiedTIMESTAMPTimestamp without time zone of when the user's data was last modified. In S3, this is in UNIX timestamp format.
user propertiesTEXTThere will be one column for every unique user property you've sent via the addUserProperties API (name, gender, account status, etc). The column type is automatically inferred from the underlying property values.

The sessions table represents individual user sessions.

Column NameTypeDescription
event_idBIGINTID of associated session event, randomly generated by CSQ.
user_idBIGINTID of associated user, randomly generated by CSQ.
session_idBIGINTID of associated session, randomly generated by CSQ.
timeTIMESTAMPTimestamp without time zone of when session started.
libraryTEXTVersion of Contentsquare library which began the session. Can be one of "web" or "iOS".
platformTEXTUser's operating system.
device_typeTEXTDevice type. Can be one of: Mobile, Desktop, or Tablet.
countryTEXTCountry in which user session occurred, based on IP.
regionTEXTRegion in which user session occurred, based on IP.
cityTEXTCity in which user session occurred, based on IP.
referrerTEXTURL that linked to your site and initiated the session. If the user navigated directly to your site, or referral headers were stripped, then this value will appear as NULL downstream and as direct in the UI.
landing_pageTEXTURL of the first pageview of the session.
browserTEXTUser's browser.
utm_sourceTEXTGA-based utm_source tag associated with the session's initial pageview.
utm_campaignTEXTGA-based utm_campaign tag associated with the session's initial pageview.
utm_mediumTEXTGA-based utm_medium tag associated with the session's initial pageview.
utm_termTEXTGA-based utm_term tag associated with the session's initial pageview.
utm_contentTEXTGA-based utm_content tag associated with the session's initial pageview.
Custom variablesTEXTOne column is created for every unique custom variable key you've attached. The column type is automatically inferred from the underlying property values.
session_replay_linkTEXTContentsquare Platform url link of the session replay.
session_durationTEXTTotal duration of the session.
session_languageTEXTLanguage displayed during the session.
session_number_of_viewsTEXTThe number of page views in the session.
frustration_scoreTEXTMeasure of user frustration during the session.
looping_IndexTEXTThe proportion of same urls visited during the session (meaning the user is doing back & forth).
page_consumptionTEXTProportion of pages where the user spent less than 2s and did less than 2 clicks.
filtered_APVTEXTNumber of views in which APVs (Artificial Pageviews) are not counted / a boolean determining whether a given page view is an Artificial Page view.

The pageviews table contains detailed information about each pageview.

Column NameTypeDescription
event_idBIGINTID of associated pageview event.
user_idBIGINTUnique ID of associated user, randomly generated by CSQ.
session_idBIGINTUnique ID of associated session, randomly generated by CSQ.
timeTIMESTAMPTimestamp without time zone of when the pageview occurred.
libraryTEXTVersion of Contentsquare library which started the session. Can be one of "web" or "iOS".
platformTEXTUser's operating system.
device_typeTEXTDevice type. Can be one of: Mobile, Desktop, or Tablet.
countryTEXTCountry in which user session occurred, based on IP.
regionTEXTRegion in which user session occurred, based on IP.
cityTEXTCity in which user session occurred, based on IP.
referrerTEXTURL that linked to your site and started the session. If the user navigated directly to your site, or referral headers were stripped, then this value will appear as NULL downstream and as direct in the UI.
landing_pageTEXTURL of the first pageview of the session.
browserTEXTUser's browser.
utm_sourceTEXTGA-based utm_source tag associated with the session's initial pageview.
utm_campaignTEXTGA-based utm_campaign tag associated with the session's initial pageview.
utm_mediumTEXTGA-based utm_medium tag associated with the session's initial pageview.
utm_termTEXTGA-based utm_term tag associated with the session's initial pageview.
utm_contentTEXTGA-based utm_content tag associated with the session's initial pageview.
pathTEXTThe path of the pageview.
queryTEXTThe query parameters associated with the pageview.
hashTEXTThe hash parameters associated with the pageview.
titleTEXTTitle of the current page.
Custom variablesTEXTOne column is created for every unique custom variable key you've attached. The column type is automatically inferred from the underlying property values.
Dynamic variablesTEXTOne column is created for every unique dynamic variable key you've attached. The column type is automatically inferred from the underlying property values.
view_numberTEXTThe pageview number in the session.
is_firstTEXTIf the page is the entry page then value is 1 otherwise value is 0.
is_lastTEXTIf the page is the exit page then value is 1, otherwise value is 0.
scroll_rateTEXTThe portion of the page which was displayed (in percentage points).
view_duration_msecTEXTDuration between the page view event time (view_time) and the timestamp of the last interaction event (in milliseconds).
first_input_delayTEXTDelay (in milliseconds) between the first user interaction and when the web browser can handle it (part of Core Web Vitals).
interaction_to_next_paintTEXTLongest visual delay (in milliseconds) experienced by a user when interacting with the page (clicks or keystrokes).
time_to_first_byteTEXTDelay (in milliseconds) between the request for the web page and when the web browser received the first piece of data.
largest_contentful_paintTEXTDelay (in milliseconds) to render the largest piece of content (part of Core Web Vitals).
cumulative_layout_shiftTEXTScore about the layout stability of the page rendering (part of Core Web Vitals).
dom_interactive_after_msecTEXTThe time in milliseconds required before the user is able to interact with the page (although all resources may not be fully loaded yet).
fully_loadedTEXTThe time in milliseconds required to fully load all the resources of the page, from initiation of the pageview to load completion in the browser.
first_contentful_paintTEXTDelay (in milliseconds) to render the first piece of content.
start_renderTEXTTimestamp sent by tag when the rendering of the page started.
window_heightTEXTHeight of the window in which the page is opened.
window_widthTEXTWidth of the window in which the page is opened.
session_replay_linkTEXTContentsquare Platform url link of the session replay.

Contentsquare will create one table for every labeled event you've created within Contentsquare and synced downstream. See how to create, manage, and sync Events.

Column NameTypeDescription
event_idBIGINTID of the associated event, randomly generated by CSQ.
user_idBIGINTUnique ID of the associated user, randomly generated by CSQ.
session_idBIGINTUnique ID of the associated session, randomly generated by CSQ.
pageview_idBIGINTUnique ID of the associated page, randomly generated by CSQ.
timeTIMESTAMPTimestamp without time zone of when the event happened.
typeTEXTFor web autocaptured events, can be any of view page, click, submit, change, with push state events registered as view page events. For custom events, this will be the custom event name.
libraryTEXTVersion of Contentsquare library on which event occurred. Can be one of "web", "iOS", or "server".
platformTEXTUser's operating system.
device_typeTEXTDevice type. Can be one of: Mobile, Desktop, or Tablet.
countryTEXTCountry in which user session occurred, based on IP.
regionTEXTRegion in which user session occurred, based on IP.
cityTEXTCity in which user session occurred, based on IP.
referrerTEXTURL that linked to your site and started the session. If the user navigated directly to your site, or referral headers were stripped, then this value will appear as NULL downstream and as direct in the UI.
landing_pageTEXTURL of the first pageview of the session.
landing_page_queryTEXTThe query parameters of the first page of the user’s session.
landing_page_hashTEXTThe hash route of the first page of the user’s session.
browserTEXTUser's browser.
utm_sourceTEXTGA-based utm_source tag associated with the session's initial pageview.
utm_campaignTEXTGA-based utm_campaign tag associated with the session's initial pageview.
utm_mediumTEXTGA-based utm_medium tag associated with the session's initial pageview.
utm_termTEXTGA-based utm_term tag associated with the session's initial pageview.
utm_contentTEXTGA-based utm_content tag associated with the session's initial pageview.
domainTEXTDomain including subdomain, e.g. blog.company.io.
pathTEXTPortion of the current URL following your domain, e.g. /docs for company.io/docs.
hashTEXTPortion of the current URL following the hash sign, e.g. #install for company.io/docs#install.
queryTEXTQuery params of the page's current URL, e.g. ?utm_id=1234 for company.io?utm_id=1234.
titleTEXTTitle of the current page.
hrefTEXThref property of link (used for clicks on anchor tags).
target_textTEXTButton text of the event target.

Several columns across tables track time in different ways. Understanding their differences is important for writing accurate queries.

ColumnTableMeaning
timesessionsTimestamp when the session started. Use for session-level date filters.
timeeventsTimestamp when the event occurred.
session_timeeventsSeconds elapsed since the session started. Use for analyzing event timing within a session (for example, WHERE session_time <= 60 to find events in the first minute).
joindateusersTimestamp when the user was first seen. Use for cohort analysis and user lifetime calculations.
-- Find users and their first action within 30 seconds of session start
SELECT u.user_id, u.joindate, e.event_table_name, e.session_time
FROM users u
JOIN all_events e ON u.user_id = e.user_id
WHERE e.session_time <= 30
ORDER BY u.joindate DESC;
Experience Monitoring

The api_errors table contains information about API errors detected during user sessions.

Column NameTypeDescription
event_idBIGINTID of associated event, randomly generated.
user_idBIGINTID of associated user, randomly generated.
session_idBIGINTID of associated session, randomly generated.
pageview_idBIGINTUnique ID of the associated page, randomly generated by CSQ.
timeTIMESTAMPTimestamp without time zone of when the event happened.
libraryTEXTVersion of Contentsquare library which began the session. "web" or "iOS".
platformTEXTUser's operating system.
device_typeTEXTDevice type. Can be one of: Mobile, Desktop, or Tablet.
countryTEXTCountry in which user session occurred, based on IP.
regionTEXTRegion in which user session occurred, based on IP.
cityTEXTCity in which user session occurred, based on IP.
landing_pageTEXTURL of the first pageview of the session.
api_error_durationTEXTTime taken to resolve API error.
api_error_endpointTEXTURL where API errors are reported.
api_error_methodTEXTStandardized communication of API error responses.
api_error_status_codeTEXTNumerical code identifying API error.
errors_after_clicksTEXTIssue occurring post-user interaction (boolean).
error_group_idTEXTIdentifier for categorizing related errors.
error_sourceTEXTOrigin point of the reported error.
typeTEXTAPI error.
referrerTEXTURL that linked to your site and initiated the session. If the user navigated directly to your site, or referral headers were stripped, then this value will appear as NULL downstream and as direct in the UI.
Experience Monitoring

The js_errors table contains information about JavaScript errors detected during user sessions.

Column NameTypeDescription
event_idBIGINTID of associated session event, randomly generated.
user_idBIGINTID of associated user, randomly generated.
session_idBIGINTID of associated session, randomly generated.
pageview_idBIGINTUnique ID of the associated page, randomly generated by CSQ.
timeTIMESTAMPTimestamp without time zone of when the event happened.
libraryTEXTVersion of Contentsquare library which began the session. "web" or "iOS".
platformTEXTUser's operating system.
device_typeTEXTDevice type. Can be one of: Mobile, Desktop, or Tablet.
countryTEXTCountry in which user session occurred, based on IP.
regionTEXTRegion in which user session occurred, based on IP.
cityTEXTCity in which user session occurred, based on IP.
landing_pageTEXTURL of the first pageview of the session.
error_messageTEXTDescription of the encountered error.
error_line_numberTEXTLine in the code where the error occurred.
js_error_file_nameTEXTFile name containing the JavaScript error.
js_error_column_numberTEXTColumn in the file where the error occurred.
errors_after_clicksTEXTIssue occurring post-user interaction (boolean).
error_group_idTEXTIdentifier for categorizing related errors.
error_sourceTEXTOrigin point of the reported error.
typeTEXTJS error.
referrerTEXTURL that linked to your site and initiated the session. If the user navigated directly to your site, or referral headers were stripped, then this value will appear as NULL downstream and as direct in the UI.
Experience Monitoring

The custom_errors table contains information about custom errors detected during user sessions.

Column NameTypeDescription
event_idBIGINTID of associated session event, randomly generated.
user_idBIGINTID of associated user, randomly generated.
session_idBIGINTID of associated session, randomly generated.
pageview_idBIGINTUnique ID of the associated page, randomly generated by CSQ.
timeTIMESTAMPTimestamp without time zone of when the event happened.
libraryTEXTVersion of Contentsquare library which began the session. "web" or "iOS".
platformTEXTUser's operating system.
device_typeTEXTDevice type. Can be one of: Mobile, Desktop, or Tablet.
countryTEXTCountry in which user session occurred, based on IP.
regionTEXTRegion in which user session occurred, based on IP.
cityTEXTCity in which user session occurred, based on IP.
landing_pageTEXTURL of the first pageview of the session.
error_messageTEXTDescription of the encountered error.
errors_after_clicksTEXTIssue occurring post-user interaction (boolean).
error_group_idTEXTIdentifier for categorizing related errors.
error_sourceTEXTOrigin point of the reported error.
typeTEXTCustom Error.
referrerTEXTURL that linked to your site and initiated the session. If the user navigated directly to your site, or referral headers were stripped, then this value will appear as NULL downstream and as direct in the UI.

The following data schema table will be the same for every frustration factor (rage click, excessive hovering, multiple use element, multiple field interactions, multiple button interactions, loading time high and low activity).

Column NameTypeDescription
event_idBIGINTID of the associated event, randomly generated by Contentsquare.
user_idBIGINTUnique ID of the associated user, randomly generated by Contentsquare.
session_idBIGINTUnique ID of the associated session, randomly generated by Contentsquare.
pageview_idBIGINTUnique ID of the associated page, randomly generated by Contentsquare.
timeTIMESTAMPTimestamp without time zone of when the event happened.
libraryTEXTFrustration
platformTEXTUser's operating system.
device_typeTEXTDevice type. Can be one of: Mobile, Desktop, or Tablet.
countryTEXTCountry in which user session occurred, based on IP.
regionTEXTRegion in which user session occurred, based on IP.
cityTEXTCity in which user session occurred, based on IP.
landing_pageTEXTURL of the first pageview of the session.
typeTEXTrage click
referrerTEXTURL that linked to your site and initiated the session. If the user navigated directly to your site, or referral headers were stripped, then this value will appear as NULL downstream and as direct in the UI.
domainTEXTDomain including subdomain, e.g. blog.company.io.
pathTEXTPortion of the current URL following your domain, e.g. /docs for company.io/docs.
hashTEXTPortion of the current URL following the hash sign, e.g. #install for company.io/docs#install.
queryTEXTQuery params of the page's current URL, e.g. ?utm_id=1234 for company.io?utm_id=1234.
target_textTEXTThe text displayed on the button clicked.
frustration_scoreTEXTFrustration score of the pageview.
target_pathTEXTTarget path of the HTML component where the event happens.
relative_timeINTThe point in time when the event happens.
valueINTThe number of clicks.

The ecommerce_transactions table contains information about e-commerce transactions.

Column NameTypeDescription
event_idBIGINTID of associated session event, randomly generated.
user_idBIGINTID of associated user, randomly generated.
session_idBIGINTID of associated session, randomly generated.
timeTIMESTAMPTimestamp without time zone of when the pageview occurred.
libraryTEXTVersion of Contentsquare library which began the session. "web" or "iOS".
platformTEXTUser's operating system.
countryTEXTCountry in which user session occurred, based on IP.
regionTEXTRegion in which user session occurred, based on IP.
cityTEXTCity in which user session occurred, based on IP.
referrerTEXTURL that linked to your site and initiated the session. If the user navigated directly to your site, or referral headers were stripped, then this value will appear as NULL downstream and as direct in the UI.
typeTEXTE-commerce transaction.
device_typeTEXTDevice type. Can be one of: Mobile, Desktop, or Tablet.
transaction_amountTEXTThe total monetary value of a purchase or payment made by a user during a session.
transaction_currencyTEXTThe currency in which the transaction amount was processed.

Table Does Not Sync For S3. The all_events table does not sync for S3. You can create an all_events view of all synced tables after the data synced to S3 has gone through ETL processing.

Column NameTypeDescription
event_idBIGINTID of associated event, randomly generated by Contentsquare.
user_idBIGINTUnique ID of associated user, randomly generated by Contentsquare.
session_idBIGINTUnique ID of associated session, randomly generated by Contentsquare.
timeTIMESTAMPTimestamp without time zone of when event happened.
event_table_nameTEXTName of the table where similar events can be found.

The all_events table is a UNION of all individual event tables. Because a single event can qualify for multiple event definitions, the same event_id may appear multiple times in this table with different event_table_name values.

For data warehouses (Redshift, BigQuery, Snowflake):

  • The composite primary key for all_events should be: (user_id, event_id, event_table_name)
  • Individual event tables can use: (user_id, event_id)

For S3:

  • You must handle deduplication in your ETL process after identity resolution
  • Use the same composite key pattern when creating views

See Troubleshooting for more information about duplicate event IDs.

Contentsquare creates a table called _sync_history that contains a record for every attempted sync. This table is helpful for monitoring sync performance and scheduling dependent jobs. All timestamps are in UTC for Redshift and BigQuery and in local time for Snowflake.

Column NameTypeDescription
statusSTRINGTakes value running/succeeded/failed based on the outcome of the sync attempt.
start_timeTIMESTAMPStart time of the sync attempt.
finish_timeTIMESTAMPFinish time of the sync attempt.
errorSTRINGError message if sync fails.
next_scheduled_sync_atTIMESTAMPTime of the next scheduled sync (estimated at the time of sync based on sync frequency settings).

Contentsquare creates a table called _sync_info that contains metadata around the sync process and last updated times.

Column NameTypeDescription
event_table_nameTEXTThe event name.
sync_startedTIMESTAMPA timestamp for when that table began syncing.
sync_endedTIMESTAMPA timestamp for when that table completed syncing.
synced_to_timeTIMESTAMPA timestamp that reflects the most recent occurrence of the event.
inserted_row_countBIGINTThe number of rows inserted during the most recent sync.

When a user is identified, a migration occurs to aggregate the data under the new identity. These migrations are recorded in this table. Time is a column for Redshift and S3 destinations only. Snowflake and BigQuery do not have a column for time.

Column NameTypeDescription
from_user_idBIGINTThe migrating user's ID.
to_user_idBIGINTThe destination user's ID.
timeTIMESTAMPA timestamp for when the migration occurred.

This table contains metadata about the names of events, which allows you to build automated jobs that use this metadata. This table is re-created on each sync to stay updated with the latest event metadata.

Column NameTypeDescription
table_nameTEXTFor each event, this row contains the sanitized name of the table containing the event's data synced to your warehouse.
ui_nameTEXTFor each event, this row contains the name of the event in Contentsquare.
last_updated_atTIMESTAMPTime event label was last updated.
last_updated_byTEXTUser who last updated event label.
notesTEXTNotes attached to the event.
sourceTEXTData capture source (ex: web, ios, android).
verified_atTIMESTAMPTime event was verified.
verified_byTEXTUser who verified the event.

The event_metadata table shows up differently for S3. You can query it as follows:

{
"ui_name": "Stripe charge",
"table_name": "stripe_charge",
"source": "stripe",
"last_updated_at": "1609459200000",
"last_updated_by": "sales@company.com",
"notes": "This is a note",
"verified_at": "1640995200000",
"verified_by": "admin@company.com"
}