Connect Common Queries

This article provides a collection of common queries you can use with Data Connect. You can copy and paste these queries into your BI tool or run them directly on your cluster. Some queries may require you to update property names, event names, or date ranges to fit your specific needs.

SELECT device_type
,plan
,COUNT(session_id)
FROM main_production.sessions AS sessions
INNER JOIN main_production.users AS users ON sessions.user_id = users.user_id
GROUP BY 1, 2
SELECT [Property_to_Group_By], AVG([Numeric_Property])

Creating An Enhanced Users Table

Section titled Creating An Enhanced Users Table

This query provides an example of how to recreate the users table, adding the Initial User Properties (ex: Initial Referrer, Initial Browser, etc.) that are displayed in your Contentsquare account.

BEGIN;
DROP TABLE IF EXISTS enhanced_users;
CREATE TABLE enhanced_users AS
SELECT u.*
,s.time AS date_first_seen
,s.referrer AS initial_referrer
,s.utm_source AS initial_utm_source
,s.utm_medium AS initial_utm_medium
,s.utm_campaign AS initial_utm_campaign
,s.city AS initial_city
,s.region AS initial_region
,s.country AS initial_country
FROM main_production.users u
JOIN (SELECT user_id
,MIN(time) AS earliest_time
FROM main_production.sessions
GROUP BY 1) fs
ON u.user_id = fs.user_id
JOIN main_production.sessions s
ON fs.user_id = s.user_id
AND fs.earliest_time = s.time;
COMMIT;

In this section, you’ll find various queries to measure aggregate metrics for your site, including average time, top events, and more. The value of these queries increases as you segment them by user personas, and the same logic can also be applied to calculate more specific statistics.

The top 10 events shed light on how users are interacting with your app. Looking at this list can generate insights about user behavior patterns.

SELECT event_type
,COUNT(*) AS events
FROM main_production.all_events
WHERE time >= CURRENT_DATE - INTERVAL '30 days'
GROUP BY 1
ORDER BY 2 DESC
LIMIT 10

You can modify the WHERE statement to adjust the date range. You can include the date in the SELECT statement and GROUP BY to see how the top events change over time.

This query calculates session-to-session retention analysis on a monthly basis. Each user is segmented into a cohort based on the month of their join date and is counted in the total retained on a monthly basis if they have a session during each month.

WITH first_session AS (
SELECT user_id
, DATE_TRUNC('month', MIN(time)) AS cohort_month
FROM main_production.sessions
GROUP BY 1
),
months AS (
SELECT DISTINCT DATE_TRUNC('month', time) AS session_month
FROM main_production.sessions
),
user_sessions AS (
SELECT s.user_id
, DATE_TRUNC('month', s.time) AS session_month
, f.cohort_month
FROM main_production.sessions s
JOIN first_session f ON s.user_id = f.user_id
GROUP BY 1, 2, 3
)
SELECT cohort_month
, session_month
, COUNT(DISTINCT user_id) AS active_users
, MAX(COUNT(DISTINCT user_id)) OVER (PARTITION BY cohort_month) AS total_users
, COUNT(DISTINCT user_id)::FLOAT / MAX(COUNT(DISTINCT user_id)) OVER (PARTITION BY cohort_month) AS retention_rate
FROM user_sessions
GROUP BY 1, 2
ORDER BY 1, 2
WITH data AS (
SELECT time::date AS day, session_id, count(*) AS pageviews
FROM heap_production.pageviews
GROUP BY 1, 2
)
SELECT day, count(CASE WHEN pageviews = 1 THEN 1 ELSE NULL END) / count(*) AS bounce_rate
FROM data
GROUP BY 1
ORDER BY 1

This section covers Data Connect queries that enable you to conduct funnel analysis by measuring all types of funnels. Funnels are fundamental to understanding any flow through your site, from onboarding to filling out forms, to virtually any step users are expected to take within your product.

WITH funnel AS (
SELECT user_id
, MIN(CASE WHEN event_name = 'view_signup_page' THEN time END) AS view_signup
, MIN(CASE WHEN event_name = 'submit_signup_form' THEN time END) AS submit_signup
, MIN(CASE WHEN event_name = 'complete_onboarding' THEN time END) AS complete_onboarding
FROM main_production.all_events
WHERE time >= CURRENT_DATE - INTERVAL '30 days'
GROUP BY 1
)
SELECT COUNT(DISTINCT user_id) AS total_users
,COUNT(DISTINCT CASE WHEN view_signup IS NOT NULL THEN user_id END) AS viewed_signup
,COUNT(DISTINCT CASE WHEN submit_signup IS NOT NULL THEN user_id END) AS submitted_signup
,COUNT(DISTINCT CASE WHEN complete_onboarding IS NOT NULL THEN user_id END) AS completed_onboarding
FROM funnel
WITH event_times AS (
SELECT user_id
,MIN(CASE WHEN event_name = 'first_event' THEN time END) AS first_event_time
,MIN(CASE WHEN event_name = 'second_event' THEN time END) AS second_event_time
FROM main_production.all_events
GROUP BY 1
)
SELECT AVG(EXTRACT(EPOCH FROM (second_event_time - first_event_time))) / 60 AS avg_minutes_between_events
FROM event_times
WHERE first_event_time IS NOT NULL
AND second_event_time IS NOT NULL
AND second_event_time > first_event_time

This section covers Data Connect queries that enable you to measure attribution to better understand where your users are coming from.

This query returns the top N referrers and the number of sessions they generate.

SELECT referrer
,COUNT(*) AS sessions
FROM main_production.sessions
WHERE referrer IS NOT NULL
GROUP BY 1
ORDER BY 2 DESC
LIMIT 10

This query generates a table that matches Contentsquare’s Users view. You can export and select all the user-level properties and all the first touch properties.

This query starts by calculating the first session for each user and joins the first sessions data with the users table based on the unique user ID. From here, each user-level property and session-level property can be selected.

SELECT u.*
,s.time AS date_first_seen
,s.referrer AS initial_referrer
,s.utm_source AS initial_utm_source
,s.utm_medium AS initial_utm_medium
,s.utm_campaign AS initial_utm_campaign
FROM main_production.users u
JOIN (SELECT user_id
,MIN(time) AS earliest_time
FROM main_production.sessions
GROUP BY 1) fs
ON u.user_id = fs.user_id
JOIN main_production.sessions s
ON fs.user_id = s.user_id
AND fs.earliest_time = s.time

One of the most powerful aspects of Data Connect is the ability to join a complete set of user interactions with other data sources to see revenue data, such as a user’s order history.

This query combines Contentsquare’s attribution data with order history data, allowing you to compare how each source is correlated with a user’s downstream behavior.

WITH first_touch AS (
SELECT all_sessions.user_id
,identity
,first_session.first_time AS first_seen
,utm_source
-- add any user level properties or session level properties here
FROM main_production.sessions AS all_sessions
JOIN (SELECT user_id
,MIN(time) AS first_time
FROM main_production.sessions
GROUP BY 1) AS first_session
ON all_sessions.user_id = first_session.user_id
AND all_sessions.time = first_session.first_time
)
SELECT utm_source
,COUNT(DISTINCT identity) AS users
,SUM(payment_amount) AS revenue
,SUM(payment_amount) / COUNT(DISTINCT identity) AS arpu
FROM first_touch
LEFT JOIN main_production.orders
ON first_touch.identity = main_production.orders.identity
WHERE utm_source IS NOT NULL
GROUP BY 1
ORDER BY 4 DESC

This query calculates the average time spent on each page, helping you identify which content is most engaging for your users.

SELECT path
,AVG(EXTRACT(EPOCH FROM (next_time - time))) AS avg_seconds_on_page
FROM (
SELECT p1.session_id
,p1.user_id
,p1.path
,p1.time
,p2.time AS next_time
,p2.path AS next_path
FROM (
SELECT *
,ROW_NUMBER() OVER (PARTITION BY session_id ORDER BY time) AS rn
FROM main_production.pageviews) p1
JOIN (
SELECT *
,ROW_NUMBER() OVER (PARTITION BY session_id ORDER BY time) AS rn
FROM main_production.pageviews) p2
ON p1.session_id = p2.session_id
AND p1.rn + 1 = p2.rn
-- optional filter to limit paths you're analyzing
-- WHERE p1.path !~ '/app'
GROUP BY p1.session_id,
p1.user_id,
p1.path,
p1.time,
p2.time) a
GROUP BY 1
ORDER BY 2 DESC

The average events per session metric can give you insight into aggregate behavior. This query can be modified to analyze behavior across segments and over time, allowing you to answer questions such as:

  • Does this number change with a product launch or tweaks to the onboarding flow?
  • Do different types of users interact differently?

Changes in the average can prompt investigation into what is really happening within your app.

SELECT COUNT(*) / COUNT(DISTINCT session_id) AS avg_events_per_session
FROM main_production.all_events
WHERE time >= CURRENT_DATE - INTERVAL '30 days'