Static asset scraping

Contentsquare uses two complementary methods to collect static resources for Session Replay and Zoning: Static Resource Scraper and Static Resource Manager (SRM).

The Scraper is the primary and always-active collector, but requires configuration to access your resources. SRM can then be configured to collect any resources the Scraper can’t access.

The Scraper runs automatically and asynchronously downloads publicly accessible resources (CSS, images, fonts) from your website:

  • It operates every 6 hours for frequently accessed pages
  • Falls back to live scraping when resources aren’t cached
  • Requires IP allowlisting or custom header configuration
  • Stores resources securely for 13 months

Limitations:

  • Assets must be accessible from Contentsquare’s IP addresses.
  • Assets behind authentication or VPN are not accessible.
  • Resources that change frequently (cache-busted URLs) may become outdated.

For resources the scraper can’t access, SRM is the backup collection method.

When SRM is needed:

  • Resources behind authentication or login
  • VPN-protected or private network resources
  • Assets requiring specific user cookies or headers
  • Dynamically generated, or user-context-dependent content
  • Base64-encoded data URLs

How SRM works:

  • The user’s browser downloads the resource during the user session
  • The tag computes a unique hash and checks if Contentsquare already has it
  • If new, the tag uploads the resource directly to Contentsquare servers
  • Resources are deduplicated across sessions using hash-based identification

Enabling Static Resource Scraper

Section titled Enabling Static Resource Scraper

To enable the Scraper, allow Contentsquare to access your assets using one of the following:

Allowing IP addresses (recommended)

Section titled Allowing IP addresses (recommended)

The Scraper needs explicit permission to access your website through your proxy, firewall, or server configuration. IP allowlisting is the recommended method as it’s simpler to implement and maintain.

Why this is needed: Your firewall or security systems may block automated requests. By allowlisting Contentsquare’s IP addresses, you allow our Scraper to download static resources without triggering security alerts or rate limiting.

Allow ports 80 (HTTP) and 443 (HTTPS) and the following IP addresses to prevent your proxy, firewall, or server configuration from blocking the scraper.

52.18.162.157
20.75.90.236
100.24.76.90
34.192.98.148
20.67.250.109
54.247.44.196
52.51.9.12
35.72.153.38
35.73.99.41
34.192.240.128

Using a custom static header (alternative)

Section titled Using a custom static header (alternative)

If you cannot allowlist by IP, you can use a custom header-based approach.

Security benefit: This method allows you to verify requests are legitimately from Contentsquare by validating the custom header value.

When to use this approach:

  • IP allowlisting is restricted by security policy
  • You need more granular control over scraper access
  • You want to monitor and log scraper requests separately

When selecting this option, Contentsquare adds a custom header to the project settings.

{
"headers": {
"my-new-header-key": "myKeyValue"
}
}

You can then validate that scraper requests contain the specific header and value.

receivedHeaderExample = "myKeyValue";
const CONTENTSQUARE_CUSTOM_HEADER = "myKeyValue";
if (receivedHeaderExample === CONTENTSQUARE_CUSTOM_HEADER) {
//
}

To verify scraping is working correctly:

  1. Check your Session Replays render with correct styling and images
  2. Review server logs for successful requests from Contentsquare IPs

Contact support if replays show missing resources after configuration.

Enabling Static Resource Manager (SRM)

Section titled Enabling Static Resource Manager (SRM)

The Static Resource Manager is a complementary method where the Contentsquare tag collects resources directly from the visitor’s browser and uploads them to Contentsquare servers.

SRM handles two types of resources:

  • Data URLs (base64 images): Collected automatically — no implementation required
  • Remote resources: Must be explicitly enabled for each pageview using a tag command

Used for:

  • Assets behind authentication or VPN
  • Assets requiring specific cookies or session data
  • Dynamically generated or user-specific content

Limitations:

  • Missing or restrictive cache policies: If resources don’t have a Cache-Control header that allows browser caching, SRM makes additional requests to the resource server for every session. This also occurs when resources return 404 errors (for example, when the page references assets the server can’t find).
  • CORS restrictions: Resources with incompatible Cross-Origin Resource Sharing (CORS) policies can’t be fetched by the browser, preventing SRM from collecting them.
  • Lazy loading not respected: SRM downloads every resource it finds as soon as it discovers it in the DOM or in the CSS, regardless of lazy loading implementation. This may defeat your lazy loading strategy and cause all resources to load immediately. If your site relies on lazy loading for performance, avoid enabling online resource collection with SRM.

When SRM is enabled for your project, the Contentsquare tag automatically collects base64-encoded data URLs found in the DOM. This ensures that embedded images appear correctly in Session Replay.

No additional implementation is required for data URLs.

Remote resources (authenticated assets)

Section titled Remote resources (authenticated assets)

For remote resources that are behind authentication or not accessible to the Scraper, you must explicitly enable online asset collection for each pageview using a tag command.

Push the following command before the pageview starts:

window._uxa.push(["replay:resourceManager:enableForOnlineResource:nextPageviewOnly"]);

Example with trackPageview:

// Enable SRM for the next pageview
window._uxa.push(["replay:resourceManager:enableForOnlineResource:nextPageviewOnly"]);
// Then track the pageview
window._uxa.push(["trackPageview", location.pathname + location.search + location.hash]);

How to check if SRM is enabled for your project

Section titled How to check if SRM is enabled for your project

To check the current state of the Static Resource Manager, use the following command:

window._uxa.push(["replay:resourceManager:getStatus"]);

This returns:

{
isStarted: true,
onlineAssets: {
activated: false,
enabledOnNextPageview: false,
enabledForChildrenOnNextStart: false
}
}
PropertyTrue if
isStartedSRM is enabled for the project and the browser supports it
onlineAssets.activatedThe current pageview has online resources collection enabled
onlineAssets.enabledOnNextPageviewThe next pageview will have online resources collection enabled
onlineAssets.enabledForChildrenOnNextStartNew iframes will have online resources collection enabled

If resources are not Appearing in Replays, check these common issues:

  1. IP not allowlisted: Verify all Contentsquare IPs are allowed on ports 80 and 443
  2. Authentication required: Resources behind login cannot be scraped automatically
    • Solution: Consider using SRM for authenticated resources (contact support)
  3. VPN or private network: Resources on internal networks are not accessible
    • Solution: Use SRM for internal resources (contact support)
  4. Geographical restrictions: Resources with geo-blocking may not be accessible
    • Solution: Configure region-specific proxy settings (contact support)
  5. Rate limiting: Your server may be throttling scraper requests
    • Solution: Allowlist our IPs in your rate limiting configuration