Static asset scraping
Contentsquare uses two complementary methods to collect static resources for Session Replay and Zoning: Static Resource Scraper and Static Resource Manager (SRM).
The Scraper is the primary and always-active collector, but requires configuration to access your resources. SRM can then be configured to collect any resources the Scraper can’t access.
Static Resource Scraper
Section titled Static Resource ScraperThe Scraper runs automatically and asynchronously downloads publicly accessible resources (CSS, images, fonts) from your website:
- It operates every 6 hours for frequently accessed pages
- Falls back to live scraping when resources aren’t cached
- Requires IP allowlisting or custom header configuration
- Stores resources securely for 13 months
Limitations:
- Assets must be accessible from Contentsquare’s IP addresses.
- Assets behind authentication or VPN are not accessible.
- Resources that change frequently (cache-busted URLs) may become outdated.
Static Resource Manager (SRM)
Section titled Static Resource Manager (SRM)For resources the scraper can’t access, SRM is the backup collection method.
When SRM is needed:
- Resources behind authentication or login
- VPN-protected or private network resources
- Assets requiring specific user cookies or headers
- Dynamically generated, or user-context-dependent content
- Base64-encoded data URLs
How SRM works:
- The user’s browser downloads the resource during the user session
- The tag computes a unique hash and checks if Contentsquare already has it
- If new, the tag uploads the resource directly to Contentsquare servers
- Resources are deduplicated across sessions using hash-based identification
Enabling Static Resource Scraper
Section titled Enabling Static Resource ScraperTo enable the Scraper, allow Contentsquare to access your assets using one of the following:
Allowing IP addresses (recommended)
Section titled Allowing IP addresses (recommended)The Scraper needs explicit permission to access your website through your proxy, firewall, or server configuration. IP allowlisting is the recommended method as it’s simpler to implement and maintain.
Why this is needed: Your firewall or security systems may block automated requests. By allowlisting Contentsquare’s IP addresses, you allow our Scraper to download static resources without triggering security alerts or rate limiting.
Allow ports 80 (HTTP) and 443 (HTTPS) and the following IP addresses to prevent your proxy, firewall, or server configuration from blocking the scraper.
52.18.162.15720.75.90.236100.24.76.9034.192.98.14820.67.250.10954.247.44.19652.51.9.1235.72.153.3835.73.99.4134.192.240.128Using a custom static header (alternative)
Section titled Using a custom static header (alternative)If you cannot allowlist by IP, you can use a custom header-based approach.
Security benefit: This method allows you to verify requests are legitimately from Contentsquare by validating the custom header value.
When to use this approach:
- IP allowlisting is restricted by security policy
- You need more granular control over scraper access
- You want to monitor and log scraper requests separately
When selecting this option, Contentsquare adds a custom header to the project settings.
{ "headers": { "my-new-header-key": "myKeyValue" }}You can then validate that scraper requests contain the specific header and value.
receivedHeaderExample = "myKeyValue";
const CONTENTSQUARE_CUSTOM_HEADER = "myKeyValue";if (receivedHeaderExample === CONTENTSQUARE_CUSTOM_HEADER) { //}Validation
Section titled ValidationTo verify scraping is working correctly:
- Check your Session Replays render with correct styling and images
- Review server logs for successful requests from Contentsquare IPs
Contact support if replays show missing resources after configuration.
Enabling Static Resource Manager (SRM)
Section titled Enabling Static Resource Manager (SRM)The Static Resource Manager is a complementary method where the Contentsquare tag collects resources directly from the visitor’s browser and uploads them to Contentsquare servers.
SRM handles two types of resources:
- Data URLs (base64 images): Collected automatically — no implementation required
- Remote resources: Must be explicitly enabled for each pageview using a tag command
Used for:
- Assets behind authentication or VPN
- Assets requiring specific cookies or session data
- Dynamically generated or user-specific content
Limitations:
- Missing or restrictive cache policies: If resources don’t have a
Cache-Controlheader that allows browser caching, SRM makes additional requests to the resource server for every session. This also occurs when resources return 404 errors (for example, when the page references assets the server can’t find). - CORS restrictions: Resources with incompatible Cross-Origin Resource Sharing (CORS) policies can’t be fetched by the browser, preventing SRM from collecting them.
- Lazy loading not respected: SRM downloads every resource it finds as soon as it discovers it in the DOM or in the CSS, regardless of lazy loading implementation. This may defeat your lazy loading strategy and cause all resources to load immediately. If your site relies on lazy loading for performance, avoid enabling online resource collection with SRM.
Data URLs (base64 images)
Section titled Data URLs (base64 images)When SRM is enabled for your project, the Contentsquare tag automatically collects base64-encoded data URLs found in the DOM. This ensures that embedded images appear correctly in Session Replay.
No additional implementation is required for data URLs.
Remote resources (authenticated assets)
Section titled Remote resources (authenticated assets)For remote resources that are behind authentication or not accessible to the Scraper, you must explicitly enable online asset collection for each pageview using a tag command.
Push the following command before the pageview starts:
window._uxa.push(["replay:resourceManager:enableForOnlineResource:nextPageviewOnly"]);Example with trackPageview:
// Enable SRM for the next pageviewwindow._uxa.push(["replay:resourceManager:enableForOnlineResource:nextPageviewOnly"]);
// Then track the pageviewwindow._uxa.push(["trackPageview", location.pathname + location.search + location.hash]);How to check if SRM is enabled for your project
Section titled How to check if SRM is enabled for your projectTo check the current state of the Static Resource Manager, use the following command:
window._uxa.push(["replay:resourceManager:getStatus"]);This returns:
{ isStarted: true, onlineAssets: { activated: false, enabledOnNextPageview: false, enabledForChildrenOnNextStart: false }}| Property | True if |
|---|---|
isStarted | SRM is enabled for the project and the browser supports it |
onlineAssets.activated | The current pageview has online resources collection enabled |
onlineAssets.enabledOnNextPageview | The next pageview will have online resources collection enabled |
onlineAssets.enabledForChildrenOnNextStart | New iframes will have online resources collection enabled |
Troubleshooting
Section titled TroubleshootingIf resources are not Appearing in Replays, check these common issues:
- IP not allowlisted: Verify all Contentsquare IPs are allowed on ports 80 and 443
- Authentication required: Resources behind login cannot be scraped automatically
- Solution: Consider using SRM for authenticated resources (contact support)
- VPN or private network: Resources on internal networks are not accessible
- Solution: Use SRM for internal resources (contact support)
- Geographical restrictions: Resources with geo-blocking may not be accessible
- Solution: Configure region-specific proxy settings (contact support)
- Rate limiting: Your server may be throttling scraper requests
- Solution: Allowlist our IPs in your rate limiting configuration