Static assets scraping

In the context of Zoning and Session Replays, Contentsquare fetches static assets on your website.

To allow Contentsquare fetching these assets, select one of the following options:

  1. Allowlist Contentsquare IP addresses
  2. Use a static header to validate requests
  3. Use a dynamic signature header to validate requests

Allow ports 80 (HTTP) and 443 (HTTPS) and the following IP addresses to prevent your proxy, firewall, or server configuration from blocking the scraper.

52.18.162.157
20.75.90.236
100.24.76.90
34.192.98.148
20.67.250.109
54.247.44.196
52.51.9.12
35.72.153.38
35.73.99.41

When selecting this option, Contentsquare adds a custom header to the project settings.

{
"headers": {
"my-new-header-key": "myKeyValue"
}
}

You can then validate that scrapper requests contains the specific header and value.

receivedHeaderExample = 'myKeyValue';
const CONTENTSQUARE_CUSTOM_HEADER = 'myKeyValue';
if (receivedHeaderExample === CONTENTSQUARE_CUSTOM_HEADER) {
//
}

Using a custom dynamic signature header

Section titled Using a custom dynamic signature header

When selecting this option, Contentsquare adds the X-CONTENTSQUARE-SIGNATURE header to incoming requests from the scrapper.

The X-CONTENTSQUARE-SIGNATURE header is a string generated in this format:

<TIMESTAMP>-base64(hmac('sha256', <SECRET>, <RESOURCE_DOMAIN>-<TIMESTAMP>))

with:

  • <TIMESTAMP>: the time at which the request was sent using Date.now(),
  • <RESOURCE_DOMAIN>: the complete domain hosting the resource on your website,
  • <SECRET>: the secret shared between you and Contentsquare for the project, generated at project creation.

With a secret of abcde, the Contentsquare scrapper service has emitted the request below on contentsquare.com on the 6th of August 2020 at 05:39 am, to fetch the official Contentsquare logo.

Terminal window
curl \
-H 'accept-language: en-GB,en-US;q=0.9,en;q=0.8,fr;q=0.7' \
-H 'accept: application/json, text/plain, */*' \
-H 'user-agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.89 Safari/537.36' \
-H 'X-CONTENTSQUARE-SIGNATURE: 1596706743675-BxmHtG6vu4CfFlzpHxc0qYOmR0iMajlIvA2B4404qk4=' \
-X GET https://contentsquare.com/wp-content/themes/kps3-contentsquare/public/assets/images/contentsquare-logo.svg?tv=1.3.0

You can compute the signature and verify it against the value of the X-CONTENTSQUARE-SIGNATURE header by providing:

  • The timestamp from the incoming request (1596706743675),
  • The resource domain (contentsquare.com),
  • The secret provided by Contentsquare (abcde).
const crypto = require('crypto');
const secret = 'abcde'; // Given by someone from CS
receivedHeaderExample = '1596706743675-BxmHtG6vu4CfFlzpHxc0qYOmR0iMajlIvA2B4404qk4=';
// Extract the timestamp and digest from the received header
const [timestamp, receivedDigest] = receivedHeaderExample.split('-');
// Extra security step to make sure the timestamp is not too old
const currentTimestamp = Date.now();
if(currentTimestamp - timestamp > 5 * 60 * 1000) {
throw new Error("Validation failed. Timestamp signature is older than 5 minutes");
}
// Recreate the string that was used to generate the digest
const dataToSign = `${resourceDomain}-${timestamp}`; // e.g. 'contentsquare.com-1596706743675';
// Create a new digest using the same secret and algorithm
const hmac = crypto.createHmac('sha256', secret);
hmac.update(dataToSign);
const generatedDigest = hmac.digest('base64');
// Compare the newly generated digest with the one received in the header
if (receivedDigest === generatedDigest) {
console.log('Header is valid');
// Code to validate request
}