Skip to main content

DOM Scrapers and DataLayer Events

DOM scrapers and the DataLayer Events allowlist let the Chrome extension capture data from JavaScript on the page — useful for marketing analytics data, user information, GTM events, and third-party tool data.

DOM Scrapers

DOM scrapers extract data from JavaScript objects accessible on the browser's window object.

When to Use DOM Scrapers

Capture data from:

  • Google Tag Manager: Extract dataLayer events and variables
  • User Information: Pull user.email or user.id for identification
  • Analytics Tools: Capture data from analytics, gtag, _hsq (HubSpot), or other tracking libraries
  • Custom Objects: Any JavaScript object exposed on window

Creating a DOM Scraper

  1. Open the side panel
  2. Click the DOM Scrapers tab
  3. Click + New DOM Scraper (or use a quick-add template)
  4. Fill in the form:
FieldDescriptionExample
Event NameYour tracking nameDataLayer Capture
Pattern TypeURL matching typeWildcard, Hostname, or Regex
URL PatternWhere to run*.example.com/*
Window Object PathDot-notation path to objectdataLayer or user.email
Capture ModeWhen to captureOn Page Load or On Change (polling)
  1. Preview shows the current value (if available on the active page)
  2. Click Save DOM Scraper

Quick-Add Templates

Pre-configured for common objects:

  • dataLayer — Google Tag Manager data layer
  • user — User object (common in many apps)
  • user.email — Extract user email for identification
  • HubSpot (_hsq) — HubSpot tracking queue
  • Analytics — Generic analytics object

Window Object Paths

Simple paths:

window.dataLayer"dataLayer"
window.user"user"
window.analytics"analytics"

Nested paths:

window.user.email"user.email"
window.config.apiKey"config.apiKey"
window.__NEXT_DATA__"__NEXT_DATA__"

Capture Modes

On Page Load:

  • Captures once when page loads
  • Best for: Static user info, page metadata, initial state
  • Example: User ID, session token, page category

On Change (Polling):

  • Checks the value every 5 seconds
  • Captures when the value changes
  • Best for: Dynamic data, real-time updates
  • Example: Cart items, notification count, dataLayer entries pushed after page load

Event Data Structure

DOM scraper events include:

{
"event_type": "DataLayer Capture",
"window_path": "dataLayer",
"captured_data": "[{\"event\":\"Page View\",\"userId\":\"123\"}]",
"url": "https://example.com/page",
"path": "/page",
"scraper_id": "scraper_abc123"
}
  • captured_data is JSON-stringified (max 10 KB)
  • Circular references are handled gracefully
  • Only alphanumeric paths are allowed (security)

Rate Limiting

  • Minimum 5 seconds between captures per scraper
  • Same value won't trigger multiple events (deduplication)

DataLayer Events Allowlist

The DataLayer Events section in the DOM Scrapers tab controls which GTM dataLayer.push() events the extension captures in real time. This is separate from DOM scrapers — instead of polling on a schedule, this captures events the instant they are pushed.

By default, dataLayer interception is disabled. The extension captures no dataLayer events until you add at least one event name to the allowlist.

Configuring the Allowlist

  1. Open the side panel
  2. Click the DOM Scrapers tab
  3. Scroll to the DataLayer Events section
  4. Enter an event name (e.g., purchase) in the input field and click Add
  5. Repeat for each event you want to capture

You can also enter multiple events at once as a comma-separated list (e.g., purchase, add_to_cart, generate_lead).

To remove an event, click the x button next to its name.

How It Works

When one or more events are configured:

  1. The extension intercepts dataLayer.push() calls on the current page
  2. Each push is checked against your allowlist (case-insensitive)
  3. Matching events are sent to kenbun as DataLayer: <event_name> events with the full dataLayer entry as metadata
  4. GTM internal events (like gtm.js, gtm.dom, gtm.load) are always excluded automatically

When the allowlist is empty, the extension does not intercept dataLayer pushes at all.

When to Use This

  • Lead generation: Capture generate_lead, sign_up, form_submit events
  • Video engagement: Capture video_start, video_complete events
  • Sites without beacon access: If you cannot add the data-datalayer-events attribute to the beacon script tag, the extension provides the same filtering capability

DataLayer Events vs. DOM Scrapers

FeatureDataLayer EventsDOM Scrapers
What it capturesIndividual dataLayer.push() callsA snapshot of any window object
When it capturesInstantly on each pushOn page load or every 5 seconds (polling)
FilteringBy event name (allowlist)By URL pattern
Best forReal-time GTM events (purchases, form submits)Static data (user info, page metadata)

Interaction with the Web Beacon

If the Web Beacon is also installed and has data-datalayer-events configured, both the beacon and the extension can capture dataLayer events. To avoid duplicates, use one approach per site:

  • Beacon (data-datalayer-events attribute): Recommended for production sites where you control the beacon snippet
  • Extension (DataLayer Events allowlist): Useful for sites where you cannot modify the script tag

Troubleshooting

DOM scraper not capturing data

  • Use the preview in the DOM Scraper form to verify the path exists
  • Try switching from "On Page Load" to "On Change" for objects that load asynchronously
  • Check the browser console (F12) for error messages
  • Verify the path syntax: window.user.email → enter user.email (without window.)

DataLayer events not being captured

  • Open the DOM Scrapers tab and check the DataLayer Events section
  • Add at least one event name to the allowlist (e.g., purchase)
  • Verify the event name matches what GTM pushes (matching is case-insensitive)
  • Refresh the page after updating the allowlist
  • Check the browser console for kenbun: dataLayer allowlist loaded to confirm the configuration was applied