DOM Scrapers and DataLayer Events
DOM scrapers and the DataLayer Events allowlist let the Chrome extension capture data from JavaScript on the page — useful for marketing analytics data, user information, GTM events, and third-party tool data.
DOM Scrapers
DOM scrapers extract data from JavaScript objects accessible on the browser's window object.
When to Use DOM Scrapers
Capture data from:
- Google Tag Manager: Extract
dataLayerevents and variables - User Information: Pull
user.emailoruser.idfor identification - Analytics Tools: Capture data from
analytics,gtag,_hsq(HubSpot), or other tracking libraries - Custom Objects: Any JavaScript object exposed on
window
Creating a DOM Scraper
- Open the side panel
- Click the DOM Scrapers tab
- Click + New DOM Scraper (or use a quick-add template)
- Fill in the form:
| Field | Description | Example |
|---|---|---|
| Event Name | Your tracking name | DataLayer Capture |
| Pattern Type | URL matching type | Wildcard, Hostname, or Regex |
| URL Pattern | Where to run | *.example.com/* |
| Window Object Path | Dot-notation path to object | dataLayer or user.email |
| Capture Mode | When to capture | On Page Load or On Change (polling) |
- Preview shows the current value (if available on the active page)
- Click Save DOM Scraper
Quick-Add Templates
Pre-configured for common objects:
- dataLayer — Google Tag Manager data layer
- user — User object (common in many apps)
- user.email — Extract user email for identification
- HubSpot (_hsq) — HubSpot tracking queue
- Analytics — Generic analytics object
Window Object Paths
Simple paths:
window.dataLayer → "dataLayer"
window.user → "user"
window.analytics → "analytics"
Nested paths:
window.user.email → "user.email"
window.config.apiKey → "config.apiKey"
window.__NEXT_DATA__ → "__NEXT_DATA__"
Capture Modes
On Page Load:
- Captures once when page loads
- Best for: Static user info, page metadata, initial state
- Example: User ID, session token, page category
On Change (Polling):
- Checks the value every 5 seconds
- Captures when the value changes
- Best for: Dynamic data, real-time updates
- Example: Cart items, notification count, dataLayer entries pushed after page load
Event Data Structure
DOM scraper events include:
{
"event_type": "DataLayer Capture",
"window_path": "dataLayer",
"captured_data": "[{\"event\":\"Page View\",\"userId\":\"123\"}]",
"url": "https://example.com/page",
"path": "/page",
"scraper_id": "scraper_abc123"
}
captured_datais JSON-stringified (max 10 KB)- Circular references are handled gracefully
- Only alphanumeric paths are allowed (security)
Rate Limiting
- Minimum 5 seconds between captures per scraper
- Same value won't trigger multiple events (deduplication)
DataLayer Events Allowlist
The DataLayer Events section in the DOM Scrapers tab controls which GTM dataLayer.push() events the extension captures in real time. This is separate from DOM scrapers — instead of polling on a schedule, this captures events the instant they are pushed.
By default, dataLayer interception is disabled. The extension captures no dataLayer events until you add at least one event name to the allowlist.
Configuring the Allowlist
- Open the side panel
- Click the DOM Scrapers tab
- Scroll to the DataLayer Events section
- Enter an event name (e.g.,
purchase) in the input field and click Add - Repeat for each event you want to capture
You can also enter multiple events at once as a comma-separated list (e.g., purchase, add_to_cart, generate_lead).
To remove an event, click the x button next to its name.
How It Works
When one or more events are configured:
- The extension intercepts
dataLayer.push()calls on the current page - Each push is checked against your allowlist (case-insensitive)
- Matching events are sent to kenbun as
DataLayer: <event_name>events with the full dataLayer entry as metadata - GTM internal events (like
gtm.js,gtm.dom,gtm.load) are always excluded automatically
When the allowlist is empty, the extension does not intercept dataLayer pushes at all.
When to Use This
- Lead generation: Capture
generate_lead,sign_up,form_submitevents - Video engagement: Capture
video_start,video_completeevents - Sites without beacon access: If you cannot add the
data-datalayer-eventsattribute to the beacon script tag, the extension provides the same filtering capability
DataLayer Events vs. DOM Scrapers
| Feature | DataLayer Events | DOM Scrapers |
|---|---|---|
| What it captures | Individual dataLayer.push() calls | A snapshot of any window object |
| When it captures | Instantly on each push | On page load or every 5 seconds (polling) |
| Filtering | By event name (allowlist) | By URL pattern |
| Best for | Real-time GTM events (purchases, form submits) | Static data (user info, page metadata) |
Interaction with the Web Beacon
If the Web Beacon is also installed and has data-datalayer-events configured, both the beacon and the extension can capture dataLayer events. To avoid duplicates, use one approach per site:
- Beacon (
data-datalayer-eventsattribute): Recommended for production sites where you control the beacon snippet - Extension (DataLayer Events allowlist): Useful for sites where you cannot modify the script tag
Troubleshooting
DOM scraper not capturing data
- Use the preview in the DOM Scraper form to verify the path exists
- Try switching from "On Page Load" to "On Change" for objects that load asynchronously
- Check the browser console (F12) for error messages
- Verify the path syntax:
window.user.email→ enteruser.email(withoutwindow.)
DataLayer events not being captured
- Open the DOM Scrapers tab and check the DataLayer Events section
- Add at least one event name to the allowlist (e.g.,
purchase) - Verify the event name matches what GTM pushes (matching is case-insensitive)
- Refresh the page after updating the allowlist
- Check the browser console for
kenbun: dataLayer allowlist loadedto confirm the configuration was applied