Skip to main content

POST /site-scoring/analyze

Analyzes a website URL and returns proposed page-view scoring rules based on the content and structure of the site. This endpoint powers the Site Analysis step in the Quick Start wizard, but can also be called directly to generate scoring rule suggestions for any public website.

When to Use This

  • Quick Start setup: Automatically seed your scoring configuration with rules based on your own website's page structure
  • Rule discovery: Identify which page categories exist on a site and get suggested point values for each
  • Onboarding automation: Programmatically bootstrap scoring configurations for new workspaces or clients

How It Works

When you provide a website URL, kenbun:

  1. Attempts to fetch and parse the site's sitemap.xml (checking robots.txt for the sitemap location first)
  2. If no sitemap is found, crawls the homepage and follows top-level navigation links one level deep
  3. Categorizes discovered pages into 17 intent categories based on URL patterns (demo, pricing, contact, signup, product, solutions, and more)
  4. Returns one proposed Page View scoring rule per category found, with a suggested point weight and an attribute filter that matches pages in that category

The suggested weights reflect typical buyer intent — a visit to /pricing carries more intent than a visit to /blog, which is reflected in the proposed scores.

Authentication

Required: Yes — Bearer token or HTTP Basic credentials

Request

Endpoint: POST /site-scoring/analyze

Headers:

Content-Type: application/json
Authorization: Bearer <token>

Request Body

FieldRequiredTypeDescription
urlYesstringThe website URL to analyze. The scheme (https://) is added automatically if omitted.

Example Request Body:

{
"url": "https://www.example.com"
}

Example:

curl -X POST "https://api.kenbun.io/site-scoring/analyze" \
-H "Authorization: Bearer <token>" \
-H "Content-Type: application/json" \
-d '{"url": "https://www.example.com"}'

Response

Status: 200 OK

Response Fields

FieldTypeDescription
domainstringThe root domain analyzed (e.g., example.com)
fetched_urlstringThe normalized URL that was fetched
page_countintegerNumber of unique pages discovered
skipped_urlsintegerNumber of URLs found but not analyzed (cap of 300 pages per analysis)
sourcestringHow pages were discovered: sitemap, crawl, or crawl+depth
categoriesarrayPage categories found on the site (see Category Fields)
proposed_rulesarrayReady-to-use scoring rules (see Proposed Rule Fields)

Category Fields

Each entry in categories describes a group of pages on the site that share the same intent signal:

FieldTypeDescription
categorystringHuman-readable category name (e.g., High Intent - Pricing)
patternstringURL fragment used to identify pages in this category (e.g., pricing)
urlsarray of stringsSpecific page paths on the site that matched this category
suggested_weightintegerRecommended point value for this category
iconstringIcon name suggestion for the rule (Lucide icon name)

Proposed Rule Fields

Each entry in proposed_rules is a scoring rule ready to be created via the Rules API:

FieldTypeDescription
event_typestringAlways Page View
weightintegerSuggested point value
labelstringHuman-readable rule name (matches the category name)
iconstringIcon name for the rule
attribute_filtersarrayFilters that match pages in this category

Each attribute_filter has:

FieldTypeDescription
fieldstringAlways path
operatorstringAlways contains
valuestringThe URL fragment to match (e.g., pricing, demo)

Example Response

{
"domain": "example.com",
"fetched_url": "https://www.example.com",
"page_count": 47,
"skipped_urls": 0,
"source": "sitemap",
"categories": [
{
"category": "Very High Intent - Demo/Book",
"pattern": "demo",
"urls": ["/demo", "/book-demo"],
"suggested_weight": 15,
"icon": "calendar"
},
{
"category": "High Intent - Pricing",
"pattern": "pricing",
"urls": ["/pricing"],
"suggested_weight": 10,
"icon": "dollar-sign"
},
{
"category": "Low Intent - Content",
"pattern": "blog",
"urls": ["/blog", "/blog/post-1", "/blog/post-2"],
"suggested_weight": 2,
"icon": "file-text"
}
],
"proposed_rules": [
{
"event_type": "Page View",
"weight": 15,
"label": "Very High Intent - Demo/Book",
"icon": "calendar",
"attribute_filters": [
{ "field": "path", "operator": "contains", "value": "demo" }
]
},
{
"event_type": "Page View",
"weight": 10,
"label": "High Intent - Pricing",
"icon": "dollar-sign",
"attribute_filters": [
{ "field": "path", "operator": "contains", "value": "pricing" }
]
},
{
"event_type": "Page View",
"weight": 2,
"label": "Low Intent - Content",
"icon": "file-text",
"attribute_filters": [
{ "field": "path", "operator": "contains", "value": "blog" }
]
}
]
}

Intent Categories and Suggested Weights

The analysis recognizes these page categories, ordered from highest to lowest intent:

CategoryExample URL PatternsSuggested Weight
Very High Intent - Demo/Book/demo, /book-demo, /schedule+15
High Intent - Contact/contact, /talk-to-sales+12
High Intent - Pricing/pricing, /plans, /buy+10
High Intent - Signup/signup, /trial, /free-trial+10
Mid-High Intent - Social Proof/case-studies, /customers, /testimonials+8
Mid Intent - Product/product, /features, /platform+6
Mid Intent - Solutions/solutions, /use-cases, /for-enterprise+6
Mid Intent - Tools/calculator, /estimate, /compare+6
Mid Intent - Location Pages/locations, /cities+5
Mid Intent - Documentation/docs, /help, /support+5
Mid Intent - Research/research, /reports, /insights+4
Low Intent - Partners/partners, /affiliates+3
Low Intent - Resources/resources, /templates+3
Low Intent - Content/blog, /webinars, /podcast+2
Neutral - Company Info/about, /team, /mission+1
Not a Buyer - Careers/careers, /jobs-5
Not a Buyer - Legal/privacy, /terms, /gdpr-3

Note that careers and legal pages carry negative weights. Leads who only visit these pages are typically job seekers or auditors rather than buyers, so scoring them down helps your model stay accurate.

Common Errors

StatusMeaningSolution
400Invalid or missing URLProvide a valid public URL (e.g., https://www.example.com)
400Internal address blockedThe URL resolves to a private or internal IP address, which is not allowed
401UnauthorizedCheck your authentication credentials
502Site unreachableThe website could not be fetched — it may be down, require a login, or block automated requests

Important Notes

Public sites only: This endpoint fetches your website from kenbun's servers. The site must be publicly accessible without authentication.

15-second timeout: The full analysis must complete within 15 seconds. Very large sites may not return all pages within this window — use a site with a sitemap.xml for the most complete results.

300-page cap: The analysis evaluates up to 300 unique pages. Pages beyond this limit are counted in skipped_urls but not categorized.

One rule per category: Even if multiple pages match a category (e.g., several blog posts), a single rule with a path contains blog filter is proposed. This keeps your scoring configuration clean and maintainable.

Negative weights: Careers and legal pages receive negative suggested weights. Review and adjust these before applying — if your product is aimed at compliance teams, for example, you may want to remove the negative weight from /legal.

Using Results with the Rules API

The proposed_rules array in the response maps directly to fields accepted by POST /engagement-scoring/rules. After reviewing the proposals, you can create rules in bulk. Here is an example of creating one proposed rule:

curl -X POST "https://api.kenbun.io/engagement-scoring/rules" \
-H "Authorization: Bearer <token>" \
-H "Content-Type: application/json" \
-d '{
"event_type": "Page View",
"weight": 10,
"label": "High Intent - Pricing",
"attribute_filters": [
{ "field": "path", "operator": "contains", "value": "pricing" }
]
}'