Semantic selectors solved half the problem. A browser agent can find role=button, name="Sign In" on any page, regardless of DOM structure. But what happens when the agent comes back tomorrow and the button's label changed to "Log In"? Or when the page has three buttons named "Submit" and the agent needs the second one?

uSEID pipeline: Snapshot accessibility tree, Map elements to stable semantic IDs, Verify cross-page cross-run identity

Session-scoped refs die with the session. CSS selectors die with the next deploy. The element is still there. The agent just can't prove it.

I ran into this building BAP. The protocol gives agents stable refs within a session (like @e1, @e2), but those refs are ephemeral. Close the browser, open a new one, and every ref resets. For any agent that needs to remember elements across runs (monitoring workflows, regression checks, multi-session tasks), you need identity that persists beyond a single page load.

uSEID is what I built to solve that. It's a portable element signature that encodes what an element is, where it sits in the DOM, and where it appears on screen, then resolves that signature against a live page with a confidence score and safety gate.

Before
Welcome back
Email
Password
Sign In
div.auth > button:first-child ✓ Sign In
After deploy
Welcome back
Email
Password
Sign Up
Sign In
div.auth > button:first-child ✗ Sign Up
uSEID: button "Sign In" ✓ confidence 0.951
A UI redesign breaks CSS selectors but uSEID signatures resolve by meaning, not position.

The three signals

People find elements using multiple cues at once. You don't find the "Sign In" button purely by its label, or purely by its position, or purely by its context. You triangulate. uSEID does the same thing with three weighted signals:

Signal Weight What it captures
Semantic 50% ARIA role + accessible name. What the element is
Structural 30% Ancestor roles, sibling labels, depth. Where it sits in the DOM
Spatial 20% Bounding box position. Where it appears on screen

Semantic carries the most weight because role and name are the most stable properties of an element. A button named "Sign In" remains a button named "Sign In" through most refactors. But when the name does change, or when multiple elements share the same role and name, the structural and spatial signals disambiguate.

The structural signal captures context: what landmark region is this element in? What are its siblings labeled? How deep is it in the tree? The spatial signal captures position: is this element in the header or the footer, left column or right? Together, the three signals form a fingerprint that's resilient to the kinds of changes real UIs go through.

Example: resolving a "Sign In" button Semantic 1.00 × .50 = .500 Structural 0.89 × .30 = .267 Spatial 0.92 × .20 = .184 Total 0.951 RESOLVED (threshold: 0.85)
Weighted scoring for a "Sign In" button. Semantic similarity dominates, with structural and spatial signals providing disambiguation.

How it works

uSEID operates in two phases. First, you capture a signature from a snapshot of the page. Later (minutes, hours, or several deploys later) you resolve that signature against the current page.

BUILD RESOLVE DOM + A11y Snapshots Extractor Builder USEIDSignature Candidates Matcher Safety Gate ResolveResult
uSEID operates in two phases: build a portable signature, then resolve it against a live page.

Building a signature:

You pass DOM and accessibility snapshots plus an element index, and get back a JSON signature containing the semantic identity (role, name), structural context (ancestor roles, siblings), and spatial position (bounding box). The signature is portable: store it in a database, write it to a file, pass it between services.

Resolving against a live page:

To resolve, pass the stored signature plus current page snapshots. The resolver returns a confidence score, a selector hint, and when it can't resolve, ranked candidates with an explanation of why.

Safety-first resolution

This is the design decision I care about most. uSEID would rather abstain than act on the wrong element.

The safety gate enforces four constraints:

01

Binding check

Signatures are locked to their origin and page path. A signature captured on example.com/login will not resolve against example.com/dashboard. Cross-page resolution is blocked, not degraded.

02

Role gate

If no candidates match the expected ARIA role, resolution stops. A button signature will never match a link, even if every other signal aligns perfectly.

03

Confidence threshold

The default threshold is 0.85 out of 1.0. Below that, the resolver abstains and returns ranked candidates instead of a potentially wrong match.

04

Ambiguity margin

If the top two candidates score within 0.1 of each other, the resolver flags it as ambiguous rather than guessing. Two elements that look equally like the target means neither can be trusted.

Origin + path match? Yes No ABSTAIN: binding_mismatch Any role matches? Yes No ABSTAIN: no_candidates Confidence ≥ 0.85? Yes No ABSTAIN: below_threshold Gap ≥ 0.10? Yes No ABSTAIN: ambiguous_match RESOLVED
The safety gate runs four checks in sequence. Any failure triggers an explicit abstention with a diagnostic reason.

The result is a system that fails explicitly. When the UI changed too much, when the element moved to a different page, when two elements are too similar, the agent gets a clear abstention with a reason, not a silent misidentification.

In agent workflows, clicking the wrong element is almost always worse than not clicking at all.

What this enables

Session-scoped refs work fine for single-run tasks: fill a form, extract data, navigate a flow. But a growing class of agent tasks requires cross-run memory:

  • Monitoring: "Check if this price changed since yesterday." Needs to find the same element across page versions
  • Regression testing: "Verify this button still exists after the deploy." Needs identity that survives code changes
  • Multi-session workflows: "Resume where I left off." Needs to re-anchor to elements from a prior run
  • Agent memory: "I learned that clicking this element leads to the checkout flow." Needs a portable reference that other agents can resolve

uSEID gives agents a way to say "this element" and mean it across time. Not "the element at this CSS path" or "the element at these coordinates" or "the third button on the page." The element that is a button, named Sign In, inside a form, inside the main landmark, near the top-center of the viewport.

Get started

$ sig = buildUSEID({ snapshot, a11y, idx: 42, url })
→ { role: "button", name: "Sign In", hash: "a1b2c3..." }
--- deploy #847 ships ---
$ result = resolveUSEID({ sig, snapshot, a11y, url })
→ Resolved with confidence 0.951: button[name="sign in"]
A signature built today resolves tomorrow, even after a deploy changes the page.

Links: GitHub / npm

The missing layer

uSEID started as a module inside BAP. The protocol needed a way to identify elements that outlived a browser session: deterministic replay, regression workflows, agent memory. The identity system turned out to be useful enough on its own to extract as a standalone package.

The web has had stable identity for documents (URLs) and for users (cookies, tokens) for decades. It never had stable identity for individual elements. data-testid was the closest thing, and it requires developers to opt in. uSEID derives identity from what the browser already knows (the accessibility tree, the DOM structure, the visual layout) without requiring any changes to the page being observed.

Selectors tell you how to find an element. Signatures tell you which element you're looking for. That distinction is the foundation for agents that can remember.