Some protocols become infrastructure. HTTP, USB, and JSON-RPC are so deeply embedded that most developers never think about them. Others become cautionary tales: SOAP, XMPP, CORBA. The difference comes down to a few consistent patterns, and I've seen them firsthand while designing the Browser Agent Protocol and building tooling for the SKILL.md ecosystem.

Four protocol design principles: Start with use cases, Design the 80% case, Ship day-one implementation, Choose human-readable formats

Start with the use case, not the spec

The single most common mistake in protocol design is starting with the abstraction. You model the entities, define the message types, write the schema, and then go looking for use cases that fit. This is backwards.

BAP started with a concrete use case: I wanted to book a pickleball court through Lifetime's web app before the slots disappeared. That meant an AI agent needed to navigate a browser, fill forms, click buttons, and handle dynamic content. The use case defined the protocol surface. Semantic selectors came from needing to find UI elements without brittle CSS selectors. Composite actions came from needing to reduce round trips. MCP integration came from needing to work with existing agent frameworks.

Most successful protocols follow this pattern. HTTP started with Tim Berners-Lee needing to share documents at CERN. REST emerged from Roy Fielding observing how the actual web worked. SOAP went the other direction, designing for every possible enterprise messaging scenario before asking what developers actually needed. It became a cautionary tale in specification-first thinking.

Design for the 80% case

REST uses standard HTTP methods: GET, POST, PUT, DELETE. Four verbs cover the vast majority of web API interactions. You don't need a WSDL file to understand what GET /users/123 does. Simplicity is not just aesthetic. It drives adoption.

In BAP, I made the same bet. The core actions are: navigate, click, type, extract, screenshot. Five operations cover 80% of browser automation tasks. Want to fill a form? That's type into fields and click submit. Want to scrape data? That's extract with a selector. The protocol supports composite actions for the remaining 20%, but the entry point is five verbs.

SKILL.md follows the same principle. A minimal valid skill file is a name, description, and instructions. Three fields. You can write one in 30 seconds. The spec supports optional sections (scripts, references, assets, examples, agents), but none are required. The 80% case, "I have instructions for an agent," is trivially easy.

The data backs this up. As of 2025, REST holds 93% team adoption versus GraphQL's 33% (Postman's State of the API report), despite GraphQL solving a real problem. Complexity is a tax on adoption, and it compounds.


The extensibility vs. simplicity tradeoff

Every protocol designer faces this tension: make it extensible enough to handle future use cases, or keep it simple enough that people actually adopt it. Get this wrong and you end up at one of two extremes. SOAP was so extensible it became unusable. Bespoke formats are so simple they can't evolve.

The pattern I've seen work: a minimal core with explicit extension points. HTTP has headers. JSON-RPC has the params object. SKILL.md has optional frontmatter fields and a freeform Markdown body. BAP has capability negotiation, where clients and servers declare supported features during connection.

The key is that extensions don't break the core. A SKILL.md parser that only understands the three required fields still works when it encounters a file with six optional sections. It just ignores what it doesn't recognize. A BAP client that doesn't support composite actions can still send basic actions. This is progressive disclosure applied to protocol design.

XMPP is the cautionary tale here. It had extension protocols for everything from file transfer to video calling, but the extension mechanism was so heavyweight that implementations diverged. Google Talk, Facebook Messenger, and WhatsApp all adopted then abandoned XMPP. The extensions existed. The ecosystem fragmented anyway.

Reference implementations matter more than specs

A spec without a reference implementation is a wish list. USB succeeded not because the specification was well-written, but because manufacturers could test against working implementations and earn compliance certification.

For skill-tools, the reference implementation is the toolchain itself: a parser that validates structure, a linter that enforces rules, a scorer that quantifies quality, a router that demonstrates retrieval. You can run skill-tools check my-skill.md and verify conformance in milliseconds.

For BAP, the reference implementation includes TypeScript and Python SDKs, plus a working browser connector. You can install the package, connect to a browser, and run your first agent action in under five minutes. The protocol spec is the contract. The SDK is the proof.

MCP followed this pattern too. Anthropic shipped the protocol with Python and TypeScript SDKs from day one. Within a year, the ecosystem grew to 10,000+ active servers and 97 million monthly SDK downloads (as of late 2025). The SDKs did more for adoption than the spec document ever could.

Why JSON-RPC over custom protocols

Both BAP and MCP chose JSON-RPC as their wire format. This isn't a coincidence. JSON-RPC is described as "probably the simplest, most lightweight, cleanest ASCII-RPC out there." The entire spec fits on one page.

The practical advantages for protocol designers:

  • No schema compilation. Unlike gRPC/Protobuf, you don't need to compile stubs or manage binary serialization. JSON-RPC can be adopted quickly with minimal tooling overhead.
  • Transport-independent. The same message format works over HTTP, WebSocket, stdio, or file descriptors. BAP uses WebSocket for real-time browser control. MCP supports both stdio and Streamable HTTP.
  • Human-readable. You can debug with curl. You can read messages in logs. You can paste them into documentation. Try that with Protobuf.
  • Request-response + notifications. JSON-RPC natively supports both request-response patterns and one-way notifications, which maps perfectly to agent communication.

Could we get better performance with a binary protocol? Yes. Would it matter for the use cases? No. The bottleneck in browser automation is the browser rendering, not the message serialization. The bottleneck in skill routing is the BM25 scoring, not the JSON parsing. Pick the format that minimizes adoption friction, not the one that minimizes microseconds.


Progressive disclosure reduces adoption friction

The fastest way to kill a protocol is to require users to understand the entire thing before they can start. Progressive disclosure means the learning curve matches the capability curve: simple things are simple, complex things are possible.

A minimal BAP interaction is three steps: connect, send an action, disconnect. You don't need to understand capability negotiation, composite actions, or session management to click a button on a web page. Those features exist for when you need them, but they're invisible until then.

A minimal SKILL.md is three YAML fields and some Markdown:

---
name: my-skill
description: Does a specific thing when asked
---
# Instructions
Do the thing in this specific way.

That's a valid, parseable, lintable, scorable skill file. It won't score 94/100, but it will work. The scoring system tells you exactly what's missing and how much each addition is worth. Progressive improvement, not binary compliance.

REST mastered this. GET /users is the entire protocol you need to fetch a list of users. Content negotiation, caching headers, and HATEOAS links all exist, but none are required to make your first API call.

A checklist for protocol designers

After designing BAP and building tooling for SKILL.md, here's what I'd tell anyone designing a protocol or standard:

  • Start from a real use case. If you can't describe the first user and their specific problem, you're designing prematurely.
  • Cover the 80% case in under 5 minutes. If a developer can't go from zero to working in a single sitting, your adoption curve is too steep.
  • Ship a reference implementation on day one. A spec without running code is fiction.
  • Make the minimal case minimal. Three fields, five verbs, one page of documentation. The hello-world should fit in a tweet.
  • Design extension points, not extensions. Let the community build what they need. Don't try to anticipate every use case in v1.
  • Choose boring transport. JSON-RPC, HTTP, WebSocket. Battle-tested, debuggable, universally supported.
  • Invest in compliance tooling. USB has certification testing. skill-tools has check. If you can't verify conformance programmatically, you'll get fragmentation.
  • Measure adoption, not completeness. A simple protocol that 10,000 people use beats a comprehensive one that 10 people understand.

Protocols are infrastructure. They succeed by being boring, predictable, and easy to adopt. They fail by being clever, comprehensive, and impossible to implement correctly. Every time I'm tempted to add a feature to BAP or SKILL.md, I ask: does this make the first five minutes easier? If not, it waits.