OIP Reference Architecture
Document class: Informative (architecture &
rationale) Version: v0.1 Companion to:
OIP-Specification-v1.0.md (Normative) ·
OIP-Developer-Guide-v0.1.md (Practical)
Audience: protocol architects, AI platform builders,
standards organizations, engineering leaders, investors
How to read this document. The Specification tells you what to implement. This document tells you why it is shaped that way. It is the equivalent of the Git Architecture Overview or the OAuth Architecture document: a conceptual model you can hold in your head, from which the wire format becomes obvious. A standards architect should be able to implement OIP from the Specification alone; this document exists so they implement it correctly — preserving the invariants that make the protocol durable rather than merely functional. No schemas are reproduced here; where JSON appears it is illustrative only.
1. Introduction
1.1 The problem OIP solves
An autonomous agent acting for a person can call any model, invoke any tool, and read any API. What it cannot reliably do is answer one question:
"Which packaged expertise should I trust to guide this person toward this real-world outcome — and how do I know?"
Today that question is answered with proxies that are not evidence: star ratings, download counts, model benchmarks, marketing copy. None of these establish that a piece of packaged expertise actually closed a gap for a real human. A five-star skill with no outcomes and a battle-tested capability with hundreds of verified successes look identical to a ranking algorithm that counts stars.
OIP — the Outcome Intelligence Protocol — is the trust-and-attribution layer for that question. It standardizes how reusable, provenance-backed, versioned Outcome Intelligence is published, discovered, verified, licensed, and — crucially — how the evidence behind its trustworthiness is exposed and attributed. OIP does not move money, invoke tools, or run agents. It is the layer that lets an agent or a person choose whom to trust to deliver an outcome, and lets that trust be earned through repeated verified results rather than asserted through popularity.
Diagram — view Mermaid source
flowchart LR
A[Agent or Person] -- "needs an outcome" --> Q{"Whom do I trust<br/>to deliver it?"}
Q -- "today" --> V["Vibes:<br/>stars · downloads · benchmarks · ads"]
Q -- "with OIP" --> E["Evidence:<br/>verified outcomes · provenance ·<br/>conservative attribution · reputation"]
V --> X["✗ unfalsifiable"]
E --> Y["✓ auditable in 3 hops"]
1.2 The evolution: AI → Agents → Outcomes → Agentic Commerce
OIP sits at the fourth stage of a progression, and each stage made the next one necessary:
Diagram — view Mermaid source
flowchart TB
AI["1 · AI<br/>models that generate"] --> AG["2 · Agents<br/>models that act via tools"]
AG --> OUT["3 · Outcomes<br/>did the action close the gap<br/>for a real human?"]
OUT --> COM["4 · Agentic Commerce<br/>paying for trusted outcomes,<br/>delivered by accountable identities"]
style OUT fill:#eef,stroke:#446
style COM fill:#efe,stroke:#464
- AI gave us generation. Value was the model.
- Agents gave us action — tool use, orchestration, MCP, A2A. Value moved to scaffolding.
- Outcomes is where DeepSenz's thesis begins: the unit that matters is not a generated artifact or an executed tool call, but a verified change in the real world — the gap between a person's current state and desired state, closed and confirmed.
- Agentic Commerce is the consequence: once outcomes are verifiable, they are purchasable, and a market forms around who can be trusted to deliver them. A market needs identity, accountability, evidence, and reputation. That is the layer OIP standardizes.
The reason a protocol is required — rather than a product feature — is in the next section.
1.3 Why this must be a protocol, not a product
A single company can build a marketplace with a trust score. But three forces make that insufficient and demand a protocol:
- Outcomes are multi-party by nature. An outcome is delivered by an accountable identity that orchestrates models, tools, runtimes, and sub-specialists it does not own. Trust must compose across parties that do not share a codebase. Cross-party composition is what protocols are for.
- Evidence must outlive any implementation. A reputation earned over a decade cannot be hostage to one company's database schema, one model vendor, or one runtime. The evidence format must be stable and portable across registries and across technology generations.
- The ecosystem already has interoperability standards — A2A for agent communication, MCP for tool invocation, SKILL.md for skill packaging, ACP/UCP for commerce, Apple App Intents for native execution. None of them carry outcome evidence, attribution, or reputation. OIP's job is to be the thin, complementary layer that adds exactly that — and a layer that complements many standards must itself be a standard.
OIP is therefore designed the way TCP/IP, OAuth, and Git are designed: a small set of durable objects and rules, neutral to any vendor, that many independent parties can implement and still interoperate.
2. Design Philosophy
The entire architecture rests on one organizing decision: every object in an Outcome Intelligence ecosystem belongs to exactly one of four categories, and each category has opposite durability rules. Conflating them is the single most common way these systems rot — they grow by adding "pipeline stages" until a derived cache can silently disagree with the evidence it was derived from, and explainability dies.
2.1 The four categories
Diagram — view Mermaid source
flowchart LR
subgraph IDENTITY["IDENTITY · persistent"]
I["who is accountable"]
end
subgraph ASSET["ASSET · publishable, versioned"]
AS["what is offered"]
end
subgraph EVIDENCE["EVIDENCE · append-only"]
EV["what actually happened"]
end
subgraph PROJECTION["PROJECTION · recomputable"]
PR["what it all means, right now"]
end
IDENTITY -->|earns| EVIDENCE
ASSET -->|used in| EVIDENCE
EVIDENCE -->|projects to| PROJECTION
PROJECTION -->|guides choice of| ASSET
| Category | Question it answers | Durability rule | Examples |
|---|---|---|---|
| Identity | Who is accountable? | Persistent; reputation-bearing; never deleted | Principal, Execution Partner, Domain Partner, Delegate, Publisher, Organization |
| Asset | What is offered? | Publishable; versioned; content-addressed | Outcome Manifest, Capability, Knowledge Module, Trust Manifest, License, Compatibility Adapter |
| Evidence | What actually happened? | Append-only; immutable; system of record | Execution Configuration, Execution Record, Outcome Event, Outcome Evaluation, Delegation Edge, Attribution |
| Projection | What does it mean now? | Recomputable; never authoritative; rebuildable from evidence | TrustSenz scores, Reputation Graph, Discovery index, Recommendations, Commerce rankings |
2.2 Why these four, and why the durability rules differ
The categories are not arbitrary taxonomy — they are defined by how the truth of each one is established and how it may change:
- Identity is true by continuity. An Execution Partner is the same partner tomorrow even if every model it uses is replaced. Identity must persist precisely because reputation attaches to it; deleting or forking an identity would erase or launder earned trust.
- Assets are true by publication and signature. They are claims a publisher makes ("this package does X"). They are versioned because claims evolve, and content-addressed so a claim cannot be silently altered after it is trusted.
- Evidence is true by occurrence. Something happened; it was recorded. Evidence is append-only because the past cannot be edited — a recorded outcome, including a verified failure, is a fact. The moment evidence becomes mutable, every reputation built on it becomes deniable.
- Projections are true by derivation. A trust score is not a fact about the world; it is a current interpretation of the evidence under a stated method. It must be recomputable because the method improves, new evidence arrives, and old evidence is sometimes discredited. A projection that cannot be rebuilt from scratch is a liability: it can drift from its own evidence and no one can tell.
Diagram — view Mermaid source
flowchart LR
P["The past"] --> R{"Can it change?"}
R -->|"No — it happened"| EVID["EVIDENCE<br/>append-only"]
N["The present meaning"] --> R2{"Should it update<br/>as method & data improve?"}
R2 -->|"Yes — always"| PROJ["PROJECTION<br/>recomputable"]
This single distinction — append-only evidence vs. recomputable projection — is the spine of the whole architecture. Sections 6 and 7 develop it.
2.3 The durability contract
The categories yield a contract that lets OIP evolve for 5–10 years without protocol-breaking changes:
New capability arrives as data (a new role, a new verdict code, a new configuration dimension), as a projection (a new ranking or recommendation derived from existing evidence), or as an extension profile — never as a change to a core object's envelope.
If a proposed change requires altering one of the eight core object envelopes, that is the signal to stop and ask whether the change is really an envelope change or whether it belongs in a registry value, a projection, or the private extension profile. Almost always, it is the latter.
3. Core Object Model
There are eight core objects plus one shared evidence primitive. This section explains their responsibilities and relationships; the Specification carries the schemas. The objects map cleanly onto the four categories of §2.
Diagram — view Mermaid source
flowchart TB
subgraph ID["IDENTITY"]
Identity["①Identity<br/>(role-based)"]
end
subgraph AS["ASSET"]
Manifest["②Outcome Manifest"]
Capability["③Capability"]
Trust["⑧Trust Manifest / Metric"]
end
subgraph EV["EVIDENCE (append-only)"]
Config["④Execution Configuration"]
Record["⑤Execution Record"]
Eval["⑥Outcome Evaluation"]
Attr["⑦Attribution"]
Edge["⟂ Delegation Edge"]
end
Identity -- publishes --> Manifest
Manifest -- references --> Capability
Manifest -- carries --> Trust
Record -- references --> Identity
Record -- references --> Config
Record -- references --> Manifest
Record -- contains --> Edge
Record -- closes with --> Eval
Eval -- grounds --> Attr
Attr -- credits --> Identity
Attr -- correlates with --> Config
Attr -- feeds --> Trust
| # | Object | Category | Responsibility |
|---|---|---|---|
| ① | Identity | Identity | Names an accountable party; anchors reputation; carries cross-standard links. One object, many roles. |
| ② | Outcome Manifest | Asset | The canonical description of one Outcome Intelligence Object — what outcome it drives, how to install it, what it costs, who published it. |
| ③ | Capability | Asset | A reusable implementation component ("Skill"), referenced by id + hash; substance lives in MCP/SKILL.md. |
| ④ | Execution Configuration | Evidence | A content-addressed record of how an outcome was produced (planner, model, runtime, device, platform, persona, tools). An evidence dimension, never a reputation subject. |
| ⑤ | Execution Record | Evidence | The append-only system of record for one completed Human Outcome Loop: who, what config, what delegations, what result. |
| ⑥ | Outcome Evaluation | Evidence | The verdict on a record — did the outcome close the gap? — plus experience dimensions and the evidence class that backs it. |
| ⑦ | Attribution | Evidence | A conservative, evidence-led statement that some identity or configuration dimension contributed to an outcome. The bridge from evidence to reputation. |
| ⑧ | Trust Manifest / Metric | Asset (carrier) + Projection (values) | The published trust envelope; its values are a projection recomputed from evidence. Keyable per-Outcome and per-Identity. |
| ⟂ | Delegation Edge | Evidence | Not a standalone top-level object — a typed edge inside an Execution Record recording that one identity delegated a role to another. |
The relationships above are the whole model. Everything else in OIP — discovery, licensing, ranking, the reputation graph — is a projection over these objects or an asset published by an identity.
4. Identity Architecture
4.1 One object, many roles
OIP models identity once. There is a single
Identity object whose role is a
registry-governed value. This is the most important extensibility
decision in the protocol.
Diagram — view Mermaid source
flowchart TB
Identity["Identity<br/>{ id · role · keys · links }"]
Identity --> P["role: principal"]
Identity --> EP["role: execution_partner"]
Identity --> DP["role: domain_partner"]
Identity --> D["role: delegate<br/>(+ specialty)"]
Identity --> PUB["role: publisher"]
Identity --> ORG["role: organization"]
Identity -.->|"future roles"| FUT["role: …<br/>(registry addition,<br/>not a schema change)"]
| Role | Who it is | Reputation? |
|---|---|---|
| Principal | The human or organization on whose behalf an agent acts; the party that hires and pays. | Holds preferences, not professional reputation. |
| Execution Partner | A persistent professional identity that accepts accountability for delivering an outcome — the "General Contractor." | Yes — primary. |
| Domain Partner | A persistent domain specialist an Execution Partner engages (in the DeepSenz reference implementation, a "Genie"). | Yes. |
| Delegate | A specialist role engaged for part of a journey: Planner, Researcher, Negotiator, Verifier, Reviewer, Coach, Curator, Concierge, … | Yes, scoped to the role. |
| Publisher | The party that signs and publishes an Asset. | Reputation as a publisher of quality assets. |
| Organization | A legal/operational umbrella that may employ or stand behind partners. | Aggregate / derived. |
The 12+ delegate specialties are not separate object
types. They are role: delegate with a registry-governed
specialty. Adding "Strategist" or "Curator" next year is a
registry entry, not a protocol revision. This is why identity is
role-based: the set of professional roles in an outcome economy
is open-ended, and an open-ended set must be data, not
schema.
4.2 Why reputation survives technology
The defining property of an Execution Partner is that it is the same identity across technology generations.
Diagram — view Mermaid source
flowchart LR
EP["Execution Partner 'Herman'<br/>(persistent identity)"]
EP -- 2026 --> T1["model: GPT-6<br/>runtime: MySenz<br/>framework: Hermes"]
EP -- 2029 --> T2["model: GPT-9<br/>runtime: (new)<br/>framework: (new)"]
T1 --> R1["reputation += verified outcomes"]
T2 --> R2["reputation += verified outcomes"]
R1 --> ACC[("one accumulating<br/>reputation")]
R2 --> ACC
A homeowner hires a general contractor, not a brand of drill. When the contractor upgrades tools, their reputation does not reset. OIP encodes exactly this: the technology used appears only inside Execution Configuration (evidence), never inside Identity. Because reputation is a projection over evidence keyed to the identity, swapping the model or runtime changes future configurations but never erases past reputation. A future Herman running GPT-9 is still Herman.
This is also why reputation must not belong to models. A model is a fungible, replaceable input that many identities use for many purposes; crediting it would (a) attribute outcomes to a covariate rather than a cause, and (b) make reputation evaporate the moment the model is deprecated. Reputation belongs to the party that chose, orchestrated, and stood behind the work — the identity.
4.3 Portable identity
Identities carry links to their counterparts in adjacent
standards (an A2A agent card, an MCP server identity, an ACP/UCP
merchant identity) using a decentralized-identifier-style URI. This
makes reputation portable across registries and across
standards without any of those standards needing to know OIP
exists. A partner's identity can be cryptographically the same entity
whether it is discovered through OIP, contacted through A2A, or paid
through ACP/UCP.
5. Asset Architecture
Assets are the publishable, claimable objects — the supply side of the ecosystem. They are versioned and content-addressed because a publisher's claims evolve and must not be silently mutable after they are trusted.
Diagram — view Mermaid source
flowchart TB
Publisher["Identity{role: publisher}"] -- signs & publishes --> OM
subgraph "Outcome Intelligence Object"
OM["Outcome Manifest<br/>(the canonical description)"]
OM --> PKG["Package refs<br/>(CAP / App Intent / MCP server / …)"]
OM --> CAP["Capability refs<br/>(Skills)"]
OM --> KM["Knowledge Module refs"]
OM --> TM["Trust Manifest"]
OM --> LIC["License terms"]
OM --> ADP["Compatibility Adapters<br/>(A2A · MCP · SKILL.md · App Intents)"]
end
| Asset | What it is | Why it is publishable |
|---|---|---|
| Outcome Intelligence Object | The asset as a whole: reusable, provenance-backed expertise for one outcome. | It is the thing discovered, trusted, licensed, and used. |
| Outcome Manifest | The canonical machine-readable description and source of truth for the object. | Agents must be able to read what an object does without running it. |
| CAP / Package | One packaged runtime format the object can be delivered as. | A runtime needs installable bytes; the manifest references them by hash. |
| Capability ("Skill") | A reusable implementation component, referenced by id + hash. | Skills compose across many objects; referencing (not copying) keeps them reusable and verifiable. |
| Knowledge Module | A packaged, refreshable body of situational knowledge an object draws on. | Knowledge goes stale; packaging it separately lets it be refreshed and freshness-scored independently. |
| Trust Manifest | The trust + provenance envelope, separate from marketing copy. | Trust must be verifiable independently of the sales pitch. |
| License | The entitlement to install/execute, as a token + entitlement record. | Commerce requires a portable, verifiable grant of access. |
| Compatibility Adapter | Declared (and optionally materialized) export to another ecosystem's format. | Interop requires the object to present itself to A2A/MCP/SKILL.md/App Intents on their terms. |
The architectural point: assets are claims; evidence is what tests those claims. A manifest claims an outcome; the trust values on its Trust Manifest are a projection of the evidence that the claim has held. The publisher controls the claim; the publisher does not control the trust values — those are computed from evidence the publisher cannot forge (§6, §7).
6. Evidence Architecture
Evidence is the heart of OIP. It is the only category that is a system of record, and it is append-only.
6.1 The append-only philosophy
Diagram — view Mermaid source
flowchart LR
subgraph "Append-only evidence log"
direction LR
e1["record 1"] --> e2["record 2"] --> e3["record 3"] --> e4["… record n"]
end
e4 -. "never edited" .-> e1
NEW["new outcome"] --> APP["append record n+1"]
APP --> e4
Three reasons evidence must be append-only, each load-bearing:
- The past is a fact. An outcome happened; recording it does not make it negotiable. Editing evidence would make every reputation built on it deniable.
- Verified failures are valuable, not embarrassing.
OIP inherits the runtime principle that verification verifies truth,
not success: a recorded outcome of
not_closed, with the failure reason captured, is among the most valuable evidence in the system — it is the closest thing to a controlled experiment. A mutable log invites quietly deleting failures, which is exactly how trust becomes theater. - Auditability in three hops. Because nothing is overwritten, any trust claim can be traced — score → supporting Attribution → Execution Record — in a bounded number of hops, forever. This property is destroyed by in-place mutation.
When the world changes, evidence is added, never edited. A later record may contradict an earlier interpretation; the protocol's response is to flag the affected projection for review, never to average the contradiction away or rewrite history.
6.2 The evidence objects and their lifecycle
Diagram — view Mermaid source
flowchart TB
CFG["④Execution Configuration<br/>(content-addressed: how it was produced)"]
REC["⑤Execution Record<br/>(append-only: the loop)"]
EDG["⟂ Delegation Edges<br/>(who did which role)"]
EVL["⑥Outcome Evaluation<br/>(verdict + experience + evidence class)"]
OEV["Outcome Events<br/>(atomic, comparable projections of the record)"]
ATR["⑦Attribution<br/>(conservative credit to identities / dimensions)"]
CFG --> REC
EDG --> REC
REC --> EVL
REC --> OEV
EVL --> ATR
OEV --> ATR
- Execution Configuration records how an outcome was produced — the planner, model, reasoning model, agent framework, runtime, device, platform, persona, and tools. It is content-addressed (its id is a hash of its contents) and is an evidence dimension: something outcomes can be correlated with, never something that earns reputation (§6.4).
- Execution Record is the append-only system of record for one completed Human Outcome Loop. It references the principal, the accountable Execution Partner, the configuration used, the manifest delivered, and the delegation edges within. It is sealed once written.
- Delegation Edge is a typed edge inside the
record:
role → delegate identity, under a parent identity. The set of edges in a record is a DAG, not a fixed tree — a skill may serve many roles, a partner may be reused. Delegation is visible in evidence; the policy that produced it lives in the runtime/extension layer. - Outcome Evaluation is the verdict: did the outcome
close the gap
(
closed | partially_closed | not_closed | worsened | …), plus experience dimensions (stress reduced, time saved, would-use-again) and the evidence class that backs the verdict (third-party > sensor > receipt > media > self-report). The evaluation becomes evidence — it is not "a review." - Outcome Events are the atomic, comparable unit of learning: deterministic projections of a sealed record, one per gap addressed, that can be aggregated across journeys. They are how the system compares like with like. (Where the Execution Record is journey-shaped and narrative, the Outcome Event is small and comparable — a commit vs. a diff stat.)
- Attribution is the conservative bridge from evidence to reputation (§6.3).
6.3 Conservative attribution
Attribution is where most trust systems cheat — assigning simple credit ("this got 5 stars, so the model is great"). OIP forbids it structurally.
Diagram — view Mermaid source
flowchart LR
OEV["Outcome Evaluation<br/>(one journey)"] --> A1{"enough<br/>evidence?"}
A1 -- "n small" --> LOW["Attribution: low-confidence<br/>sample_size_warning = true<br/>correlation only"]
A1 -- "n large, repeated" --> HIGH["Attribution: higher-confidence<br/>causal language earned<br/>across many evaluations"]
LOW --> REP["Reputation (projection)"]
HIGH --> REP
Rules that make attribution conservative:
- Correlation before causation. A single evaluation produces a low-confidence attribution carrying a sample-size warning. Causal language is earned only by repeated evidence across many evaluations.
- Sample-size-aware by construction. Attribution confidence uses the same statistical discipline as the trust metric (a Wilson lower bound / Beta posterior on outcome counts), so "one success out of one" never reads as proven.
- Explainable and versioned. Attribution weighting is
pinned to a stated
method_version. Learning which configurations correlate with good outcomes is permitted; learning an opaque credit-assignment policy is not. - Separated subjects. An attribution may credit an identity (which accrues reputation) or note correlation with a configuration dimension (which does not). The two never blur.
6.4 Why configurations are evidence dimensions, not reputation objects
This deserves its own statement because it is the most common category error in outcome systems.
Diagram — view Mermaid source
flowchart TB
subgraph WRONG["✗ Category error"]
M1["GPT-6"] --> S1["⭐ 4.8 reputation"]
end
subgraph RIGHT["✓ OIP"]
OUT["verified outcome"] --> EP2["credited to: Execution Partner<br/>(identity, accountable)"]
OUT -. "correlated with (covariate)" .-> CFG2["config: model=GPT-6<br/>(evidence dimension)"]
end
A configuration — a model, a runtime, a device — is a fungible input that countless identities use for countless purposes. Crediting it with reputation is wrong on two counts:
- It mistakes a covariate for a cause. That GPT-6 was present when an outcome succeeded does not mean GPT-6 caused it; the Execution Partner's planning, sequencing, and judgment did. Configuration is something to correlate against, which is exactly the kind of evidence dimension that helps a partner choose better tools — but it is not the accountable party.
- It makes reputation evaporate. Tie reputation to GPT-6 and it vanishes when GPT-6 is deprecated. Tie it to the identity that chose GPT-6, and it persists across the upgrade to GPT-9.
So configurations are recorded richly (they are valuable evidence for learning what works) but they carry no trust score and no reputation field. They are dimensions of evidence, full stop.
7. Projection Architecture
Projections are everything the ecosystem computes from evidence: trust scores, the reputation graph, discovery results, recommendations, commerce rankings. None of them are systems of record.
7.1 Why projections are recomputable
Diagram — view Mermaid source
flowchart LR
EVID[("Append-only evidence<br/>(system of record)")]
EVID --> TS["TrustSenz scores"]
EVID --> RG["Reputation Graph"]
EVID --> DISC["Discovery index"]
EVID --> REC["Recommendations"]
EVID --> RANK["Commerce rankings"]
TS -. "rebuild any time" .-> EVID
RG -. "rebuild any time" .-> EVID
DISC -. "rebuild any time" .-> EVID
REC -. "rebuild any time" .-> EVID
RANK -. "rebuild any time" .-> EVID
A projection is a current interpretation of evidence under a stated method. It must be recomputable for three reasons:
- Methods improve. A better trust formula or attribution model should be applied to all history by re-deriving — not by migrating stored conclusions that would otherwise disagree with the new method.
- Evidence arrives and is sometimes discredited. New verified outcomes change the interpretation; a discredited evidence source should change every score that depended on it. Only recomputation makes this clean.
- Authority must stay with evidence. If a stored score is authoritative, it can drift from the evidence and no one can tell which is right. Making projections explicitly non-authoritative and rebuildable keeps the evidence as the sole source of truth.
This is the discipline that prevents the "70-table" sprawl: nothing derived is ever migrated; it is re-run. A trust score, a reputation graph, a ranking — all can be thrown away and rebuilt byte-for-byte from the evidence log under a pinned method version.
7.2 The projection objects
| Projection | What it derives | From |
|---|---|---|
| TrustSenz | The Outcome Trust Metric: an explainable, sample-size-aware trust score with band, confidence, and exposed evidence. | Outcome Evaluations + Attributions |
| Reputation Graph | The web of identities and their earned standing, traversable for discovery and delegation. | Attributions keyed to identities |
| Discovery | The searchable index of outcome objects, filterable by domain, band, confidence. | Manifests + Trust projections |
| Recommendations | The currently-active, sufficiently-confident suggestions for a given need. | Trust + Reputation + context |
| Commerce rankings | The ordering of options a buyer or agent sees. | Trust + Reputation + license/price |
The Reputation Graph is a projection, not a system of record. It is one of the most tempting objects to treat as authoritative — it feels like the asset. But it is recomputed from append-only Attributions keyed to identities. Reputation is not stored as a number on an identity; it is derived from the identity's evidence trail whenever needed. This is what makes "reputation must be earned" structurally true: there is no field to set, only evidence to accumulate.
8. Execution Flow
The end-to-end loop ties the four categories together. A principal experiences one trusted relationship; everything below the Execution Partner is recorded as evidence and rendered, when needed, as "your partner did X."
Diagram — view Mermaid source
flowchart TB
PR["Principal<br/>(Identity)"] -->|hires| EP["Execution Partner<br/>(Identity · accountable)"]
EP -->|plans| PLAN["Outcome Planning<br/>(gap → desired state)"]
PLAN -->|selects| CFG["Execution Configuration<br/>(Evidence dimension)"]
CFG -->|runs on| RT["Runtime<br/>(transient Execution Instance)"]
RT -->|seals| REC["Execution Record<br/>(append-only Evidence)"]
REC -->|verdict| EVL["Outcome Evaluation<br/>(Evidence)"]
EVL -->|conservative| TS["TrustSenz<br/>(Projection)"]
TS -->|derives| REP["Reputation<br/>(Projection, on the Identity)"]
REP -. "informs the next" .-> EP
classDef id fill:#eef,stroke:#446;
classDef ev fill:#fee,stroke:#644;
classDef pj fill:#efe,stroke:#464;
class PR,EP id;
class CFG,REC,EVL ev;
class TS,REP pj;
Reading the flow by category:
- Identity → Identity: the principal binds to one Execution Partner. This is the trusted relationship; it is the only thing the principal must understand.
- Planning → Configuration → Runtime: the partner decides how to produce the outcome and on what technology. The configuration is captured as evidence; the runtime/instance is transient and earns no reputation.
- Execution Record → Outcome Evaluation: the loop is sealed as immutable evidence and given a verdict, including the honest cases of partial or failed outcomes.
- TrustSenz → Reputation: the evaluation is attributed conservatively and projected into trust and reputation, which then inform the next engagement — closing the flywheel (§11).
The transient layer is deliberately small and disposable: a runtime,
a model, an Execution Instance (e.g. a process handle like
Herman-501) exist for the duration of one loop and leave
behind only evidence. Nothing durable depends on which runtime ran.
9. DeepSenz Reference Implementation
DeepSenz is one implementation of OIP, useful for grounding the abstract objects in concrete actors. Nothing in this section is part of core OIP. Another organization could implement OIP with entirely different actors and vocabulary and interoperate fully.
Diagram — view Mermaid source
flowchart TB
H["Execution Partner: 'Herman'<br/>core role: execution_partner"] --> G
subgraph G["Domain Partners: 'Genies'"]
G1["Travel Genie SEA"]
G2["Wellness Genie"]
G3["Finance / Health / Longevity Genie"]
end
G --> OP["Outcome Packages (CAPs)<br/>core asset: Outcome Manifest + package"]
OP --> SK["Skills<br/>core asset: Capability"]
SK --> RT["Runtime: MySenz, Apple Intelligence, …<br/>core: Execution Configuration + Instance"]
The mapping from DeepSenz vocabulary to core OIP objects:
| DeepSenz term (reference impl) | Core OIP object |
|---|---|
| Herman (the Execution Partner) | Identity{role: execution_partner} |
| Genie (Travel Genie SEA, Wellness Genie, …) | Identity{role: domain_partner} |
| CAP / Outcome Package | Outcome Manifest + package ref |
| Skill | Capability |
| Knowledge Module | Knowledge Module asset |
| MySenz / device / platform | Execution Configuration dimensions + Execution Instance |
| Persona (e.g. Luxury Wellness Concierge) | a presentation dimension of Execution Configuration |
The crucial discipline: the brand vocabulary stays out of the
core schema. Core OIP speaks of execution_partner,
domain_partner, outcome package, and
capability. "Genie," "CAP," "Herman," and "MySenz" are
reference-implementation names. This keeps the protocol adoptable by
parties who are not DeepSenz, and keeps DeepSenz free to rename its
products without touching the standard.
10. Private Extension Profile Boundary
Reference-implementation extension-profile details are intentionally out of scope for this public preview. Core OIP remains discoverable, verifiable, licensable, and attributable without reading private extension-profile contents.
11. Agentic Commerce
OIP's economic model follows directly from the architecture: a market in which trust is the scarce asset and it is earned through repeated verified outcomes rather than asserted through marketing, benchmarks, or popularity.
11.1 The participants and what they exchange
Diagram — view Mermaid source
flowchart LR
PR["Principal<br/>(demand · pays)"] -->|hires & licenses| EP["Execution Partner<br/>(accountable supply)"]
EP -->|engages| DP["Domain Partners<br/>(specialist supply)"]
DP -->|acquire & deliver| OP["Outcome Packages<br/>(the goods)"]
OP -->|priced via| LIC["License<br/>(free / one-time / future terms)"]
EP -->|merchant-of-record on| ACP["ACP / UCP rails<br/>(settlement)"]
- Execution Partners sell accountability for outcomes; they are the merchant-of-record and the party a principal trusts.
- Domain Partners supply specialist depth; they earn their own reputation and are engaged by partners.
- Outcome Packages are the goods — licensed, installed, executed.
- Trust, discovery, licensing, and reputation are the market infrastructure OIP standardizes; settlement rides existing commerce rails (ACP/UCP), which OIP complements rather than re-implements.
11.2 The commerce flywheel
Diagram — view Mermaid source
flowchart TB
A["Execution Partner selected"] --> B["Outcome delivered"]
B --> C["Execution Record + Evaluation<br/>(append-only evidence)"]
C --> D["Conservative Attribution"]
D --> E["Reputation improves<br/>(projection)"]
E --> F["Discovery & ranking improve"]
F --> G["More principals choose the partner"]
G --> H["More verified outcomes"]
H --> C
E --> I["Higher-quality recommendations"]
I --> J["Better outcomes"]
J --> H
The flywheel's defining property: every turn deposits more append-only evidence, which makes reputation more defensible and discovery more accurate, which attracts more outcomes — and because the evidence is verified and attributed conservatively, the loop rewards actually delivering outcomes, not gaming a rating. This is the conceptual foundation for future commercial work: the asset that compounds is the verified outcome corpus, and OIP is the standard that makes that corpus portable, auditable, and trustworthy.
11.3 Why this resists commoditization
If frontier models become commodities, then prompts, scaffolding, and orchestration are commodities too. What is not commoditizable in a quarter is a structured, attributable, evidence-classed corpus of what actually worked and failed for real people — and the persistent professional identities whose reputation that corpus backs. OIP is designed so that this corpus, not any single technology, is where durable value accrues.
12. Future Evolution
OIP v1.0 is deliberately minimal. The architecture is designed so the following can be added later without protocol-breaking changes — each as data, a projection, or an extension profile.
Diagram — view Mermaid source
flowchart TB
NOW["OIP v1.0<br/>single registry · first-party · centralized trust"]
NOW --> A["Multiple registries"]
NOW --> B["Cross-registry reputation<br/>(portable identities)"]
NOW --> C["Distributed / decentralized trust"]
NOW --> D["Decentralized attribution"]
NOW --> E["Learning attribution weights<br/>(explainable, versioned)"]
NOW --> F["Outcome Recipes<br/>(reusable proven configurations)"]
NOW --> G["Long-term governance<br/>of core & extension profiles"]
- Distributed trust & multiple registries. v1.0 assumes one registry and a centralized trust authority; the schema does not require singularity. Federation of registries is additive.
- Cross-registry reputation. Because identities are portable (did-style URIs) and reputation is a projection over evidence, an identity's reputation can be recomputed from evidence held across registries — the groundwork is already in the identity model.
- Decentralized attribution & trust networks. A natural long-horizon question, deliberately deferred: keep the trust authority centralized until there is trust worth decentralizing. The append-only evidence model is compatible with later decentralization.
- Learning attribution. The system may learn
which configurations correlate with good outcomes, but
attribution weighting must remain explainable and
method_version-pinned — never an opaque learned credit policy. This constraint is permanent, not transitional. - Outcome Recipes. Reusable, proven configurations ("this combination consistently produces good outcomes for this kind of person under these circumstances") emerge naturally as projections over the corpus, without new core objects.
- Portable identities & long-term governance. The open governance questions — who recognizes extension profiles, how core versions are ratified, how cross-organization partner collaboration and reputation transfer work — are the right questions for a maturing standard. They are explicitly out of scope for v1.0 and in scope for the standardization track.
Closing note
The measure of this architecture is not how much it does in v1.0, but how much it can absorb over a decade without breaking. Every hard choice in it — four categories, one identity object, append-only evidence, recomputable projections, configurations as evidence dimensions, the extension boundary — exists to make growth arrive as data, projection, or extension profile, never as a protocol-breaking change. That is what lets OIP aspire to be, like TCP/IP or OAuth, an architecture that outlives the technologies it coordinates.
This is an informative companion to the normative
OIP-Specification-v1.0.md. Where it and the Specification
appear to differ, the Specification governs the wire format and this
document governs the intent.