Rendered from the canonical OIP-Specification-v1.0.md — the normative wire format. Public Review Draft.

SkillSenz Outcome Intelligence Protocol (OIP) v1.0

Status: Canonical specification, v1.0 (draft for Phase 1 implementation). Includes §5A — OIP v0.2 Object Model (Execution Partner Architecture), a forward-looking non-breaking superset folded in 2026-06-25. Author role: Protocol architect / API designer / trust-system designer Date: 2026-06-25 Grounded against: EvoSenz v2 review (OutcomeEvent as atomic unit, three-artifact rule, evidence-class weighting, Wilson/Beta confidence), RuntimeEvidenceAdapter review (border-checkpoint, no trust laundering, unknowns-not-defaults, provenance-by-pointer), Travel Genie Journey Capture review (verdict enum, skip≠no, synthetic flag), SignalSenz salience review (default-deny, explainable decisions), and the Execution Partner Architecture review (docs/skillsenz-oip-execution-partner-architecture-review-june-2026.md — identity-as-role, four object categories, attribution-by-evidence). Registry conventions follow registry/signals/*.v1.json.

One-line thesis. OIP is the wire format and trust contract for publishing, discovering, verifying, licensing, installing, and rating Outcome Intelligence objects. SkillSenz.ai is one renderer of this protocol — a projection of Outcome Manifests into HTML — not the protocol itself.

Founder doctrine encoded here. Prove a human or agent will pay for a trusted outcome before building marketplace machinery. v1.0 ships exactly enough to publish and sell one first-party Outcome Intelligence object (Travel Genie Da Nang Relocation) with an honest, evidence-led trust score. Trust fields exist from day one; most are null, and null is treated as an honest answer, never a zero.


1. Executive Summary

AI agents can call models, tools, and APIs. What they cannot reliably do is answer "which packaged expertise should I trust to guide this principal toward this real-world outcome, and how do I know?" Today that judgment is made from vibes: star ratings, download counts, marketing copy. None of those are evidence that the packaged expertise actually closed a gap for a real human.

OIP defines a machine-readable, agent-first way to exchange Outcome Intelligence — reusable, provenance-backed, versioned knowledge/action packages that help an agent guide a principal toward a specific real-world outcome — together with an evidence-led trust metric that exposes why an object is trustworthy, not just how many stars it has.

The problem OIP solves: discovery without trust is noise, and trust without evidence is theater. OIP makes every trust score auditable down to the journeys behind it, refuses to let a one-journey 100%-success object outrank a hundred-journey object, and treats verified outcome events as worth more than unverified reviews.

OIP v1.0 is deliberately small: one Outcome Manifest schema, one Trust Manifest, one trust metric, a thin REST+MCP+A2A discovery surface, and free / one-time-purchase licensing. Everything else — third-party publishing, negotiation, subscriptions, public reviews — is named as a future extension and explicitly excluded from v1.


2. Design Goals

  1. Machine-readable first. The Outcome Manifest is the source of truth; it is valid JSON against a published schema.
  2. Agent-first. An autonomous agent can discover, evaluate trust, license, and install with no human-readable page in the loop.
  3. Human-readable second. Websites and docs are projections of manifests, always linking back to the JSON.
  4. API-compatible. A small, stable REST surface plus OpenAPI.
  5. Registry-compatible. A registry is an index of manifests; multiple registries can coexist (federation is a Phase 3 concern, but the schema does not assume a single registry).
  6. Runtime-neutral. No assumption that MySenz, MCP, Apple App Intents, or any one runtime wins. Compatibility is declared metadata.
  7. Trust-centric. Every object carries a Trust Manifest; you cannot list without one (even if most fields are null).
  8. Score-aware. A single comparable overall_trust_score exists for ranking — but never travels without its band, confidence, sample size, and evidence.
  9. Rating-aware. Ratings are first-class inputs but are gated by sample size and outweighed by verified evidence.
  10. Versioned. Manifests, trust scores, score methods, and packages all carry explicit versions; semver governs package evolution.
  11. Portable. Provenance is by pointer (content digests + refs), so an object can move between registries and hosts without losing its evidence chain.
  12. Minimal MVP surface. One paid first-party object end-to-end; no speculative infrastructure.
  13. Future-extensible. Closed vocabularies with explicit extension points (registry-governed claim/evidence lists, reserved namespaces) so growth happens by adding data, not by mutating the envelope.

3. Non-Goals (explicitly excluded from v1.0)


4. Core Concepts and Vocabulary

These terms are load-bearing and must not be collapsed into one another.

Term Definition
Outcome Intelligence Object The asset: a reusable, provenance-backed, versioned knowledge/action package designed to guide an agent and principal toward one real-world outcome. Discovered, trusted, licensed, installed, rated, used.
Outcome Manifest The canonical machine-readable JSON description of an Outcome Intelligence Object. The system of record for what the object is.
Outcome Trust Metric The protocol-defined trust model used to rank, compare, and evaluate objects. Produces an overall_trust_score, a band, a confidence, and the evidence behind them.
TrustSenz The DeepSenz engine that calculates and updates trust metrics from evidence, provenance, freshness, outcomes, ratings, and repeated execution. TrustSenz verifies truth, not success — a verified failure is valid evidence.
Trust Delta The change in trust caused by a verified journey, rating, correction, freshness update, failure, or successful outcome. Append-only; deltas are flagged for review on contradiction, never silently averaged.
CAP One possible packaged runtime format for execution (Capsule / Experience Capsule). A CAP is a package an object may reference; it is not the object itself.
Skill export A compatibility/export layer that renders an object into a SKILL.md-style package for broader skill ecosystems.
Registry An index of Outcome Manifests. SkillSenz.ai is one registry implementation.
Runtime The system that installs or executes a referenced package: MySenz, an MCP tool host, Apple App Intents, a local agent runtime, future runtimes.
Publisher The entity that creates and signs an Outcome Intelligence Object. In v1, first-party (DeepSenz) only.
Principal The human or organization on whose behalf an agent acts.
License The entitlement to access/install/execute an object's package, expressed as a license token + entitlement record.
Trust Manifest The trust + provenance layer, kept separate from marketing claims: identity, signatures, hashes, citations, freshness TTLs, trust score object, score/delta history, rating summary, journey evidence summary, disclaimers, limitations.
Provenance The chain of where the object's knowledge and packages came from, expressed by pointer (refs + content digests), not by copying source bodies.
Freshness The recency state of an object's situational knowledge: a freshness date, an expiry/TTL, and a derived freshness score. Stale knowledge decays trust on schedule.
Signature A publisher's cryptographic signature over the manifest hash and package hash.
Install Endpoint / Install Ref A reference describing how a runtime obtains and installs the package.
Verification Endpoint The service endpoint that verifies manifest authenticity, package hash, signature, license validity, and freshness/score status.
Rating Event A submitted rating or outcome-feedback record from a user, agent, runtime, or evaluator.
Journey Event Evidence generated by usage of an object — the journey exhaust (accepted/rejected/abandoned/completed/verified-failed). The denominator. Maps to EvoSenz OutcomeEvents.
Compatibility Adapter Declared metadata (and optional export artifact) describing which runtimes/standards an object is compatible with.

5. Protocol Actors

Actor Role v1 status
Publisher Authors, validates, signs, publishes manifests. First-party only
Registry Indexes manifests; serves discovery + trust + ratings endpoints. SkillSenz.ai (single)
Runtime Installs/executes referenced packages. MySenz + reference CLI
Agent Acts for a principal; discovers, evaluates trust, licenses, installs. Supported
Principal Human/org the agent serves; the license holder. Supported
License Server Issues and verifies license tokens / entitlements. Co-located with registry in v1
Verification Service Verifies manifests, hashes, signatures, license, freshness, score. Co-located with registry in v1
Payment Provider Processes one-time purchase. External (Stripe-class), referenced
TrustSenz Evaluator Computes/updates the Outcome Trust Metric from evidence. First-party engine
Trust Authority The party whose signature/policy defines what an evidence class and score method mean. DeepSenz, in v1
Rating Submitter User / agent / runtime / evaluator submitting a Rating Event. Submission allowed; only verified events affect score in v1
Host / Package Provider Stores the actual package bytes (DeepSenz host, GitHub release, object storage, enterprise endpoint, App Intent bundle, MCP server). Pluggable

Interaction sketch: Publisher signs a manifest and registers it with the Registry. An Agent (for a Principal) discovers the manifest via the Registry, asks the Verification Service to confirm authenticity, reads the Outcome Trust Metric from TrustSenz (served by the Registry), requests a License from the License Server (paying via the Payment Provider if priced), then asks the Host for the package via the install ref and hands it to a Runtime. Usage produces Journey Events; the Agent/Runtime/Principal may submit Rating Events; the TrustSenz Evaluator folds verified Journey Events into a new Trust Delta. The Trust Authority's keys/policy define what those evidence classes and the score version mean.


5A. OIP v0.2 Object Model — Execution Partner Architecture (forward-looking; non-breaking superset of v1.0)

Status of this section. v1.0 (the Phase-1 spec above) ships with the Outcome Manifest, Trust Manifest, and Trust Metric. This section folds in the Execution Partner Architecture as the durable object model OIP grows into — it is a superset, not a replacement. Nothing here changes a v1.0 envelope; every addition arrives as a registry value, a projection, or an extension profile, which is precisely the property that lets OIP evolve for 5–10 years without protocol-breaking changes. v1.0 implementers can ignore this section and remain conformant; v0.2 formalizes the identity/attribution layer once first-party commerce demands it.

5A.1 The realization

The durable, reputation-bearing anchor in an Outcome Intelligence ecosystem is a persistent professional identity — the Execution Partner — which accepts accountability for planning, orchestrating, and delivering an outcome on a principal's behalf. Everything beneath it (domain partner, outcome package, capability, model, runtime, instance) is replaceable implementation. The principal experiences one trusted relationship; the orchestration underneath is recorded as evidence, not surfaced as complexity. The General Contractor principle: a human hires the partner, the partner chooses (and replaces) the technology, the partner stays accountable, and professional reputation persists across technology churn.

Two units, reconciled (do not collapse them):

5A.2 Four object categories (the organizing principle)

The model is not a linear hierarchy of stages. It is four categories with opposite durability rules. Conflating them is the "pipeline-stage" failure mode this protocol exists to avoid.

Category Durability rule Members
Identity Persistent; reputation-bearing; never deleted Identity (any role)
Asset Catalog; versioned; content-addressed Outcome Manifest, Capability, Execution Configuration (as a referenced descriptor)
Evidence Append-only system of record Execution Record, Outcome Evaluation, Attribution, Delegation Edge
Projection Recomputable; never authoritative Reputation Graph, Delegation Graph, Trust Metric (derived view)

5A.3 The eight core object types

1. Identity            { identity_id (did-style URI), role*, display_name, public_key,
                         identity_status, links{a2a_card, mcp_server, acp_merchant} }
                         * role ∈ registry-governed enum:
                           principal | execution_partner | domain_partner | delegate | …

2. OutcomeManifest     (the v1.0 Outcome Intelligence Object; publisher → Identity ref)

3. Capability          { capability_id, hash, kind: mcp_tool | skill_md | openapi, ref }
                         // the "Skill", by pointer — substance lives in MCP / SKILL.md

4. ExecutionConfiguration  { config_id (content hash),
                             dimensions{ planner, model, reasoning_model, agent_framework,
                                         runtime, device, platform, persona, tools[] } }
                         // EVIDENCE DIMENSION ONLY — has no reputation field, no score

5. ExecutionRecord     (append-only; references principal_ref, execution_partner_ref,
                         delegation_edges[], config_ref, manifest_ref, outcome_evaluation_ref)

6. OutcomeEvaluation   { verdict (registry enum), execution_quality, outcome_quality,
                         experience_quality[], evidence_class, confidence }

7. Attribution         { execution_record_ref,
                         subject: identity_ref | config_dimension_ref,
                         contribution_class (registry enum), confidence,
                         sample_context, method_version }   // conservative, evidence-led

8. TrustManifest/Metric (the v1.0 trust objects; in v0.2 keyable per-Identity AND per-Outcome)

Plus one shared evidence primitive that is not a standalone top-level object — it is recorded inside an Execution Record:

DelegationEdge   { role (registry enum), delegate_ref (Identity), parent_ref (Identity), at }

5A.4 Identity is modeled once (role-as-data)

There is one Identity object. The Execution Partner is Identity{role: execution_partner}; a Foundational Domain Partner ("Genie") is Identity{role: domain_partner}; the open-ended set of delegation roles (Planner, Strategist, Researcher, Reviewer, Negotiator, Verifier, Coach, Curator, Concierge, …) are Identity{role: delegate} carrying a registry-governed specialty. This makes the Execution Partner first-class (reputation and portable identity require it) while keeping every future role a registry addition, not a schema change. EPI is therefore first-class as a role of the unified Identity object — not a bespoke type, not an extension-only concept, not a reference-implementation detail.

Identities carry links to their counterparts in adjacent standards (A2A agent card, MCP server identity, ACP/UCP merchant identity) via a decentralized-identifier-style URI, so reputation is portable across registries and standards without any of those standards needing to know OIP exists.

5A.5 Structural rules (enforced by schema + tests, like the spine's forbidden-fields)

  1. Configuration carries no reputation. ExecutionConfiguration has no trust score and no reputation field. An Attribution may reference a config dimension as a covariate with confidence, but a config is never a reputation subject. (Prevents "GPT-6 has 4.8 stars" category errors.)
  2. Reputation lives in evidence, not in graphs. The durable truth is append-only Execution Records + Attribution edges keyed to Identities. The Reputation Graph and Delegation Graph are projections, rebuilt from evidence; they are never the system of record.
  3. Attribution is conservative. Single-evaluation attribution is flagged low-confidence and carries sample_size_warning; causal language is earned only by repeated evidence across many evaluations, using the same Wilson-lower-bound / Beta-posterior treatment as the Trust Metric (§8.5). Attribution weights are versioned and explainable (method_version) — never a black-box learned policy.
  4. Instances are disposable. A transient Execution Instance (e.g. Herman-501) is referenced by evidence but accrues no reputation; reputation belongs to the persistent Execution Partner identity.
  5. Delegation is a DAG, not a fixed tree. Delegation is recorded as typed DelegationEdges inside the Execution Record (the example chain Planner→partner, Research→partner, Verification→partner, … is a graph of role→identity assignments). The policy that produces delegation is runtime/extension; the edge is core evidence.
  6. Persona is presentation, not principal-modeling. persona is a configuration dimension describing how the partner presented (e.g. "Luxury Wellness Concierge"); it must never become inferred behavioral modeling of the principal.

5A.6 The durability theorem

Over a 5–10 year horizon, new capability arrives without touching the eight envelopes:

New thing Arrives as
New roles (Strategist, Curator, …) registry role / specialty value
New verdicts / contribution classes / evidence classes registry enum value
New model / runtime / device / agent framework data inside ExecutionConfiguration
New interoperability standard (next A2A/MCP) an Identity.links field + a compatibility binding
New DeepSenz orchestration intelligence a new the private extension profile profile revision

This is the concrete mechanism behind v1.0 Design Goal 13 (future extensibility): growth by data, projection, and extension profile — never by envelope mutation.

5A.7 Standards binding (complement, never replace)

Standard Owns OIP v0.2 binding
A2A agent communication, agent cards, task delegation Identity ↔︎ A2A Agent Card; DelegationEdge ↔︎ A2A task delegation
MCP tool invocation Capability{kind: mcp_tool}; Config records which MCP servers ran
SKILL.md skill packaging Capability{kind: skill_md}; CAP exports to SKILL.md
ACP / UCP payment / settlement Execution Partner is the merchant-of-record; OIP licensing rides ACP/UCP rails (no reinvented settlement)
Apple App Intents native execution surface a CAP package_ref may be an App Intent bundle; appears as platform in Config
Future runtimes whatever comes next confined to ExecutionInstance + ExecutionConfiguration — the replaceable zone

5A.8 Core vs. private extension-profile boundary

Reference-implementation extension-profile details are intentionally out of scope for this public preview. Core OIP remains discoverable, verifiable, licensable, and attributable without reading private extension-profile contents.

5A.9 Open questions carried by v0.2


6. OIP Lifecycle

# Step v1.0 Notes
1 Author ✅ Required Write Outcome Manifest + Trust Manifest.
2 Validate ✅ Required Schema-validate; enforce null↔︎unknown pairing; refuse on drift.
3 Sign ✅ Required Sign manifest hash + package hash.
4 Publish ✅ Required Register manifest with the Registry.
5 Discover ✅ Required /v1/outcomes, .well-known/oip.json, llms.txt, MCP, A2A card.
6 Evaluate trust ✅ Required /v1/outcomes/{id}/trust; band + confidence + evidence.
7 License ✅ Required free / one-time purchase.
8 Install ✅ Required Resolve install ref → package.
9 Verify ✅ Required Manifest/package/signature/license/freshness/score checks.
10 Execute in runtime ⛔ Out of scope (runtime owns semantics) OIP hands off; runtime executes.
11 Capture Journey Event ◐ Partial v1 captures first-party journeys via the Travel Genie capture stack; ingestion endpoint is internal.
12 Submit Rating / Outcome Feedback ◐ Partial Endpoint exists; in v1 only verified events move the score.
13 Update Trust Metric ✅ Required TrustSenz recomputes; append-only delta.
14 Update ✅ Required semver bump → new manifest version.
15 Refresh trust/freshness ✅ Required Freshness TTL drives freshness_score; expiry flips lifecycle.
16 Deprecate ✅ Required lifecycle_status: deprecated, with supersession ref.
17 Retire ✅ Required lifecycle_status: retired; manifest remains resolvable for audit.

Future extensions: community ratings affecting score (Phase 2), third-party Journey Event ingestion (Phase 2), webhooks for update/trust notifications (Phase 2), cross-registry trust arbitration (Phase 3).


7. Outcome Manifest v1 Schema

Conventions inherited from the spine: JSON, additionalProperties: false, draft 2020-12, registry-bound identifiers, unknowns expressed as null with an unknowns[] entry, never as guessed defaults.

7.1 Field summary

Field Type Required Notes
schema_version string "oip.outcome-manifest.v1"
outcome_id string Stable identity of the outcome intelligence across versions. Namespace oi:.
manifest_id string Identity of this version's manifest = sha256_16(outcome_id + version + content_digest).
display_name string Human label.
short_description string ≤160 chars; what outcome it drives.
long_description string Full description; may be markdown.
domain string (registry-bound) e.g. relocation, travel, health.
outcome_type string (registry-bound) e.g. guided_relocation, focus_session.
publisher object identity + signature pointer (see Trust Manifest).
version string (semver) e.g. 1.0.0.
lifecycle_status enum draft | active | deprecated | expired | retired.
package_refs[] array One or more package descriptors (host-neutral).
install_refs[] array How a runtime installs each package.
compatibility object Declared runtime/standard compatibility (§12).
trust object Embedded Outcome Trust Metric summary + pointer to full Trust Manifest (§8/§9).
provenance object Source pointers + digests (§9).
freshness object freshness_date, expires_date, freshness_score.
license object Licensing terms (§14).
pricing object model: free | one_time, amount, currency.
security object requested scopes, sandbox hints, signature algorithm.
dependencies[] array ✅ (may be empty) Other outcome_ids / runtime requirements.
update_policy object semver policy, breaking-change rule, supersedes/superseded_by.
created_at string (RFC3339)
updated_at string (RFC3339)
links object machine + human URLs (manifest JSON, trust JSON, docs page).
unknowns[] array ✅ (may be empty) {field, reason} for every null-by-absence field. Reasons from closed set: not_present_in_source, no_evidence_yet, not_applicable_v1.

7.2 Complete example — Travel Genie Da Nang Relocation Outcome Intelligence

{
  "schema_version": "oip.outcome-manifest.v1",
  "outcome_id": "oi:travel_genie.danang_relocation",
  "manifest_id": "mf_9f2a1c7b4e0d6a3c",
  "display_name": "Travel Genie — Da Nang Relocation",
  "short_description": "Guides a remote worker through relocating to Da Nang: visa runway, neighborhood fit, banking, SIM, and a verified first-week setup.",
  "long_description": "An Outcome Intelligence object that models the principal's current state (foreign resident, no local accounts, visa clock running) versus the desired state (legally present, banked, connected, housed) and guides them gap-by-gap. Built from one verified first-party journey plus first-party situational research. Treats each precondition (visa runway, proof-of-address for banking, eSIM vs local SIM) as a checklist item with verified-failure handling.",
  "domain": "relocation",
  "outcome_type": "guided_relocation",
  "publisher": {
    "publisher_id": "pub:deepsenz",
    "display_name": "DeepSenz",
    "identity_status": "first_party_verified",
    "signature_ref": "tm_danang_reloc_v1#publisher_signature"
  },
  "version": "1.0.0",
  "lifecycle_status": "active",
  "package_refs": [
    {
      "package_id": "pkg:cap.travel_genie.danang_relocation.v1",
      "package_type": "cap",
      "format": "deepsenz.experience-capsule.v2",
      "content_hash": "sha256:6b1f…e90a",
      "host": "deepsenz",
      "uri": "https://packages.skillsenz.ai/cap/travel_genie/danang_relocation/1.0.0.capsule",
      "size_bytes": 184320
    }
  ],
  "install_refs": [
    {
      "for_package": "pkg:cap.travel_genie.danang_relocation.v1",
      "runtime": "mysenz",
      "method": "capsule_install",
      "endpoint": "https://api.skillsenz.ai/v1/outcomes/oi:travel_genie.danang_relocation/package",
      "requires_license": true
    }
  ],
  "compatibility": {
    "runtimes": ["mysenz"],
    "standards": {
      "mcp": { "supported": true, "tool_manifest_ref": "links.mcp" },
      "a2a": { "supported": true, "agent_card_ref": "links.a2a" },
      "apple_app_intents": { "supported": false, "reason": "not_applicable_v1" },
      "skill_md": { "supported": true, "export_ref": "links.skill_export" },
      "openapi": { "supported": true, "spec_ref": "links.openapi" }
    }
  },
  "trust": {
    "trust_manifest_ref": "https://api.skillsenz.ai/v1/outcomes/oi:travel_genie.danang_relocation/trust",
    "overall_trust_score": 72,
    "trust_band": "Promising",
    "confidence_score": 76,
    "sample_size_warning": true,
    "score_version": "trustsenz.score.v1",
    "last_trust_update": "2026-06-09T12:00:00Z"
  },
  "provenance": {
    "source_refs": [
      { "kind": "journey_evidence", "ref": "execution_record:er_danang_focus_run1", "content_digest": "sha256:a31c…77be" },
      { "kind": "first_party_research", "ref": "doc:danang_relocation_research_v1", "content_digest": "sha256:0c44…11ad" }
    ],
    "derived_from_outcome_id": null,
    "unknowns": [
      { "field": "provenance.third_party_corroboration", "reason": "no_evidence_yet" }
    ]
  },
  "freshness": {
    "freshness_date": "2026-06-08",
    "expires_date": "2026-09-08",
    "freshness_score": 0.74,
    "recency_sensitive_fields": ["visa_rules", "bank_account_requirements", "sim_providers"]
  },
  "license": {
    "license_types": ["free_preview", "one_time"],
    "default_license_type": "one_time",
    "verification_endpoint": "https://api.skillsenz.ai/v1/licenses/{token}/verify"
  },
  "pricing": {
    "model": "one_time",
    "amount": 19.0,
    "currency": "USD",
    "free_preview": true
  },
  "security": {
    "requested_scopes": ["location.geofence", "calendar.read"],
    "sandbox": "runtime_recommended",
    "signature_algorithm": "ed25519",
    "least_privilege": true
  },
  "dependencies": [],
  "update_policy": {
    "semver": true,
    "breaking_change_rule": "major_bump_required_for_outcome_or_evidence_change",
    "supersedes": null,
    "superseded_by": null,
    "freshness_refresh_required_by": "2026-09-08"
  },
  "created_at": "2026-06-09T12:00:00Z",
  "updated_at": "2026-06-09T12:00:00Z",
  "links": {
    "manifest_json": "https://api.skillsenz.ai/v1/outcomes/oi:travel_genie.danang_relocation",
    "trust_json": "https://api.skillsenz.ai/v1/outcomes/oi:travel_genie.danang_relocation/trust",
    "ratings_json": "https://api.skillsenz.ai/v1/outcomes/oi:travel_genie.danang_relocation/ratings",
    "human_page": "https://skillsenz.ai/o/travel-genie-danang-relocation",
    "docs": "https://skillsenz.ai/docs/oip",
    "mcp": "https://api.skillsenz.ai/mcp",
    "a2a": "https://skillsenz.ai/.well-known/a2a/agent-card.json",
    "openapi": "https://api.skillsenz.ai/openapi.json",
    "skill_export": "https://packages.skillsenz.ai/skill/travel_genie_danang_relocation/SKILL.md"
  },
  "unknowns": []
}

The trust object embedded in the manifest is a summary; the full evidence lives in the Trust Manifest (§9), reachable by trust_manifest_ref. The website must never render the embedded summary without resolving the full object (§19).


8. TrustSenz / Outcome Trust Metric v1 (mandatory section)

8.1 The core principle

TrustSenz does not count stars. It accumulates evidence.

Every overall_trust_score must expose the evidence beneath it. A score is a compression of evidence for ranking convenience, never a substitute for it. Ratings may influence trust — but only after a minimum journey threshold, and never more than verified outcome evidence.

8.2 Required fields

overall_trust_score        0–100 integer, derived (never hand-assigned)
trust_band                 enum (see 8.3)
confidence_score           0–100 integer; how much the score can be trusted given sample size & evidence quality
sample_size_warning        bool; true when journey_count < min_threshold (default 20)
journey_count              int; total journeys (the denominator — includes negatives)
successful_journey_count   int; verified closed/partially_closed journeys
outcome_success_rate       float | null; successful/total, null if journey_count == 0
latest_trust_delta         float | null; most recent change
last_trust_update          RFC3339 | null
rating_count               int
user_rating_average        float | null
agent_rating_average       float | null
combined_rating_average    float | null
evidence_score             float 0–1 | null; weighted evidence-class strength of journeys
freshness_score            float 0–1 | null; from freshness TTL
identity_score             float 0–1; publisher identity strength
provenance_score           float 0–1 | null; source-chain completeness
correctness_score          float 0–1 | null; verified-correctness of claims (maps 5C correctness)
conversion_score           float 0–1 | null; verdict-statistics conversion (maps 5C conversion)
citation_score             float 0–1 | null; evidence-class-weighted citation density (maps 5C citations)
community_score            float | null; community signal — null in v1 (no community)
limitations[]              array of strings; honest known limits
scoring_explanation        string; human-readable basis
score_inputs               object; the actual numbers that fed the score
score_version              string; e.g. "trustsenz.score.v1"

8.3 Trust bands

Range Band Meaning
0–39 Unverified No or near-no verified evidence.
40–59 Experimental Some evidence; high uncertainty; small sample.
60–74 Promising Real verified evidence, positive early signal, sample still small.
75–84 Validated Adequate sample, consistent verified outcomes.
85–94 Trusted Strong, repeated, evidence-classed outcomes.
95–100 Proven at Scale Large sample, durable, fresh, third-party-corroborated.

A band is never shown without the confidence_score and sample_size_warning beside it.

8.4 Score principles (binding)

8.5 Statistical treatment (lightweight, not a data-science platform)

Three honest mechanisms, no more:

  1. Wilson score lower bound for outcome_success_rate → drives conversion_score. With 1 success of 1 journey, the Wilson 95% lower bound is ≈0.21, not 1.0. This is the single most important anti-inflation device: it converts "100% of one" into "we can defend ~0.21 so far."
  2. Beta(α₀, β₀) prior on success, with a weak prior of α₀=β₀=1 (uniform). Posterior mean = (successes+1)/(journeys+2). One-of-one → 0.67, not 1.0. Used for the displayed outcome_success_rate companion and the conversion component.
  3. Confidence from sample size + evidence class. confidence_score = f(journey_count, evidence_score), monotonic in both, capped below 80 while sample_size_warning is true. Self-report-only evidence caps confidence lower than sensor/third-party evidence (evidence-class ladder from the capture review: third_party > sensor > receipt > media > self_report).

min_journey_threshold for clearing sample_size_warning: 20 (configurable, pinned by score_version). Below threshold, ratings carry zero score weight.

This subsection describes reference-implementation detail and is redacted from the public preview. The trust model's bands, confidence, evidence components, and statistical treatment are documented above; exact component weights are not disclosed here.

8.7 Example trust object (the seed)

{
  "schema_version": "oip.trust-metric.v1",
  "outcome_id": "oi:travel_genie.danang_relocation",
  "score_version": "trustsenz.score.v1",
  "overall_trust_score": 72,
  "trust_band": "Promising",
  "confidence_score": 76,
  "sample_size_warning": true,
  "journey_count": 1,
  "successful_journey_count": 1,
  "outcome_success_rate": 1.0,
  "outcome_success_rate_wilson_lb": 0.21,
  "outcome_success_rate_beta_mean": 0.67,
  "latest_trust_delta": 0.18,
  "last_trust_update": "2026-06-09T12:00:00Z",
  "rating_count": 0,
  "user_rating_average": null,
  "agent_rating_average": null,
  "combined_rating_average": null,
  "evidence_components_5c": {
    "completeness": 0.86,
    "context": 0.82,
    "citations": 0.64,
    "correctness": 0.74,
    "conversion": 0.88,
    "overall_5c": 0.79
  },
  "evidence_score": 0.79,
  "freshness_score": 0.74,
  "freshness_date": "2026-06-08",
  "expires_date": "2026-09-08",
  "identity_score": 0.90,
  "provenance_score": 0.70,
  "correctness_score": 0.74,
  "conversion_score": 0.55,
  "citation_score": 0.64,
  "community_score": null,
  "limitations": [
    "Single verified journey (n=1); success rate is not yet statistically meaningful.",
    "Evidence is primarily first-party self-report plus sensor timestamps; no third-party corroboration yet.",
    "Situational knowledge (visa, banking, SIM) expires 2026-09-08 and must be refreshed.",
    "No community or independent ratings yet (rating_count = 0)."
  ],
  "scoring_explanation": "Score 72 (Promising) is driven by strong first-party evidence quality (5C overall 0.79) and good freshness, deliberately held down by a Wilson-bounded conversion component (0.55, from 1-of-1 journeys) and a sample-size confidence cap. Ratings contribute 0 (below n=20 threshold). community_score is null and excluded, not zeroed.",
  "score_inputs": {
    "weights": "trustsenz.score.v1",
    "components_used": ["conversion_score","evidence_score","correctness_score","provenance_score","freshness_score","citation_score","identity_score"],
    "components_null_dropped": ["combined_rating_average","community_score"],
    "journey_refs": ["execution_record:er_danang_focus_run1"]
  }
}

Why this CAP is Promising but not Trusted at scale: it has real verified evidence of high quality (5C 0.79, fresh, first-party-identity-strong), which lifts it well above Unverified/Experimental — but it has exactly one journey. The Wilson lower bound drags conversion to 0.55, the sample_size_warning caps confidence at 76, and ratings contribute nothing. It is honestly good early evidence, explicitly not yet proven. A second journey that fails could move it down; that is the system working, not a bug.


9. Trust Manifest v1

The Trust Manifest is the trust + provenance layer, kept separate from marketing copy so trust can be verified independently of the sales pitch. It is served at /v1/outcomes/{id}/trust.

{
  "schema_version": "oip.trust-manifest.v1",
  "outcome_id": "oi:travel_genie.danang_relocation",
  "manifest_id": "mf_9f2a1c7b4e0d6a3c",
  "publisher_identity": {
    "publisher_id": "pub:deepsenz",
    "identity_status": "first_party_verified",
    "verification_method": "internal_key_custody_v1"
  },
  "signatures": {
    "signature_algorithm": "ed25519",
    "manifest_hash": "sha256:1d4e…aa90",
    "signed_package_hash": "sha256:6b1f…e90a",
    "publisher_signature": "ed25519:9af2…",
    "verification_date": "2026-06-09T12:00:00Z"
  },
  "provenance": {
    "source_provenance": [
      { "kind": "journey_evidence", "ref": "execution_record:er_danang_focus_run1", "content_digest": "sha256:a31c…77be", "evidence_class": "self_report+sensor", "contract_status": "unverified" },
      { "kind": "first_party_research", "ref": "doc:danang_relocation_research_v1", "content_digest": "sha256:0c44…11ad" }
    ],
    "citations": [
      { "claim": "30-day visa-exempt entry then extension path", "source_ref": "doc:danang_relocation_research_v1#visa", "evidence_class": "document", "freshness_ttl": "2026-09-08" }
    ]
  },
  "freshness": {
    "freshness_date": "2026-06-08",
    "expires_date": "2026-09-08",
    "freshness_score": 0.74,
    "recency_sensitive_fields": ["visa_rules", "bank_account_requirements", "sim_providers"]
  },
  "disclaimers": [
    "Not legal, financial, or immigration advice. Visa and banking rules change; verify with official sources before acting.",
    "Single-journey evidence; outcomes may differ for your situation."
  ],
  "verification_status": "signed_and_indexed",
  "known_limitations": [
    "n=1 verified journey", "self-report-dominant evidence", "first-party only"
  ],
  "trust_score": { "$ref": "oip.trust-metric.v1 object from §8.7" },
  "score_history": [
    { "score_version": "trustsenz.score.v1", "overall_trust_score": 72, "at": "2026-06-09T12:00:00Z" }
  ],
  "trust_delta_history": [
    { "delta": 0.18, "cause": "first_verified_journey", "at": "2026-06-09T12:00:00Z", "review_flag": false }
  ],
  "rating_summary": {
    "rating_count": 0,
    "user_rating_average": null,
    "agent_rating_average": null,
    "combined_rating_average": null,
    "low_confidence": true,
    "score_weight_active": false
  },
  "journey_evidence_summary": {
    "journey_count": 1,
    "successful_journey_count": 1,
    "verdict_breakdown": { "closed": 1, "partially_closed": 0, "not_closed": 0, "abandoned": 0, "rejected": 0, "ignored": 0, "unknown": 0 },
    "evidence_class_breakdown": { "third_party": 0, "sensor": 1, "receipt": 0, "media": 0, "self_report": 1 }
  }
}

Honesty rules (binding): if there is no population data, rating_average, success_rate-derived community fields, and review_count are null or explicitly low_confidence: truenever fabricated. trust_delta_history is append-only; a delta that contradicts prior evidence is review_flag: true, not averaged away.


10. Rating and Outcome Feedback Protocol

A Rating Event is submitted to POST /v1/outcomes/{id}/ratings.

{
  "schema_version": "oip.rating-event.v1",
  "rating_event_id": "re_…",
  "outcome_id": "oi:travel_genie.danang_relocation",
  "license_id": "lic_… | null",
  "journey_id": "jr_… | null",
  "submitter_type": "user | agent | runtime | evaluator",
  "rating_score": 4,
  "outcome_success": "closed | partially_closed | not_closed | unknown | null",
  "reuse_intent": "yes | no | null",
  "recommendation_intent": "yes | no | null",
  "qualitative_feedback": "string | null",
  "failure_reason": "registry-bound code | null",
  "verified_execution": false,
  "anti_gaming": {
    "license_bound": true,
    "journey_bound": false,
    "rate_limited": true,
    "duplicate_window_checked": true
  },
  "timestamp": "2026-06-20T08:30:00Z",
  "privacy": { "scope": "personal", "pii_present": false }
}

Five distinct evidence tiers — kept separate, never merged into one star average:

Tier Affects v1 score? Why
Verified outcome event (verified_execution: true, journey-bound) ✅ Yes (primary) Closest thing to a controlled result.
Runtime execution report ✅ Yes (as Journey Event) Machine-observed usage.
Agent rating ❌ Not in v1 (displayed, gated) Influences only past n=20; weighted separately from users (§22).
User rating ❌ Not in v1 (displayed, gated) Same gate.
Unverified review ❌ Never scores Display only, clearly labeled unverified.

Anti-gaming in v1: ratings must be license-bound; a verified outcome event must be journey-bound to a real ExecutionRecord; rate-limit + duplicate-window per principal; verified-execution events outweigh everything; unverified reviews are quarantined from the score entirely.

Privacy: ratings default scope: personal; PII is flagged; nothing is published in v1 (no public review surface exists).


11. Package References and Distribution

OIP indexes manifests; it does not require centralized package hosting. A package_ref is host-neutral:

{
  "package_id": "pkg:…",
  "package_type": "cap | skill | mcp_tool | app_intent_bundle",
  "format": "deepsenz.experience-capsule.v2 | skill.md | openapi | …",
  "content_hash": "sha256:…",
  "host": "deepsenz | github_release | object_storage | enterprise | apple_app_intent | mcp_server",
  "uri": "https://…",
  "size_bytes": 0,
  "access": "license_required | public"
}

Supported hosts in the schema: DeepSenz-hosted package, GitHub release, object storage (S3-class), private enterprise endpoint, Apple App Intent bundle reference, MCP tool-server reference, and a reserved slot for future distributed storage. SkillSenz.ai may host packages (and does for the v1 first-party object), but that is an implementation choice, not a protocol requirement. The content_hash is what binds the manifest to the bytes regardless of host.


12. Compatibility Layer

An object declares compatibility as metadata; it does not assume one runtime wins.

"compatibility": {
  "runtimes": ["mysenz", "local_agent"],
  "standards": {
    "mcp":               { "supported": true,  "tool_manifest_ref": "…" },
    "a2a":               { "supported": true,  "agent_card_ref": "…" },
    "apple_app_intents": { "supported": false, "reason": "not_applicable_v1" },
    "skill_md":          { "supported": true,  "export_ref": "…SKILL.md" },
    "openapi":           { "supported": true,  "spec_ref": "…openapi.json" }
  },
  "compatibility_adapters": [
    { "target": "skill_md", "adapter": "skill_export.v1", "artifact_ref": "…" }
  ]
}

A Compatibility Adapter is the declared (and optionally materialized) export that renders the object into another ecosystem's format — e.g. a skill_export.v1 adapter producing a SKILL.md. v1 ships the MySenz runtime as primary and a SKILL.md export as the compatibility proof-point; MCP and A2A are declared and served (§18) but minimal.


13. Discovery Protocol

Surface Returns
/.well-known/oip.json Registry capability document: protocol version, endpoints, supported license types, score_version, MCP/A2A pointers.
/llms.txt Human+agent plain-text index pointing at the machine endpoints and the one v1 object.
/v1/outcomes Paged list of Outcome Manifest summaries (id, name, domain, band, confidence, sample_size_warning).
/v1/outcomes/{id} Full Outcome Manifest.
/v1/outcomes/{id}/trust Full Trust Manifest + Trust Metric.
/v1/outcomes/{id}/ratings Rating summary + (in v1) verified-event summary; unverified reviews labeled.
/openapi.json OpenAPI 3.1 spec of the REST surface.
/mcp MCP endpoint exposing the OIP tools (§18).
/a2a/agent-card.json A2A agent card advertising OIP capabilities (§18).

Example discovery flow (agent, autonomous):

  1. Agent has a principal who wants relocation guidance → fetches https://skillsenz.ai/.well-known/oip.json.
  2. Calls GET /v1/outcomes?domain=relocation → finds oi:travel_genie.danang_relocation.
  3. Calls GET /v1/outcomes/{id}/trust → reads overall_trust_score: 72, band: Promising, confidence: 76, sample_size_warning: true, and the limitations. Decides the evidence is honest and adequate for a low-stakes purchase.
  4. Calls GET /v1/outcomes/{id} → checks pricing, license, compatibility.runtimes includes its runtime.
  5. POST /v1/outcomes/{id}/license (paying if needed) → receives a license token.
  6. GET /v1/outcomes/{id}/package with the token → resolves install ref → hands package to runtime.

The agent never needed the HTML page. That is the design intent.


14. Licensing Protocol (v1)

Supported v1 license types: free / free_preview and one_time. Nothing else.

{
  "schema_version": "oip.license.v1",
  "license_id": "lic_…",
  "license_token": "lt_…",
  "outcome_id": "oi:travel_genie.danang_relocation",
  "license_type": "one_time",
  "entitlement": "install_and_execute",
  "principal": "principal:…",
  "purchased_by": "principal:… | agent:… (authorized)",
  "agent_authorization": { "agent_id": "agent:…", "authorized": true },
  "expiration": null,
  "verification_endpoint": "https://api.skillsenz.ai/v1/licenses/{token}/verify",
  "package_access": ["pkg:cap.travel_genie.danang_relocation.v1"]
}

Reserved (Phase 2+, present as null-able placeholders, not implemented): subscription, rental, pay_per_use, enterprise, revenue_share, a2a_negotiated.


15. Verification Protocol

Recommended verification order (an agent/runtime SHOULD verify in this sequence and stop on first failure):

  1. Manifest authenticity — resolve manifest, recompute manifest_id digest, check it matches.
  2. Publisher signature — verify publisher_signature over manifest_hash with the publisher's key.
  3. Package hash — after fetching bytes, recompute content_hash; must equal package_ref.content_hash and the Trust Manifest's signed_package_hash.
  4. License validityGET /v1/licenses/{token}/verify; confirm entitlement covers the package and principal.
  5. Freshness status — check expires_date; if past, treat situational claims as stale and warn.
  6. Trust score status — confirm score_version is one the agent understands; read band + confidence + sample_size_warning.
  7. Rating integrity — confirm displayed ratings are labeled verified/unverified; do not let unverified reviews drive the decision.
  8. Compatibility — confirm the target runtime/standard is in compatibility.
  9. Deprecation status — confirm lifecycle_status is active (or accept deprecated with superseded_by knowingly).

A failure at any step is reported with a stable refusal code (§17), not a silent fallback.


16. Update Protocol


17. Security and Anti-Gaming Model

Refusal codes (stable, API-level): unknown_outcome_id, manifest_hash_mismatch, signature_invalid, package_hash_mismatch, license_invalid, license_revoked, license_expired, score_version_unsupported, outcome_deprecated, outcome_retired, incompatible_runtime, rating_unauthorized, rating_duplicate, trust_escalation_refused (any attempt to present verified trust without a verification ref).


18. API Surface v1

REST

GET  /.well-known/oip.json
GET  /v1/outcomes
GET  /v1/outcomes/{id}
GET  /v1/outcomes/{id}/trust
GET  /v1/outcomes/{id}/ratings
POST /v1/outcomes/{id}/ratings
POST /v1/outcomes/{id}/license
GET  /v1/licenses/{token}/verify
GET  /v1/outcomes/{id}/package        (license-gated; resolves install ref → bytes/redirect)
GET  /openapi.json

MCP endpoint tools (/mcp)

A2A agent card capabilities (/a2a/agent-card.json)

compare_trust MUST return band, confidence, and sample_size_warning alongside the raw score so an agent cannot rank on a naked number.


19. Human-Readable Projection

The website SkillSenz.ai is a projection, not the protocol.


20. Phase 1 Implementation Plan (brutally practical MVP)

Goal: publish and sell one paid first-party object — Travel Genie Da Nang Relocation — with an honest trust score.

Build, in order:

  1. One Outcome Manifest (§7.2 JSON) — schema + validator (additionalProperties:false, null↔︎unknown pairing).
  2. One Trust Manifest (§9).
  3. One Outcome Trust Metric object (§8.7) — TrustSenz score computed from the one ExecutionRecord using Wilson + Beta + evidence-class caps; weights pinned to trustsenz.score.v1.
  4. One rating placeholder object — endpoint live, rating_count: 0, score weight inactive.
  5. One package reference — the CAP capsule, content-hashed.
  6. One install reference — MySenz capsule_install.
  7. One free preview — preview package or limited manifest.
  8. One paid license flowPOST /license → one-time purchase via external payment provider → license token.
  9. One verification endpointGET /v1/licenses/{token}/verify + manifest/package/signature checks.
  10. One minimal landing page — explains OIP, renders the one object's trust honestly, links to JSON.
  11. One docs page — the protocol overview.
  12. One llms.txt.
  13. One .well-known/oip.json.
  14. One OpenAPI file.

Do NOT build in Phase 1: third-party publishing, public reviews, marketplace search, negotiation, subscriptions, rental, revenue share, creator portal, webhooks, semantic search, cross-registry federation.

Tripwire (borrowed from the SignalSenz review's 500-line rule): if the Phase 1 build grows a second registry, a review queue, or a ratings-affect-score path, stop — something from Phase 2 is being smuggled in.


21. Phase 2 and Phase 3 Extensions

Phase 2 (after the first object demonstrably sells and repeated usage moves trust honestly):

Phase 3 (platform scale):


22. Open Questions (the decisions that actually matter)

  1. Is OIP proprietary, open, or open-core? Recommendation: open-core — the manifest/trust schemas and discovery API are open (adoption needs them to be); the TrustSenz evaluator and the verified-outcome corpus are the proprietary moat.
  2. Is SkillSenz.ai the canonical registry or one of many? v1: one registry. Schema avoids assuming singularity so federation stays possible.
  3. How is publisher identity verified? v1: first-party key custody only (first_party_verified). Third-party identity verification is a Phase 2 prerequisite, unsolved here deliberately.
  4. Minimum trust metadata required to list? A signed manifest, a Trust Manifest with identity_score and freshness, and an honest journey_count (which may be 0 with band Unverified). You may list with no journeys; you may not list while claiming journeys you cannot evidence.
  5. Legal disclaimers for visa/health/legal/financial/safety outcomes? Required disclaimers[] in the Trust Manifest; CRITICAL-tier domains (health, finance, legal, credentials) cannot auto-promote trust and need domain disclaimer text. Get real legal advice before the first health object.
  6. Is CAP installation local, cloud, or runtime-dependent? Runtime-dependent — OIP supplies install refs; the runtime decides. v1 is MySenz.
  7. Simplest paid Travel Genie test? The Da Nang Relocation object at one-time $19 with a free preview — the §20 plan. The success metric is a single real purchase, not protocol elegance.
  8. Does OIP need a namespace separate from CAP IDs? Yes. outcome_id (oi:) ≠ package_id (pkg:) ≠ cap_id. One outcome may reference multiple packages/CAPs across versions; collapsing the namespaces would make versioning and host-swapping ambiguous.
  9. Minimum journey count before ratings affect score? Default 20 (min_journey_threshold, pinned by score_version). Below it, ratings display but carry zero score weight.
  10. Weight agent vs user ratings differently? Yes, once they count (Phase 2): agent ratings are more reproducible but more gameable at scale; keep agent_rating_average and user_rating_average separate forever, combine only with declared, versioned weights.
  11. Who can submit verified outcome feedback? Only a license-bound principal/agent/runtime whose feedback is journey-bound to a real ExecutionRecord. Everyone else can submit unverified reviews that never touch the score.
  12. Can a publisher appeal a negative trust delta? Deltas are append-only and never deleted; a publisher may submit new evidence that produces a new delta, and a contradicting delta is review_flag: true. There is no "edit the past" path — that is the integrity guarantee.

23. Final Recommendation

Build OIP v1 only to the degree needed to publish and sell one trusted Travel Genie Da Nang Relocation Outcome Intelligence object. Do not build a marketplace. Do not build protocol complexity before usage.

Treat OIP v1 as a lightweight publishing, trust, licensing, rating, and installation standard — one Outcome Manifest, one Trust Manifest, one evidence-led Trust Metric, a thin REST+MCP+A2A surface, and free / one-time licensing.

Trust scoring is required from day one, but it must be:

The success metric is not protocol elegance. It is whether a real human or agent pays for a trusted outcome, and whether repeated usage moves trust up or down based on evidence. Everything in this spec that does not serve that metric is Phase 2.


Spec complete. The discipline that built ExecutionRecord v0 and the RuntimeEvidenceAdapter — append-only, forbidden fields, refusal codes, provenance-by-pointer, unknowns-not-defaults, no trust laundering — is the discipline that governs OIP. OIP succeeds by being the smallest honest standard that lets a trusted outcome be found, verified, bought, and improved; it fails by becoming a marketplace before anyone has paid.