Where AI memory lives: the substrate underneath SAIHM

2026-05-21 · SAIHM · ~7 min read · for engineers and architects evaluating AI memory infrastructure

The vendor-memory failure mode

Almost every AI memory feature shipped in the last twelve months lives inside the vendor: on the model provider's servers, behind their authentication, governed by their retention policy, terminated by their billing system. That works until any of the following happens:

The vendor changes the retention policy.
The vendor deprecates the memory product.
You move to a different model and the memory does not move with you.
A regulator asks for proof of erasure and the vendor cannot produce it.
An auditor asks for a timeline of what the agent knew when, and the vendor cannot produce it either.

The fix is not a better vendor. The fix is moving memory off the vendor stack onto a substrate that the agent owner controls, that any party can verify, and that survives the vendor.

What “substrate” means here

The SAIHM protocol is a contract: eight tools, an identity scheme, an erasure receipt, an audit anchor format. The substrate is the physical infrastructure that contract runs on — the ledger that records the audit anchors, the storage network that holds the encrypted cell data, the key material the agent identity is derived from.

The distinction matters because the protocol is the part you depend on. The substrate is the part you can swap.

The four jobs a substrate must do

Any substrate that wants to host AI memory at production seriousness has to do four things, each with its own engineering constraints:

Identity anchoring. The agent's identity must be derivable from key material the agent owner controls (a wallet, an HSM, a KMS) and provable to third parties without trusting the vendor. This is where wallet-derived keys + HKDF chains earn their keep.
Audit anchoring. Each memory write needs a tamper-evident timestamp on a public, append-only ledger so a regulator, auditor, or counterparty can verify the timeline without re-trusting the operator. Finality must be reasonable; the per-write cost must be low enough that anchoring every cell is economically possible.
Ciphertext storage. The encrypted cell content has to live somewhere durable, distributed, and addressable by content hash so any party with the right key can fetch it and any party without the key sees only random bytes.
Erasure proof. When the user invokes the right to erasure, the substrate must let the protocol destroy the only decryption key and stop serving the stored bytes — producing a tamper-evident receipt that a regulator can verify against the public ledger.

The candidate landscape, by job

Different substrate choices satisfy different jobs well or badly:

Identity: any chain with mature wallet tooling works. ETH, BTC, the EVM L2s, Solana, Cosmos, Substrate-family chains, COTI V2 — all viable for the “agent owns its keys” primitive.
Audit anchor: needs low write cost and reasonable finality. Ethereum L1 is too expensive at scale (every cell-mint would cost real dollars). L2 rollups are cheaper but inherit the rollup's data-availability assumptions. Cheaper L1s and app-chains widen the candidate set.
Ciphertext storage: decentralized, content-addressed networks are the obvious open candidates. A permanence-oriented (immutable) network is a candidate technically but its permanent-storage model conflicts with the right-to-erasure requirement (you cannot blacklist an immutable content identifier; such a network is designed not to forget). Centralised object stores (S3 / GCS / Azure Blob) work but reintroduce a custodial layer.
Erasure proof: not a property of the substrate per se — it is a property of the protocol's key-management discipline. Any chain that can take a small write can record a tombstone; the protocol decides what the tombstone means.

No single substrate is obviously dominant on all four jobs. The honest answer is: you pick the combination that satisfies your constraints, and you document the choice so others can audit it.

SAIHM's deployment choice

SAIHM's reference deployment uses COTI V2 mainnet for identity and audit anchoring, and a decentralized, erasure-compatible storage tier for encrypted cell ciphertext. The reasoning is narrow and engineering-only:

Native privacy primitives at the protocol level. COTI V2 supports wallet-derived encryption keys and private-data computation primitives, which lets the SAIHM agent identity HKDF chain (MPS-PQC-KEY-GEN-v1 → MPS-AGENT-IDENTITY-v1) tie audit records to the agent without putting sensitive key material on-chain.
Per-anchor cost low enough to anchor every cell. SAIHM mints one transaction per memory cell. The cost ceiling matters: at Ethereum L1 prices this workload would not be viable. The substrate has to make per-cell anchoring economically routine.
No EVM / no Solidity surface area. SAIHM has zero EVM dependency in protocol code — no ethers.js, no Solidity contracts. The substrate is interacted with via its native JSON-RPC. This is a deliberate protocol-design choice (smaller attack surface, no Solidity-version churn, no EVM gas-pricing dependency).
An erasure-compatible storage tier, not a permanence-oriented one. A permanence-oriented (immutable) network's storage model is incompatible with GDPR Article 17 right-to-erasure. SAIHM needs a storage tier it can stop serving a cell from after key destruction, so re-fetching becomes impossible. An erasure-compatible network's incentive model supports that; an immutable one's does not.

The choice is a tradeoff, not a coronation. A SAIHM deployment that satisfied the four jobs on a different substrate combination would still be SAIHM — the protocol contract does not change.

The portability point

Pulling these threads together: the substrate is a deployment configuration, the protocol is the load-bearing surface. If COTI V2 disappeared tomorrow, SAIHM the protocol would not disappear; a fresh deployment would stand up on whichever substrate combination satisfied the same four jobs, and the contract surface presented to applications and regulators would be unchanged.

This matters for evaluating any AI memory product, not just SAIHM:

If you are building an agent: think about your memory protocol independently from the substrate it runs on. Lock-in to either is a problem; lock-in to both at once is worse.
If you are a CISO or DPO: ask vendors for their substrate story — what is the audit ledger, what is the ciphertext store, what is the erasure-proof mechanism, what is the migration story if any of those changes.
If you are a regulator: the substrate is not a black box. Audit anchors are public; the candidate ledgers are well-documented; you can verify the chain of receipts yourself.

How to evaluate a substrate yourself

For each of the four jobs, ask:

Identity: Can the agent owner derive and rotate keys without trusting the vendor? Is the derivation auditable?
Audit anchor: What is the per-write cost? What is the finality timeline? Is the chain public and indexable by third-party explorers?
Ciphertext: Is content addressed by hash? Is the storage network distributed? Can a CID be made unfetchable when its key is destroyed?
Erasure: Does the protocol destroy the only key, or does it merely mark data deleted in a table? Where does the erasure receipt live? Can a regulator verify it against the public ledger without the operator's cooperation?

If a vendor cannot answer one of these for their AI memory product, the substrate story is incomplete.

Try it: a drop-in memory contract

Here is the fastest way to feel the difference. Paste this into your agent’s system prompt — it assumes the SAIHM MCP tools saihm_recall / saihm_remember / saihm_forget are wired into your harness. Following it is what produces the savings:

## Memory contract

On every turn, before you act:
1. RECALL, don't re-read. Call saihm_recall with keywords for the task to load a small, bounded set
   of cells. Do NOT re-send prior turns - the recalled cells ARE your context.
2. Prefer the CURRENT fact. If two recalled cells conflict, the most recent /
   non-superseded one wins - never act on a decision a later cell reversed.
3. REMEMBER durably. Call saihm_remember to persist decisions, conventions,
   and constraints as cells - one fact each, in your own words.
4. On a "delete my data" request, call saihm_forget on those cells: erasure
   is per-record and provable, not a soft delete.

Bounded recall flattens the resend curve from O(N²) to O(N·cap) — the 62.8–85.9% fewer context tokens the open benchmark shows. Start with a small recall cap and raise it only if recall misses.

Independence notice. SAIHM is an Apache-2.0 protocol authored independently. It is not affiliated with COTI, Ethereum, Solana, Cosmos, or any ledger, storage network, or cloud provider named in this post; references to them describe publicly observable technical characteristics, not endorsements. The substrate combination described is a reference deployment, not a requirement of the protocol contract. The context-token reduction (around 80% on long sessions) is reproducible independently with the open benchmark, and varies by usage pattern. Regulatory citations (GDPR Article 17) are correct as of publication; consult counsel for application to your jurisdiction.