two-bets-on-where-agent-context-lives - AIXplore

# Two Bets on Where Agent Context Lives > [!tip] TLDR > **The why.** Two vendors shipped answers to the same unsolved question within days of each other (Google on June 12, AWS on June 16). AWS put agent context next to the object bytes in S3. Google put it in a portable format that travels independent of any store. The obvious read is two rival products. It is the wrong one. They answer the question at two different layers, and reading them as competitors hides the architecture you actually want. > > **The shape.** AWS attaches structured context (annotations) to S3 objects, so the meaning rides with the bytes, addressed and governed the way the object is. Google's Open Knowledge Format describes context as portable markdown that any system can read or write, so the meaning moves freely between backends. One bets on locality. One bets on portability. > > **The hard part.** Naming the tradeoff without picking a side. Coupling context to bytes puts it in one store under one lifecycle, which collapses the sync problem and loses portability. Freeing context solves portability and multi-source reasoning and reopens the drift problem (the context and the thing it describes can silently diverge). Neither is the answer. The mature pattern uses both, and the design work is deciding which layer owns truth and which layer the agent reasons over. > > **Key takeaway:** treat "where does context live" as a two-layer decision, not a vendor pick. Couple your source-of-truth context to the bytes it describes; expose a portable working layer the agent reasons over; and make the sync between them explicit, because that seam is where correctness leaks. For two years the agent conversation has been about the model, then the harness, then the tools. The quiet problem underneath all of it is storage. Specifically: when your agent learns something about a document, a dataset, or a prior decision, where does that knowledge live, and who owns its lifecycle. Most teams answer this by accident. The context ends up smeared across a vector database, a few JSON sidecars, an `AGENTS.md`, and whatever the last engineer hardcoded into a prompt. It works until the underlying file changes and nobody updates the embedding. Then the agent confidently reasons over a description of a thing that no longer matches the thing. Two vendors just shipped opposite answers to this. They are worth reading together, because the disagreement between them is the actual design space. --- ## The problem both bets are answering An agent's context has a lifecycle the same way data has a lifecycle. It gets created, updated, invalidated, and garbage-collected. The difference is that almost nobody manages the context lifecycle on purpose. A document gets ingested. An agent annotates it: this is a Q3 contract, the counterparty is X, the renewal clause is unusual. That annotation is knowledge worth keeping. But it lives somewhere disconnected from the document. When a new version of the contract lands, the annotation does not move with it, does not get invalidated, does not even know the document changed. The agent keeps citing a fact that went stale three versions ago. > **The unowned lifecycle is the bug both vendors are trying to design out. The disagreement is about where you put context so it stays in step with what it describes.** I wrote a year ago that we were [arguing about the nouns while the verbs were on fire](https://rundatarun.io/p/the-context-graph-ais-trillion-dollar): everyone debating what to call agent memory, nobody shipping a durable place to keep it. These two releases are the verbs. They are infrastructure for the context lifecycle, and they make opposite bets on locality. --- ## Bet one: couple context to the bytes AWS [S3 Annotations](https://aws.amazon.com/blogs/aws/amazon-s3-annotations-attach-rich-queryable-context-directly-to-your-objects/) attach structured context directly to an S3 object: up to a thousand named annotations per object, mutable in place, auto-indexed so you can query them with Athena or a natural-language MCP server. The annotation lives in the same store as the object and shares its lifecycle. When you read the object, the meaning is right there with it. When the object is copied or deleted, its context comes along or gets cleaned up by the same machinery. The bet is **locality**. Put the knowledge next to the thing it describes, in the same system that already owns the thing's durability and access control. What that buys you is large and under-discussed: - **Co-location, not a sidecar.** The context lives in the same store as the object and reads back with it. "What does the agent know about this object" becomes a read against the object's own record, not a join against a separate database you have to keep in sync. - **Lifecycle comes mostly for free.** AWS is explicit: "your context moves automatically with the object during copy, replication, and cross-region transfers, and S3 removes it when you delete the object." You do not build a second lifecycle system for your context. The storage layer carries it along. - **Drift becomes a single-store update, not a reconciliation job.** Annotations are mutable in place, so when the object changes you refresh its context with one call against the same system. The two can still fall out of step if you skip that call. But there is no second store to reconcile, which is exactly the failure mode the unowned lifecycle creates. What it costs you is just as real. Context that lives inside one store is hard to reason over across stores. If your agent's working memory needs to span objects in S3, rows in a warehouse, and messages in a queue, bytes-coupled annotations give you a clean answer for the S3 slice and nothing for the rest. The knowledge is anchored, which is the point, and anchored knowledge does not travel. > [!note] What "annotation" means here > Not a free-text comment. Structured, queryable context attached to the object record, which is why it can be governed and versioned rather than just stored. Treat the exact schema and API surface as something to confirm against the current AWS docs before you build, because that part moves. --- ## Bet two: free context from the store Google's [Open Knowledge Format](https://cloud.google.com/blog/products/data-analytics/how-the-open-knowledge-format-can-improve-data-sharing) takes the opposite position. Describe context as portable markdown files with structured frontmatter, linked into a graph and deliberately store-agnostic, so any system can read or write it and the meaning moves independent of where the underlying data sits. It formalizes the "LLM wiki" pattern, the same instinct behind the `AGENTS.md` and `CLAUDE.md` files agents already read: knowledge as plain files a human and an agent can both open, kept in version control next to the thing it describes. The bet is **portability**. Knowledge should not be trapped in the storage layer that happens to hold the bytes today, because the bytes will move, the vendor will change, and the agent's view of the world should survive both. What that buys you: - **Multi-source reasoning is native.** A portable format does not care whether a fact came from object storage, a CRM, or a PDF. The agent reasons over one knowledge layer assembled from many sources, which is what real agent work needs. - **Vendor independence.** Context expressed in an open format outlives the store. You can migrate backends and keep the knowledge, which is the inverse of the lock-in coupling creates. - **Composability across tools.** Two systems that both speak the format can exchange knowledge without a custom adapter, the same way they can exchange JSON instead of inventing a wire protocol per pair. The cost is the drift problem, reopened. The moment context lives apart from the bytes it describes, the two can diverge, and now you own the sync. A portable knowledge layer that says a file has a certain property has no built-in guarantee the file still does. You are back to the staleness failure, except now it is your job to detect and repair it rather than the storage layer's job to prevent it. --- ## The two altitudes Read together, the disagreement resolves. These are not competing products. They sit at different altitudes of one stack. ```mermaid flowchart TB subgraph WORK["Working layer: portable, store-agnostic (OKF-shaped)"] direction LR A["agent reasons here"] --- B["multi-source knowledge graph"] end subgraph TRUTH["Source-of-truth layer: coupled to bytes (S3 Annotations-shaped)"] direction LR C["object + annotation"] --- D["object + annotation"] --- E["object + annotation"] end WORK -- "projects from / writes back to" --> TRUTH TRUTH -- "feeds verified context up" --> WORK ``` The bytes-coupled layer is where truth lives. It is anchored, governed, versioned, and resistant to drift because the context cannot detach from what it describes. The portable layer is where the agent works. It is assembled, cross-source, and disposable, because it is a projection, not the original. > **Coupling answers "is this still true." Portability answers "can I reason across everything." You want both answers, which means you want both layers.** Named plainly, the tradeoff is locality versus reach, and you do not resolve it by choosing. You resolve it by deciding which layer owns truth (the coupled one) and which layer the agent reasons over (the portable one), then making the projection between them explicit. | Dimension | Coupled to bytes | Freed from store | |---|---|---| | Keeping it current | single-store update | cross-system sync you own | | Lifecycle (copy, delete, replicate) | inherited from the object | your responsibility | | Cross-source reasoning | weak | native | | Vendor independence | low | high | | Best role | source of truth | agent working memory | --- ## Composing both Here is the pattern I would build, and it falls straight out of the two-altitude read. **Anchor truth at the bytes.** When an agent learns something durable about an object, write it as a coupled annotation on that object. This is the record. It is governed by the storage layer's lifecycle, it moves and dies with the object, and refreshing it is one call against one store rather than a pipeline between two. **Project a portable working layer for reasoning.** When the agent needs to think across many sources, assemble a store-agnostic knowledge view from the coupled annotations plus whatever else (the warehouse, the ticketing system, the docs). The agent reasons over this layer. It is allowed to be lossy and disposable because it is rebuilt from anchored truth. **Make the seam explicit.** The single most important design decision is the projection between the two layers, because that seam is where correctness leaks. Two rules keep it honest: - **Writes of durable facts go down to the coupled layer, not just into the working graph.** If the agent learns something true, it belongs on the bytes, or it will not survive the next rebuild of the working layer. - **The working layer carries a freshness stamp tied to the object version it was projected from (its ETag or version id).** When that version changes, the projection is stale and must be rebuilt or re-verified before the agent trusts it. This is exactly the invalidation step the unowned lifecycle skips. This is the same shape as the grounding pattern I have argued for elsewhere: let the deterministic, governed layer hold authority, and let the flexible layer do the reasoning, with a hard, checkable boundary between them. It is the storage-layer version of the [[AI Systems & Architecture/grounded-llm-triage-layer|LLM triage layer that can't freelance]]: the model gets reach, the substrate keeps truth, and the seam is where you spend your engineering attention. --- ## What this means for what you build now You do not have to adopt either product to use the insight. The two-altitude frame is portable on its own. If you are storing agent context today, ask which of your context is **truth** and which is **working memory**. Truth wants to be coupled to the bytes it describes, even if your version of "coupled" is just a sidecar with a version hash and an invalidation rule. Working memory wants to be portable and cheap to rebuild, even if your version of "portable" is just a graph you reassemble per task. The mistake the unowned lifecycle makes is treating all context as one undifferentiated blob in a vector store, with no anchor and no expiry. Both vendors just told you, from opposite directions, that the blob is the bug. One says anchor it. One says free it. The architecture that holds up does both, and puts the hard work at the seam between them. That seam is where your agent stops citing facts that are no longer true.