Research: The Workspace Boundary for Agent Memory

A clear pattern is emerging in how major AI and workspace platforms handle long-term agent memory. The core idea is simple: store memory in the smallest durable workspace that users already recognize, such as a project, repository, document, workspace, namespace, or site. Then, rely on the platform’s existing permission system to decide who can access that memory. Around that core, keep only lightweight layers for personal preferences and short-lived conversation context. The names vary across Anthropic, OpenAI, GitHub, Google, AWS, Salesforce, and related tools, but the product shape is becoming consistent. This post looks at that pattern, why it matters, and how WordPress fits into it.

This is the companion to a recent post on memory in WordPress core, which argues that the Guidelines feature is the right place for agent memory to grow inside WordPress. That post made the case from first principles. This one shows the research that actually informed it, surveying how the rest of the industry has approached the same problem.

The problem: the same human, different intents

Memory scoping sounds trivial until you try to design it.

Consider a single person, call them Grażyna, who has an account on a WordPress multisite network. Grażyna owns a personal travel blog. She also administers five client sites for an agency. She’s a contributor on a sixth, her friend’s photography portfolio. Same human, one account, six different contexts.

If the agent learns “the brand color is dark green” while Grażyna is working on client A’s site, that fact must not surface when Grażyna switches to her personal travel blog. If Grażyna tells the agent “I prefer concise responses” once, that preference should probably follow her across sites, but it shouldn’t apply to a different user editing the same shared site. If a junior contributor on the photography portfolio asks the agent to remember something, the agent must not let that memory leak into what the site owner sees.

Mixing memory across those contexts is both a privacy failure (client A’s data showing up in a conversation about client B) and a quality failure (contradictory suggestions that erode trust in the agent). Avoiding the mix requires a scoping model that the system applies before retrieval, not after.

What the industry has converged on

Several major AI platforms have published enough about their memory architecture to be compared directly. The shape is consistent, even though the implementation details differ.

Project or workspace as the primary scope. Anthropic’s Claude Projects creates separate memory per project and frames the separation explicitly as a safety guardrail. ChatGPT Projects supports project-only memory, and shared projects are automatically moved to project-only memory that cannot revert to default memory. GitHub Copilot Memory is repository-scoped, gated by repo write permission, with citation validation against the live codebase and a 28-day expiry. It’s the closest direct analog to the WordPress site-memory question. Linear treats agents as workspace members, with team guidance that can override workspace guidance.

Compositional scope under the hood. The workspace boundary is the one users see, but underneath, every system stores memory against a richer key. Google’s Memory Bank keys memory off a structured scope dictionary with customizable composition. AWS AgentCore uses hierarchical namespace paths like /strategies/{id}/actors/{actorId}/sessions/{sessionId}/, with IAM conditions enforcing access. Mem0’s State of AI Agent Memory 2026, the most comprehensive recent survey of the space, composes user_id × agent_id × session_id × org_id. The pattern across all of them: a single primary scope surfaces in UX, while a compositional scope tuple powers storage and retrieval underneath.

Permission delegation, not a new ACL. The strongest enterprise examples avoid making memory access a separate content ACL. They delegate read and write boundaries to the platform identity and permission systems that already govern the underlying resources. Memory Bank delegates to IAM. AgentCore delegates to IAM. Glean continues to honor source-system ACLs for content access while using its own roles for product-level administration. Salesforce Agentforce ties agent context to existing sharing rules, with documentation that puts it bluntly: employee-facing agents commonly run in the logged-in user’s context, where existing permissions and sharing settings determine access. The dominant enterprise pattern is to avoid a standalone memory ACL when a mature platform permission system already exists, because every parallel ACL is an opportunity for the two systems to drift out of sync.

When memory features are not fully integrated into existing governance systems, documented gaps follow. Microsoft 365 Copilot is the clearest public counterexample. Its memory is stored in the user’s Exchange mailbox, but Purview retention policies and labels don’t apply to it, and memory and personalization actions don’t generate Purview audit log entries. That’s the failure mode the delegation pattern is built to avoid, and it’s why Microsoft belongs here as a cautionary case rather than as part of the converging pattern.

Unified agent with roles, not multi-agent fragmentation. Intercom’s Fin takes service, sales, and other personas into a single agent keyed off the customer record, with memory spanning the customer lifecycle. Intercom’s leadership has argued, more pointedly than most, that multi-agent designs with separated memory are the wrong approach. The architectural takeaway is milder than the rhetoric: the workspace boundary handles inter-site isolation, while role-based behavior inside a unified agent handles intra-workspace specialization. The two are complementary, not competing.

Platform	Primary scope	Access enforcement	Notable feature
Claude Projects	Project	Anthropic account model	Incognito mode, per-project memory summary
ChatGPT Projects	Project (toggle)	OpenAI account	Sharing forces project-only
GitHub Copilot	Repository	Repo write permission	Citation validation, 28-day expiry
Memory Bank	Compositional scope dict	Google Cloud IAM	Customizable scope keys
AWS AgentCore	Namespace path	AWS IAM	Hierarchical namespaces
M365 Copilot	Tenant + user	Entra ID	No Purview retention, no audit log
Salesforce Agentforce	Org sharing model	Existing sharing rules	Trust Layer (zero LLM retention)
Linear	Workspace + team	Workspace membership	Team guidance overrides workspace
Glean	(Mirrors source ACLs)	IdP identity sync	Real-time permission sync
Intercom Fin	Customer record	Identity Verification	Single-agent design with role switching

Memory poisoning as a real architectural concern

The scoping model handles “the same human in different contexts.” A separate question is “the wrong information ending up in the memory at all,” and recent research has shown this isn’t a theoretical worry.

Two threats matter most. First, a low-privilege user poisoning a shared memory pool. The mitigation is straightforward: writes to shared memory require an editorial-level permission, while lower-privilege users only write into their own private pool. Promotion from private to shared is always explicit.

Second, and more interesting, agent inferences becoming self-reinforcing false facts over time. An agent infers something plausible but wrong, writes it to memory, retrieves it next session, treats it as established, and accumulates more inferences on top of the false foundation. The class of attack is now formalized in the academic literature. MemoryGraft and MINJA both show how memory injection can persistently corrupt agent behavior across sessions. This isn’t only an academic concern: Google’s Memory Bank documentation explicitly warns that memory poisoning can store false or malicious information that affects future sessions.

The architectural response is a provenance field on every stored memory, recording who or what created it. Values like “user said this directly,” “agent inferred this from a conversation,” “agent extracted this from site content,” “support team curated this,” and “system generated this” each carry different trust levels. Provenance determines two things at retrieval: how much the memory is trusted, and which validator runs against the underlying substrate. A memory about a site option validates against the current option value. A memory about a theme setting validates against the current theme. A content-derived inference stays at low confidence until a user assertion confirms it.

Provenance isn’t optional in this model. It’s the field that decides which validator runs and what the memory’s confidence ceiling is. Systems that store memory without provenance have no principled way to distinguish user assertions, agent inferences, extracted content facts, and curated guidance, which is the kind of ambiguity MemoryGraft and MINJA exploit.

What this means for WordPress

WordPress is unusually well-positioned for the workspace pattern, because WordPress already has the workspace boundary built in. It’s called a site.

WordPress already has the two primitives the workspace pattern needs. The permission system — site partitioning, post authorship, post statuses, roles, capabilities, and meta-capability checks — answers whether a given user can see a given memory on a given site, using the same checks WordPress already runs for posts. And the compositional key the industry stores against (user_id × site_id) is native to WordPress as site ID (blog_id in multisite internals) paired with user ID. These aren’t new primitives that need inventing. They’re existing primitives the memory layer should use directly.

It helps to name the layers, because they don’t all share one scope. A global user preference like “I prefer concise responses” is keyed by user ID alone, and is meant to travel with the user across every site they touch. Private memory is keyed by user and site together, so what Grażyna prefers on a client site stays on that client site. Site-shared guidance like brand voice or tone is keyed by site alone, because it’s a property of the site rather than of any one user-on-site pair, and writing to it is gated by an editorial-level capability. Three layers, three different keys, one permission system enforcing all of them.

This is also why the analogy to AWS AgentCore and Google Memory Bank only partially holds. Those systems sit at the infrastructure layer and have to provide their own primitives for tenancy, identity, and access, because the application above them varies. WordPress has the application-level model already, with twenty years of refinement. The memory store should sit on top of that model, not next to it.

The Guidelines feature is the most natural place for this conversation to start. It shipped as a Gutenberg experiment, and the current WordPress AI posture is deliberately incremental: foundation pieces first, opt-in experiments, and open community discussion about whether and where each piece belongs in Core. That posture matters here. The argument isn’t “AI vendors do this, therefore WordPress must.” It’s that this pattern fits the way WordPress is already exploring AI, and that the experiment is a sensible place to test it. The choice to scope memory this way isn’t an idiosyncratic WordPress choice either. It’s the WordPress-shaped version of the choice the rest of the industry has landed on.

Back to Grażyna. When she chats with the agent on her personal travel blog, the memories that conversation produces belong to “Grażyna on her personal blog.” When the same Grażyna switches to a client site she administers, they belong to “Grażyna on that client site.” Two different pools, same person, no bleed. That structure isn’t a WordPress invention. It’s what the workspace pattern says, applied to the WordPress object model.

Closing

The most underappreciated insight from this research is that the hardest parts of memory aren’t model-side. They’re access control, scope composition, audit, and adversarial robustness. The platforms that have shipped serious agent memory keep landing on the same move: don’t reinvent these primitives, delegate to the platform that already has them. The clearest counterexample, Microsoft 365 Copilot, shows what happens when memory sits outside the existing governance system.

For WordPress, the platform that already has these primitives is WordPress itself. The Guidelines experiment is the connective tissue between the workspace boundary the industry has landed on and the authorization model WordPress has spent twenty years refining. Building memory there isn’t a guess about where it should live. It’s the path of least resistance once you take the industry pattern seriously.

The industry comparison was compiled through research conducted with Claude and cross-verified with ChatGPT. The sources are provided inline. The platform feature claims are based on the public documentation available at the time of writing and may change as these systems evolve.

Research: The Workspace Boundary for Agent Memory

The problem: the same human, different intents

What the industry has converged on

Memory poisoning as a real architectural concern

What this means for WordPress

Closing

Share this:

Like this:

Comments

Leave a ReplyCancel reply

Discover more from Grzegorz Ziółkowski