Couchbase brings persistent agent memory to the edge | VentureBeat

Couchbase brings persistent agent memory to the edge | VentureBeat

The competitive edge in enterprise AI is shifting to context: which platform can give an agent the right memory, the right retrieval and the right data at the moment of decision.

Couchbase on Tuesday announced its AI Data Plane, combining persistent agent memory, real-time context retrieval and an enterprise-managed MCP server in a single operational platform. 

Couchbase’s roots are in caching and high-transaction databases — an architecture the company argues makes it better suited for agent memory than vendors that came to the problem from search or analytics. The AI Data Plane runs identically across cloud, on-premises and disconnected edge environments, extending agent memory and local vector search to devices with no network connection.

“How do you make sure that the intelligence that you get out of these models are the ones that databases specialize in?” Gopi Duddi, CTO at Couchbase, told VentureBeat. “How can you get that value out of storage systems, which are still going to be databases?”

What the AI Data Plane delivers

The AI Data Plane packages three components designed to replace the fragmented stacks most enterprises are currently running.

Agent memory: A unified persistence layer for conversational context, structured operational data and vector embeddings. Couchbase says the guardrails are what distinguish it from standalone memory services: token constraints per session, time-to-live limits on stored memories and metering controls that cap compute consumption per agent session.

Enterprise MCP server: An enterprise-supported self-managed server for standardized model-context protocol integration, shipping as part of the platform rather than requiring a separate service.

Agent catalog: A function-level catalog of discoverable agent tooling built by Couchbase. Duddi distinguished it from metadata catalogs like Databricks Unity or AWS Glue — describing it, in his words, as closer to a glorified MCP that surfaces agent functions as callable tools within the platform.

Memory-first architecture takes agent context to the disconnected edge

The lineage of Couchbase and its core architectural foundation is what Duddi says gives it an edge when it comes to context.

“We were a cache before we became a database,” Duddi said.

Writing to memory is 10x faster than writing to disk, Duddi said — a speed advantage he argues separates Couchbase from NoSQL databases that layer memory workloads on top of disk-based storage.

Couchbase isn’t the only data technology that has its roots in a caching layer. Redis similarly is rooted in cache and also recently announced an agentic AI context layer. Duddi argued that Couchbase is different in that it maintains an ACID (Atomicity, Consistency, Isolation, and Durability) compliant database which matters for transactional workloads. Couchbase also has a long history across multiple deployment modalities.

That architecture extends to the edge through Couchbase Lite, the platform’s on-device runtime. It runs SQL, full-text search and vector search locally without a network connection, using a proprietary sync mechanism to replicate bidirectionally back to cloud or between edge nodes when connectivity returns. The target environments are retail floor operations, field service, industrial deployments and regulated settings where agent data cannot leave the device.

Duddi cited hotel reservations as an early example: multiple agents serving customers concurrently, each pulling local context and running vector search on-device, with shared session memory synchronizing centrally. The practical benefit is token efficiency. Rather than every agent independently retrieving and processing the same data, the platform caches shared context so concurrent sessions draw on it without burning tokens repeatedly.

Agora’s view from production

Agora, a platform that helps developers embed real-time voice, video and conversational AI into enterprise applications, has run Couchbase in production since February 2024.

The initial use case was its Signaling product, managing channel setup and state synchronization for live calls. Expanding into conversational AI agents brought stricter requirements: memory-first architecture, full JSON support for storage and query, cross-datacenter replication for high availability and enterprise-grade vendor support.

“Couchbase was the best fit based on these criteria,” Patrick Ferriter, SVP of Product at Agora, told VentureBeat.

Agora is now extending that relationship to support context retrieval for conversational AI agents.

“This will simplify the architecture and deliver enterprise grade RAG with predictable lower latency required for conversational AI use cases,” Ferriter said.

For data professionals trying to figure out the best approach to context, there is no one answer. On platform selection, Ferriter was direct.

“It depends on the preference and goals of the organization, including timing,” Ferriter  said. “If they want something enterprise grade and optimal for immediate production and scale vs. having to optimize and maintain an open-source solution with community support. We wanted the former and that is why we looked at an expanded partnership with Couchbase.”

Competitive context: following the right trend

The context layer has become a crowded space in 2025.

Oracle put a memory core in its database back in March providing a context layer. Redis added a context layer in May as did vector-native database vendor Pinecone.  

“Couchbase is following this trend, not setting it, but it’s the right one to follow,” Devin Pratt, Research Director for AI, Automation, Data and Analytics at IDC, told VentureBeat. “Its real edge is reach, running the same platform from cloud to edge to mobile, which is how enterprises actually operate. The test now is to scale against bigger names.”

For teams navigating the vendor landscape, Pratt’s framing is direct. “Match the tool to the workload. Consolidate where it makes sense, use a specialized engine like a graph database where relationship-heavy reasoning earns it, and let governance drive the call rather than treating memory as plumbing,” Pratt said.

Source link