Code Lib / 代码库
更新于 a month ago

memory-systems

Mmuratcankoylan
7.4k
muratcankoylan/Agent-Skills-for-Context-Engineering/skills/memory-systems
82
Agent 评分

💡 摘要

一份关于为AI智能体设计和实现分层记忆架构的综合概念指南,涵盖从基础知识到高级时序知识图谱。

🎯 适合人群

AI智能体架构师构建持久化智能体的机器学习工程师智能体记忆系统研究员AI产品技术项目经理

🤖 AI 吐槽:这是一份出色的教科书章节,却忘了附上可以运行的实际库或代码。

安全分析中风险

指南讨论了涉及文件系统和数据库访问的模式。具体风险包括:1) '文件系统即内存'模式中的不安全文件操作可能导致路径遍历攻击。2) 在向量/图数据库中存储/查询非结构化数据时的注入风险。缓解措施:为任何数据存储层实施严格的输入验证和参数化查询,并清理所有文件路径。


name: memory-systems description: This skill should be used when the user asks to "implement agent memory", "persist state across sessions", "build knowledge graph", "track entities", or mentions memory architecture, temporal knowledge graphs, vector stores, entity memory, or cross-session persistence.

Memory System Design

Memory provides the persistence layer that allows agents to maintain continuity across sessions and reason over accumulated knowledge. Simple agents rely entirely on context for memory, losing all state when sessions end. Sophisticated agents implement layered memory architectures that balance immediate context needs with long-term knowledge retention. The evolution from vector stores to knowledge graphs to temporal knowledge graphs represents increasing investment in structured memory for improved retrieval and reasoning.

When to Activate

Activate this skill when:

  • Building agents that must persist across sessions
  • Needing to maintain entity consistency across conversations
  • Implementing reasoning over accumulated knowledge
  • Designing systems that learn from past interactions
  • Creating knowledge bases that grow over time
  • Building temporal-aware systems that track state changes

Core Concepts

Memory exists on a spectrum from immediate context to permanent storage. At one extreme, working memory in the context window provides zero-latency access but vanishes when sessions end. At the other extreme, permanent storage persists indefinitely but requires retrieval to enter context.

Simple vector stores lack relationship and temporal structure. Knowledge graphs preserve relationships for reasoning. Temporal knowledge graphs add validity periods for time-aware queries. Implementation choices depend on query complexity, infrastructure constraints, and accuracy requirements.

Detailed Topics

Memory Architecture Fundamentals

The Context-Memory Spectrum Memory exists on a spectrum from immediate context to permanent storage. At one extreme, working memory in the context window provides zero-latency access but vanishes when sessions end. At the other extreme, permanent storage persists indefinitely but requires retrieval to enter context. Effective architectures use multiple layers along this spectrum.

The spectrum includes working memory (context window, zero latency, volatile), short-term memory (session-persistent, searchable, volatile), long-term memory (cross-session persistent, structured, semi-permanent), and permanent memory (archival, queryable, permanent). Each layer has different latency, capacity, and persistence characteristics.

Why Simple Vector Stores Fall Short Vector RAG provides semantic retrieval by embedding queries and documents in a shared embedding space. Similarity search retrieves the most semantically similar documents. This works well for document retrieval but lacks structure for agent memory.

Vector stores lose relationship information. If an agent learns that "Customer X purchased Product Y on Date Z," a vector store can retrieve this fact if asked directly. But it cannot answer "What products did customers who purchased Product Y also buy?" because relationship structure is not preserved.

Vector stores also struggle with temporal validity. Facts change over time, but vector stores provide no mechanism to distinguish "current fact" from "outdated fact" except through explicit metadata and filtering.

The Move to Graph-Based Memory Knowledge graphs preserve relationships between entities. Instead of isolated document chunks, graphs encode that Entity A has Relationship R to Entity B. This enables queries that traverse relationships rather than just similarity.

Temporal knowledge graphs add validity periods to facts. Each fact has a "valid from" and optionally "valid until" timestamp. This enables time-travel queries that reconstruct knowledge at specific points in time.

Benchmark Performance Comparison The Deep Memory Retrieval (DMR) benchmark provides concrete performance data across memory architectures:

| Memory System | DMR Accuracy | Retrieval Latency | Notes | |---------------|--------------|-------------------|-------| | Zep (Temporal KG) | 94.8% | 2.58s | Best accuracy, fast retrieval | | MemGPT | 93.4% | Variable | Good general performance | | GraphRAG | ~75-85% | Variable | 20-35% gains over baseline RAG | | Vector RAG | ~60-70% | Fast | Loses relationship structure | | Recursive Summarization | 35.3% | Low | Severe information loss |

Zep demonstrated 90% reduction in retrieval latency compared to full-context baselines (2.58s vs 28.9s for GPT-5.2). This efficiency comes from retrieving only relevant subgraphs rather than entire context history.

GraphRAG achieves approximately 20-35% accuracy gains over baseline RAG in complex reasoning tasks and reduces hallucination by up to 30% through community-based summarization.

Memory Layer Architecture

Layer 1: Working Memory Working memory is the context window itself. It provides immediate access to information currently being processed but has limited capacity and vanishes when sessions end.

Working memory usage patterns include scratchpad calculations where agents track intermediate results, conversation history that preserves dialogue for current task, current task state that tracks progress on active objectives, and active retrieved documents that hold information currently being used.

Optimize working memory by keeping only active information, summarizing completed work before it falls out of attention, and using attention-favored positions for critical information.

Layer 2: Short-Term Memory Short-term memory persists across the current session but not across sessions. It provides search and retrieval capabilities without the latency of permanent storage.

Common implementations include session-scoped databases that persist until session end, file-system storage in designated session directories, and in-memory caches keyed by session ID.

Short-term memory use cases include tracking conversation state across turns without stuffing context, storing intermediate results from tool calls that may be needed later, maintaining task checklists and progress tracking, and caching retrieved information within sessions.

Layer 3: Long-Term Memory Long-term memory persists across sessions indefinitely. It enables agents to learn from past interactions and build knowledge over time.

Long-term memory implementations range from simple key-value stores to sophisticated graph databases. The choice depends on complexity of relationships to model, query patterns required, and acceptable infrastructure complexity.

Long-term memory use cases include learning user preferences across sessions, building domain knowledge bases that grow over time, maintaining entity registries with relationship history, and storing successful patterns that can be reused.

Layer 4: Entity Memory Entity memory specifically tracks information about entities (people, places, concepts, objects) to maintain consistency. This creates a rudimentary knowledge graph where entities are recognized across multiple interactions.

Entity memory maintains entity identity by tracking that "John Doe" mentioned in one conversation is the same person in another. It maintains entity properties by storing facts discovered about entities over time. It maintains entity relationships by tracking relationships between entities as they are discovered.

Layer 5: Temporal Knowledge Graphs Temporal knowledge graphs extend entity memory with explicit validity periods. Facts are not just true or false but true during specific time ranges.

This enables queries like "What was the user's address on Date X?" by retrieving facts valid during that date range. It prevents context clash when outdated information contradicts new data. It enables temporal reasoning about how entities changed over time.

Memory Implementation Patterns

Pattern 1: File-System-as-Memory The file system itself can serve as a memory layer. This pattern is simple, requires no additional infrastructure, and enables the same just-in-time loading that makes file-system-based context effective.

Implementation uses the file system hierarchy for organization. Use naming conventions that convey meaning. Store facts in structured formats (JSON, YAML). Use timestamps in filenames or metadata for temporal tracking.

Advantages: Simplicity, transparency, portability. Disadvantages: No semantic search, no relationship tracking, manual organization required.

Pattern 2: Vector RAG with Metadata Vector stores enhanced with rich metadata provide semantic search with filtering capabilities.

Implementation embeds facts or documents and stores with metadata including entity tags, temporal validity, source attribution, and confidence scores. Query includes metadata filters alongside semantic search.

Pattern 3: Knowledge Graph Knowledge graphs explicitly model entities and relationships. Implementation defines entity types and relationship types, uses graph database or property graph storage, and maintains indexes for common query patterns.

Pattern 4: Temporal Knowledge Graph Temporal knowledge graphs add validity periods to facts, enabling time-travel queries and preventing context clash from outdated information.

Memory Retrieval Patterns

Semantic Retrieval Retrieve memories semantically similar to current query using embedding similarity search.

Entity-Based Retrieval Retrieve all memories related to specific entities by traversing graph relationships.

Temporal Retrieval Retrieve memories valid at specific time or within time range using validity period filters.

Memory Consolidation

Memories accumulate over time and require consolidation to prevent unbounded growth and remove outdated information.

Consolidation Triggers Trigger consolidation after significant memory accumulation, when ret

五维分析
清晰度8/10
创新性7/10
实用性9/10
完整性9/10
可维护性8/10
优缺点分析

优点

  • 对记忆层进行了极其透彻的概念分解。
  • 提供了清晰的实现模式和基准测试。
  • 非常适合架构决策和规划。

缺点

  • 纯粹是概念性的,没有可执行代码或工具。
  • 缺乏具体的、可复制粘贴的实现示例。
  • 假设读者具备相当的基础设施知识。

相关技能

pytorch

S
toolCode Lib / 代码库
92/ 100

“它是深度学习的瑞士军刀,但祝你好运能从47种安装方法里找到那个不会搞崩你系统的那一个。”

agno

S
toolCode Lib / 代码库
90/ 100

“它承诺成为智能体领域的Kubernetes,但得看开发者有没有耐心学习又一个编排层。”

nuxt-skills

S
toolCo-Pilot / 辅助式
90/ 100

“这本质上是一份组织良好的小抄,能把你的 AI 助手变成一只 Nuxt 框架的复读机。”

免责声明:本内容来源于 GitHub 开源项目,仅供展示和评分分析使用。

版权归原作者所有 muratcankoylan.