Architecture
System Layers
┌─────────────────────────────────┐
│ Agent Tools │ memory_search, memory_block_update
├─────────────────────────────────┤
│ MemoryManager │ Orchestration & extraction
├─────────────┬───────────────────┤
│ LanceDB │ MarkdownStore │ Dual-write storage
│ (search) │ (source of truth)│
├─────────────┴───────────────────┤
│ Session Layer │ Transcripts, state, lifecycle
└─────────────────────────────────┘
Storage Tiers
Markdown Files (/shared/memory/)
Human-readable source of truth. Agent-facing — the agent can read/write these files directly via ReadFileTool/WriteFileTool.
- Daily logs (
YYYY-MM-DD.md): Append-only, timestamped facts per conversation turn - MEMORY.md: Curated long-term summary, auto-updated by
refresh_core_memory_facts()
LanceDB (.suzent/memory/)
Vector search index. Rebuilt from markdown if corrupted via MarkdownIndexer.reindex_from_markdown().
- Core memory blocks (persona, user, facts, context)
- Archival memories with vector embeddings
- Hybrid search: semantic + full-text + importance + recency
Session Data (.suzent/)
Internal operational data. Not agent-facing — accessed via API endpoints.
- Transcripts (
.suzent/transcripts/{id}.jsonl): Append-only per-session conversation logs - State snapshots (
.suzent/state/{id}.json): Inspectable JSON agent state mirrors
Components
MarkdownMemoryStore (markdown_store.py)
Human-readable persistence layer
- Manages files in
/shared/memory/(cross-session, agent-accessible) - Daily log format: lean single-line-per-fact (
- [category] content \tag1 tag2``) - Methods:
append_daily_log(),get_recent_logs(),write_memory_file(),read_memory_file()
LanceDBMemoryStore (lancedb_store.py)
Search index layer
- LanceDB connection management
- Core memory block CRUD with scoping priority (chat > user > global)
- Archival memory with vector embeddings
- Semantic and hybrid search
MemoryManager (manager.py)
Orchestration layer
- Dual-write: every extracted fact goes to both LanceDB and markdown
- Automatic LLM-based fact extraction (one concise sentence per fact)
- Deduplication via similarity threshold (0.85)
- Core memory refresh when high-importance facts are found
- Transcript linkage: facts track
source_session_id,source_transcript_line,source_timestamp
MarkdownIndexer (indexer.py)
Recovery mechanism
reindex_from_markdown(): Rebuilds LanceDB from markdown daily logs- Parses facts from markdown format, generates embeddings, stores in LanceDB
TranscriptIndexer (indexer.py)
Cross-session search
- Chunks JSONL transcripts (~400 tokens, 80 overlap) into LanceDB
- Enables semantic search across past session conversations
- Opt-in via
transcript_indexing_enabledconfig
ContextCompressor (context_compressor.py)
Context window management with pre-compaction flush
- Before compressing steps, extracts facts from steps about to be removed
- Feeds synthetic
ConversationTurntoMemoryManagerfor dual-write - Ensures no valuable context is lost when the context window is trimmed
Memory Tools (tools.py)
Agent interface
MemorySearchTool: Semantic search across archival memoryMemoryBlockUpdateTool: Update core blocks (replace, append, search_replace)- Thread-safe via
asyncio.run_coroutine_threadsafe()
Memory Context (memory_context.py)
Prompt templates
FACT_EXTRACTION_SYSTEM_PROMPT: One concise sentence per fact, no fillerCORE_MEMORY_SUMMARIZATION_PROMPT: Max 200 words, bullet points onlyformat_core_memory_section(): Includes Memory Workspace location (/shared/memory/)format_retrieved_memories_section(): Single-line format per memory
Session Components
TranscriptManager (session/transcript.py)
- Append-only JSONL files at
.suzent/transcripts/{session_id}.jsonl - Each line:
{"ts", "role", "content", "actions", "meta"} - Thread-safe with async locks per session
StateMirror (session/state_mirror.py)
- Writes inspectable JSON snapshots to
.suzent/state/{session_id}.json - Parses JSON v2 agent state; writes placeholder for legacy pickle
SessionLifecycle (session/lifecycle.py)
SessionPolicy: daily reset hour, idle timeout, max turnsshould_reset(): checks all policies against session metadata
Agent Serializer (core/agent_serializer.py)
- JSON v2 format: Human-readable, inspectable agent state
- Backward-compatible: auto-detects and loads legacy pickle format
- Serializes smolagents step objects (ActionStep, PlanningStep, TaskStep, FinalAnswerStep)
Data Flow
Write Path (Extraction + Dual-Write)
Conversation Turn (User + Agent + Response)
↓
manager.process_conversation_turn_for_memories()
↓
LLM extracts concise facts
↓
┌────────────────┬─────────────────────┐
│ LanceDB store │ Markdown daily log │
│ (search index) │ (source of truth) │
└────────────────┴─────────────────────┘
↓
If High Importance → refresh_core_memory_facts() → MEMORY.md
Read Path (Memory Injection)
User Query
↓
manager.retrieve_relevant_memories()
↓
Generate embedding → Hybrid search (LanceDB)
↓
Format results (single-line per memory)
↓
Inject into agent prompt
Pre-Compaction Flush
Context compression triggered
↓
_pre_compaction_flush(steps_to_compress)
↓
Build synthetic ConversationTurn from steps
↓
Feed to memory extraction pipeline (dual-write)
↓
Compress steps into summary
Session Lifecycle
New message arrives
↓
Write JSONL transcript entry
↓
Extract memories (dual-write)
↓
Check compression → pre-compaction flush if needed
↓
Persist state (JSON v2 + StateMirror)
↓
Update last_active_at, turn_count
Memory Scoping
User-Level
- Scope: All chats
- Storage:
user_id="x", chat_id=NULL - Use: Preferences, facts, persona
Chat-Level
- Scope: Single conversation
- Storage:
user_id="x", chat_id="y" - Use: Current context, session state
Global
- Scope: All users/chats
- Storage:
user_id=NULL, chat_id=NULL - Use: Default persona
Priority
- Chat-specific (most specific)
- User-level (persistent)
- Global (fallback)
File Structure
src/suzent/memory/
├── __init__.py # Exports
├── lancedb_store.py # LanceDB search index
├── markdown_store.py # Markdown source of truth (/shared/memory/)
├── indexer.py # MarkdownIndexer + TranscriptIndexer
├── manager.py # Orchestration & dual-write
├── memory_context.py # Prompt templates
├── tools.py # Agent tools
├── models.py # Pydantic models
└── lifecycle.py # Initialization
src/suzent/session/
├── __init__.py
├── transcript.py # JSONL transcripts
├── state_mirror.py # Agent state snapshots
└── lifecycle.py # Session reset policies
src/suzent/core/
├── agent_serializer.py # JSON v2 agent state
├── context_compressor.py # Pre-compaction flush
└── chat_processor.py # Integration point
src/suzent/routes/
└── session_routes.py # Transcript/state/memory APIs
Data Layout
data/sandbox-data/
shared/ # Cross-session workspace (agent sees /shared/)
memory/
MEMORY.md # Curated long-term memory
2026-02-08.md # Daily append-only logs
.suzent/
chats.db # SQLite + session lifecycle fields
memory/ # LanceDB search index
transcripts/
{session_id}.jsonl # Per-session conversation logs
state/
{session_id}.json # Agent state snapshots
Design Principles
- Markdown as Source of Truth - Human-readable, git-friendly, recoverable
- Dual-Write - Every fact persisted to both markdown and LanceDB
- No Context Loss - Pre-compaction flush captures facts before compression
- Inspectable State - JSON v2 replaces opaque pickle; state mirrors on disk
- Separation of Concerns - Agent-facing (
/shared/) vs internal (.suzent/) - Async by Default - Non-blocking I/O throughout
- Backward Compatible - Legacy pickle deserialization, nullable migrations