Architecture
System Layers
┌─────────────────────────────────┐ ┌──────────────────────────────┐
│ Agent Tools │ │ Notebook Skill │
│ memory_search, block_update │ │ ingest / lint / query-file │
├─────────────────────────────────┤ ├──────────────────────────────┤
│ MemoryManager │ │ WikiManager │
│ Orchestration & extraction │ │ Bootstrap schema/index/log │
├─────────────┬───────────────────┤ ├──────────────────────────────┤
│ LanceDB │ MarkdownStore │ │ Obsidian Vault │
│ (search) │ (source of truth)│ │ /mnt/notebook │
├─────────────┴───────────────────┤ └──────────────────────────────┘
│ Session Layer │
│ Transcripts, state, lifecycle │
└─────────────────────────────────┘
Conversation Memory LLM Wiki
Storage Tiers
Markdown Files (/shared/memory/)
Human-readable source of truth. Agent-facing — the agent can read/write these files directly via ReadFileTool/WriteFileTool.
- Daily logs (
YYYY-MM-DD.md): Append-only, timestamped facts per conversation turn - MEMORY.md: Curated long-term summary, auto-updated by
refresh_core_memory_facts()
LanceDB (.suzent/memory/)
Vector search index. Rebuilt from markdown if corrupted via MarkdownIndexer.reindex_from_markdown().
- Core memory blocks (persona, user, facts, context)
- Archival memories with vector embeddings
- Hybrid search: semantic + full-text + importance + recency
Session Data (.suzent/)
Internal operational data. Not agent-facing — accessed via API endpoints.
- Transcripts (
.suzent/transcripts/{id}.jsonl): Append-only per-session conversation logs - State snapshots (
.suzent/state/{id}.json): Inspectable JSON agent state mirrors
Components
MarkdownMemoryStore (markdown_store.py)
Human-readable persistence layer
- Manages files in
/shared/memory/(cross-session, agent-accessible) - Daily log format: lean single-line-per-fact (
- [category] content \tag1 tag2``) - Methods:
append_daily_log(),get_recent_logs(),write_memory_file(),read_memory_file()
LanceDBMemoryStore (lancedb_store.py)
Search index layer
- LanceDB connection management
- Core memory block CRUD with scoping priority (chat > user > global)
- Archival memory with vector embeddings
- Semantic and hybrid search
MemoryManager (manager.py)
Orchestration layer
- Dual-write: every extracted fact goes to both LanceDB and markdown
- Automatic LLM-based fact extraction (one concise sentence per fact)
- Deduplication via similarity threshold (0.85)
- Core memory refresh when high-importance facts are found
- Transcript linkage: facts track
source_session_id,source_transcript_line,source_timestamp
WikiManager (wiki_manager.py)
LLM Wiki bootstrap
- Initializes the Obsidian-style vault at the notebook root (
/mnt/notebookin sandbox) - Seeds three navigation files on first init:
schema.md(fromskills/notebook/schema_example.md),index.md,log.md - After bootstrap, the agent owns the vault entirely —
WikiManagermakes no further writes - Separate from conversation memory: no LanceDB involvement, no automatic extraction
MarkdownIndexer (indexer.py)
Recovery mechanism
reindex_from_markdown(): Rebuilds LanceDB from markdown daily logs- Parses facts from markdown format, generates embeddings, stores in LanceDB
TranscriptIndexer (indexer.py)
Cross-session search
- Chunks JSONL transcripts (~400 tokens, 80 overlap) into LanceDB
- Enables semantic search across past session conversations
- Opt-in via
transcript_indexing_enabledconfig
ContextCompressor (context_compressor.py)
Context window management with pre-compaction flush
- Before compressing steps, extracts facts from steps about to be removed
- Feeds synthetic
ConversationTurntoMemoryManagerfor dual-write - Ensures no valuable context is lost when the context window is trimmed
Memory Tools (tools.py)
Agent interface
MemorySearchTool: Semantic search across archival memoryMemoryBlockUpdateTool: Update core blocks (replace, append, search_replace)- Thread-safe via
asyncio.run_coroutine_threadsafe()
Memory Context (memory_context.py)
Prompt templates
FACT_EXTRACTION_SYSTEM_PROMPT: One concise sentence per fact, no fillerCORE_MEMORY_SUMMARIZATION_PROMPT: Max 200 words, bullet points onlyformat_core_memory_section(): Includes Memory Workspace location (/shared/memory/)format_retrieved_memories_section(): Single-line format per memory
Session Components
TranscriptManager (session/transcript.py)
- Append-only JSONL files at
.suzent/transcripts/{session_id}.jsonl - Each line:
{"ts", "role", "content", "actions", "meta"} - Thread-safe with async locks per session
StateMirror (session/state_mirror.py)
- Writes inspectable JSON snapshots to
.suzent/state/{session_id}.json - Parses JSON v2 agent state; writes placeholder for legacy pickle
SessionLifecycle (session/lifecycle.py)
SessionPolicy: daily reset hour, idle timeout, max turnsshould_reset(): checks all policies against session metadata
Agent Serializer (core/agent_serializer.py)
- JSON v2 format: Human-readable, inspectable agent state
- Backward-compatible: auto-detects and loads legacy pickle format
- Serializes smolagents step objects (ActionStep, PlanningStep, TaskStep, FinalAnswerStep)
Data Flow
Write Path (Extraction + Dual-Write)
Conversation Turn (User + Agent + Response)
↓
manager.process_conversation_turn_for_memories()
↓
LLM extracts concise facts
↓
┌────────────────┬─────────────────────┐
│ LanceDB store │ Markdown daily log │
│ (search index) │ (source of truth) │
└────────────────┴─────────────────────┘
↓
If High Importance → refresh_core_memory_facts() → MEMORY.md
Read Path (Memory Injection)
User Query
↓
manager.retrieve_relevant_memories()
↓
Generate embedding → Hybrid search (LanceDB)
↓
Format results (single-line per memory)
↓
Inject into agent prompt
Pre-Compaction Flush
Context compression triggered
↓
_pre_compaction_flush(steps_to_compress)
↓
Build synthetic ConversationTurn from steps
↓
Feed to memory extraction pipeline (dual-write)
↓
Compress steps into summary
Session Lifecycle
New message arrives
↓
Write JSONL transcript entry
↓
Extract memories (dual-write)
↓
Check compression → pre-compaction flush if needed
↓
Persist state (JSON v2 + StateMirror)
↓
Update last_active_at, turn_count
Memory Scoping
User-Level
- Scope: All chats
- Storage:
user_id="x", chat_id=NULL - Use: Preferences, facts, persona
Chat-Level
- Scope: Single conversation
- Storage:
user_id="x", chat_id="y" - Use: Current context, session state
Global
- Scope: All users/chats
- Storage:
user_id=NULL, chat_id=NULL - Use: Default persona
Priority
- Chat-specific (most specific)
- User-level (persistent)
- Global (fallback)
File Structure
src/suzent/memory/
├── __init__.py # Exports
├── lancedb_store.py # LanceDB search index
├── markdown_store.py # Markdown source of truth (/shared/memory/)
├── indexer.py # MarkdownIndexer + TranscriptIndexer
├── manager.py # Orchestration & dual-write
├── wiki_manager.py # LLM Wiki bootstrap (schema/index/log)
├── memory_context.py # Prompt templates
├── tools.py # Agent tools
├── models.py # Pydantic models
└── lifecycle.py # Initialization
skills/notebook/
├── SKILL.md # Notebook skill definition
├── schema_example.md # Default schema seeded into new vaults
├── ingest.md # Ingest procedure
└── lint.md # Lint procedure
src/suzent/session/
├── __init__.py
├── transcript.py # JSONL transcripts
├── state_mirror.py # Agent state snapshots
└── lifecycle.py # Session reset policies
src/suzent/core/
├── agent_serializer.py # JSON v2 agent state
├── context_compressor.py # Pre-compaction flush
└── chat_processor.py # Integration point
src/suzent/routes/
└── session_routes.py # Transcript/state/memory APIs
Data Layout
data/sandbox-data/
shared/ # Cross-session workspace (agent sees /shared/)
memory/
MEMORY.md # Curated long-term memory
2026-02-08.md # Daily append-only logs
/mnt/notebook/ (or $MOUNT_NOTEBOOK) # LLM Wiki vault (Obsidian-style)
schema.md # Vault conventions — agent reads first
index.md # Catalog of synthesized pages
log.md # Append-only operation log
<user notes and folders...>
.suzent/
chats.db # SQLite + session lifecycle fields
memory/ # LanceDB search index
transcripts/
{session_id}.jsonl # Per-session conversation logs
state/
{session_id}.json # Agent state snapshots
Design Principles
- Markdown as Source of Truth - Human-readable, git-friendly, recoverable
- Dual-Write - Every fact persisted to both markdown and LanceDB
- No Context Loss - Pre-compaction flush captures facts before compression
- Inspectable State - JSON v2 replaces opaque pickle; state mirrors on disk
- Separation of Concerns - Agent-facing (
/shared/) vs internal (.suzent/) - Async by Default - Non-blocking I/O throughout
- Backward Compatible - Legacy pickle deserialization, nullable migrations
- Wiki as Agent-Owned Knowledge - The LLM Wiki vault is bootstrapped once, then entirely maintained by the agent via direct file tool access; the system does not auto-write to it. Pattern from Karpathy's LLM Wiki: raw sources → LLM-maintained wiki → schema conventions