Update docs

This commit is contained in:
HyunjunJeon
2026-01-13 18:57:35 +09:00
parent fa67114806
commit 6a4f3dcdd9
6 changed files with 631 additions and 144 deletions

149
AGENTS.md
View File

@@ -1,43 +1,122 @@
# Repository Guidelines
# PROJECT KNOWLEDGE BASE
## Project Structure & Module Organization
- `research_agent/` contains the core Python agents, prompts, tools, and subagent utilities.
- `skills/` holds project-level skills as `SKILL.md` files (YAML frontmatter + instructions).
- `research_workspace/` is the agents working filesystem for generated outputs; keep it clean or example-only.
- `deep-agents-ui/` is the Next.js/React UI with source under `deep-agents-ui/src/`.
- `deepagents_sourcecode/` vendors upstream library sources for reference and comparison.
- `rust-research-agent/` is a standalone Rust tutorial agent with its own build/test flow.
- `langgraph.json` defines the LangGraph deployment entrypoint for the research agent.
**Generated:** 2026-01-13
---
## OVERVIEW
Multi-agent research system demonstrating **FileSystem-based Context Engineering** using LangChain's DeepAgents framework. Includes Python orchestrator, Rust port via Rig framework, and Next.js chat UI.
---
## STRUCTURE
## Build, Test, and Development Commands
Use the UI commands from `deep-agents-ui/` when working on the frontend:
```bash
cd deep-agents-ui && yarn install # install deps
cd deep-agents-ui && yarn dev # run local UI
cd deep-agents-ui && yarn build # production build
cd deep-agents-ui && yarn lint # eslint checks
cd deep-agents-ui && yarn format # prettier format
```
Python tooling is configured in `pyproject.toml` (ruff + mypy):
```bash
uv run ruff format .
uv run ruff check .
uv run mypy .
/
research_agent/ # Python DeepAgent orchestrator (see AGENTS.md)
context_engineering_research_agent/ # Extended agent with 5 strategies
deep-agents-ui/ # Next.js React frontend (see AGENTS.md)
rust-research-agent/ # Rust implementation (see AGENTS.md)
rig-deepagents/ # Pregel-based middleware runtime
rig-rlm/ # Recursive Language Model agent
tests/ # pytest test suite (see AGENTS.md)
skills/ # Project-level skills (SKILL.md per skill)
research_workspace/ # Agent output directory (ephemeral)
deepagents_sourcecode/ # Vendor: upstream library reference
```
## Coding Style & Naming Conventions
- Python: follow ruff defaults and Google-style docstrings (see `pyproject.toml`); prefer `snake_case` modules and functions.
- TypeScript/React: keep `PascalCase` for components, `camelCase` for hooks/utilities; rely on ESLint + Prettier (Tailwind plugin).
- Skill definitions: keep one skill per directory with a `SKILL.md` entrypoint and clear, task-focused naming.
---
## Testing Guidelines
- There are no repository-wide tests for `research_agent/` yet; add `pytest` tests when introducing new logic.
- Subprojects have their own suites: see `deepagents_sourcecode/libs/*/Makefile` and `rust-research-agent/README.md` for `make test` or `cargo test`.
## WHERE TO LOOK
## Commit & Pull Request Guidelines
- Git history uses short, descriptive messages in English or Korean with no enforced prefix; keep summaries concise and imperative.
- For PRs, include: a brief summary, testing notes (or “not run”), linked issues, and UI screenshots for frontend changes.
| Task | Location | Notes |
|------|----------|-------|
| Modify orchestrator | `research_agent/agent.py` | SubAgent assembly, tools, middleware |
| Add research tool | `research_agent/tools.py` | `tavily_search`, `think_tool` |
| Autonomous researcher logic | `research_agent/researcher/` | Three-phase workflow |
| Context strategies | `context_engineering_research_agent/context_strategies/` | 5 patterns |
| Frontend components | `deep-agents-ui/src/app/components/` | Chat UI |
| Rust Pregel runtime | `rust-research-agent/rig-deepagents/src/pregel/` | Graph execution |
| Rust middleware | `rust-research-agent/rig-deepagents/src/middleware/` | Tool injection |
| Add new skill | `skills/{skill-name}/SKILL.md` | YAML frontmatter + instructions |
## Configuration & Secrets
- Copy `env.example` to `.env` for API keys; never commit secrets.
- UI-only keys can be set via `NEXT_PUBLIC_LANGSMITH_API_KEY` in `deep-agents-ui/`.
---
## CONVENTIONS
### Deviations from Standard Patterns
- **Backend factory pattern**: Always use `backend_factory(rt: ToolRuntime)` - middleware depends on this signature
- **SubAgent naming**: Use `researcher`, `explorer`, `synthesizer` - hardcoded in prompts
- **File paths**: Paths starting with "/" route to `research_workspace/`; others are in-memory
- **Korean comments**: Docstrings and some comments in Korean (bilingual codebase)
---
## ANTI-PATTERNS
- **DO NOT** commit `.env` files (contains API keys)
- **DO NOT** instantiate researcher directly - use `get_researcher_subagent()`
- **DO NOT** skip `think_tool()` between searches - explicit reflection required
- **NEVER** write raw JSON to user - always format responses (see `deepagents_cli/tools.py`)
- **NEVER** lie to exit early - complete TODO items fully (see `researcher/runner.py`)
---
## COMMANDS
### Python Development
```bash
uv sync # Install dependencies
langgraph dev # Start backend (port 2024)
uv run ruff format . # Format code
uv run ruff check . # Lint
uv run mypy . # Type check
uv run pytest tests/ # Run tests
```
### Frontend Development
```bash
cd deep-agents-ui
yarn install && yarn dev # Dev server (port 3000)
yarn build # Production build
yarn lint && yarn format # Lint + format
```
### Rust Development
```bash
cd rust-research-agent/rig-deepagents
cargo test # Run tests (159)
cargo clippy -- -D warnings # Lint (strict)
cargo build --features checkpointer-sqlite # Build with features
```
---
## ENVIRONMENT VARIABLES
Copy `env.example` to `.env`:
| Variable | Required | Purpose |
|----------|----------|---------|
| `OPENAI_API_KEY` | Yes | gpt-4.1 model |
| `TAVILY_API_KEY` | Yes | Web search |
| `LANGSMITH_API_KEY` | No | Tracing (`lsv2_pt_...`) |
| `ANTHROPIC_API_KEY` | No | Claude models + caching |
---
## SUBDIRECTORY KNOWLEDGE
See AGENTS.md files in:
- `research_agent/AGENTS.md` - Orchestrator details
- `context_engineering_research_agent/AGENTS.md` - Context strategies
- `deep-agents-ui/AGENTS.md` - Frontend architecture
- `tests/AGENTS.md` - Test organization
- `rust-research-agent/AGENTS.md` - Rust overview (3 tiers)
- `rust-research-agent/rig-deepagents/AGENTS.md` - Middleware architecture
- `rust-research-agent/rig-rlm/AGENTS.md` - Recursive LLM pattern

221
CLAUDE.md
View File

@@ -6,9 +6,8 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
A multi-agent research system demonstrating **FileSystem-based Context Engineering** using LangChain's DeepAgents framework. The system includes:
- **Python DeepAgents**: LangChain-based multi-agent orchestration with web research capabilities
- **Rust `rig-deepagents`**: A port/reimagining using the Rig framework with Pregel-inspired graph execution
The system enables agents to conduct web research, delegate tasks to sub-agents, and generate comprehensive reports with persistent filesystem state.
- **Context Engineering Module**: Experimental platform with 5 context optimization strategies
- **Rust `rig-deepagents`**: Pregel-inspired graph execution runtime using Rig framework
## Development Commands
@@ -22,11 +21,16 @@ uv sync
langgraph dev
# Linting and formatting
ruff check research_agent/
ruff format research_agent/
uv run ruff check .
uv run ruff format .
# Type checking
mypy research_agent/
uv run mypy .
# Run tests
uv run pytest tests/ # All tests
uv run pytest tests/test_agent.py -v # Single file
uv run pytest -k "test_researcher" -v # Pattern match
```
### Frontend UI (deep-agents-ui/)
@@ -40,29 +44,19 @@ yarn lint # ESLint
yarn format # Prettier
```
### Interactive Notebook Development
### Rust `rig-deepagents`
```bash
# Open Jupyter for interactive agent testing
jupyter notebook DeepAgent_research.ipynb
```
cd rust-research-agent/rig-deepagents
The `research_agent/utils.py` module provides Rich-formatted display helpers for notebooks:
- `format_messages(messages)` - Renders messages with colored panels (Human=blue, AI=green, Tool=yellow)
- `show_prompt(text, title)` - Displays prompts with XML/header syntax highlighting
### Rust `rig-deepagents` Crate
```bash
cd rust-research-agent/crates/rig-deepagents
# Run all tests (159 tests)
# Run all tests
cargo test
# Run tests for a specific module
cargo test pregel:: # Pregel runtime tests
cargo test workflow:: # Workflow node tests
cargo test middleware:: # Middleware tests
cargo test checkpointing # Checkpointing tests
# Linting (strict, treats warnings as errors)
cargo clippy -- -D warnings
@@ -82,17 +76,18 @@ cargo build --features checkpointer-postgres
## Required Environment Variables
Copy `env.example` to `.env`:
- `OPENAI_API_KEY` - For gpt-4.1 model
- `TAVILY_API_KEY` - For web search functionality
- `LANGSMITH_API_KEY` - Optional, format `lsv2_pt_...` for tracing
- `LANGSMITH_TRACING` / `LANGSMITH_PROJECT` - Optional tracing config
| Variable | Required | Purpose |
|----------|----------|---------|
| `OPENAI_API_KEY` | Yes | gpt-4.1 model |
| `TAVILY_API_KEY` | Yes | Web search |
| `ANTHROPIC_API_KEY` | No | Claude models + prompt caching |
| `LANGSMITH_API_KEY` | No | Tracing (`lsv2_pt_...`) |
## Architecture
### Multi-SubAgent System
The system uses a three-tier agent hierarchy with two distinct SubAgent types:
```
Main Orchestrator Agent (agent.py)
@@ -113,52 +108,36 @@ Main Orchestrator Agent (agent.py)
| CompiledSubAgent | `{"runnable": CompiledStateGraph}` | Multi-turn autonomous | Complex research with self-planning |
| Simple SubAgent | `{"system_prompt": str}` | Single response | Quick tasks, file ops |
### Core Components
### Context Engineering Strategies (context_engineering_research_agent/)
**`research_agent/agent.py`** - Orchestrator configuration:
- LLM: `ChatOpenAI(model="gpt-4.1", temperature=0.0)`
- Creates researcher via `get_researcher_subagent()` (CompiledSubAgent)
- Defines `explorer_agent`, `synthesizer_agent` (Simple SubAgents)
- Assembles `ALL_SUBAGENTS = [researcher_subagent, *SIMPLE_SUBAGENTS]`
Five strategies for optimizing LLM context window usage:
**`research_agent/researcher/`** - Autonomous researcher module:
- `agent.py`: `create_researcher_agent()` factory and `get_researcher_subagent()` wrapper
- `prompts.py`: `AUTONOMOUS_RESEARCHER_INSTRUCTIONS` with three-phase workflow (Exploratory → Directed → Synthesis)
| Strategy | File | Trigger |
|----------|------|---------|
| **Offloading** | `context_strategies/offloading.py` | Tool result > 20,000 tokens |
| **Reduction** | `context_strategies/reduction.py` | Context usage > 85% |
| **Retrieval** | grep/glob/read_file tools | Always available |
| **Isolation** | SubAgent `task()` tool | Complex subtasks |
| **Caching** | `context_strategies/caching.py` | Anthropic provider |
**Backend Factory Pattern** - The `backend_factory(rt: ToolRuntime)` function demonstrates the recommended pattern:
Middleware stack order matters: Offloading → Reduction → Caching → Telemetry
### Backend Factory Pattern
The `backend_factory(rt: ToolRuntime)` function demonstrates the recommended pattern:
```python
CompositeBackend(
default=StateBackend(rt), # In-memory state (temporary files)
routes={"/": fs_backend} # Route "/" paths to FilesystemBackend
)
```
This enables routing: paths starting with "/" go to persistent local filesystem (`research_workspace/`), others use ephemeral state.
**`research_agent/prompts.py`** - Prompt templates:
- `RESEARCH_WORKFLOW_INSTRUCTIONS` - Main workflow (plan → save → delegate → synthesize → write → verify)
- `SUBAGENT_DELEGATION_INSTRUCTIONS` - When to parallelize (comparisons) vs single agent (overviews)
- `EXPLORER_INSTRUCTIONS` - Fast read-only exploration with filesystem tools
- `SYNTHESIZER_INSTRUCTIONS` - Multi-source integration with confidence levels
**`research_agent/tools.py`** - Research tools:
- `tavily_search(query, max_results, topic)` - Searches web, fetches full page content, converts to markdown
- `think_tool(reflection)` - Explicit reflection step for deliberate research
**`langgraph.json`** - Deployment config pointing to `./research_agent/agent.py:agent`
### Context Engineering Pattern
The filesystem acts as long-term memory:
1. Agent reads/writes files in virtual `research_workspace/`
2. Structured outputs: reports, TODOs, request files
3. Middleware auto-injects filesystem and sub-agent tools
4. Automatic context summarization for token efficiency
Paths starting with "/" go to persistent local filesystem (`research_workspace/`), others use ephemeral state.
### DeepAgents Auto-Injected Tools
The `create_deep_agent()` function automatically adds these tools via middleware:
- **TodoListMiddleware**: `write_todos` - Task planning and progress tracking
- **FilesystemMiddleware**: `ls`, `read_file`, `write_file`, `edit_file`, `glob`, `grep` - File operations
- **FilesystemMiddleware**: `ls`, `read_file`, `write_file`, `edit_file`, `glob`, `grep`
- **SubAgentMiddleware**: `task` - Delegate work to sub-agents
- **SkillsMiddleware**: Progressive skill disclosure via `skills/` directory
@@ -166,60 +145,65 @@ Custom tools (`tavily_search`, `think_tool`) are added explicitly in `agent.py`.
### Skills System
Project-level skills are located in `PROJECT_ROOT/skills/`:
- `academic-search/` - arXiv paper search with structured output
- `data-synthesis/` - Multi-source data integration and analysis
Project-level skills in `skills/`:
- `academic-search/` - arXiv paper search
- `data-synthesis/` - Multi-source data integration
- `report-writing/` - Structured report generation
- `skill-creator/` - Meta-skill for creating new skills
Each skill has a `SKILL.md` file with YAML frontmatter (name, description) and detailed instructions. The SkillsMiddleware uses Progressive Disclosure: only skill metadata is injected into the system prompt at session start; full skill content is read on-demand when needed.
### Research Workflow
**Orchestrator workflow:**
```
Plan → Save Request → Delegate to Sub-agents → Synthesize → Write Report → Verify
```
**Autonomous Researcher workflow (breadth-first, then depth):**
```
Phase 1: Exploratory Search (1-2 searches) → Identify directions
Phase 2: Directed Research (1-2 searches per direction) → Deep dive
Phase 3: Synthesis → Combine findings with source agreement analysis
```
Sub-agents operate with token budgets (5-6 max searches) and explicit reflection loops (Search → think_tool → Decide → Repeat).
Each skill has `SKILL.md` with YAML frontmatter. SkillsMiddleware uses Progressive Disclosure: only metadata injected at session start, full content read on-demand.
## Rust `rig-deepagents` Architecture
The Rust crate provides a Pregel-inspired graph execution runtime for agent workflows.
Pregel-inspired graph execution runtime for agent workflows.
### Module Structure
```
rust-research-agent/crates/rig-deepagents/src/
rust-research-agent/rig-deepagents/src/
├── lib.rs # Library entry point and re-exports
├── pregel/ # Pregel Runtime (graph execution engine)
│ ├── runtime.rs # Superstep orchestration, workflow timeout, retry policies
│ ├── runtime.rs # Superstep orchestration, CheckpointingRuntime
│ ├── vertex.rs # Vertex trait and compute context
│ ├── message.rs # Inter-vertex message passing
│ ├── config.rs # PregelConfig, RetryPolicy
│ ├── checkpoint/ # Fault tolerance via checkpointing
│ │ ├── mod.rs # Checkpointer trait and factory
│ │ ── file.rs # FileCheckpointer implementation
── state.rs # WorkflowState trait, UnitState
│ │ ├── mod.rs # Checkpointer trait
│ │ ── file.rs # FileCheckpointer
│ ├── sqlite.rs # SQLiteCheckpointer
│ │ ├── redis.rs # RedisCheckpointer
│ │ └── postgres.rs # PostgresCheckpointer
│ └── state.rs # WorkflowState trait
├── workflow/ # Workflow Builder DSL
│ ├── node.rs # NodeKind (Agent, Tool, Router, SubAgent, FanOut/FanIn)
── mod.rs # WorkflowGraph builder API
│ ├── compiled.rs # CompiledWorkflow with checkpoint support
── graph.rs # WorkflowGraph builder API
│ └── vertices/ # Node implementations (Agent, Tool, Router, etc.)
├── compat/ # Rig Framework Compatibility Layer
│ ├── rig_agent_adapter.rs # RigAgentAdapter (primary LLM integration)
│ └── rig_tool_adapter.rs # RigToolAdapter for Rig Tool compatibility
├── middleware/ # AgentMiddleware trait and MiddlewareStack
│ └── summarization/ # Token counting and context summarization
├── backends/ # Backend trait (Memory, Filesystem, Composite)
├── llm/ # LLMProvider abstraction (OpenAI, Anthropic)
└── tools/ # Tool implementations (read_file, write_file, grep, etc.)
├── llm/ # LLMProvider abstraction (uses RigAgentAdapter)
└── tools/ # Tool implementations (read_file, write_file, etc.)
```
### Pregel Execution Model
### LLM Integration
The runtime executes workflows using synchronized supersteps:
**Use `RigAgentAdapter`** to wrap Rig's native providers (OpenAI, Anthropic, etc.):
```rust
use rig::providers::openai::Client;
use rig_deepagents::{RigAgentAdapter, AgentExecutor};
let client = Client::from_env();
let agent = client.agent("gpt-4").build();
let provider = RigAgentAdapter::new(agent);
```
Legacy `OpenAIProvider` and `AnthropicProvider` have been removed.
### Pregel Execution Model
```
┌─────────────────────────────────────────────────────────────┐
@@ -236,7 +220,7 @@ The runtime executes workflows using synchronized supersteps:
- **Vertex**: Computation unit with `compute()` method (Agent, Tool, Router)
- **Message**: Communication between vertices across supersteps
- **Checkpointing**: Fault tolerance via periodic state snapshots
- **Checkpointing**: Fault tolerance via periodic state snapshots (File, SQLite, Redis, Postgres)
- **Retry Policy**: Exponential backoff with configurable max retries
### Key Types
@@ -244,15 +228,9 @@ The runtime executes workflows using synchronized supersteps:
| Type | Purpose |
|------|---------|
| `PregelRuntime<S, M>` | Executes workflow graph with state S and message M |
| `Vertex<S, M>` | Trait for computation nodes |
| `WorkflowState` | Trait for workflow state (must be serializable) |
| `PregelConfig` | Runtime configuration (max supersteps, parallelism, timeout) |
| `Checkpointer` | Trait for state persistence (Memory, File, SQLite, Redis, Postgres) |
### Design Documents
- `docs/plans/2026-01-02-rig-deepagents-pregel-design.md` - Comprehensive Pregel runtime design
- `docs/plans/2026-01-02-rig-deepagents-implementation-tasks.md` - Implementation task breakdown
| `CheckpointingRuntime<S, M>` | PregelRuntime with checkpoint/resume support |
| `RigAgentAdapter` | Wraps any Rig Agent for LLMProvider compatibility |
| `CompiledWorkflow` | Builder result with optional checkpointing |
## Key Files for Understanding the System
@@ -261,18 +239,43 @@ The runtime executes workflows using synchronized supersteps:
2. `research_agent/researcher/agent.py` - Autonomous researcher factory (CompiledSubAgent pattern)
3. `research_agent/researcher/prompts.py` - Three-phase autonomous workflow
4. `research_agent/prompts.py` - Orchestrator and Simple SubAgent prompts
5. `research_agent/tools.py` - Tool implementations
6. `research_agent/skills/middleware.py` - SkillsMiddleware with progressive disclosure
**Context Engineering:**
5. `context_engineering_research_agent/agent.py` - Context-aware agent factory
6. `context_engineering_research_agent/context_strategies/` - 5 optimization strategies
**Rust rig-deepagents:**
7. `rust-research-agent/crates/rig-deepagents/src/pregel/runtime.rs` - Pregel execution engine
8. `rust-research-agent/crates/rig-deepagents/src/pregel/vertex.rs` - Vertex abstraction
9. `rust-research-agent/crates/rig-deepagents/src/workflow/node.rs` - Node type definitions
10. `rust-research-agent/crates/rig-deepagents/src/llm/provider.rs` - LLMProvider trait
7. `rust-research-agent/rig-deepagents/src/pregel/runtime.rs` - Pregel + Checkpointing
8. `rust-research-agent/rig-deepagents/src/compat/rig_agent_adapter.rs` - LLM integration
9. `rust-research-agent/rig-deepagents/src/workflow/compiled.rs` - Workflow compilation
**Documentation:**
11. `DeepAgents_Technical_Guide.md` - Python DeepAgents reference (Korean)
12. `docs/plans/2026-01-02-rig-deepagents-pregel-design.md` - Rust Pregel design
## Critical Patterns
### SubAgent Creation
Always use factory functions, never instantiate directly:
```python
# Correct
researcher_subagent = get_researcher_subagent()
# Wrong - bypasses middleware setup
researcher = create_researcher_agent()
```
### File Path Routing
Paths starting with "/" persist to `research_workspace/`, others are in-memory:
```python
write_file("/reports/summary.md", content) # Persists
write_file("temp/scratch.txt", content) # Ephemeral
```
### Reflection Loop
Always use `think_tool()` between web searches - explicit reflection is required:
```
Search → think_tool() → Decide → Search → think_tool() → Synthesize
```
## Tech Stack

View File

@@ -0,0 +1,111 @@
# AGENTS.md - Context Engineering Research Agent
> **Component**: `context_engineering_research_agent/`
> **Type**: Extended DeepAgent with Context Strategies
> **Role**: Experimental platform for 5 Context Engineering patterns
---
## 1. Module Purpose
This module extends the base research agent with explicit **Context Engineering** strategies. It serves as a research testbed for optimizing LLM context window usage.
---
## 2. The 5 Context Engineering Strategies
| Strategy | Implementation | Trigger |
|----------|----------------|---------|
| **Offloading** | `context_strategies/offloading.py` | Tool result > 20,000 tokens |
| **Reduction** | `context_strategies/reduction.py` | Context usage > 85% |
| **Retrieval** | grep/glob/read_file tools | Always available |
| **Isolation** | SubAgent `task()` tool | Complex subtasks |
| **Caching** | `context_strategies/caching.py` | Anthropic provider |
---
## 3. Key Files
| File | Purpose |
|------|---------|
| `agent.py` | Main factory: `create_context_aware_agent()` |
| `context_strategies/__init__.py` | Re-exports all strategy classes |
| `context_strategies/offloading.py` | `ContextOffloadingStrategy` middleware |
| `context_strategies/reduction.py` | `ContextReductionStrategy` middleware |
| `context_strategies/caching.py` | `ContextCachingStrategy` + provider detection |
| `context_strategies/caching_telemetry.py` | `PromptCachingTelemetryMiddleware` |
| `context_strategies/isolation.py` | State isolation utilities for SubAgents |
| `context_strategies/retrieval.py` | Selective file loading patterns |
| `backends/docker_sandbox.py` | Sandboxed execution backend |
| `backends/pyodide_sandbox.py` | Browser-based Python sandbox |
---
## 4. Agent Factory Pattern
```python
# Simple usage (defaults)
agent = get_agent()
# Customized configuration
agent = create_context_aware_agent(
model="anthropic/claude-sonnet-4",
enable_offloading=True,
enable_reduction=True,
enable_caching=True,
offloading_token_limit=20000,
reduction_threshold=0.85,
)
```
---
## 5. Multi-Provider Support
Provider detection is automatic via `detect_provider(model)`:
| Provider | Features |
|----------|----------|
| Anthropic | Full cache_control markers |
| OpenAI | Standard caching |
| OpenRouter | Pass `openrouter_model_name` for specific routing |
| Gemini | Standard caching |
---
## 6. Middleware Stack Order
Middlewares execute in registration order. The recommended stack:
```python
middleware=[
ContextOffloadingStrategy, # 1. Evict large results FIRST
ContextReductionStrategy, # 2. Compress if still too large
ContextCachingStrategy, # 3. Mark cacheable sections
PromptCachingTelemetryMiddleware, # 4. Collect metrics
]
```
**Order matters:** Offloading before reduction prevents unnecessary summarization.
---
## 7. Sandbox Backends
For secure code execution:
| Backend | Environment | Isolation Level |
|---------|-------------|-----------------|
| `DockerSandbox` | Container | High (network isolated) |
| `PyodideSandbox` | WASM | Medium (browser-like) |
| `DockerSession` | Persistent container | High + state persistence |
---
## 8. Testing
Tests are in `tests/context_engineering/`:
- `test_caching.py` - Cache strategy unit tests
- `test_offloading.py` - Eviction threshold tests
- `test_reduction.py` - Summarization trigger tests
- `test_integration.py` - Full agent integration tests

118
deep-agents-ui/AGENTS.md Normal file
View File

@@ -0,0 +1,118 @@
# AGENTS.md - Deep Agents UI
> **Component**: `deep-agents-ui/`
> **Type**: Next.js 16 React Frontend
> **Role**: Chat interface for LangGraph DeepAgents
---
## 1. Module Purpose
React-based chat UI that connects to a LangGraph backend. Displays agent messages, tool calls, SubAgent activity, tasks/files sidebar, and handles human-in-the-loop interrupts.
---
## 2. Quick Start
```bash
yarn install
yarn dev # localhost:3000
```
Configure via Settings dialog:
- **Deployment URL**: `http://127.0.0.1:2024` (LangGraph dev server)
- **Assistant ID**: `research` (or UUID)
---
## 3. Directory Structure
```
src/
app/
page.tsx # Main entry, config handling
layout.tsx # Root layout with providers
components/
ChatInterface.tsx # Message input/display area
ChatMessage.tsx # Individual message rendering
ToolCallBox.tsx # Tool invocation display
SubAgentIndicator.tsx # Active SubAgent status
TasksFilesSidebar.tsx # TODO list + file tree
ThreadList.tsx # Conversation history
ToolApprovalInterrupt.tsx # HITL approval UI
ConfigDialog.tsx # Settings modal
FileViewDialog.tsx # File content viewer
MarkdownContent.tsx # Markdown renderer
components/ui/ # Radix UI primitives (shadcn)
providers/
ChatProvider.tsx # Chat state context
ClientProvider.tsx # LangGraph SDK client
lib/
config.ts # LocalStorage config persistence
```
---
## 4. Key Components
| Component | Function |
|-----------|----------|
| `ChatProvider` | Manages message state, streaming, thread lifecycle |
| `ClientProvider` | Wraps `@langchain/langgraph-sdk` client |
| `ChatInterface` | Main chat view with input area |
| `ToolCallBox` | Renders tool name, args, result with syntax highlighting |
| `SubAgentIndicator` | Shows which SubAgent is currently active |
| `ToolApprovalInterrupt` | Human-in-the-loop approval/rejection UI |
---
## 5. State Management
| State | Location | Persistence |
|-------|----------|-------------|
| Config | `lib/config.ts` | LocalStorage |
| Thread ID | URL query param `?threadId=` | URL |
| Sidebar | URL query param `?sidebar=` | URL |
| Messages | `ChatProvider` context | Server (LangGraph) |
---
## 6. Styling
- **TailwindCSS** with custom theme
- **shadcn/ui** components (Radix primitives)
- **Dark mode** via CSS variables
---
## 7. Development Commands
```bash
yarn dev # Start dev server (port 3000)
yarn build # Production build
yarn lint # ESLint
yarn format # Prettier
```
---
## 8. Backend Connection
The UI connects to LangGraph API endpoints:
- `POST /threads` - Create thread
- `POST /threads/{id}/runs` - Stream messages
- `GET /assistants/{id}` - Fetch assistant config
Configured via `ClientProvider` with deployment URL and optional LangSmith API key.
---
## 9. Extension Points
| Task | Where to Modify |
|------|-----------------|
| Add new message type | `ChatMessage.tsx` + type in `ChatProvider` |
| Custom tool rendering | `ToolCallBox.tsx` |
| New sidebar panel | `TasksFilesSidebar.tsx` |
| Theme customization | `tailwind.config.js` + `globals.css` |

97
research_agent/AGENTS.md Normal file
View File

@@ -0,0 +1,97 @@
# AGENTS.md - Research Agent Module
> **Component**: `research_agent/`
> **Type**: Python DeepAgent Orchestrator
> **Role**: Multi-SubAgent Research System with Skills Integration
---
## 1. Module Purpose
This module implements the main research orchestrator using LangChain's DeepAgents framework. It coordinates three specialized SubAgents and integrates a project-level skills system.
---
## 2. Architecture: Three-Tier SubAgent System
```
Orchestrator (agent.py)
|
+-- researcher (CompiledSubAgent)
| Autonomous, self-planning DeepAgent
| "Breadth-first, then depth" research pattern
|
+-- explorer (Simple SubAgent)
| Fast read-only filesystem exploration
|
+-- synthesizer (Simple SubAgent)
Multi-source result integration
```
### SubAgent Types
| Type | Definition | Execution | Use Case |
|------|------------|-----------|----------|
| CompiledSubAgent | `{"runnable": CompiledStateGraph}` | Multi-turn autonomous | Complex research |
| Simple SubAgent | `{"system_prompt": str}` | Single response | Quick tasks |
---
## 3. Key Files
| File | Purpose |
|------|---------|
| `agent.py` | Orchestrator assembly: model, backend, SubAgents, middleware |
| `prompts.py` | Orchestrator prompts: workflow, delegation, explorer, synthesizer |
| `tools.py` | `tavily_search()`, `think_tool()` implementations |
| `researcher/agent.py` | `get_researcher_subagent()` factory for CompiledSubAgent |
| `researcher/prompts.py` | Three-phase autonomous workflow (Exploratory -> Directed -> Synthesis) |
| `researcher/depth.py` | Research depth configuration (shallow/medium/deep) |
| `researcher/ralph_loop.py` | Iterative refinement loop pattern |
| `skills/middleware.py` | SkillsMiddleware with Progressive Disclosure |
---
## 4. Backend Configuration
The module uses a **CompositeBackend** pattern:
```python
CompositeBackend(
default=StateBackend(rt), # In-memory (temporary files)
routes={"/": fs_backend} # "/" paths -> research_workspace/
)
```
- Paths starting with "/" persist to `research_workspace/`
- Other paths use ephemeral in-memory state
---
## 5. Skills System
Skills are loaded from `PROJECT_ROOT/skills/` via `SkillsMiddleware`.
**Progressive Disclosure Pattern:**
1. Session start: Only skill metadata injected
2. Agent request: Full SKILL.md content loaded on-demand
3. Token efficiency: ~90% reduction in initial context
---
## 6. Anti-Patterns
- **DO NOT** directly instantiate researcher - use `get_researcher_subagent()`
- **DO NOT** skip `think_tool()` between searches - explicit reflection required
- **DO NOT** modify `backend_factory` signature - middleware depends on it
---
## 7. Extension Points
| Task | Where to Modify |
|------|-----------------|
| Add new SubAgent | Define in `agent.py`, add to `SIMPLE_SUBAGENTS` or `ALL_SUBAGENTS` |
| New research tool | Add to `tools.py`, include in `create_deep_agent(tools=[...])` |
| Custom middleware | Create in `skills/`, add to middleware list in `agent.py` |
| Modify researcher behavior | Edit `researcher/prompts.py` |

79
tests/AGENTS.md Normal file
View File

@@ -0,0 +1,79 @@
# AGENTS.md - Test Suite
> **Component**: `tests/`
> **Type**: pytest Test Suite
> **Role**: Unit and integration tests for all Python modules
---
## 1. Test Organization
```
tests/
context_engineering/ # Context strategy tests
test_caching.py # Cache control marker tests
test_offloading.py # Token eviction tests
test_reduction.py # Summarization trigger tests
test_isolation.py # SubAgent state isolation
test_retrieval.py # Selective file loading
test_integration.py # Full agent integration
test_openrouter_models.py # Multi-provider tests
researcher/ # Research agent tests
test_depth.py # Research depth configuration
test_ralph_loop.py # Iterative refinement loop
test_runner.py # Agent runner tests
test_tools.py # Tool unit tests
test_integration.py # End-to-end research flow
backends/ # Backend implementation tests
test_docker_sandbox_integration.py
conftest.py # Shared fixtures
```
---
## 2. Running Tests
```bash
# All tests
uv run pytest tests/
# Specific module
uv run pytest tests/context_engineering/
# Single test file
uv run pytest tests/researcher/test_depth.py
# With coverage
uv run pytest --cov=research_agent tests/
```
---
## 3. Key Fixtures
Located in `conftest.py` files:
- `mock_model` - Mocked LLM for unit tests
- `temp_workspace` - Temporary filesystem backend
- `docker_sandbox` - Docker container fixture (integration)
---
## 4. Test Categories
| Category | Marker | Speed |
|----------|--------|-------|
| Unit | (default) | Fast |
| Integration | `@pytest.mark.integration` | Slow |
| Docker | `@pytest.mark.docker` | Requires Docker |
| LLM | `@pytest.mark.llm` | Requires API keys |
---
## 5. Environment Variables
For integration tests requiring real APIs:
- `OPENAI_API_KEY` - OpenAI tests
- `TAVILY_API_KEY` - Search tool tests
- `ANTHROPIC_API_KEY` - Caching tests with Claude