From 6a4f3dcdd9fee35a35b53203868a015ea290d3f9 Mon Sep 17 00:00:00 2001 From: HyunjunJeon Date: Tue, 13 Jan 2026 18:57:35 +0900 Subject: [PATCH] Update docs --- AGENTS.md | 149 ++++++++++--- CLAUDE.md | 221 ++++++++++--------- context_engineering_research_agent/AGENTS.md | 111 ++++++++++ deep-agents-ui/AGENTS.md | 118 ++++++++++ research_agent/AGENTS.md | 97 ++++++++ tests/AGENTS.md | 79 +++++++ 6 files changed, 631 insertions(+), 144 deletions(-) create mode 100644 context_engineering_research_agent/AGENTS.md create mode 100644 deep-agents-ui/AGENTS.md create mode 100644 research_agent/AGENTS.md create mode 100644 tests/AGENTS.md diff --git a/AGENTS.md b/AGENTS.md index ab08d30..753c13a 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -1,43 +1,122 @@ -# Repository Guidelines +# PROJECT KNOWLEDGE BASE -## Project Structure & Module Organization -- `research_agent/` contains the core Python agents, prompts, tools, and subagent utilities. -- `skills/` holds project-level skills as `SKILL.md` files (YAML frontmatter + instructions). -- `research_workspace/` is the agent’s working filesystem for generated outputs; keep it clean or example-only. -- `deep-agents-ui/` is the Next.js/React UI with source under `deep-agents-ui/src/`. -- `deepagents_sourcecode/` vendors upstream library sources for reference and comparison. -- `rust-research-agent/` is a standalone Rust tutorial agent with its own build/test flow. -- `langgraph.json` defines the LangGraph deployment entrypoint for the research agent. +**Generated:** 2026-01-13 + +--- + +## OVERVIEW + +Multi-agent research system demonstrating **FileSystem-based Context Engineering** using LangChain's DeepAgents framework. Includes Python orchestrator, Rust port via Rig framework, and Next.js chat UI. + +--- + +## STRUCTURE -## Build, Test, and Development Commands -Use the UI commands from `deep-agents-ui/` when working on the frontend: -```bash -cd deep-agents-ui && yarn install # install deps -cd deep-agents-ui && yarn dev # run local UI -cd deep-agents-ui && yarn build # production build -cd deep-agents-ui && yarn lint # eslint checks -cd deep-agents-ui && yarn format # prettier format ``` -Python tooling is configured in `pyproject.toml` (ruff + mypy): -```bash -uv run ruff format . -uv run ruff check . -uv run mypy . +/ + research_agent/ # Python DeepAgent orchestrator (see AGENTS.md) + context_engineering_research_agent/ # Extended agent with 5 strategies + deep-agents-ui/ # Next.js React frontend (see AGENTS.md) + rust-research-agent/ # Rust implementation (see AGENTS.md) + rig-deepagents/ # Pregel-based middleware runtime + rig-rlm/ # Recursive Language Model agent + tests/ # pytest test suite (see AGENTS.md) + skills/ # Project-level skills (SKILL.md per skill) + research_workspace/ # Agent output directory (ephemeral) + deepagents_sourcecode/ # Vendor: upstream library reference ``` -## Coding Style & Naming Conventions -- Python: follow ruff defaults and Google-style docstrings (see `pyproject.toml`); prefer `snake_case` modules and functions. -- TypeScript/React: keep `PascalCase` for components, `camelCase` for hooks/utilities; rely on ESLint + Prettier (Tailwind plugin). -- Skill definitions: keep one skill per directory with a `SKILL.md` entrypoint and clear, task-focused naming. +--- -## Testing Guidelines -- There are no repository-wide tests for `research_agent/` yet; add `pytest` tests when introducing new logic. -- Subprojects have their own suites: see `deepagents_sourcecode/libs/*/Makefile` and `rust-research-agent/README.md` for `make test` or `cargo test`. +## WHERE TO LOOK -## Commit & Pull Request Guidelines -- Git history uses short, descriptive messages in English or Korean with no enforced prefix; keep summaries concise and imperative. -- For PRs, include: a brief summary, testing notes (or “not run”), linked issues, and UI screenshots for frontend changes. +| Task | Location | Notes | +|------|----------|-------| +| Modify orchestrator | `research_agent/agent.py` | SubAgent assembly, tools, middleware | +| Add research tool | `research_agent/tools.py` | `tavily_search`, `think_tool` | +| Autonomous researcher logic | `research_agent/researcher/` | Three-phase workflow | +| Context strategies | `context_engineering_research_agent/context_strategies/` | 5 patterns | +| Frontend components | `deep-agents-ui/src/app/components/` | Chat UI | +| Rust Pregel runtime | `rust-research-agent/rig-deepagents/src/pregel/` | Graph execution | +| Rust middleware | `rust-research-agent/rig-deepagents/src/middleware/` | Tool injection | +| Add new skill | `skills/{skill-name}/SKILL.md` | YAML frontmatter + instructions | -## Configuration & Secrets -- Copy `env.example` to `.env` for API keys; never commit secrets. -- UI-only keys can be set via `NEXT_PUBLIC_LANGSMITH_API_KEY` in `deep-agents-ui/`. +--- + +## CONVENTIONS + +### Deviations from Standard Patterns + +- **Backend factory pattern**: Always use `backend_factory(rt: ToolRuntime)` - middleware depends on this signature +- **SubAgent naming**: Use `researcher`, `explorer`, `synthesizer` - hardcoded in prompts +- **File paths**: Paths starting with "/" route to `research_workspace/`; others are in-memory +- **Korean comments**: Docstrings and some comments in Korean (bilingual codebase) + +--- + +## ANTI-PATTERNS + +- **DO NOT** commit `.env` files (contains API keys) +- **DO NOT** instantiate researcher directly - use `get_researcher_subagent()` +- **DO NOT** skip `think_tool()` between searches - explicit reflection required +- **NEVER** write raw JSON to user - always format responses (see `deepagents_cli/tools.py`) +- **NEVER** lie to exit early - complete TODO items fully (see `researcher/runner.py`) + +--- + +## COMMANDS + +### Python Development + +```bash +uv sync # Install dependencies +langgraph dev # Start backend (port 2024) +uv run ruff format . # Format code +uv run ruff check . # Lint +uv run mypy . # Type check +uv run pytest tests/ # Run tests +``` + +### Frontend Development + +```bash +cd deep-agents-ui +yarn install && yarn dev # Dev server (port 3000) +yarn build # Production build +yarn lint && yarn format # Lint + format +``` + +### Rust Development + +```bash +cd rust-research-agent/rig-deepagents +cargo test # Run tests (159) +cargo clippy -- -D warnings # Lint (strict) +cargo build --features checkpointer-sqlite # Build with features +``` + +--- + +## ENVIRONMENT VARIABLES + +Copy `env.example` to `.env`: + +| Variable | Required | Purpose | +|----------|----------|---------| +| `OPENAI_API_KEY` | Yes | gpt-4.1 model | +| `TAVILY_API_KEY` | Yes | Web search | +| `LANGSMITH_API_KEY` | No | Tracing (`lsv2_pt_...`) | +| `ANTHROPIC_API_KEY` | No | Claude models + caching | + +--- + +## SUBDIRECTORY KNOWLEDGE + +See AGENTS.md files in: +- `research_agent/AGENTS.md` - Orchestrator details +- `context_engineering_research_agent/AGENTS.md` - Context strategies +- `deep-agents-ui/AGENTS.md` - Frontend architecture +- `tests/AGENTS.md` - Test organization +- `rust-research-agent/AGENTS.md` - Rust overview (3 tiers) +- `rust-research-agent/rig-deepagents/AGENTS.md` - Middleware architecture +- `rust-research-agent/rig-rlm/AGENTS.md` - Recursive LLM pattern diff --git a/CLAUDE.md b/CLAUDE.md index 153ba18..405a292 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -6,9 +6,8 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co A multi-agent research system demonstrating **FileSystem-based Context Engineering** using LangChain's DeepAgents framework. The system includes: - **Python DeepAgents**: LangChain-based multi-agent orchestration with web research capabilities -- **Rust `rig-deepagents`**: A port/reimagining using the Rig framework with Pregel-inspired graph execution - -The system enables agents to conduct web research, delegate tasks to sub-agents, and generate comprehensive reports with persistent filesystem state. +- **Context Engineering Module**: Experimental platform with 5 context optimization strategies +- **Rust `rig-deepagents`**: Pregel-inspired graph execution runtime using Rig framework ## Development Commands @@ -22,11 +21,16 @@ uv sync langgraph dev # Linting and formatting -ruff check research_agent/ -ruff format research_agent/ +uv run ruff check . +uv run ruff format . # Type checking -mypy research_agent/ +uv run mypy . + +# Run tests +uv run pytest tests/ # All tests +uv run pytest tests/test_agent.py -v # Single file +uv run pytest -k "test_researcher" -v # Pattern match ``` ### Frontend UI (deep-agents-ui/) @@ -40,29 +44,19 @@ yarn lint # ESLint yarn format # Prettier ``` -### Interactive Notebook Development +### Rust `rig-deepagents` ```bash -# Open Jupyter for interactive agent testing -jupyter notebook DeepAgent_research.ipynb -``` +cd rust-research-agent/rig-deepagents -The `research_agent/utils.py` module provides Rich-formatted display helpers for notebooks: -- `format_messages(messages)` - Renders messages with colored panels (Human=blue, AI=green, Tool=yellow) -- `show_prompt(text, title)` - Displays prompts with XML/header syntax highlighting - -### Rust `rig-deepagents` Crate - -```bash -cd rust-research-agent/crates/rig-deepagents - -# Run all tests (159 tests) +# Run all tests cargo test # Run tests for a specific module cargo test pregel:: # Pregel runtime tests cargo test workflow:: # Workflow node tests cargo test middleware:: # Middleware tests +cargo test checkpointing # Checkpointing tests # Linting (strict, treats warnings as errors) cargo clippy -- -D warnings @@ -82,17 +76,18 @@ cargo build --features checkpointer-postgres ## Required Environment Variables Copy `env.example` to `.env`: -- `OPENAI_API_KEY` - For gpt-4.1 model -- `TAVILY_API_KEY` - For web search functionality -- `LANGSMITH_API_KEY` - Optional, format `lsv2_pt_...` for tracing -- `LANGSMITH_TRACING` / `LANGSMITH_PROJECT` - Optional tracing config + +| Variable | Required | Purpose | +|----------|----------|---------| +| `OPENAI_API_KEY` | Yes | gpt-4.1 model | +| `TAVILY_API_KEY` | Yes | Web search | +| `ANTHROPIC_API_KEY` | No | Claude models + prompt caching | +| `LANGSMITH_API_KEY` | No | Tracing (`lsv2_pt_...`) | ## Architecture ### Multi-SubAgent System -The system uses a three-tier agent hierarchy with two distinct SubAgent types: - ``` Main Orchestrator Agent (agent.py) │ @@ -113,52 +108,36 @@ Main Orchestrator Agent (agent.py) | CompiledSubAgent | `{"runnable": CompiledStateGraph}` | Multi-turn autonomous | Complex research with self-planning | | Simple SubAgent | `{"system_prompt": str}` | Single response | Quick tasks, file ops | -### Core Components +### Context Engineering Strategies (context_engineering_research_agent/) -**`research_agent/agent.py`** - Orchestrator configuration: -- LLM: `ChatOpenAI(model="gpt-4.1", temperature=0.0)` -- Creates researcher via `get_researcher_subagent()` (CompiledSubAgent) -- Defines `explorer_agent`, `synthesizer_agent` (Simple SubAgents) -- Assembles `ALL_SUBAGENTS = [researcher_subagent, *SIMPLE_SUBAGENTS]` +Five strategies for optimizing LLM context window usage: -**`research_agent/researcher/`** - Autonomous researcher module: -- `agent.py`: `create_researcher_agent()` factory and `get_researcher_subagent()` wrapper -- `prompts.py`: `AUTONOMOUS_RESEARCHER_INSTRUCTIONS` with three-phase workflow (Exploratory → Directed → Synthesis) +| Strategy | File | Trigger | +|----------|------|---------| +| **Offloading** | `context_strategies/offloading.py` | Tool result > 20,000 tokens | +| **Reduction** | `context_strategies/reduction.py` | Context usage > 85% | +| **Retrieval** | grep/glob/read_file tools | Always available | +| **Isolation** | SubAgent `task()` tool | Complex subtasks | +| **Caching** | `context_strategies/caching.py` | Anthropic provider | -**Backend Factory Pattern** - The `backend_factory(rt: ToolRuntime)` function demonstrates the recommended pattern: +Middleware stack order matters: Offloading → Reduction → Caching → Telemetry + +### Backend Factory Pattern + +The `backend_factory(rt: ToolRuntime)` function demonstrates the recommended pattern: ```python CompositeBackend( default=StateBackend(rt), # In-memory state (temporary files) routes={"/": fs_backend} # Route "/" paths to FilesystemBackend ) ``` -This enables routing: paths starting with "/" go to persistent local filesystem (`research_workspace/`), others use ephemeral state. - -**`research_agent/prompts.py`** - Prompt templates: -- `RESEARCH_WORKFLOW_INSTRUCTIONS` - Main workflow (plan → save → delegate → synthesize → write → verify) -- `SUBAGENT_DELEGATION_INSTRUCTIONS` - When to parallelize (comparisons) vs single agent (overviews) -- `EXPLORER_INSTRUCTIONS` - Fast read-only exploration with filesystem tools -- `SYNTHESIZER_INSTRUCTIONS` - Multi-source integration with confidence levels - -**`research_agent/tools.py`** - Research tools: -- `tavily_search(query, max_results, topic)` - Searches web, fetches full page content, converts to markdown -- `think_tool(reflection)` - Explicit reflection step for deliberate research - -**`langgraph.json`** - Deployment config pointing to `./research_agent/agent.py:agent` - -### Context Engineering Pattern - -The filesystem acts as long-term memory: -1. Agent reads/writes files in virtual `research_workspace/` -2. Structured outputs: reports, TODOs, request files -3. Middleware auto-injects filesystem and sub-agent tools -4. Automatic context summarization for token efficiency +Paths starting with "/" go to persistent local filesystem (`research_workspace/`), others use ephemeral state. ### DeepAgents Auto-Injected Tools The `create_deep_agent()` function automatically adds these tools via middleware: - **TodoListMiddleware**: `write_todos` - Task planning and progress tracking -- **FilesystemMiddleware**: `ls`, `read_file`, `write_file`, `edit_file`, `glob`, `grep` - File operations +- **FilesystemMiddleware**: `ls`, `read_file`, `write_file`, `edit_file`, `glob`, `grep` - **SubAgentMiddleware**: `task` - Delegate work to sub-agents - **SkillsMiddleware**: Progressive skill disclosure via `skills/` directory @@ -166,60 +145,65 @@ Custom tools (`tavily_search`, `think_tool`) are added explicitly in `agent.py`. ### Skills System -Project-level skills are located in `PROJECT_ROOT/skills/`: -- `academic-search/` - arXiv paper search with structured output -- `data-synthesis/` - Multi-source data integration and analysis +Project-level skills in `skills/`: +- `academic-search/` - arXiv paper search +- `data-synthesis/` - Multi-source data integration - `report-writing/` - Structured report generation - `skill-creator/` - Meta-skill for creating new skills -Each skill has a `SKILL.md` file with YAML frontmatter (name, description) and detailed instructions. The SkillsMiddleware uses Progressive Disclosure: only skill metadata is injected into the system prompt at session start; full skill content is read on-demand when needed. - -### Research Workflow - -**Orchestrator workflow:** -``` -Plan → Save Request → Delegate to Sub-agents → Synthesize → Write Report → Verify -``` - -**Autonomous Researcher workflow (breadth-first, then depth):** -``` -Phase 1: Exploratory Search (1-2 searches) → Identify directions -Phase 2: Directed Research (1-2 searches per direction) → Deep dive -Phase 3: Synthesis → Combine findings with source agreement analysis -``` - -Sub-agents operate with token budgets (5-6 max searches) and explicit reflection loops (Search → think_tool → Decide → Repeat). +Each skill has `SKILL.md` with YAML frontmatter. SkillsMiddleware uses Progressive Disclosure: only metadata injected at session start, full content read on-demand. ## Rust `rig-deepagents` Architecture -The Rust crate provides a Pregel-inspired graph execution runtime for agent workflows. +Pregel-inspired graph execution runtime for agent workflows. ### Module Structure ``` -rust-research-agent/crates/rig-deepagents/src/ +rust-research-agent/rig-deepagents/src/ ├── lib.rs # Library entry point and re-exports ├── pregel/ # Pregel Runtime (graph execution engine) -│ ├── runtime.rs # Superstep orchestration, workflow timeout, retry policies +│ ├── runtime.rs # Superstep orchestration, CheckpointingRuntime │ ├── vertex.rs # Vertex trait and compute context │ ├── message.rs # Inter-vertex message passing │ ├── config.rs # PregelConfig, RetryPolicy │ ├── checkpoint/ # Fault tolerance via checkpointing -│ │ ├── mod.rs # Checkpointer trait and factory -│ │ └── file.rs # FileCheckpointer implementation -│ └── state.rs # WorkflowState trait, UnitState +│ │ ├── mod.rs # Checkpointer trait +│ │ ├── file.rs # FileCheckpointer +│ │ ├── sqlite.rs # SQLiteCheckpointer +│ │ ├── redis.rs # RedisCheckpointer +│ │ └── postgres.rs # PostgresCheckpointer +│ └── state.rs # WorkflowState trait ├── workflow/ # Workflow Builder DSL -│ ├── node.rs # NodeKind (Agent, Tool, Router, SubAgent, FanOut/FanIn) -│ └── mod.rs # WorkflowGraph builder API +│ ├── compiled.rs # CompiledWorkflow with checkpoint support +│ ├── graph.rs # WorkflowGraph builder API +│ └── vertices/ # Node implementations (Agent, Tool, Router, etc.) +├── compat/ # Rig Framework Compatibility Layer +│ ├── rig_agent_adapter.rs # RigAgentAdapter (primary LLM integration) +│ └── rig_tool_adapter.rs # RigToolAdapter for Rig Tool compatibility ├── middleware/ # AgentMiddleware trait and MiddlewareStack +│ └── summarization/ # Token counting and context summarization ├── backends/ # Backend trait (Memory, Filesystem, Composite) -├── llm/ # LLMProvider abstraction (OpenAI, Anthropic) -└── tools/ # Tool implementations (read_file, write_file, grep, etc.) +├── llm/ # LLMProvider abstraction (uses RigAgentAdapter) +└── tools/ # Tool implementations (read_file, write_file, etc.) ``` -### Pregel Execution Model +### LLM Integration -The runtime executes workflows using synchronized supersteps: +**Use `RigAgentAdapter`** to wrap Rig's native providers (OpenAI, Anthropic, etc.): + +```rust +use rig::providers::openai::Client; +use rig_deepagents::{RigAgentAdapter, AgentExecutor}; + +let client = Client::from_env(); +let agent = client.agent("gpt-4").build(); +let provider = RigAgentAdapter::new(agent); +``` + +Legacy `OpenAIProvider` and `AnthropicProvider` have been removed. + +### Pregel Execution Model ``` ┌─────────────────────────────────────────────────────────────┐ @@ -236,7 +220,7 @@ The runtime executes workflows using synchronized supersteps: - **Vertex**: Computation unit with `compute()` method (Agent, Tool, Router) - **Message**: Communication between vertices across supersteps -- **Checkpointing**: Fault tolerance via periodic state snapshots +- **Checkpointing**: Fault tolerance via periodic state snapshots (File, SQLite, Redis, Postgres) - **Retry Policy**: Exponential backoff with configurable max retries ### Key Types @@ -244,15 +228,9 @@ The runtime executes workflows using synchronized supersteps: | Type | Purpose | |------|---------| | `PregelRuntime` | Executes workflow graph with state S and message M | -| `Vertex` | Trait for computation nodes | -| `WorkflowState` | Trait for workflow state (must be serializable) | -| `PregelConfig` | Runtime configuration (max supersteps, parallelism, timeout) | -| `Checkpointer` | Trait for state persistence (Memory, File, SQLite, Redis, Postgres) | - -### Design Documents - -- `docs/plans/2026-01-02-rig-deepagents-pregel-design.md` - Comprehensive Pregel runtime design -- `docs/plans/2026-01-02-rig-deepagents-implementation-tasks.md` - Implementation task breakdown +| `CheckpointingRuntime` | PregelRuntime with checkpoint/resume support | +| `RigAgentAdapter` | Wraps any Rig Agent for LLMProvider compatibility | +| `CompiledWorkflow` | Builder result with optional checkpointing | ## Key Files for Understanding the System @@ -261,18 +239,43 @@ The runtime executes workflows using synchronized supersteps: 2. `research_agent/researcher/agent.py` - Autonomous researcher factory (CompiledSubAgent pattern) 3. `research_agent/researcher/prompts.py` - Three-phase autonomous workflow 4. `research_agent/prompts.py` - Orchestrator and Simple SubAgent prompts -5. `research_agent/tools.py` - Tool implementations -6. `research_agent/skills/middleware.py` - SkillsMiddleware with progressive disclosure + +**Context Engineering:** +5. `context_engineering_research_agent/agent.py` - Context-aware agent factory +6. `context_engineering_research_agent/context_strategies/` - 5 optimization strategies **Rust rig-deepagents:** -7. `rust-research-agent/crates/rig-deepagents/src/pregel/runtime.rs` - Pregel execution engine -8. `rust-research-agent/crates/rig-deepagents/src/pregel/vertex.rs` - Vertex abstraction -9. `rust-research-agent/crates/rig-deepagents/src/workflow/node.rs` - Node type definitions -10. `rust-research-agent/crates/rig-deepagents/src/llm/provider.rs` - LLMProvider trait +7. `rust-research-agent/rig-deepagents/src/pregel/runtime.rs` - Pregel + Checkpointing +8. `rust-research-agent/rig-deepagents/src/compat/rig_agent_adapter.rs` - LLM integration +9. `rust-research-agent/rig-deepagents/src/workflow/compiled.rs` - Workflow compilation -**Documentation:** -11. `DeepAgents_Technical_Guide.md` - Python DeepAgents reference (Korean) -12. `docs/plans/2026-01-02-rig-deepagents-pregel-design.md` - Rust Pregel design +## Critical Patterns + +### SubAgent Creation + +Always use factory functions, never instantiate directly: +```python +# Correct +researcher_subagent = get_researcher_subagent() + +# Wrong - bypasses middleware setup +researcher = create_researcher_agent() +``` + +### File Path Routing + +Paths starting with "/" persist to `research_workspace/`, others are in-memory: +```python +write_file("/reports/summary.md", content) # Persists +write_file("temp/scratch.txt", content) # Ephemeral +``` + +### Reflection Loop + +Always use `think_tool()` between web searches - explicit reflection is required: +``` +Search → think_tool() → Decide → Search → think_tool() → Synthesize +``` ## Tech Stack diff --git a/context_engineering_research_agent/AGENTS.md b/context_engineering_research_agent/AGENTS.md new file mode 100644 index 0000000..f2bb188 --- /dev/null +++ b/context_engineering_research_agent/AGENTS.md @@ -0,0 +1,111 @@ +# AGENTS.md - Context Engineering Research Agent + +> **Component**: `context_engineering_research_agent/` +> **Type**: Extended DeepAgent with Context Strategies +> **Role**: Experimental platform for 5 Context Engineering patterns + +--- + +## 1. Module Purpose + +This module extends the base research agent with explicit **Context Engineering** strategies. It serves as a research testbed for optimizing LLM context window usage. + +--- + +## 2. The 5 Context Engineering Strategies + +| Strategy | Implementation | Trigger | +|----------|----------------|---------| +| **Offloading** | `context_strategies/offloading.py` | Tool result > 20,000 tokens | +| **Reduction** | `context_strategies/reduction.py` | Context usage > 85% | +| **Retrieval** | grep/glob/read_file tools | Always available | +| **Isolation** | SubAgent `task()` tool | Complex subtasks | +| **Caching** | `context_strategies/caching.py` | Anthropic provider | + +--- + +## 3. Key Files + +| File | Purpose | +|------|---------| +| `agent.py` | Main factory: `create_context_aware_agent()` | +| `context_strategies/__init__.py` | Re-exports all strategy classes | +| `context_strategies/offloading.py` | `ContextOffloadingStrategy` middleware | +| `context_strategies/reduction.py` | `ContextReductionStrategy` middleware | +| `context_strategies/caching.py` | `ContextCachingStrategy` + provider detection | +| `context_strategies/caching_telemetry.py` | `PromptCachingTelemetryMiddleware` | +| `context_strategies/isolation.py` | State isolation utilities for SubAgents | +| `context_strategies/retrieval.py` | Selective file loading patterns | +| `backends/docker_sandbox.py` | Sandboxed execution backend | +| `backends/pyodide_sandbox.py` | Browser-based Python sandbox | + +--- + +## 4. Agent Factory Pattern + +```python +# Simple usage (defaults) +agent = get_agent() + +# Customized configuration +agent = create_context_aware_agent( + model="anthropic/claude-sonnet-4", + enable_offloading=True, + enable_reduction=True, + enable_caching=True, + offloading_token_limit=20000, + reduction_threshold=0.85, +) +``` + +--- + +## 5. Multi-Provider Support + +Provider detection is automatic via `detect_provider(model)`: + +| Provider | Features | +|----------|----------| +| Anthropic | Full cache_control markers | +| OpenAI | Standard caching | +| OpenRouter | Pass `openrouter_model_name` for specific routing | +| Gemini | Standard caching | + +--- + +## 6. Middleware Stack Order + +Middlewares execute in registration order. The recommended stack: + +```python +middleware=[ + ContextOffloadingStrategy, # 1. Evict large results FIRST + ContextReductionStrategy, # 2. Compress if still too large + ContextCachingStrategy, # 3. Mark cacheable sections + PromptCachingTelemetryMiddleware, # 4. Collect metrics +] +``` + +**Order matters:** Offloading before reduction prevents unnecessary summarization. + +--- + +## 7. Sandbox Backends + +For secure code execution: + +| Backend | Environment | Isolation Level | +|---------|-------------|-----------------| +| `DockerSandbox` | Container | High (network isolated) | +| `PyodideSandbox` | WASM | Medium (browser-like) | +| `DockerSession` | Persistent container | High + state persistence | + +--- + +## 8. Testing + +Tests are in `tests/context_engineering/`: +- `test_caching.py` - Cache strategy unit tests +- `test_offloading.py` - Eviction threshold tests +- `test_reduction.py` - Summarization trigger tests +- `test_integration.py` - Full agent integration tests diff --git a/deep-agents-ui/AGENTS.md b/deep-agents-ui/AGENTS.md new file mode 100644 index 0000000..6fd1315 --- /dev/null +++ b/deep-agents-ui/AGENTS.md @@ -0,0 +1,118 @@ +# AGENTS.md - Deep Agents UI + +> **Component**: `deep-agents-ui/` +> **Type**: Next.js 16 React Frontend +> **Role**: Chat interface for LangGraph DeepAgents + +--- + +## 1. Module Purpose + +React-based chat UI that connects to a LangGraph backend. Displays agent messages, tool calls, SubAgent activity, tasks/files sidebar, and handles human-in-the-loop interrupts. + +--- + +## 2. Quick Start + +```bash +yarn install +yarn dev # localhost:3000 +``` + +Configure via Settings dialog: +- **Deployment URL**: `http://127.0.0.1:2024` (LangGraph dev server) +- **Assistant ID**: `research` (or UUID) + +--- + +## 3. Directory Structure + +``` +src/ + app/ + page.tsx # Main entry, config handling + layout.tsx # Root layout with providers + components/ + ChatInterface.tsx # Message input/display area + ChatMessage.tsx # Individual message rendering + ToolCallBox.tsx # Tool invocation display + SubAgentIndicator.tsx # Active SubAgent status + TasksFilesSidebar.tsx # TODO list + file tree + ThreadList.tsx # Conversation history + ToolApprovalInterrupt.tsx # HITL approval UI + ConfigDialog.tsx # Settings modal + FileViewDialog.tsx # File content viewer + MarkdownContent.tsx # Markdown renderer + + components/ui/ # Radix UI primitives (shadcn) + providers/ + ChatProvider.tsx # Chat state context + ClientProvider.tsx # LangGraph SDK client + lib/ + config.ts # LocalStorage config persistence +``` + +--- + +## 4. Key Components + +| Component | Function | +|-----------|----------| +| `ChatProvider` | Manages message state, streaming, thread lifecycle | +| `ClientProvider` | Wraps `@langchain/langgraph-sdk` client | +| `ChatInterface` | Main chat view with input area | +| `ToolCallBox` | Renders tool name, args, result with syntax highlighting | +| `SubAgentIndicator` | Shows which SubAgent is currently active | +| `ToolApprovalInterrupt` | Human-in-the-loop approval/rejection UI | + +--- + +## 5. State Management + +| State | Location | Persistence | +|-------|----------|-------------| +| Config | `lib/config.ts` | LocalStorage | +| Thread ID | URL query param `?threadId=` | URL | +| Sidebar | URL query param `?sidebar=` | URL | +| Messages | `ChatProvider` context | Server (LangGraph) | + +--- + +## 6. Styling + +- **TailwindCSS** with custom theme +- **shadcn/ui** components (Radix primitives) +- **Dark mode** via CSS variables + +--- + +## 7. Development Commands + +```bash +yarn dev # Start dev server (port 3000) +yarn build # Production build +yarn lint # ESLint +yarn format # Prettier +``` + +--- + +## 8. Backend Connection + +The UI connects to LangGraph API endpoints: +- `POST /threads` - Create thread +- `POST /threads/{id}/runs` - Stream messages +- `GET /assistants/{id}` - Fetch assistant config + +Configured via `ClientProvider` with deployment URL and optional LangSmith API key. + +--- + +## 9. Extension Points + +| Task | Where to Modify | +|------|-----------------| +| Add new message type | `ChatMessage.tsx` + type in `ChatProvider` | +| Custom tool rendering | `ToolCallBox.tsx` | +| New sidebar panel | `TasksFilesSidebar.tsx` | +| Theme customization | `tailwind.config.js` + `globals.css` | diff --git a/research_agent/AGENTS.md b/research_agent/AGENTS.md new file mode 100644 index 0000000..a5f6d05 --- /dev/null +++ b/research_agent/AGENTS.md @@ -0,0 +1,97 @@ +# AGENTS.md - Research Agent Module + +> **Component**: `research_agent/` +> **Type**: Python DeepAgent Orchestrator +> **Role**: Multi-SubAgent Research System with Skills Integration + +--- + +## 1. Module Purpose + +This module implements the main research orchestrator using LangChain's DeepAgents framework. It coordinates three specialized SubAgents and integrates a project-level skills system. + +--- + +## 2. Architecture: Three-Tier SubAgent System + +``` +Orchestrator (agent.py) + | + +-- researcher (CompiledSubAgent) + | Autonomous, self-planning DeepAgent + | "Breadth-first, then depth" research pattern + | + +-- explorer (Simple SubAgent) + | Fast read-only filesystem exploration + | + +-- synthesizer (Simple SubAgent) + Multi-source result integration +``` + +### SubAgent Types + +| Type | Definition | Execution | Use Case | +|------|------------|-----------|----------| +| CompiledSubAgent | `{"runnable": CompiledStateGraph}` | Multi-turn autonomous | Complex research | +| Simple SubAgent | `{"system_prompt": str}` | Single response | Quick tasks | + +--- + +## 3. Key Files + +| File | Purpose | +|------|---------| +| `agent.py` | Orchestrator assembly: model, backend, SubAgents, middleware | +| `prompts.py` | Orchestrator prompts: workflow, delegation, explorer, synthesizer | +| `tools.py` | `tavily_search()`, `think_tool()` implementations | +| `researcher/agent.py` | `get_researcher_subagent()` factory for CompiledSubAgent | +| `researcher/prompts.py` | Three-phase autonomous workflow (Exploratory -> Directed -> Synthesis) | +| `researcher/depth.py` | Research depth configuration (shallow/medium/deep) | +| `researcher/ralph_loop.py` | Iterative refinement loop pattern | +| `skills/middleware.py` | SkillsMiddleware with Progressive Disclosure | + +--- + +## 4. Backend Configuration + +The module uses a **CompositeBackend** pattern: + +```python +CompositeBackend( + default=StateBackend(rt), # In-memory (temporary files) + routes={"/": fs_backend} # "/" paths -> research_workspace/ +) +``` + +- Paths starting with "/" persist to `research_workspace/` +- Other paths use ephemeral in-memory state + +--- + +## 5. Skills System + +Skills are loaded from `PROJECT_ROOT/skills/` via `SkillsMiddleware`. + +**Progressive Disclosure Pattern:** +1. Session start: Only skill metadata injected +2. Agent request: Full SKILL.md content loaded on-demand +3. Token efficiency: ~90% reduction in initial context + +--- + +## 6. Anti-Patterns + +- **DO NOT** directly instantiate researcher - use `get_researcher_subagent()` +- **DO NOT** skip `think_tool()` between searches - explicit reflection required +- **DO NOT** modify `backend_factory` signature - middleware depends on it + +--- + +## 7. Extension Points + +| Task | Where to Modify | +|------|-----------------| +| Add new SubAgent | Define in `agent.py`, add to `SIMPLE_SUBAGENTS` or `ALL_SUBAGENTS` | +| New research tool | Add to `tools.py`, include in `create_deep_agent(tools=[...])` | +| Custom middleware | Create in `skills/`, add to middleware list in `agent.py` | +| Modify researcher behavior | Edit `researcher/prompts.py` | diff --git a/tests/AGENTS.md b/tests/AGENTS.md new file mode 100644 index 0000000..133d686 --- /dev/null +++ b/tests/AGENTS.md @@ -0,0 +1,79 @@ +# AGENTS.md - Test Suite + +> **Component**: `tests/` +> **Type**: pytest Test Suite +> **Role**: Unit and integration tests for all Python modules + +--- + +## 1. Test Organization + +``` +tests/ + context_engineering/ # Context strategy tests + test_caching.py # Cache control marker tests + test_offloading.py # Token eviction tests + test_reduction.py # Summarization trigger tests + test_isolation.py # SubAgent state isolation + test_retrieval.py # Selective file loading + test_integration.py # Full agent integration + test_openrouter_models.py # Multi-provider tests + + researcher/ # Research agent tests + test_depth.py # Research depth configuration + test_ralph_loop.py # Iterative refinement loop + test_runner.py # Agent runner tests + test_tools.py # Tool unit tests + test_integration.py # End-to-end research flow + + backends/ # Backend implementation tests + test_docker_sandbox_integration.py + conftest.py # Shared fixtures +``` + +--- + +## 2. Running Tests + +```bash +# All tests +uv run pytest tests/ + +# Specific module +uv run pytest tests/context_engineering/ + +# Single test file +uv run pytest tests/researcher/test_depth.py + +# With coverage +uv run pytest --cov=research_agent tests/ +``` + +--- + +## 3. Key Fixtures + +Located in `conftest.py` files: +- `mock_model` - Mocked LLM for unit tests +- `temp_workspace` - Temporary filesystem backend +- `docker_sandbox` - Docker container fixture (integration) + +--- + +## 4. Test Categories + +| Category | Marker | Speed | +|----------|--------|-------| +| Unit | (default) | Fast | +| Integration | `@pytest.mark.integration` | Slow | +| Docker | `@pytest.mark.docker` | Requires Docker | +| LLM | `@pytest.mark.llm` | Requires API keys | + +--- + +## 5. Environment Variables + +For integration tests requiring real APIs: +- `OPENAI_API_KEY` - OpenAI tests +- `TAVILY_API_KEY` - Search tool tests +- `ANTHROPIC_API_KEY` - Caching tests with Claude