* fix: fill implementation gaps across core modules - Replace ConfidenceChecker placeholder methods with real implementations that search the codebase for duplicates, verify architecture docs exist, check research references, and validate root cause specificity - Fix intelligent_execute() error capture: collect actual errors from failed tasks instead of hardcoded None, format tracebacks as strings, and fix variable shadowing bug where loop var overwrote task parameter - Implement ReflexionPattern mindbase integration via HTTP API with graceful fallback when service is unavailable - Fix .gitignore: remove duplicate entries, add explicit !-rules for .claude/settings.json and .claude/skills/, remove Tests/ ignore - Remove unnecessary sys.path hack in cli/main.py - Fix FailureEntry.from_dict to not mutate input dict - Add comprehensive execution module tests: 62 new tests covering ParallelExecutor, ReflectionEngine, SelfCorrectionEngine, and the intelligent_execute orchestrator (136 total, all passing) https://claude.ai/code/session_01AnGJMAA6Qp2j9WKKHHZfB9 * chore: include test-generated reflexion artifacts https://claude.ai/code/session_01AnGJMAA6Qp2j9WKKHHZfB9 * fix: address 5 open GitHub issues (#536, #537, #531, #517, #534) Security fixes: - #536: Remove shell=True and user-controlled $SHELL from _run_command() to prevent arbitrary code execution. Use direct list-based subprocess.run without passing full os.environ to child processes. - #537: Add SHA-256 integrity verification for downloaded docker-compose and mcp-config files. Downloads are deleted on hash mismatch. Gateway config supports pinned hashes via docker_compose_sha256/mcp_config_sha256. Bug fixes: - #531: Add agent file installation to `superclaude install` and `update` commands. 20 agent markdown files are now copied to ~/.claude/agents/ alongside command installation. - #517: Fix MCP env var flag from --env to -e for API key passthrough, matching the Claude CLI's expected format. Usability: - #534: Replace Japanese trigger phrases and report labels in pm-agent.md and pm.md (both src/ and plugins/) with English equivalents for international accessibility. https://claude.ai/code/session_01AnGJMAA6Qp2j9WKKHHZfB9 * docs: align documentation with Claude Code and fix version/count gaps - Update CLAUDE.md project structure to include agents/ (20 agents), modes/ (7 modes), commands/ (30 commands), skills/, hooks/, mcp/, and core/ directories. Add Claude Code integration points section. - Fix version references: 4.1.5 -> 4.2.0 in installation.md, quick-start.md, and package.json (was 4.1.7) - Fix feature counts across all docs: - Commands: 21 -> 30 - Agents: 14/16 -> 20 - Modes: 6 -> 7 - MCP Servers: 6 -> 8 - Update README.md agent count from 16 to 20 - Add docs/user-guide/claude-code-integration.md explaining how SuperClaude maps to Claude Code's native features (commands, agents, hooks, skills, settings, MCP servers, pytest plugin) https://claude.ai/code/session_01AnGJMAA6Qp2j9WKKHHZfB9 * chore: update test-generated reflexion log https://claude.ai/code/session_01AnGJMAA6Qp2j9WKKHHZfB9 * docs: comprehensive Claude Code gap analysis and integration guide - Rewrite docs/user-guide/claude-code-integration.md with full feature mapping: all 28 hook events, skills system with YAML frontmatter, 5 settings scopes, permission rules, plan mode, extended thinking, agent teams, voice, desktop features, and session management. Includes detailed gap table showing where SuperClaude under-uses Claude Code capabilities (skills migration, hooks integration, plan mode, settings profiles). - Add Claude Code native features section to CLAUDE.md with extension points we use vs should use more (hooks, skills, plan mode, settings) - Add Claude Code integration gap analysis to KNOWLEDGE.md with prioritized action items for skills migration, hooks leverage, plan mode integration, and settings profiles https://claude.ai/code/session_01AnGJMAA6Qp2j9WKKHHZfB9 * chore: update test-generated reflexion log https://claude.ai/code/session_01AnGJMAA6Qp2j9WKKHHZfB9 * chore: bump version to 4.3.0 Bump version across all 15 files: - VERSION, pyproject.toml, package.json - src/superclaude/__init__.py, src/superclaude/__version__.py - CLAUDE.md, PLANNING.md, TASK.md, CHANGELOG.md - README.md, README-zh.md, README-ja.md, README-kr.md - docs/getting-started/installation.md, quick-start.md - docs/Development/pm-agent-integration.md Also fixes __version__.py which was out of sync at 0.4.0. Adds comprehensive CHANGELOG entry for v4.3.0. https://claude.ai/code/session_01AnGJMAA6Qp2j9WKKHHZfB9 * i18n: replace all Japanese/Chinese text with English in source files Replace CJK text with English across all non-translation files: - src/superclaude/commands/pm.md: 38 Japanese strings in PDCA cycle, error handling patterns, anti-patterns, document templates - src/superclaude/agents/pm-agent.md: 20 Japanese strings in PDCA phases, self-evaluation, documentation sections - plugins/superclaude/: synced from src/ copies - .github/workflows/readme-quality-check.yml: all Chinese comments, table headers, report strings, and PR comment text - .github/workflows/pull-sync-framework.yml: Japanese comment - .github/PULL_REQUEST_TEMPLATE.md: complete rewrite from Japanese Translation files (README-ja.md, docs/user-guide-jp/, etc.) are intentionally kept in their respective languages. https://claude.ai/code/session_01AnGJMAA6Qp2j9WKKHHZfB9 --------- Co-authored-by: Claude <noreply@anthropic.com>
139 lines
4.7 KiB
Python
139 lines
4.7 KiB
Python
"""
|
|
Integration tests for the execution engine orchestrator
|
|
|
|
Tests intelligent_execute, quick_execute, and safe_execute functions
|
|
that combine reflection, parallel execution, and self-correction.
|
|
"""
|
|
|
|
import pytest
|
|
|
|
from superclaude.execution import intelligent_execute, quick_execute, safe_execute
|
|
|
|
|
|
class TestQuickExecute:
|
|
"""Test quick_execute convenience function"""
|
|
|
|
def test_quick_execute_simple_ops(self):
|
|
"""Quick execute should run simple operations and return results"""
|
|
results = quick_execute([
|
|
lambda: "result_a",
|
|
lambda: "result_b",
|
|
lambda: 42,
|
|
])
|
|
|
|
assert results == ["result_a", "result_b", 42]
|
|
|
|
def test_quick_execute_empty(self):
|
|
"""Quick execute with no operations should return empty list"""
|
|
results = quick_execute([])
|
|
assert results == []
|
|
|
|
def test_quick_execute_single(self):
|
|
"""Quick execute with single operation"""
|
|
results = quick_execute([lambda: "only"])
|
|
assert results == ["only"]
|
|
|
|
|
|
class TestIntelligentExecute:
|
|
"""Test the intelligent_execute orchestrator"""
|
|
|
|
def test_execute_with_clear_task(self, tmp_path):
|
|
"""Clear task with simple operations should succeed"""
|
|
# Create PROJECT_INDEX.md so context check passes
|
|
(tmp_path / "PROJECT_INDEX.md").write_text("# Index")
|
|
(tmp_path / "docs" / "memory").mkdir(parents=True, exist_ok=True)
|
|
|
|
result = intelligent_execute(
|
|
task="Create a new function called validate_email in validators.py",
|
|
operations=[lambda: "validated"],
|
|
context={
|
|
"project_index": "loaded",
|
|
"current_branch": "main",
|
|
"git_status": "clean",
|
|
},
|
|
repo_path=tmp_path,
|
|
)
|
|
|
|
assert result["status"] in ("success", "blocked")
|
|
assert "confidence" in result
|
|
|
|
def test_execute_blocked_by_low_confidence(self, tmp_path):
|
|
"""Vague task should be blocked by reflection engine"""
|
|
(tmp_path / "docs" / "memory").mkdir(parents=True, exist_ok=True)
|
|
|
|
result = intelligent_execute(
|
|
task="fix",
|
|
operations=[lambda: "done"],
|
|
repo_path=tmp_path,
|
|
)
|
|
|
|
# Very short vague task may get blocked
|
|
assert result["status"] in ("blocked", "success", "partial_failure")
|
|
assert "confidence" in result
|
|
|
|
def test_execute_with_failing_operation(self, tmp_path):
|
|
"""Failing operation should trigger self-correction"""
|
|
(tmp_path / "PROJECT_INDEX.md").write_text("# Index")
|
|
(tmp_path / "docs" / "memory").mkdir(parents=True, exist_ok=True)
|
|
|
|
def failing():
|
|
raise ValueError("Test failure")
|
|
|
|
result = intelligent_execute(
|
|
task="Create validation endpoint in api/validate.py",
|
|
operations=[lambda: "ok", failing],
|
|
context={
|
|
"project_index": "loaded",
|
|
"current_branch": "main",
|
|
"git_status": "clean",
|
|
},
|
|
repo_path=tmp_path,
|
|
auto_correct=True,
|
|
)
|
|
|
|
assert result["status"] in ("partial_failure", "blocked", "failed")
|
|
|
|
def test_execute_no_auto_correct(self, tmp_path):
|
|
"""Disabling auto_correct should skip self-correction phase"""
|
|
(tmp_path / "PROJECT_INDEX.md").write_text("# Index")
|
|
(tmp_path / "docs" / "memory").mkdir(parents=True, exist_ok=True)
|
|
|
|
result = intelligent_execute(
|
|
task="Create helper function in utils.py for date formatting",
|
|
operations=[lambda: "done"],
|
|
context={
|
|
"project_index": "loaded",
|
|
"current_branch": "main",
|
|
"git_status": "clean",
|
|
},
|
|
repo_path=tmp_path,
|
|
auto_correct=False,
|
|
)
|
|
|
|
assert result["status"] in ("success", "blocked")
|
|
|
|
|
|
class TestSafeExecute:
|
|
"""Test safe_execute convenience function"""
|
|
|
|
def test_safe_execute_success(self, tmp_path):
|
|
"""Safe execute should return result on success"""
|
|
(tmp_path / "PROJECT_INDEX.md").write_text("# Index")
|
|
(tmp_path / "docs" / "memory").mkdir(parents=True, exist_ok=True)
|
|
|
|
try:
|
|
result = safe_execute(
|
|
task="Create user validation function in validators.py",
|
|
operation=lambda: "validated",
|
|
context={
|
|
"project_index": "loaded",
|
|
"current_branch": "main",
|
|
"git_status": "clean",
|
|
},
|
|
)
|
|
# If it proceeds, should get result
|
|
assert result is not None
|
|
except RuntimeError:
|
|
# If blocked by low confidence, that's also valid
|
|
pass
|