- Add 60 new agents across all 10 categories (75 -> 135) - Add 95 new plugins with command files (25 -> 120) - Update all agents to use model: opus - Update README with complete plugin/agent tables - Update marketplace.json with all 120 plugins
687 B
687 B
Compare multiple ML experiment runs side-by-side to identify the best configuration.
Steps
- Load experiment records from the tracking store.
- Select experiments to compare:
- Build a comparison table:
- Analyze parameter sensitivity:
- Generate visualizations:
- Identify the winning configuration:
- Recommend next experiments to try.
Format
Comparison: <N> experiments
Best Run: <experiment name>
Key Findings:
- <parameter X> has <impact> on <metric Y>
Rules
- Only compare experiments with the same dataset version.
- Use consistent metrics across all compared runs.
- Statistical significance matters; do not draw conclusions from single runs.