awesome-claude-code-toolkit/plugins/experiment-tracker/commands/compare.md at main - awesome-claude-code-toolkit - WEVAL Git

yanis/awesome-claude-code-toolkit

Files

Rohit Ghumare c3f43d8b61 Expand toolkit to 135 agents, 120 plugins, 796 total files

- Add 60 new agents across all 10 categories (75 -> 135)
- Add 95 new plugins with command files (25 -> 120)
- Update all agents to use model: opus
- Update README with complete plugin/agent tables
- Update marketplace.json with all 120 plugins

2026-02-04 21:08:28 +00:00

687 B

Raw Permalink Blame History

Compare multiple ML experiment runs side-by-side to identify the best configuration.

Steps

Load experiment records from the tracking store.
Select experiments to compare:
Build a comparison table:
Analyze parameter sensitivity:
Generate visualizations:
Identify the winning configuration:
Recommend next experiments to try.

Format

Comparison: <N> experiments
Best Run: <experiment name>
Key Findings:
  - <parameter X> has <impact> on <metric Y>

Rules

Only compare experiments with the same dataset version.
Use consistent metrics across all compared runs.
Statistical significance matters; do not draw conclusions from single runs.