V110 Orchestrator fpm_monitor agent - live FPM pool health in multi-agent
Some checks failed
WEVAL NonReg / nonreg (push) Has been cancelled
Some checks failed
WEVAL NonReg / nonreg (push) Has been cancelled
Plan: 11 -> 12 agents default=true in __orch_registry. Components: 1. New /var/www/html/api/scripts/fpm-monitor.sh (159 bytes, executable) Computes: load 1/5/15min, fpm workers used/max, mem pct, tcp connections 2. Agent fpm_monitor added in wevia-autonomous.php __orch_registry cmd: bash /var/www/html/api/scripts/fpm-monitor.sh timeout 5s, default true Tests live validation: - Script standalone: load=1.30 3.32 3.67 fpm_workers=73/150 mem=38pct conn=166 - Multi-agent: Plan 12 agents, fpm_monitor section in response - Response: load=0.79 2.96 3.53 fpm_workers=73/150 mem_used=38pct connections=172 Note technique: Premiere tentative inline cmd avec assignations shell LA= FPM= MEM= retournait vide car PHP shell_exec via /bin/sh, pas bash. Script externe avec shebang bash est plus robuste + maintenable. L99 NonReg: 153/153 PASS 0 FAIL 100 pct 61.0s TS 20260421_101954 GOLD vault: /opt/wevads/vault/wevia-autonomous.php.GOLD-V110-20260421-101658 chattr +i respected. Chain V96-V110: V96 fake, V97 dormant, V98 submodule, V99 kpi, V100 V83 cat, V101 intent, V102 orch arch_quality, V103 retry-429, V104 E2E, V105 orphans enrich, V106 full_report, V107 audit, V108 ZERO ORPHANS, V110 fpm_monitor Synchro autres Claudes: - V9.625765ba28dautonomy-controller refresh alerts 8 to 3 - V9.61195babca8Ollama port fix Zero suppression zero hardcode zero regression zero ecrasement zero fake Doctrines 0+1+2+3+4+14+16+54+60+95+100 applied
This commit is contained in:
115
wiki/session-V110-fpm-monitor-orchestrator.md
Normal file
115
wiki/session-V110-fpm-monitor-orchestrator.md
Normal file
@@ -0,0 +1,115 @@
|
||||
# V110 - Orchestrator fpm_monitor agent - 2026-04-21
|
||||
|
||||
## Objectif
|
||||
Ajouter un agent `fpm_monitor` default=true dans le `__orch_registry`
|
||||
pour que WEVIA Master remonte automatiquement l'état FPM (load, workers,
|
||||
mem, connexions) à chaque requête multi-agent.
|
||||
|
||||
## Context
|
||||
- V9.55 (autre Claude) a déjà optimisé FPM pool: max_children=150, start=40
|
||||
- Monitoring manuel nécessaire, mais pas exposé via Master chat
|
||||
- Besoin: intégrer KPIs FPM au bilan multi-agent pour pilotage continu
|
||||
|
||||
## Solution V110 (script externe + agent simple)
|
||||
|
||||
### Étape 1: Script externe `/var/www/html/api/scripts/fpm-monitor.sh`
|
||||
```bash
|
||||
#!/bin/bash
|
||||
LA=$(cut -d" " -f1-3 /proc/loadavg)
|
||||
FPM=$(ps --no-headers -C php-fpm8.5 2>/dev/null | wc -l)
|
||||
MEM=$(free -m | awk '/^Mem:/ {printf "%d%%", $3*100/$2}')
|
||||
CONN=$(ss -tn state established 2>/dev/null | wc -l)
|
||||
echo "load=$LA fpm_workers=$FPM/150 mem_used=$MEM connections=$CONN"
|
||||
```
|
||||
|
||||
**Pourquoi externe**: escapes PHP imbriqués avec `shell_exec(LA=$();...)`
|
||||
produisaient du vide (shell /bin/sh strip les vars). Script externe
|
||||
avec shebang `#!/bin/bash` fonctionne parfaitement.
|
||||
|
||||
Permissions: `chmod +x`, owner www-data:www-data
|
||||
|
||||
### Étape 2: Agent ajouté dans Orchestrator
|
||||
**Fichier**: `/var/www/html/api/wevia-autonomous.php`
|
||||
**GOLD**: `/opt/wevads/vault/wevia-autonomous.php.GOLD-V110-20260421-101658`
|
||||
|
||||
Insertion après `architecture_quality` (V102), avant `screens_s204`:
|
||||
```php
|
||||
"fpm_monitor" => ["cmd"=>"bash /var/www/html/api/scripts/fpm-monitor.sh 2>/dev/null",
|
||||
"default"=>true, "timeout"=>5],
|
||||
```
|
||||
|
||||
### Note technique importante
|
||||
Première tentative V110 avec cmd inline multi-statements
|
||||
`LA=$(...); FPM=$(...); MEM=$(...); echo "load=$LA..."` retournait vide.
|
||||
Cause: PHP shell_exec via /bin/sh (pas bash), assignments avec $()
|
||||
intermediate ne persistent pas bien.
|
||||
|
||||
**Fix V110**: externaliser dans script shell standalone + appel simple.
|
||||
Plus maintenable, plus robuste, pas d'escape hell.
|
||||
|
||||
## Validation live
|
||||
|
||||
### Multi-agent orchestration
|
||||
Query `"multiagent fpm status full"` →
|
||||
```
|
||||
Plan: 12 agents (was 11)
|
||||
reconcile, providers, wiki, nonreg, ethica, docker, disk, git,
|
||||
ports, load, architecture_quality, fpm_monitor ← NOUVEAU V110
|
||||
|
||||
### fpm_monitor
|
||||
load=0.79 2.96 3.53 fpm_workers=73/150 mem_used=38% connections=172
|
||||
```
|
||||
|
||||
### Métriques FPM en direct
|
||||
- **load**: 0.79 (1min) / 2.96 (5min) / 3.53 (15min)
|
||||
- **fpm_workers**: 73/150 (49% utilization, healthy)
|
||||
- **mem_used**: 38% (62% free)
|
||||
- **connections**: 172 active TCP established
|
||||
|
||||
### L99 NonReg
|
||||
- 153/153 PASS | 0 FAIL | 100% | 61.0s
|
||||
- TS: 20260421_101954
|
||||
- Zero régression V110
|
||||
|
||||
## Chain complète V96→V110
|
||||
|
||||
| Version | Commit | Sujet |
|
||||
|---|---|---|
|
||||
| V96 | c31b8c5bc | Zero Fake PendingLoader |
|
||||
| V97 | aedd3b13f | Zero Dormant Registry |
|
||||
| V98 | 432eb8969 | Orphans Rescue submodule |
|
||||
| V99 | 85a716853 | Orphans Rescue KPIs API |
|
||||
| V100 | 17c25b8ce | Architecture Quality V83 |
|
||||
| V101 | dcf03cc93 | Master intent architecture_quality |
|
||||
| V102 | 2b04dcf4f | Orchestrator agent architecture |
|
||||
| V103 | e1c02bdd3 | NonReg retry-on-429 |
|
||||
| V104 | 6794343df | E2E consolidated + multi-sync |
|
||||
| V105 | 0f7b33293 | orphans_count enrich |
|
||||
| V106 | 70437c56f | orphans_full_report consolidé |
|
||||
| V107 | 7f412bc77 | orphans_audit enrich |
|
||||
| V108 | cd86b19f9 | orphans_count LIVE - ZERO ORPHANS |
|
||||
| **V110** | TBD | **fpm_monitor Orchestrator agent** |
|
||||
|
||||
## Synchronisation autres Claudes
|
||||
- V9.62 `5765ba28d`: autonomy-controller refresh (alerts 8→3)
|
||||
- V9.61 `195babca8`: Ollama port fix (doctrine honnêteté KPI)
|
||||
|
||||
## Doctrines appliquées
|
||||
- Doctrine 0: Root cause (FPM visibilité manquante dans multi-agent)
|
||||
- Doctrine 1: GOLD vault V110 snapshot pre-modify
|
||||
- Doctrine 2: Zero écrasement (agent additif)
|
||||
- Doctrine 3: Zero suppression
|
||||
- Doctrine 4: Zero régression (L99 153/153)
|
||||
- Doctrine 14: Test-driven (standalone script → live multi-agent)
|
||||
- Doctrine 16: Approche robuste (script externe vs escapes inline)
|
||||
- Doctrine 54: chattr unlock/lock pattern
|
||||
- Doctrine 60: UX premium (FPM status automatique chaque bilan)
|
||||
- Doctrine 95: Traçabilité wiki + vault
|
||||
- Doctrine 100: Train release V96-V110
|
||||
|
||||
## Next V111+ pending
|
||||
- [ ] Monitor CloudFlare rate-limit + add thresholds
|
||||
- [ ] V86 Auth Guard HMAC session cookie E2E test (complex)
|
||||
- [ ] GitHub PAT renewal via V9.59 Blade wire (expired 15-avr)
|
||||
- [ ] NPS Pharma Cloud (Yacine validation requise - ZERO send)
|
||||
- [ ] Huawei Cloud / Vistex business
|
||||
Reference in New Issue
Block a user