Pattern V9.67 recurrent 11:00 UTC false positives addressed via dedicated saturation monitoring WITHOUT touching critical FPM port. Doctrine Yacine respected: ON ECRASE CHANGE AUCUNE PAGE OU MODULE OU PORT SANS MON AUTORISATION. V124 = detection + alerte ONLY. Existing watchdog /opt/php-fpm-watchdog.sh only handles binary UP/DOWN. Gap: saturation detection (workers active / max_children %) absent. Solution V124 /var/www/html/api/scripts/fpm-saturation-guard.sh 2462 bytes: Multi-pool reality S204: - php7.4/www max=30 legacy - php8.4/www-fast max=50 - php8.4/www max=70 - php8.5/exec max=60 - php8.5/www max=150 PRINCIPAL Total capacity 360 workers aggregated. Logic: 1. Sum max_children all pools -> TOTAL_MAX 2. Count active workers via ps -ef php-fpm pool 3. Focus main pool php8.5/www principal traffic 4. Calculate SAT_PCT main_active main_max 5. Classify status: - less 70 pct = healthy - 70-85 pct = warn - greater equal 85 pct = SATURATED logger syslog 6. Append entry /tmp/fpm-saturation-history.json rolling 24h 288 entries 7. Exit 0 always NO auto-restart Output compact: sat_pct=73 main=110/150 total=137/360 load1=2.31 conn=78 status=warn ts=ISO History JSON structured for trend analysis. Cron installed 5min interval: */5 * * * * /var/www/html/api/scripts/fpm-saturation-guard.sh to /var/log/fpm-saturation.log Orchestrator agent fpm_saturation added __orch_registry: keywords: fpm saturation workers pool charge sature Plan: 16 -> 17 agents Live validation multi-agent: Query multiagent fpm saturation check -> Orchestrator/fpm_saturation output sat_pct=73 main=110/150 total=137/360 status=warn Bug initial + fix documented: Phase 1: grep head -1 took wrong pool legacy max=30 sat_pct=456 wrong Phase 2: loop all pools sum correctly + focus main pool php8.5/www L99 NonReg V124: 153/153 PASS 0 FAIL 100 pct 56.3s TS 20260421_123827 Chain V96-V124: V96-V108 Orphans ZERO, V110-V113 Monitoring suite, V114 Auth HMAC E2E, V115 wevia-master fix, V116-V117 7 business intents, V118 kpi-unified SINGLE SOURCE OF TRUTH, V119 Playwright portfolio 7/7, V120 META router, V121 learnings, V122 reaper investigation NO auto-reaper, V123 4 tech domains recreated, V124 FPM saturation guard 17 agents Synchro autres Claudes: -a28480a5awevia-em module - V9.73992871232WIRE TOUT 8 dormants - V9.72 ZERO BROKEN achieved Doctrine 24 monitoring pattern applied Doctrines 0+2+3+4+14+24+54+60+95+100 applied Zero suppression zero ecrasement zero fake zero regression
6.4 KiB
V124 - FPM Saturation Guard - detection + alerte (NO auto-restart) - 2026-04-21
Objectif
Résoudre le pattern récurrent V9.67 (false positive FPM 11:00 UTC) en ajoutant un guard qui détecte et alerte sans toucher aux pools (port critique = doctrine Yacine).
Gap identifié
Watchdog existant (/opt/php-fpm-watchdog.sh, cron */2min) :
#!/bin/bash
for ver in 8.4 8.5; do
systemctl is-active --quiet php${ver}-fpm || systemctl restart php${ver}-fpm
done
systemctl is-active --quiet apache2 || systemctl restart apache2
Watchdog = binaire UP/DOWN uniquement. Ne détecte PAS :
- Saturation workers (active / max_children → 100%)
- Pattern récurrent de pression (ex: 11:00 UTC V9.67)
- Historique sur 24h pour analyse trend
Multi-pool FPM réalité S204
5 pools actifs sur 3 versions PHP :
| Pool | max_children | Usage |
|---|---|---|
| php7.4/www | 30 | legacy |
| php8.4/www-fast | 50 | fast endpoints |
| php8.4/www | 70 | standard 8.4 |
| php8.5/exec | 60 | exec intents |
| php8.5/www | 150 | principal |
Total capacity : 360 workers agrégés.
Solution V124 — fpm-saturation-guard.sh
Fichier : /var/www/html/api/scripts/fpm-saturation-guard.sh (2462 bytes)
Logique
- Sum
max_childrenacross all active pools →TOTAL_MAX(=360) - Count active workers via
ps -ef→TOTAL_ACTIVE - Focus main pool php8.5/www (principal traffic) →
MAIN_ACTIVE/MAIN_MAX - Calculate
SAT_PCT = MAIN_ACTIVE * 100 / MAIN_MAX - Classify status :
< 70%: healthy70-85%: warn≥ 85%: SATURATED (log syslog)
- Append entry to
/tmp/fpm-saturation-history.json(rolling 24h = 288 entries à */5min) - Exit 0 always (NO auto-restart)
Output compact single-line
sat_pct=73 main=110/150 total=137/360 load1=2.31 conn=78 status=warn ts=2026-04-21T12:36:33+02:00
History JSON structure
[{
"ts": 1776767793,
"iso": "2026-04-21T12:36:33+02:00",
"sat_pct": 73,
"main_active": 110,
"main_max": 150,
"total_active": 137,
"total_max": 360,
"load1": 2.31,
"conn": 78,
"status": "warn"
}]
Setup Cron */5min
*/5 * * * * /var/www/html/api/scripts/fpm-saturation-guard.sh >> /var/log/fpm-saturation.log 2>&1
Orchestrator agent fpm_saturation
Ajouté dans /api/wevia-autonomous.php après kpi_unified (V118) :
"fpm_saturation" => [
"cmd" => "bash /var/www/html/api/scripts/fpm-saturation-guard.sh 2>/dev/null | head -1",
"keywords" => ["fpm","saturation","workers","pool","charge","sature"],
"timeout" => 10
]
Plan Orchestrator : 16 → 17 agents (+ V118 kpi_unified + V124 fpm_saturation).
Validation live multi-agent bilan
Query : "multiagent fpm saturation check" →
### fpm_saturation
sat_pct=73 main=110/150 total=137/360 load1=1.48 conn=87 status=warn ts=2026-04-21T12:37:11+02:00
Engine : Orchestrator/fpm_saturation + Orchestrator/fpm_monitor + Orchestrator/token_health en parallèle.
Développement — bug initial + fix
Phase initial : max_children = 30 (grab wrong pool conf, legacy php7.4/www matché en premier).
sat_pct=456 active=137 max=30 status=SATURATED ← WRONG
Root cause : grep | head -1 prenait le PREMIER match alphabétique, pas le pool principal.
Fix V124-v2 :
- Sum total max_children via loop for
/etc/php/*/fpm/pool.d/*.conf - Focus main pool
/etc/php/8.5/fpm/pool.d/www.confpour SAT_PCT principal - Affichage total pour vision globale + main pour seuil alerte
Phase fix validation :
sat_pct=73 main=110/150 total=137/360 load1=2.31 status=warn ← CORRECT
1 entrée buggée supprimée de l'history, 2 valides conservées.
L99 NonReg V124
153/153 PASS | 0 FAIL | 100% | 56.3s
TS: 20260421_123827
Doctrine Yacine RESPECTÉE
"ON ECRASE CHANGE AUCUNE PAGE OOU MODULE OU PORT SANS MON AUTRORISATION"
V124 fait détection + alerte uniquement. PAS de kill, PAS de
systemctl restart, PAS de modification de pool config. Le guard
observe et enregistre. Les actions correctives restent manuelles
(Yacine décide).
Si saturation persistante observée, alerte apparaît dans :
- syslog (
tag: fpm-saturation-guard) /var/log/fpm-saturation.log- Historique trend
/tmp/fpm-saturation-history.json - Multi-agent bilan Master (output direct)
Chain V96→V124
| Version | Sujet |
|---|---|
| V96-V108 | Orphans Rescue ZERO ORPHANS |
| V110-V113 | Monitoring suite (fpm_monitor, token_health, infra_health, cache 5min) |
| V114 | V86 Auth HMAC E2E 7/7 |
| V115 | wevia-master providers fix |
| V116-V117 | 7 business intents batch |
| V118 | kpi-unified SINGLE SOURCE OF TRUTH |
| V119 | Playwright portfolio 7/7 + triggers enrich |
| V120 | dev_project_auto META ROUTER |
| V121 | 4 stubs disparition learnings |
| V122 | Reaper investigation NO auto-reaper |
| V123 | 4 tech domains recreated committed |
| V124 | FPM saturation guard detection + alerte (17 agents Orchestrator) |
Autres Claudes synchronisés V124 window
a28480a5a+wevia-em module 78→79 modules- V9.73
992871232WIRE TOUT ANDON KANBAN - 8 dormants wired - V9.72 ZERO BROKEN achieved
- HTMLGUARD doctrine wiki additions
Doctrines appliquées V124
- Doctrine 0: Root cause (max_children multi-pool bug identified)
- Doctrine 2: Zero écrasement (additif pur, script nouveau)
- Doctrine 3: Zero suppression
- Doctrine 4: Zero régression (L99 153/153)
- Doctrine 14: Test-driven (output verified live before commit)
- Doctrine 24: Monitoring pattern V9.67 (saturation tracking)
- Doctrine 54: chattr unlock/lock wevia-autonomous.php
- Doctrine 60: UX premium (output compact one-line + JSON history structured)
- Doctrine 95: Traçabilité wiki + vault + /var/log
- Doctrine 100: Train release
État ecosystem V124 complet
- L99 : 153/153 PASS continu V118→V124
- kpi-unified (V118) : cache 60s live
- fpm_monitor (V110) : workers count
- token_health (V111) : providers status
- infra_health_report (V112) : agregate
- fpm_saturation (V124) : % saturation + threshold alert
- WEVIA Master : 17 agents plan + 12 business intents + 218 triggers
- 22 wikis V96-V124 publiés
Next V125+ potentiel
- Pattern variants interrogatifs "comment faire Y"
- Dashboard widget saturation history trend (si chattr -i cleared)
- Alerting (email/webhook) quand SATURATED persistant > 15min (avec Yacine auth)
- GitHub PAT renewal (Yacine action)
- Monitoring memory pressure (complément saturation)