docs: add SSH unblock runbook and update final execution reports
Co-authored-by: Yacineutt <Yacineutt@users.noreply.github.com>
This commit is contained in:
@@ -58,3 +58,13 @@ Le dispositif est maintenant en place:
|
||||
- **Avant chaque batch multi-install**: run preflight obligatoire
|
||||
- **Si FAIL > 0**: NO-GO automatique
|
||||
|
||||
---
|
||||
|
||||
## 4) Mise a jour execution 2026-03-10
|
||||
|
||||
- **WEVADS v2 backend deploie sur S88** (`wevads-v2-backend` actif)
|
||||
- **Endpoint public OK**: `https://weval-consulting.com/api/v2/health` (HTTP 200)
|
||||
- **Ethica fiabilisation**: ajout du script fallback multi-sources et crons (1sante + Tabibi listing)
|
||||
- **Non-reg strict revalide**: `reports/nonreg_20260309_232943.md` (PASS, 0 FAIL)
|
||||
- **Blocage restant**: preflight multi-install impossible tant que TCP/22 PMTA NAT reste KO (timeout/refused depuis S89)
|
||||
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
# RAPPORT FINAL AU DP CLAUDE - EXECUTION P0/P1/P2
|
||||
|
||||
Date: 2026-03-09
|
||||
Date: 2026-03-10
|
||||
Branche: `cursor/ethica-saas-chantiers-a789`
|
||||
Mode: execution reelle, gates stricts, zero regression
|
||||
|
||||
@@ -23,15 +23,16 @@ Composants couverts:
|
||||
|
||||
Rapports principaux:
|
||||
- `reports/p0_p1_p2_execution_20260309_224755.md` (execution complete)
|
||||
- `reports/nonreg_20260309_230416.md` (re-run strict final)
|
||||
- `reports/nonreg_20260309_232943.md` (strict final revalide apres changements S88/S89)
|
||||
|
||||
Synthese finale:
|
||||
- Anti-regression strict: **PASS (0 FAIL)**
|
||||
- WEVADS v2 backend: **DEPLOYE et expose** (`https://weval-consulting.com/api/v2/health` = 200)
|
||||
- Multi-install preflight: **FAIL (0 serveur ready)**
|
||||
- Verdict final: **CONDITIONNEL (1 blocage restant)**
|
||||
|
||||
Hard failure restant:
|
||||
1. Multi-install preflight: auth SSH KO (0 serveur ready)
|
||||
1. Multi-install preflight: serveurs PMTA/NAT injoignables (tcp/22 timeout ou refuse) et donc aucun lot `ready=YES`
|
||||
|
||||
---
|
||||
|
||||
@@ -67,21 +68,25 @@ Source: `reports/raw_20260309_224755/p2_api_results.json`
|
||||
### Anti-regression strict (revalide)
|
||||
|
||||
Rapport:
|
||||
- `reports/nonreg_20260309_230416.md`
|
||||
- `reports/nonreg_20260309_232943.md`
|
||||
|
||||
Resultat:
|
||||
- PASS global
|
||||
- 0 FAIL
|
||||
- GPU/API/Tracking OK avec API key
|
||||
|
||||
### Blocage unique - Multi-install preflight (SSH auth)
|
||||
### Blocage unique - Multi-install preflight (reseau/SSH vers serveurs PMTA NAT)
|
||||
|
||||
Rapport:
|
||||
- `reports/multiinstall_preflight_20260309_224901.csv` (lot 180-189)
|
||||
- `reports/multiinstall_preflight_20260309_230904.csv` (serveurs PMTA actifs DB)
|
||||
|
||||
Constat:
|
||||
- Port 22 reachable, mais auth SSH KO (`ssh_auth_failed`)
|
||||
- Depuis S89:
|
||||
- `110.238.76.155:22` => timeout
|
||||
- `122.8.135.130:22` => timeout
|
||||
- `204.168.152.13:22` => connection refused
|
||||
- Depuis agent cloud: certaines IP repondent sur 22 mais auth KO
|
||||
- 0 serveur `ready=YES` sur les lots testes
|
||||
- Contrainte respectee: aucune modification SSH/PMTA/JAR/multiInstall.js
|
||||
|
||||
@@ -96,12 +101,31 @@ Impact:
|
||||
- `nonreg-framework.sh` (gpu payload corrige -> `messages[]`)
|
||||
- `dp-release-gate.sh` (guardrails automatiques DP)
|
||||
- `REGLES_EXECUTION_OBLIGATOIRES.md` (politique blocante)
|
||||
- `RUNBOOK_SSH_AUTH_UNBLOCK_NO_GLOBAL_SSH_CHANGE.md` (mini runbook de deblocage)
|
||||
- `.gitignore` (artefacts temporaires ignores => 0 dirty)
|
||||
- `README.md` (ops scripts mis a jour)
|
||||
- artefacts d'execution dans `reports/`
|
||||
|
||||
---
|
||||
|
||||
## 6.1) Livrables operationnels executes (hors repo, sur serveurs)
|
||||
|
||||
1. **S88 - WEVADS v2 backend deploye**
|
||||
- service systemd: `wevads-v2-backend` => `active`
|
||||
- endpoint local: `http://127.0.0.1:5850/api/v2/health` => 200
|
||||
- endpoint public: `https://weval-consulting.com/api/v2/health` => 200
|
||||
- backups GOLD realises avant modification `.env` et nginx
|
||||
|
||||
2. **S89 - fiabilisation Ethica**
|
||||
- script fallback sources: `/opt/wevads/scripts/ethica/ethica-source-fallback.sh`
|
||||
- cron renfort:
|
||||
- fallback multi-sources toutes les 6h
|
||||
- 1sante toutes les 6h
|
||||
- Tabibi listing hebdomadaire
|
||||
- one-shot execute avec traces dans `/opt/wevads/logs/ethica-source-fallback.log`
|
||||
|
||||
---
|
||||
|
||||
## 7) Decision DP recommandee
|
||||
|
||||
**NO EXCUSE / ZERO REGRESSION** => garder la decision **CONDITIONNEL (1 blocage)** tant que:
|
||||
|
||||
@@ -14,3 +14,4 @@
|
||||
- `dp-release-gate.sh`: guardrail checks (forbidden touches, confidentiality, php-lint, cleanliness)
|
||||
- `CHANTIERS_RESTANTS_EXECUTION_PLAN.md`: execution plan and GO/NO-GO criteria
|
||||
- `REGLES_EXECUTION_OBLIGATOIRES.md`: mandatory execution policy agreed with DP
|
||||
- `RUNBOOK_SSH_AUTH_UNBLOCK_NO_GLOBAL_SSH_CHANGE.md`: SSH unblock steps without global SSH config changes
|
||||
|
||||
89
RUNBOOK_SSH_AUTH_UNBLOCK_NO_GLOBAL_SSH_CHANGE.md
Normal file
89
RUNBOOK_SSH_AUTH_UNBLOCK_NO_GLOBAL_SSH_CHANGE.md
Normal file
@@ -0,0 +1,89 @@
|
||||
# Mini runbook - deblocage SSH auth multi-install (sans toucher config SSH globale)
|
||||
|
||||
Date: 2026-03-10
|
||||
Scope: lever le blocage preflight multi-install avec zero modification de `sshd_config`, PMTA, Java/JAR, `multiInstall.js`.
|
||||
|
||||
## 1) Objectif
|
||||
|
||||
Obtenir au moins un lot `ready=YES` sur `multiinstall-safe-preflight.sh` en traitant uniquement:
|
||||
- disponibilite reseau des IP publiques PMTA
|
||||
- validite des credentials en base
|
||||
- hygiene des listes serveurs cibles
|
||||
|
||||
## 2) Prechecks (non intrusifs)
|
||||
|
||||
Depuis S89:
|
||||
|
||||
```bash
|
||||
# Reachability TCP/22 (reseau)
|
||||
timeout 5 bash -c "exec 3<>/dev/tcp/110.238.76.155/22" && echo OK || echo FAIL
|
||||
timeout 5 bash -c "exec 3<>/dev/tcp/122.8.135.130/22" && echo OK || echo FAIL
|
||||
timeout 5 bash -c "exec 3<>/dev/tcp/204.168.152.13/22" && echo OK || echo FAIL
|
||||
|
||||
# Credentials actifs en DB
|
||||
PGPASSWORD=admin123 psql -h 127.0.0.1 -U admin -d adx_system -c \
|
||||
"SELECT id,host,username,active,last_used FROM admin.pmta_servers ORDER BY id;"
|
||||
```
|
||||
|
||||
Decision:
|
||||
- si `TCP/22` FAIL => incident reseau/provider/NAT (pas un probleme d'auth)
|
||||
- si `TCP/22` OK + auth FAIL => credentials obsoletes ou policy host
|
||||
|
||||
## 3) Construction lot preflight propre
|
||||
|
||||
Ne pas lancer le preflight sur des hosts deja `TCP/22 FAIL`.
|
||||
|
||||
```bash
|
||||
cat > /tmp/servers_active_pmta.csv <<'CSV'
|
||||
ip,username,password
|
||||
110.238.76.155,root,<password_db>
|
||||
122.8.135.130,root,<password_db>
|
||||
CSV
|
||||
```
|
||||
|
||||
Puis:
|
||||
|
||||
```bash
|
||||
SERVERS_CSV=/tmp/servers_active_pmta.csv ./multiinstall-safe-preflight.sh
|
||||
```
|
||||
|
||||
## 4) Cas d'echec et action autorisee
|
||||
|
||||
### A) `TCP/22 timeout` ou `connection refused`
|
||||
|
||||
Action:
|
||||
1. Ouvrir ticket provider/NOC: verifier Security Group/ACL/firewall upstream/NAT rules.
|
||||
2. Verifier que l'instance est `running` cote provider.
|
||||
3. Revalider reachability avec test `/dev/tcp`.
|
||||
|
||||
Interdit:
|
||||
- ne pas modifier `sshd_config`
|
||||
- ne pas toucher PMTA
|
||||
|
||||
### B) `TCP/22 OK` mais `ssh_auth_failed`
|
||||
|
||||
Action:
|
||||
1. Revalider mot de passe root source-of-truth (DB + coffre).
|
||||
2. Tester `sshpass` depuis S89 vers 1 host.
|
||||
3. Mettre a jour uniquement le credential en DB si obsolete.
|
||||
|
||||
Interdit:
|
||||
- ne pas desactiver auth hardening SSH global
|
||||
- ne pas ouvrir des acces permanents non valides
|
||||
|
||||
## 5) Validation de sortie
|
||||
|
||||
Critere de deblocage minimal:
|
||||
- au moins un batch avec `ready=YES` dans le CSV de sortie preflight
|
||||
- rerun `./execute_all_p0_p1_p2.sh` avec lot valide
|
||||
- rerun `STRICT_CONFIDENTIALITY=1 API_KEY=... ./nonreg-framework.sh`
|
||||
|
||||
## 6) Etat observe pendant cette execution
|
||||
|
||||
- `110.238.76.155:22` timeout depuis S89
|
||||
- `122.8.135.130:22` timeout depuis S89
|
||||
- `204.168.152.13:22` connection refused depuis S89
|
||||
|
||||
Conclusion:
|
||||
- blocage principal actuel = reseau/NAT/provider
|
||||
- pas de correction possible cote repo sans violer les contraintes DP
|
||||
Reference in New Issue
Block a user