Add security hardening, Ethica fixes, anti-regression v2, sitemap, deploy script

Security:
- nginx security-headers.conf: HSTS, CSP, X-Frame-Options, Referrer-Policy
- nginx cors-strict.conf: domain whitelist (replaces wildcard *)
- nginx weval-api.conf: complete vhost with rate limiting

Ethica:
- logrotate config: daily rotation, max 50MB, 7 days retention
- Tabibi scraper fix: listing-based mode (replaces ID-based)
- Cron configuration: all Ethica scrapers + cleanup jobs

Anti-regression v2:
- 46+ automated checks in 7 categories
- Modes: --full, --quick, --api-only, --security-only
- JSON report output with Six Sigma scoring
- Categories: pages, confidentiality, APIs, WEVIA, security, tracking, load

Sitemap: 27 product URLs

Deploy script: master deployment to S88/S89/S202/S151

Co-authored-by: Yacineutt <Yacineutt@users.noreply.github.com>
This commit is contained in:
Cursor Agent
2026-03-09 22:35:28 +00:00
parent 463f2d232a
commit 6cd830f853
10 changed files with 1001 additions and 7 deletions

View File

@@ -1,7 +1,83 @@
# WEVADS GPU Server
- **IP**: 88.198.4.195
- **GPU**: NVIDIA RTX 4000 SFF Ada (20GB vRAM)
- **RAM**: 62GB DDR4
- **Disk**: 1.7TB NVMe
- **Ollama**: localhost:11434
- **Models**: deepseek-r1:8b, deepseek-r1:32b, llama3.1:8b
# WEVAL Platform — SaaS Activation & Security Hardening
## Infrastructure
- **S88** (88.198.4.195) — GPU Server: NVIDIA RTX 4000 SFF Ada (20GB vRAM), 62GB RAM, 1.7TB NVMe
- **S89** (89.167.40.150) — App Server: Apache, 424 APIs PHP, PostgreSQL 13, PMTA, Arsenal
- **S202** (204.168.152.13) — Ollama CPU (qwen2.5:3b, phi3:mini, gemma2:2b), PMTA relay, backups
- **S151** (151.80.235.110) — Tracking server, DR OVH
## Corrections appliquees (session courante)
| Categorie | Corrections | Status |
|-----------|------------|--------|
| Confidentialite pages | 0 OpenAI/Anthropic/Abbott/AbbVie/J&J | VERIFIE |
| IPs internes | 0 IP interne dans HTML | VERIFIE |
| API keys frontend | 0 cle hardcodee | VERIFIE |
| Modeles GPU | Alignes sur S202 (qwen2.5:3b, phi3:mini, gemma2:2b) | VERIFIE |
| Anthropic API calls | Reroutes vers /api/content/generate.php | VERIFIE |
| MedReach data | Chiffres masques, sources anonymisees, dates generiques | VERIFIE |
| WEVADS interne | 646/604/527/CX3/DoubleM supprimes | VERIFIE |
| Internationalisation | Casablanca/Maroc -> International | VERIFIE |
| Roadmap interne | Remplace par "Plan de deploiement" | VERIFIE |
## Structure du projet
```
/workspace/
├── weval-pages/ # Pages HTML corrigees (13 pages)
├── weval-scan/ # Snapshots de scan confidentialite
├── saas-backends/ # Backends SaaS deployables
│ ├── api-router.php # Routeur central
│ ├── auth-otp.php # Auth OTP (remplace email-only)
│ ├── lib/ # Librairies communes
│ ├── storeforge/ # E-commerce generator
│ ├── leadforge/ # Lead generation
│ ├── proposalai/ # Proposal generator
│ ├── blueprintai/ # Process/architecture docs
│ ├── mailwarm/ # Email warmup
│ ├── outreachai/ # Cold outreach AI
│ ├── formbuilder/ # Form generator
│ ├── emailverify/ # Email validation
│ └── migrations/ # SQL migrations
├── deploy/ # Configs de deploiement
│ ├── nginx/ # Security headers, CORS, vhost
│ ├── sitemap.xml # Sitemap 27 URLs
│ └── deploy-all.sh # Script de deploiement master
├── ethica/ # Ethica fiabilisation
│ ├── logrotate-ethica.conf
│ ├── ethica-scraper-fix.php
│ └── ethica-crons.sh
└── nonreg/ # Anti-regression framework
└── nonreg-framework-v2.sh
```
## Deploiement
```bash
# Tout deployer
./deploy/deploy-all.sh --all
# Deployer par composant
./deploy/deploy-all.sh --saas # SaaS backends
./deploy/deploy-all.sh --security # CORS/CSP/HSTS
./deploy/deploy-all.sh --ethica # Ethica fixes
./deploy/deploy-all.sh --sitemap # Sitemap
./deploy/deploy-all.sh --pages # HTML pages
./deploy/deploy-all.sh --nonreg # Anti-regression framework
```
## Anti-regression
```bash
# Test complet (46+ checks)
./nonreg/nonreg-framework-v2.sh --full
# Test rapide (pages + confidentialite + securite)
./nonreg/nonreg-framework-v2.sh --quick
# APIs uniquement
./nonreg/nonreg-framework-v2.sh --api-only
# Securite uniquement
./nonreg/nonreg-framework-v2.sh --security-only
```

171
deploy/deploy-all.sh Executable file
View File

@@ -0,0 +1,171 @@
#!/bin/bash
###############################################################################
# WEVAL Platform — Master Deployment Script
# Deploys: SaaS backends, security configs, Ethica fixes, sitemap
# Usage: ./deploy-all.sh [--saas|--security|--ethica|--sitemap|--all]
# Prerequisites: SSH access to S88, S89
###############################################################################
set -euo pipefail
S88="88.198.4.195"
S89="89.167.40.150"
S202="204.168.152.13"
S151="151.80.235.110"
SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
WORKSPACE_DIR="$(dirname "$SCRIPT_DIR")"
MODE="${1:---all}"
echo "=============================================="
echo " WEVAL Platform Deployment"
echo " Mode: $MODE"
echo " $(date '+%Y-%m-%d %H:%M:%S')"
echo "=============================================="
###############################################################################
# Deploy SaaS Backends to S89
###############################################################################
deploy_saas() {
echo ""
echo "=== Deploying SaaS Backends to S89 ==="
ssh root@$S89 "mkdir -p /var/www/weval/api/{storeforge,leadforge,proposalai,blueprintai,mailwarm,outreachai,formbuilder,emailverify,lib}"
scp "$WORKSPACE_DIR/saas-backends/api-router.php" root@$S89:/var/www/weval/api/products/api-router.php
scp "$WORKSPACE_DIR/saas-backends/lib/auth.php" root@$S89:/var/www/weval/api/lib/auth.php
scp "$WORKSPACE_DIR/saas-backends/lib/wevia-proxy.php" root@$S89:/var/www/weval/api/lib/wevia-proxy.php
for product in storeforge leadforge proposalai blueprintai mailwarm outreachai formbuilder emailverify; do
scp "$WORKSPACE_DIR/saas-backends/$product/api.php" root@$S89:/var/www/weval/api/$product/api.php
echo " Deployed: /api/$product/"
done
echo " Deploying OTP auth..."
ssh root@$S89 "cp /var/www/weval/api/products/auth.php /var/www/weval/api/products/auth.php.bak.$(date +%Y%m%d)"
scp "$WORKSPACE_DIR/saas-backends/auth-otp.php" root@$S89:/var/www/weval/api/products/auth.php
echo " Running DB migration..."
scp "$WORKSPACE_DIR/saas-backends/migrations/001_auth_otp.sql" root@$S89:/tmp/
ssh root@$S89 "PGPASSWORD=\$DB_PASSWORD psql -h 127.0.0.1 -U admin -d adx_system -f /tmp/001_auth_otp.sql"
ssh root@$S89 "chown -R www-data:www-data /var/www/weval/api/ && systemctl reload apache2"
echo " SaaS backends deployed."
}
###############################################################################
# Deploy Security Configs to S88
###############################################################################
deploy_security() {
echo ""
echo "=== Deploying Security Configs to S88 ==="
ssh root@$S88 "mkdir -p /etc/nginx/snippets"
scp "$WORKSPACE_DIR/deploy/nginx/security-headers.conf" root@$S88:/etc/nginx/snippets/security-headers.conf
scp "$WORKSPACE_DIR/deploy/nginx/cors-strict.conf" root@$S88:/etc/nginx/snippets/cors-strict.conf
ssh root@$S88 "cp /etc/nginx/sites-available/weval-api /etc/nginx/sites-available/weval-api.bak.$(date +%Y%m%d) 2>/dev/null || true"
scp "$WORKSPACE_DIR/deploy/nginx/weval-api.conf" root@$S88:/etc/nginx/sites-available/weval-api
ssh root@$S88 "nginx -t && systemctl reload nginx"
echo " Security configs deployed."
}
###############################################################################
# Deploy Ethica Fixes to S89
###############################################################################
deploy_ethica() {
echo ""
echo "=== Deploying Ethica Fixes to S89 ==="
scp "$WORKSPACE_DIR/ethica/logrotate-ethica.conf" root@$S89:/etc/logrotate.d/ethica
scp "$WORKSPACE_DIR/ethica/ethica-scraper-fix.php" root@$S89:/opt/wevadsapp/scrapers/ethica-tabibi-listing.php
ssh root@$S89 "chmod 644 /etc/logrotate.d/ethica && logrotate -f /etc/logrotate.d/ethica"
echo " Ethica fixes deployed."
}
###############################################################################
# Deploy Sitemap
###############################################################################
deploy_sitemap() {
echo ""
echo "=== Deploying Sitemap ==="
scp "$WORKSPACE_DIR/deploy/sitemap.xml" root@$S88:/var/www/weval/sitemap.xml
ssh root@$S88 "chown www-data:www-data /var/www/weval/sitemap.xml"
echo " Sitemap deployed (27 URLs)."
}
###############################################################################
# Deploy HTML Pages
###############################################################################
deploy_pages() {
echo ""
echo "=== Deploying Fixed HTML Pages to S88 ==="
ssh root@$S88 "mkdir -p /var/www/weval/products/backup-$(date +%Y%m%d)"
ssh root@$S88 "cp /var/www/weval/products/*.html /var/www/weval/products/backup-$(date +%Y%m%d)/ 2>/dev/null || true"
for page in "$WORKSPACE_DIR"/weval-pages/*.html; do
BASENAME=$(basename "$page")
if [ "$BASENAME" = "products-index.html" ]; then
scp "$page" root@$S88:/var/www/weval/products/index.html
else
scp "$page" root@$S88:/var/www/weval/products/$BASENAME
fi
echo " Deployed: /products/$BASENAME"
done
echo " HTML pages deployed."
}
###############################################################################
# Deploy Anti-Regression Framework
###############################################################################
deploy_nonreg() {
echo ""
echo "=== Deploying Anti-Regression Framework ==="
ssh root@$S88 "mkdir -p /opt/wevads/vault"
scp "$WORKSPACE_DIR/nonreg/nonreg-framework-v2.sh" root@$S88:/opt/wevads/vault/nonreg-framework-v2.sh
ssh root@$S88 "chmod +x /opt/wevads/vault/nonreg-framework-v2.sh"
echo " Anti-regression framework v2 deployed."
}
###############################################################################
# MAIN
###############################################################################
case $MODE in
--saas) deploy_saas ;;
--security) deploy_security ;;
--ethica) deploy_ethica ;;
--sitemap) deploy_sitemap ;;
--pages) deploy_pages ;;
--nonreg) deploy_nonreg ;;
--all)
deploy_saas
deploy_security
deploy_ethica
deploy_sitemap
deploy_pages
deploy_nonreg
;;
*)
echo "Usage: $0 [--saas|--security|--ethica|--sitemap|--pages|--nonreg|--all]"
exit 1
;;
esac
echo ""
echo "=============================================="
echo " Deployment complete."
echo " Run anti-regression tests:"
echo " ssh root@$S88 '/opt/wevads/vault/nonreg-framework-v2.sh --full'"
echo "=============================================="

View File

@@ -0,0 +1,20 @@
# WEVAL CORS Strict Configuration
# Replace wildcard (*) CORS with domain whitelist
# Usage: include /etc/nginx/snippets/cors-strict.conf;
set $cors_origin "";
set $cors_methods "GET, POST, OPTIONS";
set $cors_headers "Content-Type, X-API-Key, Authorization";
if ($http_origin ~* "^https://(weval-consulting\.com|www\.weval-consulting\.com|api\.weval-consulting\.com)$") {
set $cors_origin $http_origin;
}
add_header Access-Control-Allow-Origin $cors_origin always;
add_header Access-Control-Allow-Methods $cors_methods always;
add_header Access-Control-Allow-Headers $cors_headers always;
add_header Access-Control-Max-Age 86400 always;
if ($request_method = OPTIONS) {
return 204;
}

View File

@@ -0,0 +1,23 @@
# WEVAL Security Headers — Include in all server blocks
# Usage: include /etc/nginx/snippets/security-headers.conf;
# HSTS — Force HTTPS for 1 year including subdomains
add_header Strict-Transport-Security "max-age=31536000; includeSubDomains; preload" always;
# CSP — Content Security Policy
add_header Content-Security-Policy "default-src 'self'; script-src 'self' 'unsafe-inline' 'unsafe-eval' https://cdn.jsdelivr.net https://unpkg.com; style-src 'self' 'unsafe-inline' https://fonts.googleapis.com https://cdn.jsdelivr.net; font-src 'self' https://fonts.gstatic.com; img-src 'self' data: https: blob:; connect-src 'self' https://weval-consulting.com https://*.weval-consulting.com; frame-ancestors 'self'; object-src 'none'; base-uri 'self'" always;
# Prevent MIME type sniffing
add_header X-Content-Type-Options "nosniff" always;
# Clickjacking protection
add_header X-Frame-Options "SAMEORIGIN" always;
# XSS Protection
add_header X-XSS-Protection "1; mode=block" always;
# Referrer Policy
add_header Referrer-Policy "strict-origin-when-cross-origin" always;
# Permissions Policy
add_header Permissions-Policy "camera=(), microphone=(), geolocation=(), interest-cohort=()" always;

View File

@@ -0,0 +1,90 @@
# WEVAL API Nginx Configuration
# Deploy to: /etc/nginx/sites-available/weval-api
# Symlink: ln -s /etc/nginx/sites-available/weval-api /etc/nginx/sites-enabled/
server {
listen 443 ssl http2;
server_name weval-consulting.com www.weval-consulting.com;
root /var/www/weval;
index index.html index.php;
# SSL (managed by Certbot or Cloudflare)
# ssl_certificate /etc/letsencrypt/live/weval-consulting.com/fullchain.pem;
# ssl_certificate_key /etc/letsencrypt/live/weval-consulting.com/privkey.pem;
include /etc/nginx/snippets/security-headers.conf;
# Static files
location / {
try_files $uri $uri/ =404;
}
# Product pages
location /products/ {
try_files $uri $uri/ =404;
}
# WEVIA API
location /api/weval-ia {
include /etc/nginx/snippets/cors-strict.conf;
proxy_pass http://127.0.0.1:8080;
proxy_read_timeout 300s;
proxy_send_timeout 300s;
proxy_buffering off;
}
location /api/weval-ia-full {
include /etc/nginx/snippets/cors-strict.conf;
proxy_pass http://127.0.0.1:8080;
proxy_read_timeout 300s;
proxy_send_timeout 300s;
proxy_buffering off;
}
# SaaS APIs
location ~ ^/api/(deliverscore|medreach|gpu|content|products|storeforge|leadforge|proposalai|blueprintai|mailwarm|outreachai|formbuilder|emailverify)/ {
include /etc/nginx/snippets/cors-strict.conf;
fastcgi_pass unix:/var/run/php/php8.3-fpm.sock;
fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
include fastcgi_params;
fastcgi_read_timeout 300s;
fastcgi_send_timeout 300s;
fastcgi_buffering off;
}
# Guardian/Sentinel
location /api/guardian-scan.php {
include /etc/nginx/snippets/cors-strict.conf;
fastcgi_pass unix:/var/run/php/php8.3-fpm.sock;
fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
include fastcgi_params;
fastcgi_read_timeout 300s;
fastcgi_buffering off;
}
# Rate limiting zones
limit_req_zone $binary_remote_addr zone=auth:10m rate=5r/m;
limit_req_zone $binary_remote_addr zone=api:10m rate=30r/m;
location /api/products/auth.php {
limit_req zone=auth burst=3 nodelay;
include /etc/nginx/snippets/cors-strict.conf;
fastcgi_pass unix:/var/run/php/php8.3-fpm.sock;
fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
include fastcgi_params;
}
# Block direct access to internal configs
location ~ /\.(env|git|htaccess) {
deny all;
}
}
server {
listen 80;
server_name weval-consulting.com www.weval-consulting.com;
return 301 https://$host$request_uri;
}

30
deploy/sitemap.xml Normal file
View File

@@ -0,0 +1,30 @@
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url><loc>https://weval-consulting.com/</loc><changefreq>weekly</changefreq><priority>1.0</priority></url>
<url><loc>https://weval-consulting.com/solutions.html</loc><changefreq>monthly</changefreq><priority>0.8</priority></url>
<url><loc>https://weval-consulting.com/platform/</loc><changefreq>weekly</changefreq><priority>0.9</priority></url>
<url><loc>https://weval-consulting.com/wevia/</loc><changefreq>weekly</changefreq><priority>0.9</priority></url>
<url><loc>https://weval-consulting.com/products/</loc><changefreq>weekly</changefreq><priority>0.9</priority></url>
<url><loc>https://weval-consulting.com/products/deliverscore.html</loc><changefreq>monthly</changefreq><priority>0.8</priority></url>
<url><loc>https://weval-consulting.com/products/medreach.html</loc><changefreq>monthly</changefreq><priority>0.8</priority></url>
<url><loc>https://weval-consulting.com/products/gpu-inference.html</loc><changefreq>monthly</changefreq><priority>0.8</priority></url>
<url><loc>https://weval-consulting.com/products/content-factory.html</loc><changefreq>monthly</changefreq><priority>0.8</priority></url>
<url><loc>https://weval-consulting.com/products/proposalai.html</loc><changefreq>monthly</changefreq><priority>0.7</priority></url>
<url><loc>https://weval-consulting.com/products/blueprintai.html</loc><changefreq>monthly</changefreq><priority>0.7</priority></url>
<url><loc>https://weval-consulting.com/products/storeforge.html</loc><changefreq>monthly</changefreq><priority>0.7</priority></url>
<url><loc>https://weval-consulting.com/products/wevia-whitelabel.html</loc><changefreq>monthly</changefreq><priority>0.7</priority></url>
<url><loc>https://weval-consulting.com/products/arsenal.html</loc><changefreq>monthly</changefreq><priority>0.7</priority></url>
<url><loc>https://weval-consulting.com/products/wevads-ia.html</loc><changefreq>monthly</changefreq><priority>0.7</priority></url>
<url><loc>https://weval-consulting.com/products/academy.html</loc><changefreq>monthly</changefreq><priority>0.7</priority></url>
<url><loc>https://weval-consulting.com/products/wevads.html</loc><changefreq>monthly</changefreq><priority>0.7</priority></url>
<url><loc>https://weval-consulting.com/products/workspace.html</loc><changefreq>weekly</changefreq><priority>0.8</priority></url>
<url><loc>https://weval-consulting.com/products/leadforge.html</loc><changefreq>monthly</changefreq><priority>0.6</priority></url>
<url><loc>https://weval-consulting.com/products/mailwarm.html</loc><changefreq>monthly</changefreq><priority>0.6</priority></url>
<url><loc>https://weval-consulting.com/products/outreachai.html</loc><changefreq>monthly</changefreq><priority>0.6</priority></url>
<url><loc>https://weval-consulting.com/products/formbuilder.html</loc><changefreq>monthly</changefreq><priority>0.6</priority></url>
<url><loc>https://weval-consulting.com/products/emailverify.html</loc><changefreq>monthly</changefreq><priority>0.6</priority></url>
<url><loc>https://weval-consulting.com/products/deliverads.html</loc><changefreq>monthly</changefreq><priority>0.6</priority></url>
<url><loc>https://weval-consulting.com/products/affiliates.html</loc><changefreq>monthly</changefreq><priority>0.6</priority></url>
<url><loc>https://weval-consulting.com/products/mailforge.html</loc><changefreq>monthly</changefreq><priority>0.6</priority></url>
<url><loc>https://weval-consulting.com/products/canvasai.html</loc><changefreq>monthly</changefreq><priority>0.6</priority></url>
</urlset>

37
ethica/ethica-crons.sh Executable file
View File

@@ -0,0 +1,37 @@
#!/bin/bash
# Ethica Cron Configuration
# Deploy: copy entries to crontab -e on S89
cat << 'CRONS'
# === ETHICA SCRAPERS ===
# Mega scraper (Google + directories) — every 6h
0 */6 * * * /usr/bin/php /opt/wevadsapp/scrapers/ethica-mega-scraper.php >> /var/log/ethica-mega-scraper.log 2>&1
# Validator — every 30min
*/30 * * * * /usr/bin/php /opt/wevadsapp/scrapers/ethica-validator.php >> /var/log/ethica-validator.log 2>&1
# Full scraper — 1st and 15th of month
0 2 1,15 * * /usr/bin/php /opt/wevadsapp/scrapers/ethica-scraper-full.php >> /var/log/ethica-scraper-full.log 2>&1
# 1sante.com enricher — weekly
0 3 * * 1 /usr/bin/php /opt/wevadsapp/scrapers/ethica-1sante.php >> /var/log/ethica-1sante.log 2>&1
# Tabibi.tn listing mode — weekly (FIXED: listing-based instead of ID-based)
0 4 * * 2 /usr/bin/php /opt/wevadsapp/scrapers/ethica-tabibi-listing.php >> /var/log/ethica-tabibi.log 2>&1
# Email enricher — every 6h
0 1,7,13,19 * * * /usr/bin/php /opt/wevadsapp/scrapers/ethica-email-enricher.php >> /var/log/ethica-email-enricher.log 2>&1
# General enricher — every 5min
*/5 * * * * /usr/bin/php /opt/wevadsapp/scrapers/ethica-enricher-general.php >> /var/log/ethica-enricher.log 2>&1
# Google verify — every 30min
*/30 * * * * /usr/bin/php /opt/wevadsapp/scrapers/ethica-google-verify.php >> /var/log/ethica-google-verify.log 2>&1
# === CLEANUP ===
# OTP cleanup — hourly
0 * * * * psql -h 127.0.0.1 -U admin -d adx_system -c "DELETE FROM auth_otp WHERE expires_at < NOW() - INTERVAL '1 hour'; DELETE FROM auth_attempts WHERE created_at < NOW() - INTERVAL '1 day';"
# Log rotation force — daily at midnight
0 0 * * * /usr/sbin/logrotate -f /etc/logrotate.d/ethica
CRONS

View File

@@ -0,0 +1,114 @@
<?php
/**
* Ethica Scraper Fix — Tabibi.tn listing-based mode
* Problem: Current scraper uses ID-based scraping which misses entries
* Solution: Switch to listing/pagination mode
* Deploy to: /opt/wevadsapp/scrapers/ethica-tabibi-listing.php
*/
$baseUrl = 'https://www.tabibi.tn';
$specialties = [
'medecin-generaliste', 'cardiologue', 'dermatologue', 'pediatre',
'gynecologue', 'ophtalmologue', 'orl', 'dentiste', 'chirurgien',
'pneumologue', 'neurologue', 'gastro-enterologue', 'urologue',
'endocrinologue', 'rhumatologue', 'psychiatre', 'radiologue'
];
$cities = [
'tunis', 'sfax', 'sousse', 'kairouan', 'bizerte', 'gabes',
'ariana', 'gafsa', 'monastir', 'ben-arous', 'kasserine',
'medenine', 'nabeul', 'tataouine', 'beja', 'jendouba',
'mahdia', 'sidi-bouzid', 'siliana', 'le-kef', 'tozeur',
'manouba', 'zaghouan', 'kebili'
];
$db = pg_connect("host=127.0.0.1 dbname=adx_system user=admin password=" . getenv('DB_PASSWORD'));
$totalNew = 0;
$totalUpdated = 0;
$errors = 0;
foreach ($specialties as $specialty) {
foreach ($cities as $city) {
$page = 1;
$hasMore = true;
while ($hasMore && $page <= 50) {
$url = "$baseUrl/$specialty/$city?page=$page";
$ch = curl_init($url);
curl_setopt_array($ch, [
CURLOPT_RETURNTRANSFER => true,
CURLOPT_TIMEOUT => 30,
CURLOPT_FOLLOWLOCATION => true,
CURLOPT_USERAGENT => 'Mozilla/5.0 (compatible; EthicaBot/1.0)',
CURLOPT_HTTPHEADER => ['Accept-Language: fr-FR,fr;q=0.9']
]);
$html = curl_exec($ch);
$httpCode = curl_getinfo($ch, CURLINFO_HTTP_CODE);
curl_close($ch);
if ($httpCode !== 200 || empty($html)) {
$hasMore = false;
continue;
}
$dom = new DOMDocument();
@$dom->loadHTML($html, LIBXML_NOERROR);
$xpath = new DOMXPath($dom);
$cards = $xpath->query("//div[contains(@class, 'doctor-card') or contains(@class, 'praticien')]");
if ($cards->length === 0) {
$hasMore = false;
continue;
}
foreach ($cards as $card) {
$nameNode = $xpath->query(".//h2|.//h3|.//*[contains(@class, 'name')]", $card)->item(0);
$name = $nameNode ? trim($nameNode->textContent) : '';
$phoneNode = $xpath->query(".//*[contains(@class, 'phone') or contains(@class, 'tel')]|.//a[starts-with(@href, 'tel:')]", $card)->item(0);
$phone = $phoneNode ? trim($phoneNode->textContent) : '';
$addressNode = $xpath->query(".//*[contains(@class, 'address') or contains(@class, 'adresse')]", $card)->item(0);
$address = $addressNode ? trim($addressNode->textContent) : '';
if (empty($name)) continue;
$existing = pg_fetch_assoc(pg_query_params($db,
"SELECT id FROM ethica.medecins_real WHERE nom = $1 AND ville = $2 AND specialite = $3 LIMIT 1",
[$name, $city, $specialty]
));
if ($existing) {
if (!empty($phone)) {
pg_query_params($db,
"UPDATE ethica.medecins_real SET telephone = $1, updated_at = NOW() WHERE id = $2",
[$phone, $existing['id']]
);
$totalUpdated++;
}
} else {
pg_query_params($db,
"INSERT INTO ethica.medecins_real (nom, specialite, ville, pays, telephone, adresse, source, created_at) VALUES ($1, $2, $3, 'TN', $4, $5, 'tabibi.tn', NOW())",
[$name, $specialty, $city, $phone, $address]
);
$totalNew++;
}
}
$page++;
usleep(rand(500000, 1500000));
}
}
}
echo json_encode([
'status' => 'completed',
'new_entries' => $totalNew,
'updated' => $totalUpdated,
'errors' => $errors,
'timestamp' => date('Y-m-d H:i:s')
]);

View File

@@ -0,0 +1,21 @@
# Ethica Scraper Log Rotation
# Deploy to: /etc/logrotate.d/ethica
# Fixes: log files growing to 300+ MB
/var/log/ethica*.log
/opt/wevads/logs/ethica*.log
/opt/wevadsapp/logs/ethica*.log {
daily
rotate 7
compress
delaycompress
missingok
notifempty
maxsize 50M
dateext
dateformat -%Y%m%d
postrotate
# Notify PHP-FPM to reopen log files
systemctl reload php8.3-fpm 2>/dev/null || true
endscript
}

412
nonreg/nonreg-framework-v2.sh Executable file
View File

@@ -0,0 +1,412 @@
#!/bin/bash
###############################################################################
# WEVAL Anti-Regression Framework v2.0 — Six Sigma Testing
# Usage: ./nonreg-framework-v2.sh [--full|--quick|--api-only|--security-only]
# Deploy: S88:/opt/wevads/vault/nonreg-framework-v2.sh
###############################################################################
set -euo pipefail
BASE="https://weval-consulting.com"
REPORT_FILE="/tmp/nonreg-report-$(date +%Y%m%d_%H%M%S).json"
MODE="${1:---full}"
PASS=0
FAIL=0
WARN=0
TOTAL=0
RESULTS="[]"
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
NC='\033[0m'
log_result() {
local category="$1" test_name="$2" status="$3" details="$4" latency="${5:-0}"
TOTAL=$((TOTAL + 1))
case $status in
PASS) PASS=$((PASS + 1)); echo -e " ${GREEN}PASS${NC} [$category] $test_name ($details)" ;;
FAIL) FAIL=$((FAIL + 1)); echo -e " ${RED}FAIL${NC} [$category] $test_name ($details)" ;;
WARN) WARN=$((WARN + 1)); echo -e " ${YELLOW}WARN${NC} [$category] $test_name ($details)" ;;
esac
RESULTS=$(echo "$RESULTS" | python3 -c "
import sys, json
r = json.load(sys.stdin)
r.append({'category':'$category','test':'$test_name','status':'$status','details':'$details','latency_ms':$latency})
print(json.dumps(r))
" 2>/dev/null || echo "$RESULTS")
}
###############################################################################
# 1. FRONTEND PAGES — HTTP 200 check (27 pages)
###############################################################################
test_pages() {
echo ""
echo "=== 1. FRONTEND PAGES ==="
PAGES=(
"/products/deliverscore.html"
"/products/medreach.html"
"/products/gpu-inference.html"
"/products/content-factory.html"
"/products/proposalai.html"
"/products/blueprintai.html"
"/products/storeforge.html"
"/products/wevia-whitelabel.html"
"/products/arsenal.html"
"/products/wevads-ia.html"
"/products/academy.html"
"/products/wevads.html"
"/products/workspace.html"
"/products/"
"/platform/"
"/wevia/"
"/"
"/solutions.html"
)
for page in "${PAGES[@]}"; do
START=$(date +%s%N)
CODE=$(curl -s -o /dev/null -w '%{http_code}' --max-time 15 "$BASE$page" 2>/dev/null || echo "000")
END=$(date +%s%N)
LATENCY=$(( (END - START) / 1000000 ))
if [ "$CODE" = "200" ]; then
log_result "PAGE" "$page" "PASS" "HTTP $CODE" "$LATENCY"
elif [ "$CODE" = "301" ] || [ "$CODE" = "302" ]; then
log_result "PAGE" "$page" "WARN" "HTTP $CODE (redirect)" "$LATENCY"
else
log_result "PAGE" "$page" "FAIL" "HTTP $CODE" "$LATENCY"
fi
done
}
###############################################################################
# 2. CONFIDENTIALITY SCAN — 0 leaks
###############################################################################
test_confidentiality() {
echo ""
echo "=== 2. CONFIDENTIALITY SCAN ==="
SENSITIVE_PAGES=(
"/products/gpu-inference.html"
"/products/workspace.html"
"/products/proposalai.html"
"/products/blueprintai.html"
"/products/medreach.html"
"/products/wevads.html"
"/products/deliverscore.html"
"/products/storeforge.html"
)
FORBIDDEN_PATTERNS="McKinsey|Deloitte|PwC|Accenture|BCG|Abbott|AbbVie|Johnson.*Johnson|89\.167\.40\.150|88\.198\.4\.195|204\.168\.152|157\.180\.25|weval-playground-2026|deepseek-r1:32b|deepseek-r1:8b|llama3\.1"
for page in "${SENSITIVE_PAGES[@]}"; do
CONTENT=$(curl -s --max-time 10 "$BASE$page" 2>/dev/null || echo "")
MATCHES=$(echo "$CONTENT" | grep -oEi "$FORBIDDEN_PATTERNS" | head -5 || true)
if [ -z "$MATCHES" ]; then
log_result "CONFIDENTIALITY" "$page" "PASS" "0 forbidden patterns"
else
log_result "CONFIDENTIALITY" "$page" "FAIL" "Found: $(echo $MATCHES | tr '\n' ', ')"
fi
done
for page in "${SENSITIVE_PAGES[@]}"; do
CONTENT=$(curl -s --max-time 10 "$BASE$page" 2>/dev/null || echo "")
OPENAI_COUNT=$(echo "$CONTENT" | grep -oi "OpenAI" | wc -l || echo "0")
ANTHROPIC_COUNT=$(echo "$CONTENT" | grep -oi "anthropic\.com" | wc -l || echo "0")
if [ "$OPENAI_COUNT" -eq 0 ] && [ "$ANTHROPIC_COUNT" -eq 0 ]; then
log_result "COMPETITOR" "$page" "PASS" "0 competitor refs"
else
log_result "COMPETITOR" "$page" "FAIL" "OpenAI:$OPENAI_COUNT Anthropic:$ANTHROPIC_COUNT"
fi
done
}
###############################################################################
# 3. API TESTS — Functional + Performance
###############################################################################
test_apis() {
echo ""
echo "=== 3. API TESTS ==="
# DeliverScore
START=$(date +%s%N)
DS_RESULT=$(curl -s --max-time 120 "$BASE/api/deliverscore/scan.php?domain=gmail.com" 2>/dev/null || echo '{"error":"timeout"}')
END=$(date +%s%N)
LATENCY=$(( (END - START) / 1000000 ))
if echo "$DS_RESULT" | python3 -c "import sys,json; d=json.load(sys.stdin); sys.exit(0 if 'domain' in d else 1)" 2>/dev/null; then
log_result "API" "DeliverScore (gmail.com)" "PASS" "${LATENCY}ms" "$LATENCY"
else
log_result "API" "DeliverScore (gmail.com)" "FAIL" "Error or timeout" "$LATENCY"
fi
# MedReach
START=$(date +%s%N)
MR_RESULT=$(curl -s --max-time 30 "$BASE/api/medreach/search.php?specialty=cardiologue&country=MA&limit=5" 2>/dev/null || echo '{"error":"timeout"}')
END=$(date +%s%N)
LATENCY=$(( (END - START) / 1000000 ))
if echo "$MR_RESULT" | python3 -c "import sys,json; d=json.load(sys.stdin); sys.exit(0 if 'results' in d or 'total' in d else 1)" 2>/dev/null; then
log_result "API" "MedReach (cardiologue MA)" "PASS" "${LATENCY}ms" "$LATENCY"
else
log_result "API" "MedReach (cardiologue MA)" "FAIL" "Error or timeout" "$LATENCY"
fi
# Content Factory
START=$(date +%s%N)
CF_RESULT=$(curl -s --max-time 120 -X POST "$BASE/api/content/generate.php" \
-H "Content-Type: application/json" \
-d '{"template":"linkedin_post","topic":"IA souveraine","language":"fr"}' 2>/dev/null || echo '{"error":"timeout"}')
END=$(date +%s%N)
LATENCY=$(( (END - START) / 1000000 ))
CF_CODE=$(echo "$CF_RESULT" | python3 -c "import sys,json; d=json.load(sys.stdin); print('ok' if 'content' in d or 'text' in d else 'fail')" 2>/dev/null || echo "fail")
if [ "$CF_CODE" = "ok" ]; then
log_result "API" "Content Factory (linkedin)" "PASS" "${LATENCY}ms" "$LATENCY"
else
log_result "API" "Content Factory (linkedin)" "WARN" "May be rate-limited" "$LATENCY"
fi
# GPU Chat
START=$(date +%s%N)
GPU_RESULT=$(curl -s --max-time 60 -X POST "$BASE/api/gpu/chat.php" \
-H "Content-Type: application/json" \
-d '{"model":"qwen2.5:3b","messages":[{"role":"user","content":"Hello"}],"max_tokens":50}' 2>/dev/null || echo '{"error":"timeout"}')
END=$(date +%s%N)
LATENCY=$(( (END - START) / 1000000 ))
GPU_CODE=$(echo "$GPU_RESULT" | python3 -c "import sys,json; d=json.load(sys.stdin); print('ok' if 'choices' in d else 'fail')" 2>/dev/null || echo "fail")
if [ "$GPU_CODE" = "ok" ]; then
log_result "API" "GPU Chat (qwen2.5:3b)" "PASS" "${LATENCY}ms" "$LATENCY"
else
log_result "API" "GPU Chat (qwen2.5:3b)" "FAIL" "Model not available" "$LATENCY"
fi
}
###############################################################################
# 4. WEVIA TESTS — Widget + Deep
###############################################################################
test_wevia() {
echo ""
echo "=== 4. WEVIA IA ==="
# Greeting
START=$(date +%s%N)
GREETING=$(curl -s --max-time 10 -X POST "$BASE/api/weval-ia" \
-H "Content-Type: application/json" \
-d '{"message":"Bonjour","mode":"fast"}' 2>/dev/null || echo '{"error":"timeout"}')
END=$(date +%s%N)
LATENCY=$(( (END - START) / 1000000 ))
if [ "$LATENCY" -lt 3000 ]; then
log_result "WEVIA" "Greeting (<3s)" "PASS" "${LATENCY}ms" "$LATENCY"
else
log_result "WEVIA" "Greeting (<3s)" "FAIL" "${LATENCY}ms (>3000ms)" "$LATENCY"
fi
# Deep mode
START=$(date +%s%N)
DEEP=$(curl -s --max-time 90 -X POST "$BASE/api/weval-ia-full" \
-H "Content-Type: application/json" \
-d '{"message":"Comment WEVIA peut aider mon entreprise en transformation digitale?","mode":"deep"}' 2>/dev/null || echo '{"error":"timeout"}')
END=$(date +%s%N)
LATENCY=$(( (END - START) / 1000000 ))
if [ "$LATENCY" -lt 60000 ]; then
log_result "WEVIA" "Deep mode (<60s)" "PASS" "${LATENCY}ms" "$LATENCY"
else
log_result "WEVIA" "Deep mode (<60s)" "FAIL" "${LATENCY}ms (>60000ms)" "$LATENCY"
fi
# Check for competitor names in WEVIA response
DEEP_CONTENT=$(echo "$DEEP" | python3 -c "import sys,json; d=json.load(sys.stdin); print(d.get('response','') + d.get('content','') + d.get('answer',''))" 2>/dev/null || echo "")
FORBIDDEN=$(echo "$DEEP_CONTENT" | grep -oEi "McKinsey|Deloitte|PwC|BCG|Accenture" || true)
if [ -z "$FORBIDDEN" ]; then
log_result "WEVIA" "0 competitor in response" "PASS" "Clean response"
else
log_result "WEVIA" "0 competitor in response" "FAIL" "Found: $FORBIDDEN"
fi
}
###############################################################################
# 5. SECURITY TESTS
###############################################################################
test_security() {
echo ""
echo "=== 5. SECURITY ==="
# HSTS
HSTS=$(curl -sI --max-time 10 "$BASE/" 2>/dev/null | grep -i "strict-transport-security" || true)
if [ -n "$HSTS" ]; then
log_result "SECURITY" "HSTS present" "PASS" "$HSTS"
else
log_result "SECURITY" "HSTS present" "WARN" "Missing on main domain"
fi
# No hardcoded API keys in frontend
FRONTEND=$(curl -s --max-time 10 "$BASE/products/gpu-inference.html" 2>/dev/null || echo "")
KEYS=$(echo "$FRONTEND" | grep -o "weval-playground-2026" || true)
if [ -z "$KEYS" ]; then
log_result "SECURITY" "No hardcoded API keys" "PASS" "0 keys exposed"
else
log_result "SECURITY" "No hardcoded API keys" "FAIL" "Key exposed in frontend"
fi
# CORS check
CORS=$(curl -sI --max-time 10 -H "Origin: https://evil.com" "$BASE/api/weval-ia" 2>/dev/null | grep -i "access-control-allow-origin" || true)
if echo "$CORS" | grep -q "\*"; then
log_result "SECURITY" "CORS strict (no wildcard)" "WARN" "Wildcard CORS detected"
else
log_result "SECURITY" "CORS strict (no wildcard)" "PASS" "No wildcard"
fi
# No internal IPs
for page in "/products/workspace.html" "/products/gpu-inference.html" "/products/deliverscore.html"; do
CONTENT=$(curl -s --max-time 10 "$BASE$page" 2>/dev/null || echo "")
IPS=$(echo "$CONTENT" | grep -oE '89\.167\.40\.150|88\.198\.4\.195|204\.168\.152' || true)
if [ -z "$IPS" ]; then
log_result "SECURITY" "No internal IPs in $page" "PASS" "0 IPs"
else
log_result "SECURITY" "No internal IPs in $page" "FAIL" "Found: $IPS"
fi
done
}
###############################################################################
# 6. TRACKING (S151)
###############################################################################
test_tracking() {
echo ""
echo "=== 6. TRACKING ==="
# S151 tracking
T_CODE=$(curl -s -o /dev/null -w '%{http_code}' --max-time 10 "http://151.80.235.110/" 2>/dev/null || echo "000")
if [ "$T_CODE" = "200" ] || [ "$T_CODE" = "301" ] || [ "$T_CODE" = "302" ]; then
log_result "TRACKING" "S151 tracking server" "PASS" "HTTP $T_CODE"
else
log_result "TRACKING" "S151 tracking server" "FAIL" "HTTP $T_CODE"
fi
# Tracking domain
TD_CODE=$(curl -s -o /dev/null -w '%{http_code}' --max-time 10 "https://culturellemejean.charity" 2>/dev/null || echo "000")
log_result "TRACKING" "culturellemejean.charity" "$([ "$TD_CODE" != "000" ] && echo PASS || echo FAIL)" "HTTP $TD_CODE"
# S151 tracking endpoints
for ep in "o" "c" "u"; do
EP_CODE=$(curl -s -o /dev/null -w '%{http_code}' --max-time 10 "http://151.80.235.110/$ep/" 2>/dev/null || echo "000")
log_result "TRACKING" "S151 /$ep/ endpoint" "$([ "$EP_CODE" != "000" ] && echo PASS || echo WARN)" "HTTP $EP_CODE"
done
}
###############################################################################
# 7. LOAD TEST — 3 concurrent requests
###############################################################################
test_load() {
echo ""
echo "=== 7. LOAD TEST (3 concurrent) ==="
# DeliverScore x3
LOAD_OK=0
for i in 1 2 3; do
CODE=$(curl -s -o /dev/null -w '%{http_code}' --max-time 60 "$BASE/api/deliverscore/scan.php?domain=test${i}.com" 2>/dev/null || echo "000") &
done
wait
log_result "LOAD" "DeliverScore x3 concurrent" "PASS" "Completed"
# MedReach x3
for i in 1 2 3; do
CODE=$(curl -s -o /dev/null -w '%{http_code}' --max-time 30 "$BASE/api/medreach/search.php?specialty=dentiste&country=MA&limit=5" 2>/dev/null || echo "000") &
done
wait
log_result "LOAD" "MedReach x3 concurrent" "PASS" "Completed"
}
###############################################################################
# REPORT
###############################################################################
generate_report() {
echo ""
echo "=============================================="
echo " WEVAL ANTI-REGRESSION REPORT v2.0"
echo " $(date '+%Y-%m-%d %H:%M:%S')"
echo "=============================================="
echo ""
echo -e " ${GREEN}PASS${NC}: $PASS"
echo -e " ${RED}FAIL${NC}: $FAIL"
echo -e " ${YELLOW}WARN${NC}: $WARN"
echo " TOTAL: $TOTAL"
echo ""
SCORE=$(( PASS * 100 / TOTAL ))
if [ "$FAIL" -eq 0 ]; then
echo -e " VERDICT: ${GREEN}GO LIVE${NC} ($SCORE% pass rate)"
elif [ "$FAIL" -le 2 ]; then
echo -e " VERDICT: ${YELLOW}GO CONDITIONNEL${NC} ($SCORE% pass rate, $FAIL failures)"
else
echo -e " VERDICT: ${RED}NO GO${NC} ($SCORE% pass rate, $FAIL failures)"
fi
echo ""
echo " Report saved to: $REPORT_FILE"
python3 -c "
import json
results = $RESULTS
report = {
'timestamp': '$(date -Iseconds)',
'version': '2.0',
'mode': '$MODE',
'summary': {'pass': $PASS, 'fail': $FAIL, 'warn': $WARN, 'total': $TOTAL},
'score': $SCORE,
'results': results
}
with open('$REPORT_FILE', 'w') as f:
json.dump(report, f, indent=2)
" 2>/dev/null || true
}
###############################################################################
# MAIN
###############################################################################
echo "=============================================="
echo " WEVAL Anti-Regression Framework v2.0"
echo " Mode: $MODE"
echo " $(date '+%Y-%m-%d %H:%M:%S')"
echo "=============================================="
case $MODE in
--full)
test_pages
test_confidentiality
test_apis
test_wevia
test_security
test_tracking
test_load
;;
--quick)
test_pages
test_confidentiality
test_security
;;
--api-only)
test_apis
test_wevia
test_load
;;
--security-only)
test_confidentiality
test_security
;;
*)
echo "Usage: $0 [--full|--quick|--api-only|--security-only]"
exit 1
;;
esac
generate_report