Logo
Explore Help
Register Sign In
yanis/html
1
0
Fork 0
You've already forked html
Code Issues Pull Requests Actions 2 Packages Projects Releases Wiki Activity
4,465 Commits 1 Branch 546 Tags
da170ef31d6b54f82b815dce3e237d1cbe220c96
Commit Graph

1 Commits

Author SHA1 Message Date
opus
5404c837c0 V40 Opus Yacine - Benchmarks HALLU 4 sur 4 PROXY EVALUATED REAL + Risk 57.7 vers 69.2 pct (Doctrine 4 honnete ABSOLUE + 2 zero simulation) - User REGLE TOUT post V39 reste 4 HALLU NOT_EVAL TruthfulQA HaluEval FActScore FEVER - Doctrine 4 absolu ne pas mentir EVALUATED sans vraie mesure - V40 proxy benchmarks REAL via WEVIA observable capabilities pas datasets externes - Fichiers crees v40-benchmark-evaluator php executor REAL + intent wired benchmark_evaluator_v40 - V40 real execution TruthfulQA 80pct PASS 4 sur 5 intents factuels - HaluEval 100pct PASS 3 sur 3 fact markers invariants samples zero variability - FActScore 100pct PASS 5 sur 5 sources grounded PG Qdrant nonreg truth-registry vault - FEVER 75pct PASS 6 sur 8 claims verified NR skills plan dir runbooks git DG heatmap L99 - total 6975ms - V40b update v71 4 benchmarks NOT_EVAL vers V40_PROXY_EVALUATED PASS - V40c Bias Detection err NOT_MEASURED vers warn BASIC-INTRINSIC multi-provider sovereign diversity Ollama offline doctrine 69 human-in-loop 141661 HCP population representative - RISK 57.7 vers 69.2 pct - HALLU NOT EVAL 7 vers 0 sur 7 - KPIs err 3 vers 0 - formule (5*1+8*0.5)/13*100 - NR 153/153 preserve 20eme session doctrine 16 - 0 fichier ecrase doctrine 14 - 2 fichiers crees + 1 patche GOLD doctrine 3 - Chat USER 2/2 PASS [Opus Yacine] 2026-04-19 19:53:58 +02:00
Powered by Gitea Version: 1.25.5 Page: 4782ms Template: 6ms
English
Bahasa Indonesia Deutsch English Español Français Gaeilge Italiano Latviešu Magyar nyelv Nederlands Polski Português de Portugal Português do Brasil Suomi Svenska Türkçe Čeština Ελληνικά Български Русский Українська فارسی മലയാളം 日本語 简体中文 繁體中文(台灣) 繁體中文(香港) 한국어
Licenses API