Changelog

Release notes for the Pauhu® platform. All infrastructure runs on Cloudflare Workers in EU jurisdiction (Helsinki).

2026-03-11 v5

CUA browser automation, 6 new documentation pages, staging consolidation, security hardening, and link integrity verification across all 5 domains.

Computer Use Agent (CUA)

New CUA architecture — screenshot→reason→act→verify loop with deontic safety gates. 7 action types, 12 allowlisted EU portals, human-in-the-loop confirmation for all form submissions.
New CUA API — 8 endpoints: /v1/cua/start, action, screenshot, stream (SSE), confirm, rollback, stop, stats. Full request/response schemas with curl examples.
New CUA MCP tools — 8 Playwright actions exposed as MCP tools (click, type, scroll, navigate, select, submit, wait, screenshot) for programmatic browser automation.
New CUA safety model — sandbox boundaries, M1 prohibited actions, audit trail (90-day retention), GDPR Art. 17 erasure support, EU-jurisdiction screenshot storage.

Documentation

New 6 CUA documentation pages — architecture, API reference, safety model, quickstart (TED eProcurement walkthrough), MCP tools, FAQ (18 Q&As).
New Getting Started pages for pauhu.eu (data feeds) and pauhu.com (search + translate). 3-step format with curl examples.
Improved Documentation index updated with CUA section links. All 5 staging domains verified: 43 same-domain targets, 12 cross-domain targets, 0 broken links.

Staging consolidation (v4 → v5)

Improved Direct index.html serving — removed _index.html indirection layer. Staging domains now serve content directly instead of “Building…” placeholder pages.
Improved pauhu.eu index restored — ticker ribbons, all product sections, and prior improvements fused into single index (commit 653df7c5d).
Improved 52 files changed across staging domains: pricing pages, source pages, quality dashboards, welcome flows refreshed.

Security

Security X-Pauhu-Domain injection fix — header always deleted and overwritten in gateway proxy, preventing cross-product ACL bypass.
Security /v1/feature-health auth — endpoint now requires admin/premium auth with 10/min rate limit. Evidence field redacted to prevent binding name leakage.
Security CORS + chat fix — Gateway CORS headers and chat endpoint corrected.

Internationalization

Fixed Hardcoded locale parameters on 3 staging files (brief.js, translate-monitor.js, quality/index.html).
New 68 CUA i18n keys added for overlay labels, action types, error messages, and TED demo flow in 24 EU languages.

2026-03-07 Update

Sovereign Brain architecture documentation, 3-tier adaptive model loading, supply chain sovereignty, annotation inheritance, and vectorize embedding pipeline fixes.

Sovereign Brain architecture

New Sovereign Brain documentation — full architecture guide explaining two-hemisphere design: left hemisphere (Laine search, 26ms paragraph retrieval) and right hemisphere (FiD mT5, grounded generation with citations).
New Thalamus gateway — single entry point that validates every query and routes to the correct hemisphere before either side does any work.
New Supply chain sovereignty — models run in the browser or on your server. No cloud API dependency, no model hosting subscription, no inference-per-token billing. ONNX open standard (ISO/IEC 17203), OCI-compliant containers.

Adaptive model loading

New 3-tier loading — Lite (<4 GB, search only), Standard (4–16 GB, search + FiD generation), Full (>16 GB, all models including 552 NMT pairs and 21 domain classifiers).
New Progressive download — search models load first (available in seconds), FiD generation loads second (10–30s), translation pairs load on demand.
Improved FiD generation model fits in 300 MB of DRAM. Runs on commodity hardware without special procurement.
Improved Browser-native inference via ONNX Runtime for WebAssembly. No server, no GPU required. Data never leaves the browser process.

Annotation inheritance

New English-first Rosetta pattern — English is annotated with highest-quality NLP models, and structural annotations (topic, deontic modality, cross-references) are inherited by all 24 parallel language versions.
New COALESCE/NULLIF SQL — inheritance logic prefers language-specific annotations when available, falls back to English otherwise. 24/24 EU language coverage for all topic domains.
Improved Multilingual rollout in progress: multilingual indexing, backfill of existing EN-only documents, cross-language annotation verification.

Vectorize embedding pipeline

Fixed Vector serialization — Float32Array output from BGE-M3 was not serializing correctly to the vector index. Fixed via Array.from() normalization before storage. Vectors now populate correctly across all 20 product indexes.
Improved Documented full embedding path: object storage → annotation engine (STAM) → structured index (D1) → embedding service (BGE-M3, 1024 dimensions) → vector index (cosine similarity).

Documentation

New CRM API reference — internal API documentation for sales pipeline: contacts, companies, deals, activities, tasks, email sequences, AI lead scoring.
New Data pipeline guide — 14-section documentation covering ingestion, STAM annotation, paragraph indexing, semantic search, grounded generation, multilingual flow, annotation inheritance, vectorize embedding, adaptive loading, browser sidebar integration.
New “Works Alongside Your Tools” — browser sidebar overlay for contextual search, grounded answers, terminology lookup, and local translation. No vendor lock-in, no per-seat licensing.

2026-03-06 Update

Security audit clearance, E2E staging verification, FiD dual-brain architecture, 24-language translation at 100% coverage, and container hardening.

Security

Security FiD dual-brain audit complete — 0 CRITICAL findings. ONNX model SHA-256 checksums verified. STAM bounds checking confirmed. Server-side grounding guarantee validated.
Security Container security clearance — all sovereign container images audited: FiD container cleared, Compass container cleared (XSS fix applied), LDS container conditionally passed.
Security GDPR Art. 17 erasure — right-to-erasure flow verified end-to-end. Data retention policy documented.
Security Air-gap readiness — government readiness audit scored 72/100. Offline mode detection and Finnish/Dutch language detection added.

Staging verification

Improved End-to-end launch testing across all staging domains (pauhu.eu, pauhu.com, pauhu.ai, pauhu.dev). 8 tests PASS, 2 PASS with runtime notes, 2 deferred.
Improved Data feeds page verified: 20 EU institutional source cards with correct attribution and licensing.
Improved Sovereign deployment connector page verified.

FiD architecture

New FiD browser neural network specification — Fusion-in-Decoder architecture for grounded answer generation. Encoder processes each retrieved paragraph independently; decoder attends to all simultaneously for cross-document reasoning.
Improved FiD data format cleaned and validated. Retrieval comparison test against BGE-M3 cross-lingual embeddings completed.

Internationalisation

Improved 24/24 EU languages at 100% coverage — all translation keys verified across Bulgarian, Croatian, Czech, Danish, Dutch, English, Estonian, Finnish, French, German, Greek, Hungarian, Irish, Italian, Latvian, Lithuanian, Maltese, Polish, Portuguese, Romanian, Slovak, Slovenian, Spanish, and Swedish.
Improved FiD 24-language prompt templates delivered. Decoder locale keys wired.

Infrastructure

Improved Worker count increased from 97 to 150. All new workers with EU jurisdiction placement.
Improved 4.7M+ objects verified across 24 product R2 buckets.
Improved LLM adapter port changed from 8000 to 8001 to resolve collision with TTS container.
Fixed Chat renderer wired into workspace UI.

2026-03-03 Launch

The European launch release brings server-side document extraction via Document extraction, a transparent two-part pricing model, 20 live EU institutional data feeds, 2.4M IATE terminology terms, and browser-native ONNX inference with full offline support.

Document extraction integration

New POST /api/v1/extract — Server-side document extraction via Document extraction headless Chrome on EU Hetzner Helsinki. Extracts text from any URL with optional IATE terminology lookup and STAM annotation.
New POST /api/v1/extract-and-index — Extract, annotate, and store documents in EU storage for automatic indexing. Writes STAM sidecar JSON with full provenance metadata.
New POST /api/v1/pdf-render — Render any URL to PDF via server-side Chrome. Returns raw PDF binary with inline Content-Disposition.
Improved Document extraction runs on Hetzner Helsinki — documents never leave the EU. Tab is closed after each extraction (stateless).

Two-part tariff pricing

New GET /v1/pricing/two-part — Transparent two-part tariff: Azure pass-through cost (live from Azure Retail Prices API, northeurope region, EUR) plus Pauhu Data License (fixed EUR amounts). No markup on compute, clear separation of infrastructure and data costs.
New GET /v1/pricing/data-licenses — Enterprise data licensing endpoint with tiered pricing.
New GET /v1/pricing/ranker — Semantic ranker add-on pricing.
Improved All pricing pages across 5 domains (pauhu.eu, pauhu.com, pauhu.ai, pauhu.dev, pauhu.io) now use pauhu-pricing.js for live client-side price filling from the Azure Retail Prices API.

20 EU institutional data feeds

New All 20 EU data sources are live with dedicated R2 buckets, D1 databases, Vectorize indexes, and queue pipelines per product.
New Products: Commission, Consilium, CORDIS, CURIA, Data Europa, DPP, ECB, ECHA, EMA, EPO, Europarl, EUR-Lex, Eurostat, IATE, National Law (28 countries), OEIL, Publications, TED, Who is Who, Wiki.
Improved Hybrid search across all 20 indexes: 70% BGE-M3 semantic similarity (1024 dimensions, cosine) + 30% BM25 keyword matching via the Laine Algorithm.
Improved STAM (Stand-off Text Annotation Model) sidecar format for non-destructive annotations with full provenance tracking. All producers and consumers migrated from .annotation.json to .stam.json.
Improved 11 data sync services migrated to eternal protocols (SPARQL, SDMX, REST) — zero fragile API patterns.

IATE terminology

Improved 2,456,445 terms across 24 EU official languages. Reliability scores and domain classification included in all responses.
Improved Fuzzy search, exact lookup, entry by concept ID, language stats, and quality dashboard endpoints all live.
Improved IATE terms automatically extracted during Document extraction document processing when terminology: true is set.

Browser-native inference

New ONNX Runtime Web integration with WebGPU/WASM backend auto-detection. Cold-start and warm-start benchmarking via the Onboarding Wizard.
New Full offline mode with Service Worker caching. Pre-flight asset audit, download progress indicators, and offline workflow verification.
New Browser-side vision models: Tesseract.js (OCR), ViT-GPT2 (captioning), MobileNetV3 (classification), Legal-BERT (NER), BART-Large-CNN (summarization). No data leaves the browser.
Improved Model downloads show granular progress indicators with phase labels (detecting device, cold loading, cold inference, warm loading, warm inference).

Infrastructure

Improved 97 Cloudflare Workers, all with [placement] jurisdiction = "eu". All R2 buckets EU jurisdiction. All D1 databases created with --location eu.
Improved 16-gate orchestrator loop with deontic modality classification (prohibition, obligation, permission, exemption).
Improved DSPy integration: 486 signatures, 401 modules, 34 orchestrators. MetaPromptEngine for prompt optimization with EU AI Act transparency compliance.
Improved EWMA-based gate health monitoring replaces Z-score. Trend-aware statistical process control with 3-sigma control limits.

Accessibility

Fixed Resolved 7 CRITICAL and 2 HIGH WCAG 2.1 AA violations across all staging domains.
Improved Skip-to-content links, ARIA landmarks, keyboard navigation, focus traps, text size controls (A+/A-), and reduced-motion support on all 5 domains.
Improved All images have descriptive alt text. All interactive elements are keyboard-navigable. All forms use proper role="search" and aria-label attributes.

Security

Security IEC 62443-3-3 zone-based security verified. Model Last pattern enforced: gates run before inference.
Security 81 services hardened with development mode disabled. Only the API router and landing page remain intentionally public.
Security SHA-256 checksums for all model files. Credential rotation tracking. Python credential audit complete.
Security GDPR Article 25 (Privacy by Design) and Article 32 (Security of Processing) compliance verified. All data flows EU-only — no third-country transfer.

Compliance

Improved EU AI Act Article 52 transparency endpoints on all AI-powered services.
Improved DSA Article 27 ranking transparency metadata in search results.
Improved NIS2 Important Entity self-assessment (Art. 21 partial).