Extension Catalog
96 core extensions for EU AI infrastructure. Data connectors, translation models, compliance validators, and developer tools. All open-source, all EU-jurisdiction.
Data Connectors
Pull from 20 EU institutional sources. Each connector handles authentication, rate limiting, pagination, and STAM annotation.
EUR-Lex Connector Core
EU legislation, directives, and regulations. CELLAR SPARQL + REST. 8 sectors, 24 languages.
IATE Terminology Core
2.4M institutional terms across 24 EU languages. TBX/TMX export. Vectorize-indexed.
CURIA Case Law Core
Court of Justice judgments, opinions, and orders. ECLI identifiers.
Eurostat Statistics Core
SDMX statistical data. GDP, trade, demographics, environment indicators.
European Parliament Core
Plenary debates, committee reports, legislative procedures via OEIL.
TED Procurement Core
Public procurement notices. Contract awards, call for tenders. eForms native.
Data Europa Core
Open data portal. DCAT-AP metadata, 1.7M datasets from 36 countries.
ECB Financial Core
Exchange rates, monetary policy decisions, financial stability reports.
CORDIS Research Core
EU-funded research projects. Horizon Europe, FP7, H2020 results.
Commission Documents Core
Proposals, communications, delegated acts, implementing decisions.
Council of the EU Core
Council conclusions, meeting agendas, voting records, press releases.
Publications Office Core
Official Journal, Supplement, annual reports. CELLAR repository.
OEIL Legislative Core
Legislative Observatory. Track procedures, amendments, committee opinions.
Who is Who Core
EU institutional directory. Commissioners, DGs, agencies, committees.
ECHA Chemicals Core
REACH, CLP, biocides registrations. Substance dossiers, SVHC lists.
EMA Medicines Core
Authorised medicines, EPARs, safety signals, clinical trial data.
EPO Patents Core
European patents, patent families, legal status. Open Patent Services API.
National Law Core
28 country adapters. Finlex, BGBI, Legifrance, BOE, and more.
Wikipedia EU Core
Wikidata entities + Wikipedia summaries for EU legislation and institutions.
DPP Registry Core
Digital Product Passport. ESPR 2024/1781 categories, lifecycle data.
Translation
Neural machine translation models for regulated content. Helsinki-NLP OPUS models, browser-native ONNX, terminology-enforced.
OPUS MT en-fi Core
English to Finnish. Helsinki-NLP OPUS. INT8 quantized, ~60MB ONNX.
OPUS MT fi-en Core
Finnish to English. Helsinki-NLP OPUS. INT8 quantized, ~60MB ONNX.
OPUS MT en-de Core
English to German. Legal and technical domain fine-tuned.
OPUS MT en-fr Core
English to French. EU institutional register style.
OPUS MT en-sv Core
English to Swedish. Nordic legal terminology aligned.
OPUS MT en-es Core
English to Spanish. EUR-Lex parallel corpus trained.
OPUS MT Multilingual Core
M2M-100 multilingual model. Any-to-any among 24 EU languages.
Terminology Enforcer Core
IATE term enforcement during translation. Guaranteed institutional terminology.
TM Lookup Core
Translation memory segment matching. EUR-Lex parallel corpus, 24 languages.
QA Checker Core
Post-edit quality assurance. Terminology consistency, number validation, tag integrity.
TBX Import Core
Import customer terminology databases in TBX v3 format.
TMX Import Core
Import translation memories in TMX 1.4b format.
XLIFF Round-Trip Core
XLIFF 2.1 import/export. CAT tool interoperability.
SRX Segmenter Core
SRX-based sentence segmentation. Language-specific rules for EU languages.
Compliance
Regulatory compliance validators for EU legislation. Check obligations, track deadlines, classify risk levels.
EU AI Act Classifier Core
Risk classification per EU AI Act 2024/1689. Annex I-VIII mapping.
GDPR Checker Core
Data processing agreement analyzer. Art. 6 legal basis, Art. 28 processor checks.
NIS2 Validator Core
NIS2 Directive compliance. Essential vs. important entity classification.
DORA Checker Core
Digital Operational Resilience Act. ICT risk management requirements.
DPP Validator Core
Digital Product Passport validation against ESPR 2024/1781 schema.
REACH Checker Core
REACH regulation substance screening. SVHC candidate list matching.
Deontic Classifier Core
Extract obligations (MUST), prohibitions (MUST NOT), permissions (MAY) from legal text.
Obligation Tracker Core
Track regulatory deadlines, transposition dates, and compliance milestones.
EuroVoc Tagger Core
Automatic EuroVoc topic classification. 21 domains, 7,000 concepts.
CELEX Resolver Core
Resolve CELEX IDs to full document metadata. Cross-reference resolver.
Accessibility Checker Core
European Accessibility Act compliance. WCAG 2.1 AA + EN 301 549.
MDR Classifier Core
Medical Device Regulation 2017/745. Device classification rules.
Search & RAG
Semantic search and retrieval-augmented generation across EU legal corpora. BGE-M3 embeddings, 1024 dimensions.
Semantic Search Core
Semantic search across 165K EU documents. BGE-M3 embeddings, cosine similarity.
Legal RAG Pipeline Core
Retrieval-augmented generation for legal questions. Citation-grounded answers.
Cross-Language Search Core
Query in one language, find results in all 24. Multilingual BGE-M3.
Citation Extractor Core
Extract and resolve legal citations. CELEX, ECLI, ELI identifiers.
Document Chunker Core
Intelligent document chunking for RAG. Article-aware, section-respecting splits.
Embedding Generator Core
Generate BGE-M3 embeddings for custom documents. 1024-dim vectors.
Hybrid Search Core
Combine keyword (BM25) and semantic (vector) search. Reciprocal rank fusion.
Context Window Core
Optimal context assembly for LLM prompts. Token budget management.
Reranker Core
Cross-encoder reranking for search results. Legal domain fine-tuned.
Answer Grounding Core
Verify LLM answers against source documents. Hallucination detection.
Annotation
STAM-based document annotation. Product classification, entity extraction, and metadata enrichment.
STAM Annotator Core
Stand-off Text Annotation Model. Sidecar JSON annotations for any document.
Product Classifier Core
Classify documents into 11 ESPR product categories + DPP metadata.
Named Entity Recognition Core
Extract organizations, persons, locations, legal references from EU text.
Date Extractor Core
Extract and normalize dates, deadlines, and time periods from legal text.
Amount Extractor Core
Extract monetary amounts, thresholds, and penalties from regulations.
Legal Reference Linker Core
Resolve cross-references between EU legal acts. Build citation graphs.
Sector 7 Linker Core
Link EU directives to 290K national transposition measures.
Amendment Tracker Core
Track how legal acts are amended over time. Consolidated text diffs.
Language Detector Core
Detect document language from 24 EU languages. FastText-based.
Readability Scorer Core
Score text readability. Flesch-Kincaid adapted for legal German, French, Finnish.
AlphaFold Lookup Core
Protein structure lookup for REACH/CLP chemical dossiers. PDB cross-reference.
eForms Parser Core
Parse eForms XML procurement notices. 800+ business terms mapped.
Export & Format
Transform EU data into standard formats. JSON-LD, CSV, Akoma Ntoso, ODRL, and more.
JSON-LD Export Core
Export annotations as schema.org JSON-LD. Legislation, Dataset, CreativeWork.
CSV/TSV Export Core
Flat file export. One row per document, configurable columns.
Akoma Ntoso Core
Export legislation in Akoma Ntoso XML. OASIS LegalDocML standard.
ODRL Policies Core
Generate ODRL usage policies for LDS data sharing agreements.
DCAT-AP Metadata Core
Generate DCAT-AP v3 metadata for open data portals.
PDF Generator Core
Generate PDF reports from search results, annotations, and analyses.
RSS Feed Core
Generate RSS/Atom feeds for regulatory updates. Per-domain or per-keyword.
Webhook Relay Core
Push regulatory updates to external systems. Configurable filters.
Parquet Export Core
Export datasets as Apache Parquet for data science workflows.
SPARQL Endpoint Core
SPARQL 1.1 query endpoint over annotated EU data graph.
Developer Tools
SDKs, CLI tools, and IDE integrations for building on Pauhu EU infrastructure.
Python SDK Core
Python client library. Type-safe, async-first. pip install pauhu.
TypeScript SDK Core
TypeScript/JavaScript client. Tree-shakeable, ESM + CJS. npm i @pauhu/sdk.
CLI Core
Command-line interface. Search, translate, annotate from your terminal.
VS Code Extension Core
EU regulation lookup, translation, and compliance checking in VS Code.
MCP Server Core
Model Context Protocol server. Connect any MCP-compatible AI to EU data.
OpenAPI Spec Core
OpenAPI 3.1 specification. Auto-generated client stubs for any language.
Postman Collection Core
Pre-built Postman collection with all API endpoints and example requests.
Docker Container Core
Self-hosted container. Air-gapped inference, sovereignty-first deployment.
Jupyter Notebooks Core
Example Jupyter notebooks for data analysis, search, and translation tasks.
GitHub Action Core
CI/CD integration. Compliance checks, terminology validation in pull requests.
Infrastructure
Monitoring, caching, security, and operational tooling for production EU AI systems.
EWMA Health Monitor Core
Exponentially weighted moving average gate health monitoring. 3-sigma alerts.
Rate Limiter Core
Sliding window rate limiting. Per-key, per-endpoint. KV-backed, Redis-portable.
Cache Manager Core
Multi-tier caching. Browser Cache API, KV, R2. Stale-while-revalidate.
Audit Logger Core
IEC 62443 FR6 audit logging. Tamper-evident, append-only, NDJSON to R2.
Error Collector Core
Structured error collection. Dead letter queue for failed operations.
Backup Manager Core
Automated D1/R2 backups. Encrypted, versioned, cross-region replication.
Migration Runner Core
Database migration runner for D1. Version-tracked, rollback-safe.
Health Endpoint Core
Standardized /health endpoint. Uptime, latency, dependency status checks.
Start building
Get an API key and install extensions via CLI or API.
Get API Key Read Docs