GPU Extensions

Bring your own API keys. Pauhu® routes to the best provider. You pay the provider directly.

How It Works

GPU Extensions are a zero-cost multi-provider API gateway. You provide your own API keys for external providers (OpenAI, Anthropic, Google, Replicate, etc.), and Pauhu handles routing, rate limiting, usage tracking, and a unified API surface. You pay the provider directly for compute — Pauhu charges only for the integration layer via your subscription tier.

Your Application
      ↓
Pauhu GPU Gateway (Cloudflare Worker, EU jurisdiction)
  • JWT authentication
  • Tier-based rate limiting
  • API key format validation
  • Usage tracking (D1)
      ↓
External Provider (customer-paid)
  • OpenAI, Anthropic, Google, Replicate, etc.
  • You provide the API key
  • You pay the provider directly
      ↓
Response → Your Application

Policy Boundaries

Extensions operate under strict policy boundaries:

Authentication: Every request requires a valid Pauhu JWT token. Anonymous access is not permitted.
Rate limiting: Tier-based limits prevent abuse. Free: 10 generations/month. Starter: 100/month. Professional: 1,000/month. Enterprise: custom.
API key isolation: Your provider API keys are validated for format but never stored by Pauhu. They are forwarded to the provider in a single request and discarded.
EU jurisdiction: The gateway worker runs in EU jurisdiction. Requests are routed to the provider from EU edge nodes. Provider data processing is subject to the provider's own terms.
No data retention: Pauhu does not store prompts, responses, or generated content. Only usage counts and timestamps are recorded for billing.

6 Extension Types

1. Large LLMs (70B+ Parameters)

Endpoint: /gpu/large-llms/chat

Chat completions with large language models that exceed browser-native capacity. Providers: OpenAI, Anthropic, Google Gemini, Together AI, Groq, Replicate.

POST /gpu/large-llms/chat
{
  "model": "gemini-1.5-pro",
  "messages": [{"role": "user", "content": "Translate to Finnish"}],
  "api_key": "YOUR_GOOGLE_API_KEY",
  "provider": "google"
}

2. Video Generation

Endpoint: /gpu/video-generation/generate-video

Text-to-video and image-to-video generation. Providers: OpenAI (Sora), Replicate, RunwayML, Pika, Fal.ai. Cost: $0.002–$0.20 per second (customer-paid).

3. Image Generation

Endpoint: /gpu/image-generation/generate-image

Text-to-image generation. Providers: OpenAI (DALL-E 3), Replicate, Fal.ai, Together AI. Cost: $0.001–$0.04 per image (customer-paid).

4. Real-time Video

Endpoint: /gpu/realtime-video/process-frame

Real-time object detection and video analysis. Providers: Roboflow (YOLOv8), Ultralytics, AWS Rekognition, Replicate. Cost: $0.00001–$0.12 per frame/minute (customer-paid).

5. Audio Generation

Endpoint: /gpu/audio-generation/generate-music

Music generation and text-to-speech. Providers: Suno (music), ElevenLabs (speech), Replicate, Stability AI, Mubert. Cost: $0.02–$0.50 per generation (customer-paid).

6. 3D Generation

Endpoint: /gpu/3d-generation/generate-3d

Text-to-3D and image-to-3D model generation with textures, rigging, and LODs. Providers: Trellis (Microsoft), Meshy, Luma AI, Rodin, Stability AI, Replicate. Cost: $0.05–$2 per model (customer-paid).

POST /gpu/3d-generation/generate-3d
{
  "prompt": "A medieval fantasy knight with armor",
  "api_key": "YOUR_RODIN_API_KEY",
  "provider": "rodin",
  "output_format": "glb",
  "with_textures": true,
  "with_pbr": true
}

Pricing

Tier	Monthly Fee	Generations / Month
Free	$0	10
Starter	$49	100
Professional	$199	1,000
Enterprise	Custom	Unlimited

These prices cover the Pauhu integration layer only. You pay the external provider separately for compute (LLM tokens, GPU time, etc.) using your own API key.

Authentication

All GPU extension requests require two credentials:

Pauhu JWT token in the Authorization: Bearer header. Identifies your account and subscription tier.
Provider API key in the request body (api_key field). Forwarded to the provider for the actual compute call.

API Documentation — full reference for all Pauhu APIs
Developer Docs — MCP server, CLI, container setup
Security — how we protect your data and API keys
Attributions — open-source licences