language-learning-app/api/docs/object-storage.md

3 KiB

Object Storage

This document explains how object storage works today, and how to control it across environments.

TL;DR

  • The app has one storage interface (StorageClient) and two implementations:
    • MinioClient for local/dev (S3-compatible MinIO)
    • BunnyClient for deployed environments (Bunny Storage + Bunny CDN)
  • Provider selection is controlled by STORAGE_PROVIDER:
    • local -> MinIO
    • bunny -> Bunny
  • The client is initialised once at API startup and stored as a process-level singleton.

Runtime Lifecycle

  1. API startup runs init_storage() from app.outbound.storage_factory.
  2. init_storage() reads config from app.config.settings.
  3. It creates either MinioClient or BunnyClient.
  4. The client instance is set via _set_storage_client(...).
  5. App code calls get_storage_client() anywhere it needs object URLs or file operations.

If storage is used before startup initialisation, get_storage_client() raises an assertion error.

Interface Contract

StorageClient currently exposes:

  • upload(path, data) -> bool
  • get_url(path) -> str
  • get_public_url(path) -> str
  • delete(path) -> bool
  • download(path) -> (bytes, content_type)

Important behavior differences:

  • MinIO supports download(...) for API media proxy routes.
  • Bunny does not support direct download in this adapter and raises NotImplementedError; callers should use signed CDN URLs from get_url(...).

URL Behavior

Local/MinIO mode (STORAGE_PROVIDER=local)

  • get_url(path) returns API-proxied URLs under /media/....
  • Browser requests go through the API media router.
  • Media router validates DB ownership/existence, then streams bytes from storage.

Bunny mode (STORAGE_PROVIDER=bunny)

  • get_url(path) returns a signed Bunny CDN URL.
  • Signature uses token auth key + path + expiry (currently 1 hour).
  • Browser requests go directly to Bunny CDN (no API proxy hop).

Configuration

Local/MinIO settings

  • STORAGE_PROVIDER=local
  • STORAGE_ENDPOINT_URL (for Docker dev: http://storage:9000)
  • STORAGE_ACCESS_KEY
  • STORAGE_SECRET_KEY
  • STORAGE_BUCKET
  • API_BASE_URL (used to build /media/... URLs)

On startup, MinioClient.ensure_bucket_exists() is called.

Bunny settings

  • STORAGE_PROVIDER=bunny
  • BUNNY_ZONE
  • BUNNY_API_KEY
  • BUNNY_CDN_BASE_URL
  • BUNNY_TOKEN_AUTH_KEY
  • BUNNY_STORAGE_ENDPOINT

On startup, Bunny client runs list_directory("") as a connection test.

Where Storage Is Used

  • BFF routers call get_storage_client().get_url(...) to expose audio URLs.
  • Media router calls get_storage_client().download(...) to stream files for /media/... routes.

Practically:

  • In local mode, /media/... endpoints are expected and functional.
  • In Bunny mode, clients should consume returned CDN URLs directly.

Operational Notes

  • Upload content type is currently fixed to audio/wav in both adapters.
  • Bunny signed URL expiry is _SIGNED_URL_EXPIRY_SECONDS = 3600.
  • The storage client is per-process; each API process initialises its own instance at boot.