6.5 KiB
Language Learning App — API
Python/FastAPI HTTP API. See docs/architecture.md for design principles.
Stack
- Python 3.13, FastAPI, Uvicorn (ASGI)
- PostgreSQL via SQLAlchemy 2.0 async (asyncpg driver) + Alembic migrations
- Auth: JWT (PyJWT), passwords hashed with passlib pbkdf2_sha256
- External APIs: Anthropic (text generation), Google Gemini (TTS), DeepL (translation), Deepgram (STT), spaCy (NLP)
- Storage: S3-compatible object store via boto3 (
app/storage.py) - Background work: in-process
asyncio.Queueworker (app/worker.py) - Tests: pytest + httpx against a real Docker stack (
tests/conftest.py)
Architecture: Domain-Driven + Hexagonal
app/
domain/
models/ # Pure dataclasses — NO ORM, NO methods, just data
services/ # Orchestration logic; take repos as constructor params
routers/
api/ # RESTful resource endpoints (/api/...)
bff/ # Screen-specific read-only endpoints (/bff/...) — GET only
outbound/
postgres/
entities/ # SQLAlchemy ORM table definitions
repositories/ # CRUD; the ONLY place ORM entities are touched
anthropic/ # AnthropicClient
gemini/ # GeminiClient
deepl/ # DeepLClient
deepgram/ # DeepgramClient
spacy/ # SpacyClient
auth.py # JWT helpers: create_access_token, verify_token, require_admin
config.py # Pydantic Settings (reads from .env)
storage.py # S3 upload/download helpers
worker.py # worker_loop() + enqueue()
Key Patterns
Domain models
Plain dataclasses. IDs stored as str (UUID converted at repo boundary).
@dataclass
class Flashcard:
id: str
user_id: str
created_at: datetime
ORM entities
SQLAlchemy 2.0 Mapped[] style. UUIDs as primary keys with default=uuid.uuid4.
class FlashcardEntity(Base):
__tablename__ = "flashcard"
id: Mapped[uuid.UUID] = mapped_column(UUID(as_uuid=True), primary_key=True, default=uuid.uuid4)
user_id: Mapped[uuid.UUID] = mapped_column(UUID(as_uuid=True), ForeignKey("users.id", ondelete="CASCADE"), nullable=False, index=True)
created_at: Mapped[datetime] = mapped_column(DateTime(timezone=True), nullable=False, default=lambda: datetime.now(timezone.utc))
Repositories
Protocol + Postgres implementation. Private _to_model(entity) function at module level.
class PostgresFooRepository:
def __init__(self, db: AsyncSession) -> None:
self.db = db
async def create(self, ...) -> Foo:
entity = FooEntity(...)
self.db.add(entity)
await self.db.commit()
await self.db.refresh(entity)
return _to_model(entity)
Services
Constructor receives all needed repos. Domain-language method names.
class FooService:
def __init__(self, foo_repo: FooRepository, bar_repo: BarRepository) -> None:
...
async def add_foo_for_user(self, user_id: uuid.UUID, ...) -> Foo:
... # raises ValueError on bad input
Routers
Depends(get_db)for the DB sessionDepends(verify_token)for auth; extractsuser_id = uuid.UUID(token_data["sub"])Depends(require_admin)for admin-only endpoints; use_: dict = Depends(require_admin)if token_data unused- Private
_service(db)factory function instantiates service + repos ValueErrorfrom service →HTTPExceptionwith appropriate status code- Private
_to_response(model)converts domain model to Pydantic response
def _service(db: AsyncSession) -> FooService:
return FooService(foo_repo=PostgresFooRepository(db))
@router.post("", response_model=FooResponse, status_code=201)
async def create_foo(
body: CreateFooRequest,
db: AsyncSession = Depends(get_db),
token_data: dict = Depends(verify_token),
) -> FooResponse:
user_id = uuid.UUID(token_data["sub"])
try:
foo = await _service(db).add_foo_for_user(user_id, ...)
except ValueError as exc:
raise HTTPException(status_code=404, detail=str(exc))
return _to_response(foo)
Background work
Enqueue callables into the in-process worker queue. The worker runs one task at a time.
await worker.enqueue(lambda: some_service.do_work(db, entity_id))
The SummariseService.run() is the canonical example: LLM → translate → TTS → S3 upload, all in one async method, called from a worker task. Use _anthropic_with_backoff() (defined in summarise_service.py) for retryable Anthropic calls.
External clients
All use asyncio.to_thread(_call) to wrap synchronous SDK calls.
AnthropicClient.new(api_key)— text generation; model hardcoded as"claude-sonnet-4-6"GeminiClient(api_key)— TTS viagenerate_audio(text, voice);get_voice_by_language(lang)maps ISO codes to voice namesDeepLClient(api_key)—translate(text, to_language); checkcan_translate_to(lang)firstGeminiClientusesgemini-2.5-flash-preview-ttsmodel; returns PCM converted to WAV viapcm_to_wav()
Migrations
Naming: YYYYMMDD_NNNN_description.py.
Always include downgrade() that reverses upgrade() in reverse order. Use postgresql.JSONB() for arrays/objects, sa.func.now() for server-side timestamp defaults.
Tests
Session-scoped docker_stack fixture brings up docker-compose.test.yml (project langlearn-test, API on port 18000). Each test gets a fresh httpx.Client. Register + login to get a token; set client.headers["Authorization"] = f"Bearer {token}".
Route Registration
Add new routers in:
app/routers/api/main.py—api_router.include_router(...)app/routers/bff/main.py—bff_router.include_router(...)
Config
All settings in app/config.py via pydantic_settings.BaseSettings (reads .env). Access via from .config import settings. Required keys: database_url, jwt_secret, anthropic_api_key, deepl_api_key, deepgram_api_key, gemini_api_key, storage_endpoint_url, storage_access_key, storage_secret_key.
Existing Domain Areas
| Area | Models | Router prefix |
|---|---|---|
| Auth | Account |
/api/auth |
| Vocabulary | LearnableWordBankEntry, UserLanguagePair |
/api/vocab |
| Flashcards | Flashcard, FlashcardEvent |
/api/flashcards |
| Packs | WordBankPack, WordBankPackEntry |
/api/packs, /api/admin/packs |
| Articles | TranslatedArticle, SummariseJob |
/api/generation, /api/jobs |
| Adventures | (being built — see docs/technical-doc-choose-your-own-adventure.md) |
/api/adventures |