# Language Learning App — API Python/FastAPI HTTP API. See `docs/architecture.md` for design principles. ## Stack - **Python 3.13**, FastAPI, Uvicorn (ASGI) - **PostgreSQL** via SQLAlchemy 2.0 async (asyncpg driver) + Alembic migrations - **Auth**: JWT (PyJWT), passwords hashed with passlib pbkdf2_sha256 - **External APIs**: Anthropic (text generation), Google Gemini (TTS), DeepL (translation), Deepgram (STT), spaCy (NLP) - **Storage**: S3-compatible object store via boto3 (`app/storage.py`) - **Background work**: in-process `asyncio.Queue` worker (`app/worker.py`) - **Tests**: pytest + httpx against a real Docker stack (`tests/conftest.py`) ## Architecture: Domain-Driven + Hexagonal ``` app/ domain/ models/ # Pure dataclasses — NO ORM, NO methods, just data services/ # Orchestration logic; take repos as constructor params routers/ api/ # RESTful resource endpoints (/api/...) bff/ # Screen-specific read-only endpoints (/bff/...) — GET only outbound/ postgres/ entities/ # SQLAlchemy ORM table definitions repositories/ # CRUD; the ONLY place ORM entities are touched anthropic/ # AnthropicClient gemini/ # GeminiClient deepl/ # DeepLClient deepgram/ # DeepgramClient spacy/ # SpacyClient auth.py # JWT helpers: create_access_token, verify_token, require_admin config.py # Pydantic Settings (reads from .env) storage.py # S3 upload/download helpers worker.py # worker_loop() + enqueue() ``` ## Key Patterns ### Domain models Plain dataclasses. IDs stored as `str` (UUID converted at repo boundary). ```python @dataclass class Flashcard: id: str user_id: str created_at: datetime ``` ### ORM entities SQLAlchemy 2.0 `Mapped[]` style. UUIDs as primary keys with `default=uuid.uuid4`. ```python class FlashcardEntity(Base): __tablename__ = "flashcard" id: Mapped[uuid.UUID] = mapped_column(UUID(as_uuid=True), primary_key=True, default=uuid.uuid4) user_id: Mapped[uuid.UUID] = mapped_column(UUID(as_uuid=True), ForeignKey("users.id", ondelete="CASCADE"), nullable=False, index=True) created_at: Mapped[datetime] = mapped_column(DateTime(timezone=True), nullable=False, default=lambda: datetime.now(timezone.utc)) ``` ### Repositories Protocol + Postgres implementation. Private `_to_model(entity)` function at module level. ```python class PostgresFooRepository: def __init__(self, db: AsyncSession) -> None: self.db = db async def create(self, ...) -> Foo: entity = FooEntity(...) self.db.add(entity) await self.db.commit() await self.db.refresh(entity) return _to_model(entity) ``` ### Services Constructor receives all needed repos. Domain-language method names. ```python class FooService: def __init__(self, foo_repo: FooRepository, bar_repo: BarRepository) -> None: ... async def add_foo_for_user(self, user_id: uuid.UUID, ...) -> Foo: ... # raises ValueError on bad input ``` ### Routers - `Depends(get_db)` for the DB session - `Depends(verify_token)` for auth; extracts `user_id = uuid.UUID(token_data["sub"])` - `Depends(require_admin)` for admin-only endpoints; use `_: dict = Depends(require_admin)` if token_data unused - Private `_service(db)` factory function instantiates service + repos - `ValueError` from service → `HTTPException` with appropriate status code - Private `_to_response(model)` converts domain model to Pydantic response ```python def _service(db: AsyncSession) -> FooService: return FooService(foo_repo=PostgresFooRepository(db)) @router.post("", response_model=FooResponse, status_code=201) async def create_foo( body: CreateFooRequest, db: AsyncSession = Depends(get_db), token_data: dict = Depends(verify_token), ) -> FooResponse: user_id = uuid.UUID(token_data["sub"]) try: foo = await _service(db).add_foo_for_user(user_id, ...) except ValueError as exc: raise HTTPException(status_code=404, detail=str(exc)) return _to_response(foo) ``` ### Background work Enqueue callables into the in-process worker queue. The worker runs one task at a time. ```python await worker.enqueue(lambda: some_service.do_work(db, entity_id)) ``` The `SummariseService.run()` is the canonical example: LLM → translate → TTS → S3 upload, all in one async method, called from a worker task. Use `_anthropic_with_backoff()` (defined in `summarise_service.py`) for retryable Anthropic calls. ### External clients All use `asyncio.to_thread(_call)` to wrap synchronous SDK calls. - `AnthropicClient.new(api_key)` — text generation; model hardcoded as `"claude-sonnet-4-6"` - `GeminiClient(api_key)` — TTS via `generate_audio(text, voice)`; `get_voice_by_language(lang)` maps ISO codes to voice names - `DeepLClient(api_key)` — `translate(text, to_language)`; check `can_translate_to(lang)` first - `GeminiClient` uses `gemini-2.5-flash-preview-tts` model; returns PCM converted to WAV via `pcm_to_wav()` ### Migrations Naming: `YYYYMMDD_NNNN_description.py`. Always include `downgrade()` that reverses `upgrade()` in reverse order. Use `postgresql.JSONB()` for arrays/objects, `sa.func.now()` for server-side timestamp defaults. ### Tests Session-scoped `docker_stack` fixture brings up `docker-compose.test.yml` (project `langlearn-test`, API on port 18000). Each test gets a fresh `httpx.Client`. Register + login to get a token; set `client.headers["Authorization"] = f"Bearer {token}"`. ## Route Registration Add new routers in: - `app/routers/api/main.py` — `api_router.include_router(...)` - `app/routers/bff/main.py` — `bff_router.include_router(...)` ## Config All settings in `app/config.py` via `pydantic_settings.BaseSettings` (reads `.env`). Access via `from .config import settings`. Required keys: `database_url`, `jwt_secret`, `anthropic_api_key`, `deepl_api_key`, `deepgram_api_key`, `gemini_api_key`, `storage_endpoint_url`, `storage_access_key`, `storage_secret_key`. ## Existing Domain Areas | Area | Models | Router prefix | |------|--------|---------------| | Auth | `Account` | `/api/auth` | | Vocabulary | `LearnableWordBankEntry`, `UserLanguagePair` | `/api/vocab` | | Flashcards | `Flashcard`, `FlashcardEvent` | `/api/flashcards` | | Packs | `WordBankPack`, `WordBankPackEntry` | `/api/packs`, `/api/admin/packs` | | Articles | `TranslatedArticle`, `SummariseJob` | `/api/generation`, `/api/jobs` | | Adventures | (being built — see `docs/technical-doc-choose-your-own-adventure.md`) | `/api/adventures` |