From fb4ab69295f01a27e058bdc3c2ee154c01417074 Mon Sep 17 00:00:00 2001
From: Thomas <thomas@thomaswilson.xyz>
Date: Fri, 24 Apr 2026 07:54:56 +0100
Subject: [PATCH] docs: [api] Add flashcard-related models

---
 api/docs/design-doc-flashcard.md    | 102 ++++
 api/docs/design-doc-pricing.md      |   9 +
 api/docs/technical-doc-flashcard.md | 758 ++++++++++++++++++++++++++++
 3 files changed, 869 insertions(+)
 create mode 100644 api/docs/design-doc-flashcard.md
 create mode 100644 api/docs/design-doc-pricing.md
 create mode 100644 api/docs/technical-doc-flashcard.md

diff --git a/api/docs/design-doc-flashcard.md b/api/docs/design-doc-flashcard.md
new file mode 100644
index 0000000..93b9ff8
--- /dev/null
+++ b/api/docs/design-doc-flashcard.md
@@ -0,0 +1,102 @@
+# Design Doc: Flashcards
+
+This is a technical design document for the Flashcard in the context of Language Learning App.
+
+You may wish to read both [architecture.md](./architecture.md) and [domain](./domain.md) for an understanding of the components of this API codebase and the wider problem domain, respectively. 
+
+## The Problem
+
+Flashcards, in the physical world, are a common learning aid.  They are a  two-sided piece of paper with information on both sides.  The learner will view one side, and attempt to recall what is on the other.  They are a tool to improve a learner's ability to recall specific information (e.g. specific words), and therefore a tool to increase fluency.
+
+Language Learning App relies on recognition and recall to improve learner fluency in a language.  To do so, learners must be presented with some kind of cue for which they must generate some kind of response.  It is important that we don't rely on learner's perceived fluency (e.g. present them with a a cue, then show them the response and ask them how right/confident they were in retrospect), because language learners will consistently over-estimate their fluency.
+
+Providing opportunity for "meaning-focused input" (as researcher Paul Nation called it in their *Four Strands*) is one of the main focuses of Language Learning App.  That is - providing realistic text to read and listen to, and have them engage in the sustained act of making sense from it.
+
+Flashcards are not meaning-focused input, because of how isolated they are.  Their benefit is in helping learners gain familiarity in isolated parts of language, e.g. specific words.  This comes in two flavours: recognition and generation.  Recognition of words can be seen, broadly, as "I see the word 'chat', I know that means 'cat'".  Generation is about producing the word for "cat" when it is needed. 
+
+Pedagogically, Flashcards allow us to model the recognition and generation stepss at an artifically granular level: a single word looking for a single translation.  E.g. showing a French learner "banque" and expecting them to generate "bank".
+
+This is the central problem that Flashcards solve: helping learners become familiar, and able to recall/generate specific language items (e.g. words).  This can include:
+
+* a foreign language word they wish to become familiar with, 
+* the conjugations of a specific verb
+
+However there are numerous concerns, or limitations, with the simple idea of a two-sided piece of paper.  The first is that, philosophically, language learning is not about a one-to-one translation between words.  The second is that homonyms (i.e. the same literal words) can have multiple possible meanings or translations.  
+
+Because langauge is not a one-to-one translation activity, we must accept that there are multiple possible correct (or semi-correct) answers to a single cue.  For example a single cue (e.g. "cold") may have multiple synonyms, e.g. "cold" or "chilly".  One of these responses might be the ideal target, but another would have the learner convey their meaning. 
+
+From a product positioning perspective, Language Learning App wishes to be more than "just" a flashcard app, (e.g. anki).  Firstly because Anki is an exceptitonal tool that already exists, but also because its central philosophy of teaching and language learning is that it is more valuable to learn a language through exposure to longer-form content than *just* a single word on a flashcard.  Specifically it centres written articles and audio podcasts (both AI-generated) - and having the learner identify words from that content that they wish to retain or be familiar with.
+
+This becomes apparent as Flashcards progress from being simple "cue" and "response" to having contextual text (i.e. natural language sentences) with gaps in them,
+
+## Not in scope
+
+There are a number of problems not in scope in the Flashcard model. 
+
+The biggest category fall around the system that we will need to build to present users with words in an algorithmic, effective way that enhances learning.  We should bear these requirements in mind, but they are out of scope in the initial design of a Flashcard.
+
+There will need to be a system which grades "familiarity" or "fluency" against a specific Flashcard, or against a specific wordbank item/sense (and also lemma).  
+
+There will also need to be a system of "events" which records when a learner was last shown a specific flashcard, what they responded to it, and how correct their response was.
+
+## Example Flashcards
+
+The following examples all assume the learner is an English speaker learning French.
+
+A simple one-way Flashcard would look like:
+
+```txt
+Cue: être
+Answers: [to be, be]
+```
+
+In the above, the learner can respond "be" or "to be" and be considered correct.
+
+Consider a simple two-way Flashcard, would look like:
+
+```txt
+Cue: to have (v. inf.)
+Answer: [avoir]
+---
+Cue: avoir
+Answer: [to have, have]
+```
+
+A flashcard with "contextual text" may look like the following:
+
+```txt
+Contextual text: Il souhaite d'_____ un chat
+Cue: (He wishes [to have] a cat, He wises to have a cat)
+Answer: [avoir]
+```
+
+Notice in the above that there are two forms of cue: one which [highlights] the word fill in, and one which does not.  This should be a configurable property, and the presentation of it should run both ways.
+
+Let's consider a Flashcard with contextual text where there are multiple words to fill:
+
+```txt
+Contextual text: Il _______ _'_____ un chat
+Cue: (He [wishes] [to have] a cat)
+Answers: [[souhaite], [d], [avoir]]
+```
+
+And let's consider a Flashcard that encourages the learner to "guess" from context a word:
+
+```txt
+Contextual text: Il [souhaite] d'avoir un chat
+Answers: [wishes]
+```
+
+In the above, the learning activity is having a learner "guess" a word, from context, which they may not have previously seen before.  Such activities focus on meaning-making from language.
+
+## Requirements 
+
+* Both the cue and the response(s) for a Flashcard need to be linked, in some way, to their underlying wordbank entries.  
+  * There can be multiple Flashcards against the same wordbank entry and sense (e.g. "cat <-> chat"), and Flashcards against the same lemma (e.g. "lent,lente <-> slow").
+  * There should be special attention paid to the modelling of verb conjugations.  Flashcards where the response is a verb's conjugation (e.g. 3rd person present tense of être) should be modelled as such
+* It should be possible for Flashcards to go both one-way (e.g. En -> Fr) and two ways (e.g. En <-> Fr).  
+* There needs to be many options for "grading" a response to a flashcard.  This may come from language be configured by the learner specifically (e.g. they may) flashcards needs to be considered (right/wrong), categorical (fully correct/), or some degree of percentage-ly correct (e.g. 75% word similarity).  
+* Audible components (likely AI-generated text-to-speech) for both contextual text _and_ foreign language answers are an essential component.  There are a variety of scenarios these could play.  There could be Flashcards which are _just_ audio, or we could play the audio of the correct answer after the learner has answered, so that they know how the word sounds out loud.
+* There needs to be a good interface for making Flashcards.  They can be made in a stand-alone "studio" (i.e. a CMS like interface) where they are constructed, but also they can be taken from the natural language content (e.g. articles) within Language Learning App.  This latter would allow us to extract the contextual text of the original sentence
+  * For admins, there should be a more powerful Flashcard Studio which uses generative AI to suggest contextual texts and examples for words, so that the Admin user can specify the wordform (and find the right sense) and then have a pre-populated list of possible Flashcards with contextual texts.
+  *
diff --git a/api/docs/design-doc-pricing.md b/api/docs/design-doc-pricing.md
new file mode 100644
index 0000000..76ea919
--- /dev/null
+++ b/api/docs/design-doc-pricing.md
@@ -0,0 +1,9 @@
+# Pricing 
+
+## Ideas
+
+- Dynamic subscription based pricing where the floor is lower if the learner commits to more learning activity.  Price paid at the end of the month decreases for each day where the learner does a certain amount of activity.  
+  - Learner should be able to set the type of activity
+  - E.g. $5/mo floor, starting at $25/mo (20 learning days a mo)
+  - E.g. $10/mo floor, starting at $20/mo (10 learning days a mo)
+- Useage-based pricing where the learner commits an amount (e.g. $15/mo), and can
diff --git a/api/docs/technical-doc-flashcard.md b/api/docs/technical-doc-flashcard.md
new file mode 100644
index 0000000..3b476a7
--- /dev/null
+++ b/api/docs/technical-doc-flashcard.md
@@ -0,0 +1,758 @@
+# Technical Document: Flashcard
+
+This is the technical design document for building Flashcards.  See [design-doc-flashcard](./design-doc-flashcard.md) for the product requirements and domain analysis.
+
+## Summary
+
+The Flashcard domain implements a spaced-repetition learning system that supports contextual, multi-modal flashcards with bidirectional study patterns.  Unlike simple word-pair flashcards, this system integrates deeply with the vocabulary bank and dictionary to support contextual text, multiple correct answers per gap, verb conjugations, and audio components.
+
+## Current State Analysis
+
+The existing flashcard implementation provides basic functionality:
+- Simple bidirectional flashcards (`target_to_source`, `source_to_target`)
+- Basic event tracking (`shown`, `answered`, `skipped`)
+- Integration with vocabulary bank entries
+
+However, the design document requirements necessitate significant enhancements to support:
+- Contextual text with gap-fill exercises, including multiple simultaneous gaps
+- Multiple correct answers per gap, independently mapped
+- Bidirectional study as two distinct presentation rows
+- Full wordbank linkage on both cue and answer sides
+- Verb conjugation modelling
+- Audio (TTS) integration
+- AI-assisted flashcard generation from templates
+- Flashcard creation from article source sentences
+
+## Domain Entities
+
+### Core Flashcard Entity (Enhanced)
+
+```python
+@dataclass
+class Flashcard:
+    id: str
+    user_id: str
+
+    # Wordbank linkage — both sides must be anchored
+    bank_entry_id: str           # The vocabulary bank entry this card belongs to
+    prompt_sense_id: str | None  # Dictionary sense being tested on the prompt side
+    prompt_lemma_id: str | None  # Dictionary lemma for the prompt side
+
+    source_lang: str
+    target_lang: str
+
+    # Core content
+    # answer_text is removed; accepted_answers is the single canonical list
+    prompt_text: str
+    accepted_answers: list[str]  # All acceptable answer variations; never empty
+
+    # Contextual content
+    contextual_text: str | None
+    contextual_text_language: str | None
+    gap_positions: list[GapPosition] | None  # For fill-in-the-blank; each gap carries its own accepted_answers
+
+    # Card configuration
+    card_direction: str   # "target_to_source" | "source_to_target"
+                          # Bidirectional = two separate Flashcard rows, one per direction
+    card_type: str        # "simple" | "contextual" | "gap_fill" | "conjugation"
+    prompt_modality: str  # "text" | "audio" | "text_and_audio"
+
+    # Grading configuration
+    grading_mode: str  # "binary" | "fuzzy"
+                       # "multiple_choice" is deferred: distractors are not yet modelled
+
+    # Audio support
+    prompt_audio_url: str | None
+    answer_audio_url: str | None
+    contextual_audio_url: str | None
+
+    # Template relationship (null for cards extracted from articles)
+    template_id: str | None
+
+    # Article source (null for template-generated cards)
+    source_article_id: str | None
+    source_sentence_index: int | None  # Which sentence in the article was used as contextual_text
+
+    created_at: datetime
+    updated_at: datetime
+
+
+@dataclass
+class GapPosition:
+    """Represents a single gap in contextual text for fill-in-the-blank exercises.
+
+    Each GapPosition carries its own accepted_answers, enabling independent
+    grading of each gap in multi-gap cards.
+
+    Example for "Il _______ _'_____ un chat" (He wishes to have a cat):
+        GapPosition(start_index=3, end_index=10, target_word="souhaite",
+                    accepted_answers=["souhaite"], ...)
+        GapPosition(start_index=14, end_index=19, target_word="avoir",
+                    accepted_answers=["avoir", "d'avoir"], ...)
+    """
+    start_index: int
+    end_index: int
+    target_word: str
+    accepted_answers: list[str]   # Answers specific to this gap
+    target_lemma_id: str | None
+    target_sense_id: str | None
+    bank_entry_id: str | None     # Wordbank linkage for this specific gap's word
+```
+
+### Bidirectionality
+
+A "bidirectional" flashcard is **not** a single entity with a `bidirectional` direction value.  It is represented as two separate `Flashcard` rows — one `target_to_source` and one `source_to_target` — sharing the same `bank_entry_id`.  This keeps each row's `prompt_sense_id`, `accepted_answers`, and grading configuration independently addressable, and avoids ambiguity in event recording.
+
+When generating flashcards for a vocabulary entry, the service creates both rows if bidirectional study is desired.
+
+```python
+# Example: two rows for "banque" ↔ "bank"
+Flashcard(
+    card_direction="target_to_source",
+    prompt_text="banque",
+    prompt_sense_id="dict-sense-banque-finance",
+    accepted_answers=["bank", "financial institution"],
+    ...
+)
+
+Flashcard(
+    card_direction="source_to_target",
+    prompt_text="bank (n, finance)",
+    prompt_sense_id="dict-sense-bank-finance-en",
+    accepted_answers=["banque", "la banque"],
+    ...
+)
+```
+
+### Multi-Gap Cards
+
+For cards with multiple simultaneous gaps, each `GapPosition` in the list carries its own `accepted_answers`.  The top-level `Flashcard.accepted_answers` field is not used for gap-fill cards; grading iterates `gap_positions` instead.
+
+```python
+# "Il _______ _'_____ un chat"
+# Cue: "(He [wishes] [to have] a cat)"
+Flashcard(
+    card_type="gap_fill",
+    contextual_text="Il _______ _'_____ un chat",
+    prompt_text="(He [wishes] [to have] a cat)",
+    accepted_answers=[],  # Unused for gap_fill; answers live on gap_positions
+    gap_positions=[
+        GapPosition(
+            start_index=3, end_index=10,
+            target_word="souhaite",
+            accepted_answers=["souhaite"],
+            target_lemma_id="lemma-souhaiter",
+            target_sense_id="sense-souhaiter-wish",
+            bank_entry_id="entry-souhaiter-user-123",
+        ),
+        GapPosition(
+            start_index=14, end_index=19,
+            target_word="avoir",
+            accepted_answers=["avoir", "d'avoir"],
+            target_lemma_id="lemma-avoir",
+            target_sense_id="sense-avoir-have",
+            bank_entry_id="entry-avoir-user-123",
+        ),
+    ],
+    ...
+)
+```
+
+### Flashcard Template Entity
+
+Templates define parameters for generating flashcards from dictionary senses.  They are used for AI-assisted generation only; cards extracted from articles do not require a template.
+
+```python
+@dataclass
+class FlashcardTemplate:
+    id: str
+    name: str
+    description: str
+    language_pair: str  # e.g., "en-fr"
+
+    card_type: str  # "simple" | "contextual" | "gap_fill" | "conjugation"
+
+    # AI generation settings
+    use_ai_for_context: bool
+    ai_context_prompt: str | None  # Supports {headword}, {gloss}, {proficiency} placeholders
+
+    # Answer generation settings
+    include_gender_hints: bool
+    include_conjugation_hints: bool
+    max_accepted_answers: int
+
+    created_at: datetime
+```
+
+### AI Generation Cache Entity
+
+```python
+@dataclass
+class AIGeneratedContent:
+    """Caches AI-generated contextual sentences for dictionary senses."""
+    id: str
+    sense_id: str
+    language: str
+
+    contextual_sentences: list[str]
+    difficulty_level: str  # "A1" | "A2" | "B1" | "B2" | "C1" | "C2"
+
+    ai_model_used: str     # Read from configuration, never hardcoded
+    generated_at: datetime
+    usage_count: int
+```
+
+### Enhanced FlashcardEvent
+
+```python
+@dataclass
+class FlashcardEvent:
+    id: str
+    flashcard_id: str
+    user_id: str
+    event_type: str  # "shown" | "answered" | "skipped" | "audio_played"
+
+    user_response: str | None
+    response_time_ms: int | None
+
+    # For gap_fill cards, per-gap results are stored here
+    gap_results: list[GapGradingResult] | None
+
+    correctness_score: float | None   # 0.0–1.0; mean of gap scores for multi-gap
+    accepted_answer_matched: str | None
+
+    study_session_id: str | None
+    card_presentation_order: int | None
+
+    audio_played: bool
+    audio_duration_played_ms: int | None
+
+    created_at: datetime
+
+
+@dataclass
+class GapGradingResult:
+    gap_index: int
+    user_response: str
+    is_correct: bool
+    correctness_score: float
+    matched_answer: str | None
+```
+
+### Conjugation Support Entity
+
+```python
+@dataclass
+class VerbConjugationCard:
+    id: str
+    base_flashcard_id: str
+
+    verb_lemma_id: str
+    tense: str    # "present" | "past" | "future" | "conditional" etc.
+    person: str   # "1s" | "2s" | "3s" | "1p" | "2p" | "3p"
+    mood: str | None  # "indicative" | "subjunctive" | "imperative"
+
+    conjugated_form: str
+    prompt_template: str  # e.g., "Conjugate 'aller' (to go) in 3rd person present"
+
+    created_at: datetime
+```
+
+---
+
+## Database Schema
+
+### New Tables
+
+#### `flashcard_template`
+```sql
+CREATE TABLE flashcard_template (
+    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
+    name TEXT NOT NULL,
+    description TEXT,
+    language_pair TEXT NOT NULL,
+
+    card_type TEXT NOT NULL,  -- 'simple' | 'contextual' | 'gap_fill' | 'conjugation'
+
+    use_ai_for_context BOOLEAN DEFAULT FALSE,
+    ai_context_prompt TEXT,
+
+    include_gender_hints BOOLEAN DEFAULT FALSE,
+    include_conjugation_hints BOOLEAN DEFAULT FALSE,
+    max_accepted_answers INTEGER DEFAULT 3,
+
+    created_at TIMESTAMPTZ DEFAULT NOW()
+);
+
+CREATE INDEX idx_flashcard_template_language_pair ON flashcard_template(language_pair);
+CREATE INDEX idx_flashcard_template_type ON flashcard_template(card_type);
+```
+
+#### `ai_generated_content`
+```sql
+CREATE TABLE ai_generated_content (
+    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
+    sense_id UUID REFERENCES dictionary_sense(id) ON DELETE CASCADE,
+    language TEXT NOT NULL,
+
+    contextual_sentences JSONB NOT NULL,
+    difficulty_level TEXT NOT NULL,
+
+    ai_model_used TEXT NOT NULL,  -- populated from application config, not hardcoded
+    generated_at TIMESTAMPTZ DEFAULT NOW(),
+    usage_count INTEGER DEFAULT 0,
+
+    UNIQUE(sense_id, language, difficulty_level)
+);
+
+CREATE INDEX idx_ai_content_sense_lang ON ai_generated_content(sense_id, language);
+CREATE INDEX idx_ai_content_difficulty ON ai_generated_content(difficulty_level);
+```
+
+#### `verb_conjugation_card`
+```sql
+CREATE TABLE verb_conjugation_card (
+    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
+    base_flashcard_id UUID REFERENCES flashcard(id) ON DELETE CASCADE,
+
+    verb_lemma_id UUID REFERENCES dictionary_lemma(id) ON DELETE CASCADE,
+    tense TEXT NOT NULL,
+    person TEXT NOT NULL,
+    mood TEXT,
+
+    conjugated_form TEXT NOT NULL,
+    prompt_template TEXT NOT NULL,
+
+    created_at TIMESTAMPTZ DEFAULT NOW(),
+
+    UNIQUE(base_flashcard_id)
+);
+```
+
+### Enhanced Existing Tables
+
+#### `flashcard` (modifications)
+```sql
+ALTER TABLE flashcard
+-- Remove answer_text; accepted_answers is the single source of truth
+DROP COLUMN IF EXISTS answer_text,
+ADD COLUMN accepted_answers JSONB NOT NULL DEFAULT '[]',  -- list[str] for simple/contextual; empty for gap_fill
+
+-- Wordbank linkage on both sides
+ADD COLUMN prompt_sense_id UUID REFERENCES dictionary_sense(id) ON DELETE SET NULL,
+ADD COLUMN prompt_lemma_id UUID REFERENCES dictionary_lemma(id) ON DELETE SET NULL,
+
+-- Contextual content
+ADD COLUMN contextual_text TEXT,
+ADD COLUMN contextual_text_language TEXT,
+ADD COLUMN gap_positions JSONB,  -- list[GapPosition]; each GapPosition includes its own accepted_answers
+
+-- Card configuration
+ADD COLUMN card_direction TEXT NOT NULL DEFAULT 'target_to_source',
+  -- CONSTRAINT: values are 'target_to_source' or 'source_to_target' only
+  -- Bidirectionality = two rows, not a third value here
+ADD COLUMN card_type TEXT NOT NULL DEFAULT 'simple',
+ADD COLUMN prompt_modality TEXT NOT NULL DEFAULT 'text',
+ADD COLUMN grading_mode TEXT NOT NULL DEFAULT 'binary',
+
+-- Audio
+ADD COLUMN prompt_audio_url TEXT,
+ADD COLUMN answer_audio_url TEXT,
+ADD COLUMN contextual_audio_url TEXT,
+
+-- Provenance: template-generated vs article-extracted (mutually exclusive)
+ADD COLUMN template_id UUID REFERENCES flashcard_template(id) ON DELETE SET NULL,
+ADD COLUMN source_article_id UUID REFERENCES article(id) ON DELETE SET NULL,
+ADD COLUMN source_sentence_index INTEGER,
+
+ADD COLUMN updated_at TIMESTAMPTZ DEFAULT NOW();
+
+ALTER TABLE flashcard
+ADD CONSTRAINT chk_card_direction CHECK (card_direction IN ('target_to_source', 'source_to_target')),
+ADD CONSTRAINT chk_provenance CHECK (
+    NOT (template_id IS NOT NULL AND source_article_id IS NOT NULL)
+);
+
+CREATE INDEX idx_flashcard_card_type ON flashcard(card_type);
+CREATE INDEX idx_flashcard_direction ON flashcard(card_direction);
+CREATE INDEX idx_flashcard_source_article ON flashcard(source_article_id);
+```
+
+#### `flashcard_event` (modifications)
+```sql
+ALTER TABLE flashcard_event
+ADD COLUMN response_time_ms INTEGER,
+ADD COLUMN gap_results JSONB,                  -- list[GapGradingResult] for gap_fill cards
+ADD COLUMN correctness_score DECIMAL(3,2),
+ADD COLUMN accepted_answer_matched TEXT,
+ADD COLUMN study_session_id UUID,
+ADD COLUMN card_presentation_order INTEGER,
+ADD COLUMN audio_played BOOLEAN DEFAULT FALSE,
+ADD COLUMN audio_duration_played_ms INTEGER;
+
+CREATE INDEX idx_flashcard_event_session ON flashcard_event(study_session_id);
+CREATE INDEX idx_flashcard_event_correctness ON flashcard_event(correctness_score);
+```
+
+---
+
+## Service Layer Architecture
+
+### FlashcardService
+
+```python
+class FlashcardService:
+
+    def __init__(
+        self,
+        flashcard_repo: FlashcardRepository,
+        vocab_repo: VocabRepository,
+        dict_repo: DictionaryRepository,
+        template_repo: FlashcardTemplateRepository,
+        audio_service: AudioGenerationService,
+        ai_service: AIContentGenerationService,
+        ai_model_name: str,  # Injected from application config; never hardcoded
+    ): ...
+
+    async def generate_flashcards_from_vocab_entry(
+        self,
+        entry_id: UUID,
+        user_proficiency: str = "B1",
+        template_types: list[str] | None = None,
+        bidirectional: bool = True,
+    ) -> list[Flashcard]:
+        """
+        Generate flashcards from a vocabulary entry using configured templates.
+
+        If bidirectional=True, both a target_to_source and a source_to_target
+        row are created for each template.  They are stored as independent rows.
+        """
+        entry = await self.vocab_repo.get_entry(entry_id)
+        sense = await self.dict_repo.get_sense(entry.sense_id)
+        lemma = await self.dict_repo.get_lemma(sense.lemma_id)
+
+        templates = await self.template_repo.get_templates_for_language_pair(
+            entry.language_pair,
+            template_types or ["simple", "contextual"]
+        )
+
+        flashcards = []
+        for template in templates:
+            contextual_text = None
+            if template.use_ai_for_context:
+                ai_content = await self._get_or_generate_ai_content(
+                    sense.id, sense.language, user_proficiency, template
+                )
+                contextual_text = random.choice(ai_content.contextual_sentences)
+
+            # Always create target_to_source
+            card_tts = await self._create_card(
+                template, entry, sense, lemma,
+                direction="target_to_source",
+                contextual_text=contextual_text,
+            )
+            flashcards.append(card_tts)
+
+            if bidirectional:
+                card_stt = await self._create_card(
+                    template, entry, sense, lemma,
+                    direction="source_to_target",
+                    contextual_text=contextual_text,
+                )
+                flashcards.append(card_stt)
+
+        return flashcards
+
+    async def create_flashcard_from_article_sentence(
+        self,
+        article_id: UUID,
+        sentence_index: int,
+        target_word: str,
+        bank_entry_id: UUID,
+        sense_id: UUID,
+        direction: str = "target_to_source",
+    ) -> Flashcard:
+        """
+        Create a contextual flashcard using a sentence from an article as the
+        contextual text.  The original sentence provides authentic context;
+        the target word is extracted as the gap.
+
+        This is the primary creation path for cards derived from article reading.
+        No template_id is set; source_article_id and source_sentence_index are.
+        """
+        sentence = await self._get_article_sentence(article_id, sentence_index)
+        gap = self._build_gap_from_sentence(sentence, target_word, sense_id, bank_entry_id)
+
+        return Flashcard(
+            bank_entry_id=str(bank_entry_id),
+            prompt_sense_id=str(sense_id),
+            card_type="gap_fill",
+            card_direction=direction,
+            contextual_text=sentence.text_with_gap,
+            contextual_text_language=sentence.language,
+            gap_positions=[gap],
+            accepted_answers=[],  # Answers live on gap_positions for gap_fill
+            template_id=None,
+            source_article_id=str(article_id),
+            source_sentence_index=sentence_index,
+            ...
+        )
+
+    async def grade_flashcard_response(
+        self,
+        flashcard: Flashcard,
+        user_response: str,
+        grading_mode: str = "fuzzy",
+    ) -> GradingResult:
+        """
+        Grade a user response.
+
+        For gap_fill cards with multiple gaps, user_response is expected to be
+        a pipe-delimited string of per-gap responses (e.g. "souhaite|avoir").
+        Per-gap GapGradingResult objects are returned inside the GradingResult.
+        """
+        if flashcard.card_type == "gap_fill" and flashcard.gap_positions:
+            return self._grade_multi_gap(flashcard, user_response, grading_mode)
+
+        if grading_mode == "binary":
+            return self._grade_binary(flashcard, user_response)
+        elif grading_mode == "fuzzy":
+            return self._grade_fuzzy(flashcard, user_response)
+        else:
+            raise ValueError(f"Unknown grading mode: {grading_mode}")
+
+    def _grade_multi_gap(
+        self,
+        flashcard: Flashcard,
+        user_response: str,
+        grading_mode: str,
+    ) -> GradingResult:
+        """
+        Grade each gap independently using its own accepted_answers list.
+        Overall correctness_score is the mean of per-gap scores.
+        """
+        responses = user_response.split("|")
+        gap_results = []
+
+        for i, (gap, response) in enumerate(zip(flashcard.gap_positions, responses)):
+            temp_card = SimpleNamespace(accepted_answers=gap.accepted_answers)
+            gap_grade = (
+                self._grade_fuzzy(temp_card, response)
+                if grading_mode == "fuzzy"
+                else self._grade_binary(temp_card, response)
+            )
+            gap_results.append(GapGradingResult(
+                gap_index=i,
+                user_response=response,
+                is_correct=gap_grade.is_correct,
+                correctness_score=gap_grade.score,
+                matched_answer=gap_grade.matched_answer,
+            ))
+
+        mean_score = sum(r.correctness_score for r in gap_results) / len(gap_results)
+        return GradingResult(
+            is_correct=all(r.is_correct for r in gap_results),
+            score=mean_score,
+            gap_results=gap_results,
+        )
+
+    def _grade_fuzzy(self, flashcard, response: str) -> GradingResult:
+        """
+        Accept variations and use string similarity.  Checks accepted_answers
+        exactly first, then falls back to similarity threshold (>= 0.8).
+        """
+        response_clean = response.strip().lower()
+
+        for accepted in flashcard.accepted_answers:
+            if response_clean == accepted.lower():
+                return GradingResult(is_correct=True, score=1.0, matched_answer=accepted)
+
+        for accepted in flashcard.accepted_answers:
+            similarity = self._calculate_string_similarity(response_clean, accepted.lower())
+            if similarity >= 0.8:
+                return GradingResult(is_correct=True, score=similarity, matched_answer=accepted)
+
+        return GradingResult(is_correct=False, score=0.0, matched_answer=None)
+
+    async def _get_or_generate_ai_content(
+        self,
+        sense_id: UUID,
+        language: str,
+        proficiency: str,
+        template: FlashcardTemplate,
+    ) -> AIGeneratedContent:
+        cached = await self.ai_content_repo.get_content(sense_id, language, proficiency)
+        if cached:
+            await self.ai_content_repo.increment_usage(cached.id)
+            return cached
+
+        sense = await self.dict_repo.get_sense(sense_id)
+        lemma = await self.dict_repo.get_lemma(sense.lemma_id)
+
+        ai_prompt = template.ai_context_prompt.format(
+            headword=lemma.headword,
+            gloss=sense.gloss,
+            proficiency=proficiency,
+        )
+        sentences = await self.ai_service.generate_contextual_sentences(ai_prompt, count=5)
+
+        return await self.ai_content_repo.create(AIGeneratedContent(
+            sense_id=sense_id,
+            language=language,
+            contextual_sentences=sentences,
+            difficulty_level=proficiency,
+            ai_model_used=self.ai_model_name,  # From config
+            usage_count=1,
+        ))
+```
+
+### FlashcardTemplateService
+
+Manages templates and the admin Flashcard Studio experience.
+
+```python
+class FlashcardTemplateService:
+
+    async def create_template_for_word_class(
+        self,
+        word_class: str,   # "verb" | "noun" | "adjective" etc.
+        language_pair: str,
+        admin_user_id: UUID,
+    ) -> FlashcardTemplate: ...
+
+    async def generate_contextual_examples_for_admin(
+        self,
+        lemma: DictionaryLemma,
+        sense: DictionarySense,
+        proficiency: str,
+        count: int = 5,
+    ) -> list[str]:
+        """
+        Admin Flashcard Studio: given a headword and sense, generate candidate
+        contextual sentences that an admin can review and accept or discard before
+        a template is saved.  Results are not cached until the admin confirms.
+        """
+
+    async def suggest_flashcard_improvements(
+        self,
+        flashcard: Flashcard,
+        performance_data: list[FlashcardEvent],
+    ) -> list[str]: ...
+```
+
+### FlashcardStudyService
+
+```python
+class FlashcardStudyService:
+
+    async def start_study_session(
+        self,
+        user_id: UUID,
+        language_pair_id: UUID,
+        session_config: StudySessionConfig,
+    ) -> StudySession: ...
+
+    async def get_next_card_in_session(self, session_id: UUID) -> Flashcard | None: ...
+
+    async def record_card_interaction(
+        self,
+        flashcard_id: UUID,
+        user_response: str,
+        response_time_ms: int,
+        session_id: UUID,
+    ) -> FlashcardEvent: ...
+
+    async def complete_study_session(self, session_id: UUID) -> StudySessionSummary: ...
+```
+
+### AudioIntegrationService
+
+```python
+class AudioIntegrationService:
+
+    async def generate_audio_for_flashcard(
+        self,
+        flashcard: Flashcard,
+        voice_config: VoiceConfig,
+    ) -> AudioFiles: ...
+
+    async def generate_contextual_audio(
+        self,
+        text: str,
+        language: str,
+        highlight_words: list[str] | None = None,
+    ) -> str: ...
+```
+
+---
+
+## Integration Points
+
+### Vocabulary Bank Integration
+
+- Each `Flashcard` links to a `LearnableWordBankEntry` via `bank_entry_id`
+- `prompt_sense_id` and `prompt_lemma_id` anchor the cue side to the dictionary
+- For gap-fill cards, each `GapPosition.bank_entry_id` anchors the answer side for each gap independently
+- Only resolved vocabulary entries (with `sense_id`) can generate standard flashcards
+- Flashcard performance events feed back into vocabulary familiarity scoring
+
+### Dictionary Integration
+
+- Verb lemmas link to specialised conjugation flashcard generation via `VerbConjugationCard`
+- Gender information influences `accepted_answers` construction (e.g. including "la banque" alongside "banque")
+- Multiple senses per lemma enable sense-specific flashcard variations with distinct `prompt_sense_id` values
+
+### Article Extraction Integration
+
+- `source_article_id` and `source_sentence_index` on `Flashcard` record provenance for cards created during article reading
+- The `create_flashcard_from_article_sentence` service method is the dedicated creation path
+- These cards carry no `template_id`; the constraint on the table enforces mutual exclusivity
+
+### Future Fluency System Integration
+
+- `FlashcardEvent` provides performance metrics per word and per sense
+- `GapGradingResult` enables per-word performance tracking within multi-gap cards
+- Spaced-repetition scheduling will be driven by fluency scores derived from event history
+
+---
+
+## Implementation Phases
+
+### Phase 1: Core Enhanced Flashcard System
+- Implement enhanced `Flashcard` domain model with wordbank linkage on both sides
+- Replace `answer_text` with `accepted_answers` throughout; migrate existing data
+- Implement `GapPosition` with per-gap `accepted_answers`
+- Enforce bidirectionality as two rows via the service layer
+
+### Phase 2: Article Extraction Path
+- Implement `create_flashcard_from_article_sentence` in `FlashcardService`
+- Wire up article sentence retrieval and gap construction
+- Surface this in the article reading UI
+
+### Phase 3: AI-Assisted Content Generation
+- Integrate AI service for contextual sentence generation; model name from config
+- Implement `FlashcardTemplateService` including the admin Flashcard Studio preview flow
+- Implement `ai_generated_content` caching
+
+### Phase 4: Advanced Card Types
+- Implement verb conjugation flashcards via `VerbConjugationCard`
+- Add audio support via `AudioIntegrationService`
+- Implement fuzzy grading and multi-gap grading
+
+### Phase 5: Study Session Management
+- Implement `FlashcardStudyService`
+- Basic spaced-repetition scheduling
+- Session summaries and performance analytics
+
+### Phase 6: Integration and Polish
+- Integrate with fluency/familiarity system once designed
+- Adaptive difficulty adjustment
+- Administrative tooling
+
+---
+
+## Backward Compatibility
+
+- Existing flashcards are treated as `card_type: "simple"`, `card_direction: "target_to_source"`
+- Where `answer_text` exists in current data, it is migrated to a single-element `accepted_answers` list
+- Existing `FlashcardEvent` records remain valid; new columns are nullable