# Technical Document: Flashcard This is the technical design document for building Flashcards. See [design-doc-flashcard](./design-doc-flashcard.md) for the product requirements and domain analysis. ## Summary The Flashcard domain implements a spaced-repetition learning system that supports contextual, multi-modal flashcards with bidirectional study patterns. Unlike simple word-pair flashcards, this system integrates deeply with the vocabulary bank and dictionary to support contextual text, multiple correct answers per gap, verb conjugations, and audio components. ## Current State Analysis The existing flashcard implementation provides basic functionality: - Simple bidirectional flashcards (`target_to_source`, `source_to_target`) - Basic event tracking (`shown`, `answered`, `skipped`) - Integration with vocabulary bank entries However, the design document requirements necessitate significant enhancements to support: - Contextual text with gap-fill exercises, including multiple simultaneous gaps - Multiple correct answers per gap, independently mapped - Bidirectional study as two distinct presentation rows - Full wordbank linkage on both cue and answer sides - Verb conjugation modelling - Audio (TTS) integration - AI-assisted flashcard generation from templates - Flashcard creation from article source sentences ## Domain Entities ### Core Flashcard Entity (Enhanced) ```python @dataclass class Flashcard: id: str user_id: str # Wordbank linkage — both sides must be anchored bank_entry_id: str # The vocabulary bank entry this card belongs to prompt_sense_id: str | None # Dictionary sense being tested on the prompt side prompt_lemma_id: str | None # Dictionary lemma for the prompt side source_lang: str target_lang: str # Core content # answer_text is removed; accepted_answers is the single canonical list prompt_text: str accepted_answers: list[str] # All acceptable answer variations; never empty # Contextual content contextual_text: str | None contextual_text_language: str | None gap_positions: list[GapPosition] | None # For fill-in-the-blank; each gap carries its own accepted_answers # Card configuration card_direction: str # "target_to_source" | "source_to_target" # Bidirectional = two separate Flashcard rows, one per direction card_type: str # "simple" | "contextual" | "gap_fill" | "conjugation" prompt_modality: str # "text" | "audio" | "text_and_audio" # Grading configuration grading_mode: str # "binary" | "fuzzy" # "multiple_choice" is deferred: distractors are not yet modelled # Audio support prompt_audio_url: str | None answer_audio_url: str | None contextual_audio_url: str | None # Template relationship (null for cards extracted from articles) template_id: str | None # Article source (null for template-generated cards) source_article_id: str | None source_sentence_index: int | None # Which sentence in the article was used as contextual_text created_at: datetime updated_at: datetime @dataclass class GapPosition: """Represents a single gap in contextual text for fill-in-the-blank exercises. Each GapPosition carries its own accepted_answers, enabling independent grading of each gap in multi-gap cards. Example for "Il _______ _'_____ un chat" (He wishes to have a cat): GapPosition(start_index=3, end_index=10, target_word="souhaite", accepted_answers=["souhaite"], ...) GapPosition(start_index=14, end_index=19, target_word="avoir", accepted_answers=["avoir", "d'avoir"], ...) """ start_index: int end_index: int target_word: str accepted_answers: list[str] # Answers specific to this gap target_lemma_id: str | None target_sense_id: str | None bank_entry_id: str | None # Wordbank linkage for this specific gap's word ``` ### Bidirectionality A "bidirectional" flashcard is **not** a single entity with a `bidirectional` direction value. It is represented as two separate `Flashcard` rows — one `target_to_source` and one `source_to_target` — sharing the same `bank_entry_id`. This keeps each row's `prompt_sense_id`, `accepted_answers`, and grading configuration independently addressable, and avoids ambiguity in event recording. When generating flashcards for a vocabulary entry, the service creates both rows if bidirectional study is desired. ```python # Example: two rows for "banque" ↔ "bank" Flashcard( card_direction="target_to_source", prompt_text="banque", prompt_sense_id="dict-sense-banque-finance", accepted_answers=["bank", "financial institution"], ... ) Flashcard( card_direction="source_to_target", prompt_text="bank (n, finance)", prompt_sense_id="dict-sense-bank-finance-en", accepted_answers=["banque", "la banque"], ... ) ``` ### Multi-Gap Cards For cards with multiple simultaneous gaps, each `GapPosition` in the list carries its own `accepted_answers`. The top-level `Flashcard.accepted_answers` field is not used for gap-fill cards; grading iterates `gap_positions` instead. ```python # "Il _______ _'_____ un chat" # Cue: "(He [wishes] [to have] a cat)" Flashcard( card_type="gap_fill", contextual_text="Il _______ _'_____ un chat", prompt_text="(He [wishes] [to have] a cat)", accepted_answers=[], # Unused for gap_fill; answers live on gap_positions gap_positions=[ GapPosition( start_index=3, end_index=10, target_word="souhaite", accepted_answers=["souhaite"], target_lemma_id="lemma-souhaiter", target_sense_id="sense-souhaiter-wish", bank_entry_id="entry-souhaiter-user-123", ), GapPosition( start_index=14, end_index=19, target_word="avoir", accepted_answers=["avoir", "d'avoir"], target_lemma_id="lemma-avoir", target_sense_id="sense-avoir-have", bank_entry_id="entry-avoir-user-123", ), ], ... ) ``` ### Flashcard Template Entity Templates define parameters for generating flashcards from dictionary senses. They are used for AI-assisted generation only; cards extracted from articles do not require a template. ```python @dataclass class FlashcardTemplate: id: str name: str description: str language_pair: str # e.g., "en-fr" card_type: str # "simple" | "contextual" | "gap_fill" | "conjugation" # AI generation settings use_ai_for_context: bool ai_context_prompt: str | None # Supports {headword}, {gloss}, {proficiency} placeholders # Answer generation settings include_gender_hints: bool include_conjugation_hints: bool max_accepted_answers: int created_at: datetime ``` ### AI Generation Cache Entity ```python @dataclass class AIGeneratedContent: """Caches AI-generated contextual sentences for dictionary senses.""" id: str sense_id: str language: str contextual_sentences: list[str] difficulty_level: str # "A1" | "A2" | "B1" | "B2" | "C1" | "C2" ai_model_used: str # Read from configuration, never hardcoded generated_at: datetime usage_count: int ``` ### Enhanced FlashcardEvent ```python @dataclass class FlashcardEvent: id: str flashcard_id: str user_id: str event_type: str # "shown" | "answered" | "skipped" | "audio_played" user_response: str | None response_time_ms: int | None # For gap_fill cards, per-gap results are stored here gap_results: list[GapGradingResult] | None correctness_score: float | None # 0.0–1.0; mean of gap scores for multi-gap accepted_answer_matched: str | None study_session_id: str | None card_presentation_order: int | None audio_played: bool audio_duration_played_ms: int | None created_at: datetime @dataclass class GapGradingResult: gap_index: int user_response: str is_correct: bool correctness_score: float matched_answer: str | None ``` ### Conjugation Support Entity ```python @dataclass class VerbConjugationCard: id: str base_flashcard_id: str verb_lemma_id: str tense: str # "present" | "past" | "future" | "conditional" etc. person: str # "1s" | "2s" | "3s" | "1p" | "2p" | "3p" mood: str | None # "indicative" | "subjunctive" | "imperative" conjugated_form: str prompt_template: str # e.g., "Conjugate 'aller' (to go) in 3rd person present" created_at: datetime ``` --- ## Database Schema ### New Tables #### `flashcard_template` ```sql CREATE TABLE flashcard_template ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), name TEXT NOT NULL, description TEXT, language_pair TEXT NOT NULL, card_type TEXT NOT NULL, -- 'simple' | 'contextual' | 'gap_fill' | 'conjugation' use_ai_for_context BOOLEAN DEFAULT FALSE, ai_context_prompt TEXT, include_gender_hints BOOLEAN DEFAULT FALSE, include_conjugation_hints BOOLEAN DEFAULT FALSE, max_accepted_answers INTEGER DEFAULT 3, created_at TIMESTAMPTZ DEFAULT NOW() ); CREATE INDEX idx_flashcard_template_language_pair ON flashcard_template(language_pair); CREATE INDEX idx_flashcard_template_type ON flashcard_template(card_type); ``` #### `ai_generated_content` ```sql CREATE TABLE ai_generated_content ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), sense_id UUID REFERENCES dictionary_sense(id) ON DELETE CASCADE, language TEXT NOT NULL, contextual_sentences JSONB NOT NULL, difficulty_level TEXT NOT NULL, ai_model_used TEXT NOT NULL, -- populated from application config, not hardcoded generated_at TIMESTAMPTZ DEFAULT NOW(), usage_count INTEGER DEFAULT 0, UNIQUE(sense_id, language, difficulty_level) ); CREATE INDEX idx_ai_content_sense_lang ON ai_generated_content(sense_id, language); CREATE INDEX idx_ai_content_difficulty ON ai_generated_content(difficulty_level); ``` #### `verb_conjugation_card` ```sql CREATE TABLE verb_conjugation_card ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), base_flashcard_id UUID REFERENCES flashcard(id) ON DELETE CASCADE, verb_lemma_id UUID REFERENCES dictionary_lemma(id) ON DELETE CASCADE, tense TEXT NOT NULL, person TEXT NOT NULL, mood TEXT, conjugated_form TEXT NOT NULL, prompt_template TEXT NOT NULL, created_at TIMESTAMPTZ DEFAULT NOW(), UNIQUE(base_flashcard_id) ); ``` ### Enhanced Existing Tables #### `flashcard` (modifications) ```sql ALTER TABLE flashcard -- Remove answer_text; accepted_answers is the single source of truth DROP COLUMN IF EXISTS answer_text, ADD COLUMN accepted_answers JSONB NOT NULL DEFAULT '[]', -- list[str] for simple/contextual; empty for gap_fill -- Wordbank linkage on both sides ADD COLUMN prompt_sense_id UUID REFERENCES dictionary_sense(id) ON DELETE SET NULL, ADD COLUMN prompt_lemma_id UUID REFERENCES dictionary_lemma(id) ON DELETE SET NULL, -- Contextual content ADD COLUMN contextual_text TEXT, ADD COLUMN contextual_text_language TEXT, ADD COLUMN gap_positions JSONB, -- list[GapPosition]; each GapPosition includes its own accepted_answers -- Card configuration ADD COLUMN card_direction TEXT NOT NULL DEFAULT 'target_to_source', -- CONSTRAINT: values are 'target_to_source' or 'source_to_target' only -- Bidirectionality = two rows, not a third value here ADD COLUMN card_type TEXT NOT NULL DEFAULT 'simple', ADD COLUMN prompt_modality TEXT NOT NULL DEFAULT 'text', ADD COLUMN grading_mode TEXT NOT NULL DEFAULT 'binary', -- Audio ADD COLUMN prompt_audio_url TEXT, ADD COLUMN answer_audio_url TEXT, ADD COLUMN contextual_audio_url TEXT, -- Provenance: template-generated vs article-extracted (mutually exclusive) ADD COLUMN template_id UUID REFERENCES flashcard_template(id) ON DELETE SET NULL, ADD COLUMN source_article_id UUID REFERENCES article(id) ON DELETE SET NULL, ADD COLUMN source_sentence_index INTEGER, ADD COLUMN updated_at TIMESTAMPTZ DEFAULT NOW(); ALTER TABLE flashcard ADD CONSTRAINT chk_card_direction CHECK (card_direction IN ('target_to_source', 'source_to_target')), ADD CONSTRAINT chk_provenance CHECK ( NOT (template_id IS NOT NULL AND source_article_id IS NOT NULL) ); CREATE INDEX idx_flashcard_card_type ON flashcard(card_type); CREATE INDEX idx_flashcard_direction ON flashcard(card_direction); CREATE INDEX idx_flashcard_source_article ON flashcard(source_article_id); ``` #### `flashcard_event` (modifications) ```sql ALTER TABLE flashcard_event ADD COLUMN response_time_ms INTEGER, ADD COLUMN gap_results JSONB, -- list[GapGradingResult] for gap_fill cards ADD COLUMN correctness_score DECIMAL(3,2), ADD COLUMN accepted_answer_matched TEXT, ADD COLUMN study_session_id UUID, ADD COLUMN card_presentation_order INTEGER, ADD COLUMN audio_played BOOLEAN DEFAULT FALSE, ADD COLUMN audio_duration_played_ms INTEGER; CREATE INDEX idx_flashcard_event_session ON flashcard_event(study_session_id); CREATE INDEX idx_flashcard_event_correctness ON flashcard_event(correctness_score); ``` --- ## Service Layer Architecture ### FlashcardService ```python class FlashcardService: def __init__( self, flashcard_repo: FlashcardRepository, vocab_repo: VocabRepository, dict_repo: DictionaryRepository, template_repo: FlashcardTemplateRepository, audio_service: AudioGenerationService, ai_service: AIContentGenerationService, ai_model_name: str, # Injected from application config; never hardcoded ): ... async def generate_flashcards_from_vocab_entry( self, entry_id: UUID, user_proficiency: str = "B1", template_types: list[str] | None = None, bidirectional: bool = True, ) -> list[Flashcard]: """ Generate flashcards from a vocabulary entry using configured templates. If bidirectional=True, both a target_to_source and a source_to_target row are created for each template. They are stored as independent rows. """ entry = await self.vocab_repo.get_entry(entry_id) sense = await self.dict_repo.get_sense(entry.sense_id) lemma = await self.dict_repo.get_lemma(sense.lemma_id) templates = await self.template_repo.get_templates_for_language_pair( entry.language_pair, template_types or ["simple", "contextual"] ) flashcards = [] for template in templates: contextual_text = None if template.use_ai_for_context: ai_content = await self._get_or_generate_ai_content( sense.id, sense.language, user_proficiency, template ) contextual_text = random.choice(ai_content.contextual_sentences) # Always create target_to_source card_tts = await self._create_card( template, entry, sense, lemma, direction="target_to_source", contextual_text=contextual_text, ) flashcards.append(card_tts) if bidirectional: card_stt = await self._create_card( template, entry, sense, lemma, direction="source_to_target", contextual_text=contextual_text, ) flashcards.append(card_stt) return flashcards async def create_flashcard_from_article_sentence( self, article_id: UUID, sentence_index: int, target_word: str, bank_entry_id: UUID, sense_id: UUID, direction: str = "target_to_source", ) -> Flashcard: """ Create a contextual flashcard using a sentence from an article as the contextual text. The original sentence provides authentic context; the target word is extracted as the gap. This is the primary creation path for cards derived from article reading. No template_id is set; source_article_id and source_sentence_index are. """ sentence = await self._get_article_sentence(article_id, sentence_index) gap = self._build_gap_from_sentence(sentence, target_word, sense_id, bank_entry_id) return Flashcard( bank_entry_id=str(bank_entry_id), prompt_sense_id=str(sense_id), card_type="gap_fill", card_direction=direction, contextual_text=sentence.text_with_gap, contextual_text_language=sentence.language, gap_positions=[gap], accepted_answers=[], # Answers live on gap_positions for gap_fill template_id=None, source_article_id=str(article_id), source_sentence_index=sentence_index, ... ) async def grade_flashcard_response( self, flashcard: Flashcard, user_response: str, grading_mode: str = "fuzzy", ) -> GradingResult: """ Grade a user response. For gap_fill cards with multiple gaps, user_response is expected to be a pipe-delimited string of per-gap responses (e.g. "souhaite|avoir"). Per-gap GapGradingResult objects are returned inside the GradingResult. """ if flashcard.card_type == "gap_fill" and flashcard.gap_positions: return self._grade_multi_gap(flashcard, user_response, grading_mode) if grading_mode == "binary": return self._grade_binary(flashcard, user_response) elif grading_mode == "fuzzy": return self._grade_fuzzy(flashcard, user_response) else: raise ValueError(f"Unknown grading mode: {grading_mode}") def _grade_multi_gap( self, flashcard: Flashcard, user_response: str, grading_mode: str, ) -> GradingResult: """ Grade each gap independently using its own accepted_answers list. Overall correctness_score is the mean of per-gap scores. """ responses = user_response.split("|") gap_results = [] for i, (gap, response) in enumerate(zip(flashcard.gap_positions, responses)): temp_card = SimpleNamespace(accepted_answers=gap.accepted_answers) gap_grade = ( self._grade_fuzzy(temp_card, response) if grading_mode == "fuzzy" else self._grade_binary(temp_card, response) ) gap_results.append(GapGradingResult( gap_index=i, user_response=response, is_correct=gap_grade.is_correct, correctness_score=gap_grade.score, matched_answer=gap_grade.matched_answer, )) mean_score = sum(r.correctness_score for r in gap_results) / len(gap_results) return GradingResult( is_correct=all(r.is_correct for r in gap_results), score=mean_score, gap_results=gap_results, ) def _grade_fuzzy(self, flashcard, response: str) -> GradingResult: """ Accept variations and use string similarity. Checks accepted_answers exactly first, then falls back to similarity threshold (>= 0.8). """ response_clean = response.strip().lower() for accepted in flashcard.accepted_answers: if response_clean == accepted.lower(): return GradingResult(is_correct=True, score=1.0, matched_answer=accepted) for accepted in flashcard.accepted_answers: similarity = self._calculate_string_similarity(response_clean, accepted.lower()) if similarity >= 0.8: return GradingResult(is_correct=True, score=similarity, matched_answer=accepted) return GradingResult(is_correct=False, score=0.0, matched_answer=None) async def _get_or_generate_ai_content( self, sense_id: UUID, language: str, proficiency: str, template: FlashcardTemplate, ) -> AIGeneratedContent: cached = await self.ai_content_repo.get_content(sense_id, language, proficiency) if cached: await self.ai_content_repo.increment_usage(cached.id) return cached sense = await self.dict_repo.get_sense(sense_id) lemma = await self.dict_repo.get_lemma(sense.lemma_id) ai_prompt = template.ai_context_prompt.format( headword=lemma.headword, gloss=sense.gloss, proficiency=proficiency, ) sentences = await self.ai_service.generate_contextual_sentences(ai_prompt, count=5) return await self.ai_content_repo.create(AIGeneratedContent( sense_id=sense_id, language=language, contextual_sentences=sentences, difficulty_level=proficiency, ai_model_used=self.ai_model_name, # From config usage_count=1, )) ``` ### FlashcardTemplateService Manages templates and the admin Flashcard Studio experience. ```python class FlashcardTemplateService: async def create_template_for_word_class( self, word_class: str, # "verb" | "noun" | "adjective" etc. language_pair: str, admin_user_id: UUID, ) -> FlashcardTemplate: ... async def generate_contextual_examples_for_admin( self, lemma: DictionaryLemma, sense: DictionarySense, proficiency: str, count: int = 5, ) -> list[str]: """ Admin Flashcard Studio: given a headword and sense, generate candidate contextual sentences that an admin can review and accept or discard before a template is saved. Results are not cached until the admin confirms. """ async def suggest_flashcard_improvements( self, flashcard: Flashcard, performance_data: list[FlashcardEvent], ) -> list[str]: ... ``` ### FlashcardStudyService ```python class FlashcardStudyService: async def start_study_session( self, user_id: UUID, language_pair_id: UUID, session_config: StudySessionConfig, ) -> StudySession: ... async def get_next_card_in_session(self, session_id: UUID) -> Flashcard | None: ... async def record_card_interaction( self, flashcard_id: UUID, user_response: str, response_time_ms: int, session_id: UUID, ) -> FlashcardEvent: ... async def complete_study_session(self, session_id: UUID) -> StudySessionSummary: ... ``` ### AudioIntegrationService ```python class AudioIntegrationService: async def generate_audio_for_flashcard( self, flashcard: Flashcard, voice_config: VoiceConfig, ) -> AudioFiles: ... async def generate_contextual_audio( self, text: str, language: str, highlight_words: list[str] | None = None, ) -> str: ... ``` --- ## Integration Points ### Vocabulary Bank Integration - Each `Flashcard` links to a `LearnableWordBankEntry` via `bank_entry_id` - `prompt_sense_id` and `prompt_lemma_id` anchor the cue side to the dictionary - For gap-fill cards, each `GapPosition.bank_entry_id` anchors the answer side for each gap independently - Only resolved vocabulary entries (with `sense_id`) can generate standard flashcards - Flashcard performance events feed back into vocabulary familiarity scoring ### Dictionary Integration - Verb lemmas link to specialised conjugation flashcard generation via `VerbConjugationCard` - Gender information influences `accepted_answers` construction (e.g. including "la banque" alongside "banque") - Multiple senses per lemma enable sense-specific flashcard variations with distinct `prompt_sense_id` values ### Article Extraction Integration - `source_article_id` and `source_sentence_index` on `Flashcard` record provenance for cards created during article reading - The `create_flashcard_from_article_sentence` service method is the dedicated creation path - These cards carry no `template_id`; the constraint on the table enforces mutual exclusivity ### Future Fluency System Integration - `FlashcardEvent` provides performance metrics per word and per sense - `GapGradingResult` enables per-word performance tracking within multi-gap cards - Spaced-repetition scheduling will be driven by fluency scores derived from event history --- ## Implementation Phases ### Phase 1: Core Enhanced Flashcard System - Implement enhanced `Flashcard` domain model with wordbank linkage on both sides - Replace `answer_text` with `accepted_answers` throughout; migrate existing data - Implement `GapPosition` with per-gap `accepted_answers` - Enforce bidirectionality as two rows via the service layer ### Phase 2: Article Extraction Path - Implement `create_flashcard_from_article_sentence` in `FlashcardService` - Wire up article sentence retrieval and gap construction - Surface this in the article reading UI ### Phase 3: AI-Assisted Content Generation - Integrate AI service for contextual sentence generation; model name from config - Implement `FlashcardTemplateService` including the admin Flashcard Studio preview flow - Implement `ai_generated_content` caching ### Phase 4: Advanced Card Types - Implement verb conjugation flashcards via `VerbConjugationCard` - Add audio support via `AudioIntegrationService` - Implement fuzzy grading and multi-gap grading ### Phase 5: Study Session Management - Implement `FlashcardStudyService` - Basic spaced-repetition scheduling - Session summaries and performance analytics ### Phase 6: Integration and Polish - Integrate with fluency/familiarity system once designed - Adaptive difficulty adjustment - Administrative tooling --- ## Backward Compatibility - Existing flashcards are treated as `card_type: "simple"`, `card_direction: "target_to_source"` - Where `answer_text` exists in current data, it is migrated to a single-element `accepted_answers` list - Existing `FlashcardEvent` records remain valid; new columns are nullable