docs: [api] Update domain.md to explain the domain context

2026-04-14 10:18:16 +01:00 · 2026-04-14 10:18:16 +01:00 · 376757df51
commit 376757df51
parent a5ab9dde21
1 changed files with 65 additions and 7 deletions
--- a/api/docs/domain.md
+++ b/api/docs/domain.md
@ -105,11 +105,17 @@ Key fields: `user_id`, `source_lang`, `target_lang`. Unique per user per directi
 ## The vocab bank
-The vocab bank is the central concept of the system. It is the user's personal list of words they are actively learning.
+The vocab bank is the central concept of the system. It is the user's personal list of words they are actively learning.  Even when words "graduate" to _learned_ or _well known_ by a User, they stay in the vocab bank.
 Each user has their own Vocab bank.
 Items can be put into a Vocab Bank by either the user (e.g. through identifying a word they don't know in some natural language text, translating it in the app, then adding it), or by the system (e.g. by the user selecting predefined "packs" of words).
 ### `LearnableWordBankEntry`
-One row per word or phrase that a user has added to their bank. This is the bridge between the reference dictionary and the user's personal study material.
+Each `LearnableWordBankEntry` signifies a word or phrase that a user has added to their bank, i.e. which they have identified something they want to learn.
 This is the bridge between the reference dictionary and the user's personal study material.
 Key fields:
@ -150,16 +156,64 @@ Only entries with `disambiguation_status` of `"auto_resolved"` or `"resolved"` h
 ## Flashcards
-A flashcard is a study card derived from a resolved vocab bank entry. It carries pre-computed prompt and answer text so the study session does not need to re-query the dictionary.
+A flashcard is a study card, its analogue in the physical world is a piece of paper with writing on both sides.  A learner would look at one side, and attempt to recall what is on the other side.  For example, for a French learner, one side would have the word "to go (v)" and the other would have "aller".
-### `Flashcard`
+At the core of Language Learning App is the idea that Flashcards are a good primitive for improving recall over time.  They should complement, not replace, immersion or exposure to foreign-language text.  They allow users to focus on one thing at a time, as opposed to the more cognitiviely demanding experience of reading.
 A User can have many Flashcards in their "bank", and flashcards can be arranged into "packs" of themes.  Flashcards can be created in multiple ways:
 1. Users can "open" (i.e. copy) Flashcards in pre-constructed Packs.  These might be, for example "100 most common French Verbs, infinitive forms" or "Food and ingredients, French Words".  These packs are build and maintained the system administrators, and it is possible for updates to the parent pack to trickle down to the children Flashcards in a User's account.  
 2. Users can generaet their own flashcards using the Web App using the dedicated Flashcard Interface.
 3. When a Learner is reading (or listening) to foreign language content they may look up a specific word for translation.  When they do so, they have the chance to auotomatically create a flashcard.
 4. Users can duplicate pre-existing Flashcards
 ### Flashcard content
 The idea of a Flashcard starts with its paper analogue, but adds a lot of functionality on, and around, them to make them maximally useful to the learner.
 For example, a user may be trying to learn a single headword, so the system use generative AI to generate multiple possible bits of context text.  Because in real life, you will see a word in many contexts.
 Furthermore, we use generative AI to generate autio (text-to-speech) to allow the user to hear/listen to the word, as well as the wider context text.
 It is possible to have "simple" text flashcards which are _just_ a source language word and a traget language word ("to go (v)" -> "aller").  It is also possible to have contextual text in both the source and the target.  E.g. "he wants [to go] to the cinema" -> "il vuet [aller] au cinema".
 For these flashcards with more context text, it might be possible to present the user with e.g. "il vue _____ au cinema (to go, v)" as the prompt, as well as the whole oringinal source text.
 It is important to have Text To Speech for both the answer (e.g. "aller") as well as the whole context text ("il veut aller au cinema") because a big part of the premise of Language Learning App is that you can't just learn a language one word at a time.
 We should design our Flashcard model with the idea that more than one element in the context text could be questioned on.  E.g. a user may wish to have "he wants [to go] [to the cinema]" and be presented "il veut _____ __ ______".  Within this single Flashcard we are helping the learner learn a number of words, each linked to separate wordforms and lemmas 
 ### Posing Questions / Prompts
 Presenting just a single word prompt to the user may not be enough to generate an accurate response, especially without context text.
 Notably, European languages have gender and tense agreement, where English might not. 
 For example, consider "went" as the past participle of "go".  If you showed a learner "went" and asked for the French translation you may receive multiple possibly viable options.  "Allẻ" is the most notable or likely response, but "allai" is also a possible response (simple past, first person tense).
 Therefore, the cue word for a Flashcard can possibly:
 1. Show the user explicit context: "Went (v, past participle)
 2. Show the user context text "Went.  Je suis _____"
 3. Some mixture of the two
 The same is true for plurality and gender on e.g. adjectives: "young" could be "jeune" or "jeunes"
 ### Linking to the Bilingual Dictionary
 Two cards are typically generated per bank entry — one in each direction:
- **`target_to_en`** (recognition): prompt = `lemma.headword` (e.g. `"bisque"`), answer = `sense.gloss` (e.g. `"advantage"`). The learner sees the French word and must produce the English meaning.
+- **`target_to_source`** (recognition): prompt = `lemma.headword` (e.g. `"bisque"`), answer = `sense.gloss` (e.g. `"advantage"`). The learner sees the French word and must produce the English meaning.
- **`en_to_target`** (production): prompt = `sense.gloss` (e.g. `"advantage"`), answer = `lemma.headword` (e.g. `"bisque"`). The learner sees the English meaning and must produce the French word.
+- **`source_to_target`** (production): prompt = `sense.gloss` (e.g. `"advantage"`), answer = `lemma.headword` (e.g. `"bisque"`). The learner sees the English meaning and must produce the French word.
-Key fields: `bank_entry_id`, `user_id`, `source_lang`, `target_lang`, `prompt_text`, `answer_text`, `prompt_context_text` (optional sentence context), `answer_context_text`, `card_direction`, `prompt_modality` (`"text"` or `"audio"`).
+## Fluency, familiarity, and struggle
 Ideally, over time, a User becomes familiar with words in their Word Bank.  They will do this through e.g. Flashcards, and also possibly through exposure to the word in Articles and natural language content they generate.
 It is also possible that a user consistently struggles with a certain word in a vocab bank, or a certain class of words (e.g. subjunctive tense use)
 The System takes an event-driven approach to recording fluency, with periodic roll-ups or aggregations of state to represent a learner's familiarity.  The exact nature of this system hasn't been thought through or designed yet
 ### `FlashcardEvent`
@ -172,6 +226,10 @@ Event types:
 The spaced-repetition scheduling algorithm (not yet implemented) will consume these events to determine when each card should next be shown.
 ### `TranslatedArticleEvent`
 These are immutible records of something that happened with regards to an artcie.  Maybe they mark something as read or played, or they loaded a TranslatedArticle in the WebUI which contained a word, or they attempted to translate a word.
 ---
 ## NLP pipeline integration