Kolja Sam Pluemer

March 9, 2025

Interdependent Flashcards in Language Learning — Why and Where

Using digital flashcards with spaced repetition software (SRS) is an incredibly powerful language learning (SLA) method. While scheduling flashcards is now getting pretty good thanks to fsrs, there is one topic that needs R&D: Interdependent flashcards. There are quite a lot of scenarios in language acquisition where it's needed that the result of a flashcard interaction influences the learning state of other flashcards. Most often, these scenarios cannot be represented in the current software. This post gives an overview of where interdependent flashcards are needed in language learning and some pointers on implementation requirements.

Topics and Concepts

Say the learner is studying Egyptian Arabic verb conjugations in some flashcard app like Anki. They answer Hard for the word "مَتِسْتَخْدِمْش" — "don't use". This tells you not only how hard this word in itself was to remember for the learner but also about the perceived difficulty of:

using imperative
using imperative addressing a male person
using negation
using negated imperative
putting "use" in imperative
negating "use"
remembering the base form of "use"

Ideally, an SRS would also log these, so the learner can specifically study topics that are hard for them (e.g. negated imperative). This is something that I'd really love to see.

Redundantly Learning the Same Thing

In almost all SLA scenarios, it's recommended to practice vocab both from native language to target language and vice versa. This creates an interesting scenario: Does the SR data from "apple → Apfel" predict how the user will do on "Apfel → apple"? Likely not not.

This is just the simplest form of a much broader concept: How about we add cards where the user connects a picture of an apple with "Apfel", or a soundfile (both of course bidirectionally)? Suddenly, you have a geometric explosion of possible relationships between flashcards. Also, the need for tracking this relationship becomes evident from a UX angle: Imagine having memorized "🍎 → Apfel" perfectly, but the SRS mercilessly presents you with 17 additional flashcards relating to apples, which are now all trivial. That'd be pretty lame.

Cloze Deletion

The method of ＿ deletion is an extreme variation of the problem described above. Imagine you want to learn the idiom del senno di poi son piene le fosse, using progressive cloze deletion. It may go something like this:

del senn＿ di poi son piene le foss＿ (practicing the gendered word endings)
del senno ＿ poi son ＿ le fosse (taking out key phrases)
the one about hindsight: del ＿＿＿＿＿＿＿ (just a memory anchor and the first word)

Obviously, all of these flashcards relate to the idiom, and how often you want to see the flashcards depends on how well you memorized the idiom itself. What's more, they get harder, in a pretty obvious way: The more of the sentence you "cloze out", the harder it is. This leads to the intriguing idea of letting a computer generate cloze challenges for you, and this it what makes it special. If you allow every possible cloze to be generated (from taking out one letter to taking out everything but one letter), you end up with thousands or even millions of flashcards.

Seeing every single flashcard as an individual entity with attached learning data (as it is the wont of most SRS) may not even be the right paradigm here.

Progressive Difficulty

Another thing I touched upon in the previous paragraph is the idea that flashcards have difficulty and difficulty dependencies — we only want to start practicing "Äpfel" after getting "Apfel", for example.

The same thing can be done with topics, say, mastering simple past before past progressive.

In any case, this is mostly about introducing flashcards, which is a whole other topic that deserves more love, and likely its own post.

Attacking Several Learning Items with One Flashcard

This is sort of what I've been proposing in the first chapter, but it's relevant even if we ignore the complexity that comes with topics. For example, imagine the following flashcards:

A multiple choice card like "the sun --- soleil | fenêtre"
"Form a sentence with the following words: fenêtre + soleil"

In either case, the user's answer tells us something about their knowledge of both the learning items: window and sun. We can actually easily implement this in just about any SR system, however the interesting part is that we cannot say with high confidence which of the items confounded the user in case of a struggle — were they confused about the sun, the window, or both?

This would be interesting to model, and I think it is possible when enough data is collected. Of course, intriguing dynamic possibilities emerge, like prompting the user to form a sentence with the two or three words that would be due anyway and adapting the SR data for all of them.

Weird Relationships

Finally, I want to point out that strange relationships between language learning items exist everywhere: Learners may confuse a learning item with a specific other one every time. Does it help to know the antonym of a word? What implications does it have for the synonyms of the word I failed to remember? How do we model synonyms, anyway? If we're learning the script of a language, do we abstract the letters in a word? The digraphs, the bigrams? Phonemes?

What matters, what doesn't?

Closing Words

A lot of this stuff may not matter so much and may be answered with, "just do your daily hour of Anki". Yet I think there is real potential here, potential that can be used now that we have computers instead of physical Leitner boxes. I have a feeling that Intelligent Tutoring Systems explored a lot of these ideas already decades ago — this is literature I still have to read in more detail. Will do.

Let's see what fsrs, Anki and sm can do in the future.