To All Posts

An image of many different flags as buttons on a keyboard.

Duolingo’s Odd Beginnings as a Translator

Written by

Tagged as:

Duolingo Translation

Duolingo is pretty much synonymous with language learning these days. I’ve certainly spent my fair share of time being accosted by the owl and his friends at all hours of the day. And whether I’m on a bus, chilling in a coffee shop, or even one time dancing in a club, I’ve heard others practicing through those telltale Duolingo sounds - the happy ding for a right answer or the sad thump of a wrong one. Seriously, Duolingo is all over the place.

Duolingo’s popularity isn’t without reason. As of writing, you can download the app and get started learning any one of over 40 languages completely for free. Duolingo keeps the app free by running ads and offering a premium subscription, but did you know that the minds behind the Owl originally wanted to support its model of free language learning education through… translating internet articles?

It All Starts with a CAPTCHA

Our story about Duolingo as a translator starts with CAPTCHAs. You know, those check boxes that ask if you’re a human or not.

Nowadays, CAPTCHA is very sophisticated and often works in the background completely unnoticed, but this is a very far cry from where it started. In the early to mid-2000s, CAPTCHAs would present internet users with a blurry picture of a word and ask them what it said. The idea was that, at that point, computer programs couldn’t understand the blurry pictures, but humans could. So if you gave the correct answer, you were probably a human.

One of the creators of CAPTCHA was a Guatemalan-born graduate student at Carnegie Mellon University in Pittsburgh named Luis von Ahn. Alongside his interest in computer security, von Ahn became interested in human computation, or the idea that humans could collaborate to act as a sort of computer to solve a problem. Oddly enough, CAPTCHA was a great platform for von Ahn to test out his idea.

von Ahn identified a problem he believed could be solved through human computation. Countless books had been photocopied but hadn’t been converted to text format, such as PDF, as the text could not be deciphered by computer algorithms. Believe it or not, von Ahn was able to use CAPTCHAs to digitize these books. He developed an enhanced version, called reCAPTCHA, which displayed two blurry words to internet users instead of just one. The first word was a known image, while the second was an unidentified blurry word. With reCAPTCHA, websites could maintain the process of confirming the human identity of users using the known word. Simultaneously, reCAPTCHA could determine the unknown word by considering the most common responses. For example, if 75% of users thought that the blurry word was “owl” then reCAPTCHA figured that it was probably the correct word. reCAPTCHA could use this process to convert entire books to text.

This method proved to be incredibly effective and von Ahn’s reCAPTCHA ended up digitizing the entire archive of the New York Times this way. So, how does this relate to Duolingo? Inspired by the results, von Ahn sought to address another issue he believed could be tackled similarly - translating the internet. Around 2011, he and his graduate student, Severin Hacker, started developing Duolingo as a tool to do just this.

Duolingo as a Translator

With the ubiquity of translation tools like Google Translate, you may be wondering why Luis von Ahn even thought of using human computation to translate the Internet. In 2011, when von Ahn and Hacker started Duolingo, online translation software wasn’t as reliable as it is today. It wasn’t until 2016 that Google updated Google Translate into the powerful version of the software we know today. von Ahn and Hacker thought that they could do better than Google Translate using human computation.

In the early days of Duolingo, its app had two sections. The first was a lesson section that taught users a new language. This section is probably what you think of when you think of Duolingo. It has a skill tree that teaches you different words and grammar. As you get better at it, the lessons get more complicated, and you learn more. The second section, meanwhile, was a practice section that would be used to accomplish translation. The concept behind the practice section was that Duolingo would prompt users to translate real sentences found in various internet articles for practice, using the most frequently submitted translation as the correct answer. This approach mirrors how reCAPTCHA relied on the most common answers for unclear words.

Duolingo managed to get contracts with companies like CNN and BuzzFeed to translate their websites using this method. So did it work? Based on some examples from both Luis von Ahn’s TED talk that introduced Duolingo and a paper evaluating Duolingo written by Ignacio Garcia, a lecturer at Western Sydney University, it seemed to work reasonably well compared to Google Translate, but not perfectly.

In his TED talk, Luis von Ahn gave the following example:

  • Original Sentence (German): Falls Pakistans Geschichte ein indicator ist, so könnte Musharrafs Entscheidung, das Kriegsrecht zu verhängen, jener sprichwörtliche Tropfen sein, der das Fass zum Überlaufen bringt.
  • Professional Translation (English): If Pakistan’s history is any indicator, Musharraf’s decision to impose martial law may prove to be the proverbial straw that breaks the camel’s back.
  • Duolingo’s Translation (English): If Pakistan’s history is an indicator, Musharraf’s decision to impose martial law could be the proverbial straw that breaks the camel’s back.

While Garcia’s example proved that there was still room to improve Duolingo’s translation services:

  • Original Sentence (Spanish): Reconozco que la calculadora iPod canta un poco pero el podómetro Shuffle está muy conseguido (aunque sea un poco más grande)
  • Professional Translation (English): I admit the iPod calculator sucks a bit, but the Shuffle pedometer turned out quite well (even though it’s a little bigger).
  • Duolingo’s Translation (English): I recognize that the iPod calculator sings a little but the pedometer Shuffle is very achieved (even if it is a bit more large).

However well it ended up working, Duolingo eventually abandoned this business and began to focus more on advertising and premium subscriptions. Perhaps it was the improvement in services like Google Translate or that there just wasn’t enough money for translation. By the time the company went public in 2021, its SEC filing didn’t mention translation as part of its core business.

Translation for Language Learning

The last question that I had when doing research for this article is whether or not Duolingo’s model of translating sentences between your target language and native language helps you practice. Duolingo commissioned a study in its early days that suggested that 34 hours of using Duolingo is the equivalent of one semester of a college-level course. However, given that Duolingo funded this study, the results may be somewhat biased.

The question of whether or not translation is a valuable method for learning a new language is a bit of a hot topic in academia. Traditionally, language teachers have tended to steer clear of asking students to directly translate sentences. Chris Lonsdale, a linguist, and author of the book The Third Ear, believes that language acquisition should focus primarily on the meaning of what you’re saying as opposed to focusing on translating your target language back to your native language. He even gave a TED talk explaining how he used this method to learn Chinese in six months.

However, there has been a bit of a shift in this mentality in the last decade or so. Several studies have suggested that translation, especially in the early days of learning a new language, can be a very useful tool for practicing a language. For example, a study by Pilar Munday, a former Spanish professor at Sacred Heart University in Connecticut, saw a huge increase in Spanish test scores for students who practiced a little bit each day using Duolingo in addition to taking formal classes. Furthermore, she noticed that students who practiced every day did better than students who did the same number of lessons in fewer days.

In general, it seems like anything that gets you to practice and use the language every day is a great tool for learning a language!

Sources for this Article