Key Takeaways

  • Published an e-learning course on Coursera that reached 100,000 students
  • Greatly simplified audio content production and streamlined workflow
  • Trained the TTS engine to pronounce complicated medical terminology

Founded in 1858, the Stanford University School of Medicine (Stanford MEdiC) is one of the nation’s most respected medical institutions. It boasts an extraordinary legacy of providing healthcare to millions and democratizing access to high quality health education through a global network of health professionals and researchers. The organization’s directive includes the creation of professional education resources including online graduate certificate programs and online certificate courses.

As an initiative under the umbrella of the Stanford MEdIC, digital MEdIC was started in 2016 to create credible e-learning experiences and provide digital expertise to NGOs, academic institutions, and government agencies to address the world’s health crises. Its content reaches an audience of 1.5 million across 140 countries.

On the onset of the COVID-19 pandemic, Stanford MEdIC redirected its priorities to raise awareness on the disease and stimulate further virological study through the sharing of online resources. However, with the far-reaching global volatility of the pandemic, Stanford MEdIC had to actively tailor its content to an increasingly multi-lingual audience. Doing so, however, introduced new challenges for the content production team.

Running a localized, multilingual e-learning course

Stanford MEdiC’s Coursera course “COVID-19 Training for Healthcare Workers”, was a pioneering effort led by Dr. Matthew Stellow (Associate Professor of Emergency Medicine) and Dr. S.V Mahadevan (Professor of Emergency Medicine). The course’s goal was to spread knowledge of COVID-19 symptomology, prevention techniques for healthcare professional on the field, methods for immediate care, and advanced patient management & practices to stabilize patients in severe respiratory distress.

“COVID-19 is rapidly spreading across the globe and all providers must be prepared to recognize, stabilize and treat patients with novel coronavirus infection. Following completion of this short course physicians, nurses, and other healthcare professionals will have a unified, evidenced-based approach to saving the lives of patients with COVID-19, including those who are critically ill.”-Coursera Course Introduction

By itself, the traditional process of e-learning content creation involves several costly and timely steps including ideation, copywriting, recording (audio & video), post-production editing, and publishing. The main frustration that the team at Stanford MEdIC had was creating engaging content that was efficiently localized to target communities, in their case, to the Latin American community.

Localizing voiceover content initially made in English requires several prerequisites — not only does an organization have to go through a completely separate process of targeted research and translation, but may also have to go through the inconvenience of sourcing freelance voice actors. This procedure can introduce tasks such as sampling through hundreds of portfolios, rate negotiation, and audio directing. Costs for freelance actors can also reach inaccessibly high rates, and vary depending on the marketing channel chosen by the client and the individual demand & availability of a particular freelancer. According to, a 5-minute voiceover can cost up to $499, and can skyrocket with an added premium of $749 if the audio is used in commercial broadcasts. Clearly, Stanford MEdIC required a more financially scalable method to expand their content, especially with their content reaching an approximate runtime of 6 hours.


By leveraging LOVO Genny, the Stanford MEdiC team were able to access over 400+ human-like AI voiceovers in 100+ languages, and their entire localization struggle was solved by simplifying their production workflow: all they needed to do now was type Spanish text on the Genny interface and download their voiceover file. Editing over 6 hours of audio content through Genny became easy — as simple as copying and pasting text.

While initially skeptical of the quality of artificial voiceovers, the team’s worries were immediately addressed by the popular reception of their course. With an aggregated rating of 99% from 3,549 ratings on the Coursera platform, the Spanish voiceovers provided by LOVO were highly received and effortlessly intelligible. In the end, the course was able to reach over 11,000 enrollments and 100,000 participants.

“Tratamiento del paciente con disnea moderada – Parte 1″ by Digital Medic at Stanford University

Pronouncing specialized terminology

Pronunciation Editor” by LOVO

Some words can be hard to pronounce, especially in the medical space. Accurate pronunciation can be particularly important for healthcare professional to smoothly communicate requests and information. With terms like Sphenopalatine ganglioneuralgia, fasciculation, or muscae volitantes, complex pronunciation can be too much for your common TTS engine.

However, LOVO Genny provided the team at Stanford MEdiC with a suite of advanced customizations that not only allowed a user to make specific, second-by-second costs and edits, speed adjustments, and manual control over a voice’s emotional output, but also a pronunciation editor that gave the team the ability to train LOVO’s engine to pronounce complicated terminology or pronouns, ensuring that the audio delivery of their course remained concise, clear, and accurate.

All the pronunciation editor requires is the original phrase/term and its correct pronunciation using phonetically sound letter combinations. For more details, anyone is invited to visit the Help Center to access guides and resources such as the user manual for a more detailed explanation of the interface and examples on pronunciation corrections.

Once the pronunciation edit is made, the change is implemented through the engine, and the user can then type their script naturally, without any further delays or verbal complications in the future.

In summary, LOVO Genny equipped Stanford MEdiC with a powerful TTS engine to power their e-learning courses and spread awareness of critical information on COVID-19.

If you’d like to read more about LOVO Genny and its many use cases, visit our Blog to find out why over 300,000 users rely on LOVO to accelerate their audio content production.

Make sure to check these other use cases as well:
What Would You Do If You Can Speak 100 Languages?
How Schools and Companies Can Leverage TTS