The Accountant. Stephen Hawking. Morgan Freeman. Clubhouse.

What do these 4 have in common? Yes, you guessed it correctly: VOICE.

“Heavy Sigh”… What’s the Plan?

The 2016 movie with Ben Affleck (playing the role of Christian Wolff) as a martial arts-trained CPA with high-functioning autism, features Alison Wright (playing the role of Justine) who goes by the moniker “The Voice” that uses Text-to-Speech to communicate with clients and Christian. Stephen Hawking brought speech synthesizer to global awareness as many people associated his robotic voice, among other traits, a core part of the late genius’ character. We’ve all had that one friend who prided in doing the best voice cover of Morgan Freeman, and Clubhouse… well, that’s probably the loudest bandwagon I’ve heard in a while.

Hawking and The Accountant represent the fundamental use of synthetic voice where people leveraged it for basic means of communication. It was so rudimentary that “The Voice” in The Accountant reads out loud “heavy sigh”. Singers saw a business value in enhancing their voices, and people on Clubhouse, and the folks mimicking Morgan Freeman, are finding entertainment values. Since The Accountant, we’ve seen synthesized voices playing many roles in the Hollywood, whether as a character (Jarvis of Iron Man, although the voice is actually played by a human actor) or providing the necessary audio behind the scenes (Saw series).

This evolution of synthetic voice from a barebones single-purpose tool to a fleshed-out multi-purpose application occurred due to significant improvements in the quality of the voices as well as the supporting technology, and the society that is more and more welcoming to synthetic media in general. In a visually crowded world we’ve lived in, audio was always secondary. But platforms like Spotify, Podcast, and most recently Clubhouse, are liberating audio from its chains, and the newer generations seem to be all ears.

Voice is Heavy

So, what’s all this noise with Clubhouse?

There’s been numerous social platforms emphasizing community, so why the clamor now? It’s because voice, traditionally, has been very “heavy”.

Voices pack a punch: they add depth to the text they carry, but also convey their own meanings beyond the words. A voice can send shudders down the spine (ergo the fandom for ASMR) or perk up ears (imagine a puppy hearing you calling out). It is emotional and powerful, and people grow a strong attachment to it: DECtalk speech synthesizer, more famously know as Dr. Hawking’s voice, was based on the voice of the creator of the machinery, Dennis H. Klatt. When Hawking was offered an improved synthesizer by Speech Plus in 1988 with a different voice, he had asked them to replace it with the original voice of “Perfect Paul” recorded by Klatt himself. The synthetic voice had become a part of his identity.

But it’s also heavy in that it is cumbersome: you see people reading texts, watching YouTube on mute in subways, but without earphones, your consumption of audio is limited. People have phone phobias, but a much lesser number of people are afraid of reading texts or communicating over emails, and the rate of absorption of information is much slower for audible words compared to written script. I still take a couple minutes to gather my thoughts before answering a phone call, and when if it’s speaking in front of any number of people, you bet I’m prepping beforehand.

Homo Sapiens Vox

But when Clubhouse opened its doors, people embraced it with an unparalleled eagerness to talk and listen to each other. What made it so different from Chatroulette, WeChat groups, and other audio/video community players that have come and gone is that they made audio light in a not-so-shallow way: it was light in that people could listen on mute and go in and out silently, but meaningful in that the invite-only structure with rapid participation from high-level industry leaders led to people maintaining a sense of professionalism, each room built with a specific purpose in mind.

Combine this with the loss of in-person conversations due to COVID, we have people hungry to talk, to chat, to get their voices heard. Watch out, visual world, here come the Homo Sapiens Vox – and their AI.

And if you are a Homo Sapines Vox, you should give Genny a whirl