Imagine
walking into a doctor’s clinic to get that cut on your tongue, that’s not
healing, examined. The biopsy confirms the
presence of cancerous cells. In the next meeting with your oncologist, your
worst nightmare comes true - the doctor informs you that 2/3rd of your
tongue will be removed which would lead to loss of voice.
Millions
of people around the world lose their voice every year due to diseases
such ALS, oral cancers, etc. Question arises, can technology help these
patients get their voice back?
If you’ve
watched the movie ‘Theory of Everything’, the first image that would pop into
your mind when we say ‘artificial voice box’ would be that of Stephen Hawking
using his ‘text to speech’ machine to deliver lectures in a ‘robotic’ voice.
Now, imagine, 100s of patients using the same ‘robotic’ voice to communicate with no intonation and no eloquence.
Your
voice is unique to you and is a part of your identity. VocalID recognised the
need of ‘preserving voice’ in order to provide the vocally impaired with a
customized or bespoke voice. VocalID set up the first ever ‘Human Voicebank’
that crowdsources voices - allowing anyone with an internet connection to donate/preserve their voice that would get catalogued in their library.
How Donor Voice Is Used:
The
donors can record 2-3 hours of their voice from the comfort of their homes.
VocalID uses AI to synthesize a digital voice wherein it combines source with
the filter. The source (vocal cords, larynx and throat muscles) is
responsible for creating the sound of our laughter, etc. while filter (muscles)
gives our voice distinctiveness. The AI recognizes the vowels from the source
and combines it with the filter. While the source is provided by the donors,
the filter is provided by the patient or the client. VocalID blends the voice
of the donor with that of the patient to produce a ‘bespoke’ voice. The voice
donor is matched with the patient based on their age, gender etc.
The Technique Of Voice Preservation:
The
clients/patients record 2-3 hours of their speech and the AI recognises the
vowels, the intonation etc. that can be used later to create speech through the
text-to-speech pre-programmed machines. The difference is that, now, the voice
box would be producing the voice of the patient instead of producing a
pre-programmed robotic voice.
Once the
digitized voice is created, it can be used as a plug-in to the machine that
patient is using. With VocalID, a bedridden mother can now read her child
bedtime stories, a lawyer can present his/her argument and a student can recite
a poem in a classroom full of students.
Comments
Post a Comment