Google has introduced Translatotron, an end-to-end, speech-to-speech translation model that converts speech directly into another language without needing an intermediate text translation. The model has several advantages over the current method used, including:
- increased speeds
- fewer translation errors
- retention of the original speaker’s voice
- better handling of names and proper nouns that don’t need to be translated.
The current cascade models in use first transcribe the speech as text, translate that text into another language, then use text-to-speech synthesis to translate the text into speech, wrote Ye Jia and Ron Weiss, software engineers at Google AI, in a recent blog post. Translatotron, on the other hand, is the first model that can translate speech from one language directly into speech from another language, while preserving the speaker’s tone and voice.
Translatotron Builds on Google Assistant’s Bilingual Feature
Google announced their bilingual feature for the Google Assistant at the 2018 International Franchise Association conference. Aside from English, the feature allows the Assistant to recognize Spanish, French, German, Japanese, and Italian, allowing for voice commands in those languages. The feature does not allow for multiple languages to be used at once, according to a post from Google’s artificial intelligence blog.
Polish Added to Google Assistant
At the beginning of 2019, Google added Polish to the Assistant, programming it not only with voice commands but also with Polish culture. Users can ask the Google Assistant for data regarding Polish history, pop culture, songs, and more, according to a blog post from Google. The technology uses LangID, which identifies spoken language and uses neural networks to recognize speech and voices to distinguish between languages. With multiple updates each week, it is expected that Google will only further increase the Assistant’s language abilities in terms of both voice commands and translations.