Google’s AI Translation Tool Creates Its Own Secret Language

Google’s Neural Machine Translation system had gone live back in September. It uses deep learning to produce better, more natural translations between languages. The company’s AI team calls it the Google Neural Machine Translation system, or GNMT, and it initially provided a less resource-intensive way to ingest a sentence in one language and produce that same sentence in another language. Instead of digesting each word or phrase as a standalone unit, as prior methods do, GNMT takes in the entire sentence as a whole.

GNMT’s creators were curious about something. If you teach the translation system to translate English to Korean and vice versa, and also English to Japanese and vice versa… could it translate Korean to Japanese, without resorting to English as a bridge between them? They made this helpful gif to illustrate the idea of what they call “zero-shot translation” (it’s the orange one):

translate1.gif

As it turns out — the answer is yes! It produces “reasonable” translations between two languages that it has not explicitly linked in any way. Remember, no English allowed.

But this raised the second question. If the computer is able to make connections between concepts and words that have not been formally linked… does that mean that the computer has formed a concept of shared meaning for those words, meaning at a deeper level than simply that one word or phrase is the equivalent of another?

This can mean that the computer has developed its own internal language to represent concepts it is using to between other languages.

transcape.png

A Visualization of the translation system’s memory when translating a single sentence in multiple directions

A visualization of the translation system’s memory when translating a single sentence in multiple directions.

In some cases, Google says its GNMT system is even approaching human-level translation accuracy. That near-parity is restricted to transitions between related languages, like from English to Spanish and French. However, Google is eager to gather more data for “notoriously difficult” use cases, all of which will help its system learn and improve over time thanks to machine learning techniques. So starting today, Google is using its GNMT system for 100 percent of Chinese to English machine translations in the Google Translate mobile and web apps, accounting for around 18 million translations per day.

Google admits that its approach still has ways to go. “GNMT can still make significant errors that a human translator would never make, like dropping words and mistranslating proper names or rare terms,” Le and Schuster explain, “and translating sentences in isolation rather than considering the context of the paragraph or page. There is still a lot of work we can do to serve our users better.” Over time this will improve and it may be a lot more efficient.

 

Sources: (TechCrunch, The Verge)

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s