A few weeks ago my roommate was using an app to read Chinese comic books, problem is she is still learning Chinese. I pulled up the Google Translate app on my phone and used the camera feature to attempt to translate the comic.
It understood basic words well like "free", "popular", and "today", basically all of the menu items on the app. But when it came to translating actual sentences it struggled. It would translate a sentence that should've said "I'm finally a college student now" to "I am college" and "I am now old university" when it should've been something like "I am old enough to attend university".
As my roommate said what Google was trying to do was translating sentences, trying to string the characters to make it coherent. It was so focused on creating a sentence that it struggled. While my roommate was translating each character and formed the sentence in her own head.
My roommate was taking all of her knowledge of the English and Chinese language to create a reasonable sentence. Google was taking its limited knowledge to recreate what she did but it was failing at more difficult sentences. And especially with text that was decorated (which is typical in comic book text).
There's a clear gap between my roommate's knowledge and Google Translate (especially with languages that don't have an alphabet and instead use characters). But Google is working towards bridging that gap. How?
Using Machine Learning.
What is machine learning?
But before we get into the how, lets first define what machine learning is.
First, it is a sub-field of artificial intelligence which is a wide-range branch of computer science concerned with building machines capable of performing tasks that typically require human intelligence.
Machine learning is a type of artificial intelligence that allows machines to learn to be more accurate at predicting outcomes without being explicitly programmed to do so. It uses historical data as input to predict new output values.
Or to put it simply, it is a way machines to answer questions themselves without any human input. To practice it has numerous "practice problems" which trains it to act without being explicitly told to. For example, if a machine looks at a picture of a dog without any prior knowledge it won't be able to identify it as a dog. It must be given numerous pictures of all different types of dogs, and the more problems it does the better it is at guessing from a photo if it is or is not a dog.
It's similar to how we learn. When we were little we didn't have the innate knowledge to distinguish a dog from any other animal. But after having others (teachers, parents, adults, etc.) tell us again and again what a dog is. Along with repeating identifying a dog in various scenarios like using flashcards or walking around in our local neighborhood. We learned to recognize a dog in almost any situation.
Or when you learn how to ride a bike. You can't do it automatically, you must practice and practice. Until you are able to do it without any help.
The human brain is a powerful machine. While we do have powerful computers they are only capable of performing tasks that we tell them to do. If we don't tell them to do it, they won't. Computer's can only see in the binary: It's A or B, black or white, 0 or 1.
What is deep learning?
Okay so now we understand what machine learning is: teaching a machine how to get a desired output without telling it the answer.
But how is this done?
With deep learning the next branch in artificial intelligence. Deep learning attempts to mimic the human brain. It creates a system which is able to take a group of data and make predictions with incredible accuracy.
How does this apply to you?
This is behind everyday products and services like:
digital assistants - learning your voice, and what information may be relevant to you
credit card fraud detection - learning your spending patterns and making note of any purchases that are suspicious (foreign transactions, large purchases)
natural language processing - understanding what a person means/wants, can be used for better search results when you ask Google a question
We're getting more specific we understand machine learning is having a computer get a desired output without human interaction, and deep learning is the way it's done (process), by mimicking the human brain.
Now how does a machine learn?
What are neural networks?
With neural networks. In the brain there are neurons, which are the cells responsible for receiving sensory input from outside our brains. Neural networks mimic this structure.
In the simplest terms there are three main components to neural networks:
the input layer;
hidden layers;
output layer;
All of these nodes are interconnected, similar to how it is in our brains.
The input layer is where the data is inputted. Say we are trying to scan a photo and see whether or not it's a cat, dog, or bird. The input will be the pixels of the photo (basically when we scan it). Then we move to the hidden layers which are composed of neurons. These neurons contain mathematical equations with parameters to generate an output number which is fed to all the neurons in the next layer. Finally, after going through the hidden layers we reach the output layer. This will give us a probability of whether or not the photo is of a cat, dog or bird. Like if it's 20% chance it's a cat, 20% chance it's a dog and 60% chance of being a bird.
In order for neural networks to work it needs training data to learn and improve its accuracy over time. It needs data to learn, just like we need to practice a skill, like playing the guitar in order to get better at it.
Back to our example. If we have 10 photos of dogs, cats, and birds. Then we can input these into our neural network. If a picture of a dog has an output layer of 50% cat, 15% dog and 35% bird then we know we need to tweak our hidden layers, or the mathematical equation. We test this over and over again and make sure that all 10 photos of the animal get the correct probability. Meaning the photo of a dog when put into our neural network, is given the highest probability of being a dog.
The Original Problem
Going back to Google Translate. In the beginning Google used to translate words individually but there were significant problems when it came to translating languages with different word order. Such as German. Back in 2016, Google introduced Google Neural Machine Translation (GNMT), an AI-powered neural machine translation algorithm. Fancy words aside, what this means is instead of mirroring the source text's word sequence, it focuses on meaning. Instead of translating the literal words in the order it comes in, it tries to make a sentence out of it. It tries to mimic the target language's grammar and syntax rules (which is how human brain translates).
Google decided to translate meaning, not words. Meaning is something humans inherently understand but a concept machines have trouble with. While it's still a long ways away from translators the technology is vastly improving with the progression of machine learning. Machine learning is something in the distant future but it is in things we interact with everyday.
Conclusion
We've gone through the definition (of machine learning), process (deep learning), and the how (neural networks). I've begun to understand the power of machine learning, before I saw it as a tool for abstract purposes like artificial intelligence, which seemed so far away. Like Tony Stark's AI Jarvis in the Iron Man movies, or something you would see in Star Wars like the android C-3PO. But machine learning is used in our everyday lives even if we don't realize it.
A few real-world examples:
automatically tag people on social media, turning handwriting into text
recommend certain products and services, for example Netflix uses machine learning to make predictions on which shows you may enjoy most and make recommendations based on those predictions
ranking posts on social media, Twitter using machine learning to prioritize tweets they think are most relevant to you
I'm not even close to an expert on machine learning, I'm just diving into the world of data science. But I wanted to give an article explaining machine learning as simple as possible, especially for those who aren't in the field of data science.
Resources:
Below I are the resources I used for this article:
Youtube Videos
Articles
How Google is using emerging AI techniques to improve language translation quality
Google Translate starts using neural machine translation in 9 languages, coming to all 103
How Accurate is Google Translate in 2021? These 4 Tests Will Tell You
A Neural Network for Machine Translation, at Production Scale
AI vs. Machine Learning vs. Deep Learning vs. Neural Networks: What's the Difference?
17 Machine Learning Examples Your Industry Needs to Know Now
If you'd like to read the more academic/scientific side of GNMT, see the following paper, Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation.