The black cat has crossed the road — this sentence might sound simple but when you try to translate it to native languages, all the historical inferences and semantic sophistications come into play. For some feline lover, that sentence might remind them of fluffy cat pictures and crossing the road is unimportant to them. For some, it could be an ominous warning, and for others a sign of prosperity. A harmless statement like this can throw amateur linguists into disarray.
Questions such as these regarding the structure of language have been persistent for quite some time.
Our understanding, or the lack of it, plays a very trivial role in task-specific machine translations. But at a more generalised level, it gets tricky if machines are tasked to respond to a French query in Hindi or to a question which has a pun backed up by a localised cultural inference.
Cross linguistic references can be understood by humans based on the context. The words learnt in one language say ‘mother’ or ‘maa’ can be associated immediately when heard in other language. Machine learning comes close to this with its transfer learning technique. Since most of the real world scenarios are unsupervised and the available data scarce and unstructured, teaching the machine to teach itself is the best option.
Researchers at Amazon, now try to train a model on English language so that training it on an Japanese would require less training. Transfer learning between European languages and Japanese has been little explored because of the mismatch between character sets. The researchers address this problem by taking as input to the Japanese system both Japanese characters and their Roman-alphabet transliterations.
How Does This Model Work
A model trained on transfer learning through Romanization of Japanese words is compared with a model trained from scratch on the same data.
Both these models, English and Japanese ones are trained to do NER task for categorisation. The inputs to these models have both word and character embeddings, which are produced by neural networks. These embeddings are then used as an anchor to identify the data items in a vector space of high dimensionality.
So those words whose embeddings are close to each other in a vector space are assumed to have similar meaning.
Here every input word is embedded separately and then passed to bi-LSTM (Long Short-Term Memory) neural network. This network processes subword chunks of a single word in both directions.
The output of this network is joined with other outputs and together passed to another bi-LSTM to capture the intrinsic meaning and context within the sentence.
By the end of the training, the neural network would have learned to produce representations for NER task while the classifier learns entity types.
“Our first experiment was to determine which of the three modules of our English-language network — the character representation module, the word representation module, and the classifier — should be transferred over to the Japanese context. We were experimenting with three different data sets — the two public data sets and our own proprietary data set — and a different combination of modules yielded the best results on each. Nonetheless, the combination with the best results across the board was of the classifier and the character-level representations,” observe the researchers in their paper that will be presented in June at the annual meeting of the North American chapter of the Association for Computational Linguistics
There has been a growth in the usage of voice-controlled devices over the past couple of years. Be it Amazon’s Alexa enabled devices or Google Home, they run high on machine learning models in the background. Thes devices train themselves on new words spoken in new contexts and try to understand the demands of the user no matter how vague the command is. When these devices are deployed in new locations where a new language is spoken then portability becomes a challenge. The large scale deployment of cross linguistics transfer learning will make the devices more ready in new environments as training with new language representations from scratch will take time.
- A new model for named entity recognition is proposed for dissimilar languages.
- A part of Japanaese input is romanized to overcome the dissimilarities between target languages.
- This success will lead to transfer of knowledge between other dissimilar languages like English and Chinese.
Know more about this work here
Register for our upcoming events:
- Meetup: NVIDIA RAPIDS GPU-Accelerated Data Analytics & Machine Learning Workshop, 18th Oct, Bangalore
- Join the Grand Finale of Intel Python HackFury2: 21st Oct, Bangalore
- Machine Learning Developers Summit 2020: 22-23rd Jan, Bangalore | 30-31st Jan, Hyderabad