Google Translate Has Gender Bias. And It Needs Fixing

Machine prejudice is increasingly becoming a cause for concern. Despite  translations becoming more natural and fluid with advancements in neural machine translation (NMT), they still reflect societal biases and stereotypes.

Gender bias also surfaces when working with languages that use gender-specific terminology. For example, Google Translate has historically translated the Turkish phrase “He/she is a doctor,” into the masculine form, whereas the Turkish phrase “He/she is a nurse” has always been translated into the feminine form.

Prevailing gender bias in translation 

Now, Google is aiming to reduce gender bias in machines. In December 2018, it released gender-specific translations in Google Translate that enables gender-neutral searches to have their translations rendered in both feminine and masculine forms.

One of Google’s key research areas is using adjacent sentences and passages as context to  make notable improvement in gender accuracy.These techniques present a hurdle because gender information is not often explicitly expressed in each individual sentence. For instance, in the following Spanish passage, the first sentence directly refers to Marie Curie as the topic, but the second sentence does not. The second sentence could be referring to anyone, regardless of their gender. When translating, the initial sentence must identify a pronoun and reveal the information needed for an accurate translation.

Spanish TextTranslation to English
Marie Curie nació en Varsovia. Fue la primera persona en recibir dos premios Nobel en distintas especialidades.Marie Curie was born in Warsaw. She was the first person to receive two Nobel Prizes in different specialties.
Source: AI blog

Furthermore, to counteract the usual issues in contextual translation (e.g., pronoun drop, gender agreement and appropriate possessives), Google is releasing the Translated Wikipedia Biographies dataset to evaluate the gender bias of translation models. The objective is to support long-term advances on machine learning systems focusing on pronouns and gender in translation by providing a benchmark in which translations’ correctness can be tested pre- and post-model revisions.

Case study with Google Translate

In 2019, a paper published in Neural Computing and Applications, “ Assessing gender bias in machine translation: a case study with Google Translate” by Prates, M.O.R., Avelar, P.H. & Lamb, L.C. studied gender bias in machine translation. 

The researchers believed that automatic translation systems can be leveraged via gender-neutral languages to provide an insight into gender biases in AI. The team began with a comprehensive list of job positions from the US Bureau of Labor Statistics (BLS) and used it to construct sentences in gender neutral languages such as Hungarian, Chinese, Yoruba, etc. The researchers used the Google Translate API to translate the lines into English and collect statistics on the prevalence of female, male, and gender-neutral pronouns in the translated output. This demonstrated the strong inclination of Google Translate towards masculine default, particularly in areas generally linked with unequal distribution of genders or with preconceptions, such as science, technology, engineering and maths jobs. The comparison of these figures to BLS data on the frequency of female participation in each occupational position, showed that Google Translate fails to replicate a real-world distribution of female workers.  

Translated Wikipedia Biographies dataset

The dataset was developed to examine common gender errors in machine translation. Each instance of dataset represents an individual, a rock band, or a sports team (considered genderless). Non-native English speaking staff write articles in their original language and have them professionally translated into Spanish and German. Similar sets could be used to examine pronoun-drop and gender agreement in both Spanish and English. Bands and sports teams are the first third-person pronouns to be found in a gender-specific investigation.

The dataset was created by selecting an equal representation of examples across geographies and genders. To ensure an objective selection of occupations, researchers chose nine that exemplified a range of stereotyped gender connections (either feminine, masculine, or neither). Then, to account for any geographical bias, they separated all of these cases according to their geographic variety. There were two biographies  (one male and one feminine), one for each of the seven geographic zones. Finally, 12 instances that lacked a gender were included. Rock bands and sports teams were chosen since they are frequently referred to by third person non-gendered pronouns such as “it” or singular “they”. 


The machine translation evaluation approach using this dataset offers new applications (introduced in a previous post). One can calculate the correctness of the gender-specific translations that relate to this subject as each instance is tied to a known gender. This computation is much easier when it is translated into English, as all pronouns in the language are gender-specific. Gender datasets have also reduced errors on context-aware models by 67% compared to earlier models.Using this additional information, new lines of research may be explored into how different models perform across various occupations or areas.

Ritika Sagar
Ritika Sagar is currently pursuing PDG in Journalism from St. Xavier's, Mumbai. She is a journalist in the making who spends her time playing video games and analyzing the developments in the tech world.

Download our Mobile App

MachineHack | AI Hackathons, Coding & Learning

Host Hackathons & Recruit Great Data Talent!

AIMResearch Pioneering advanced AI market research

With a decade of experience under our belt, we are transforming how businesses use AI & data-driven insights to succeed.

The Gold Standard for Recognizing Excellence in Data Science and Tech Workplaces

With Best Firm Certification, you can effortlessly delve into the minds of your employees, unveil invaluable perspectives, and gain distinguished acclaim for fostering an exceptional company culture.

AIM Leaders Council

World’s Biggest Community Exclusively For Senior Executives In Data Science And Analytics.

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox