Team 18: Bias in Google Images Search

 Motivation

Most people believe the outcome of top search results given by google. Google's image search is used to understand the world around us. It is important that the results, especially in top ranks are not biased. Researchers from the University of Michigan in 2015 claimed in their report that women were significantly underrepresented in Google image search results. Across all the professions, women were slightly underrepresented on average. The researchers claim can change searchers’ worldviews. Although Google has claimed that it has fixed the problem of the under representation of women in the article, we observed a strong bias in image search results google against women.

In careers related to technology, where the representation of women is still limited even though the industry is booming. Many reports claim even though the woman who opts for STEM as a field of study has increased, this number is not translating into jobs. One possible reason for this is that women are still considered outsiders in high-paying technology jobs.

 

Problem Statement-:  

Google image search results directly or indirectly increase this bias. The direct way is when the user directly observes the results. Other indirect ways are advertising agencies when women are less likely to be shown ads for technology jobs. 


This project aims to do the bias quantification of google search engine results. The aim will be to understand whether google ranking algorithms promote images that as male in them for job roles in the technology sector and do the comparison to job roles that are woman-dominated.


In the second part, The impacts of the search engine result on the behavior of users after they see the results are observed.

 

Data Collection: Methodology: 

 

To detect the bias in google image search, 20 job roles are search keywords to be queried to collect their corresponding images. 10 keywords are the job roles from the technology sector which are hand-picked after consulting with those in the industry. The other 10 keywords are the jobs roles that are woman-dominated. These roles are chosen from the data published by World Trade Organization.


For each corresponding search query, 30 top image results are collected.  The images are web 

scrapped using selenium. Using selenium we extract the top 30 URLs and after this, we

 downloaded images through these URLs.  This process is carried out 3 times from 3 different 

locations in India to make remove the geospatial bias and account for randomness in the

 results since for every search query made, results are slightly altered for the same keyword.

 In total 1800 images are collected. In the second step, the images are labeled to -1 (for male in

 the image), +1 (for female in the image), and 0 (for none of them, both present). 

 

 

 Part 1: Search bias quantification framework:


 A search bias framework for taken from a paper (...) where the author has used to do the 

Twitter and google search bias quantification by collecting the ranking of news media articles

 during the 2016 US Presidental elections. We have modified the methodology for gender bias 

in the imaging modality instead of text. 

 

 

For a given query q, a set of data items relevant to the query is first selected. Each individual
 data item has an associated bias score. These set of relevant items is input to the ranking
 system which produces a ranked list of the items. Our framework includes metrics for 
measuring the bias in the set of relevant items input to the ranking system (input bias), and 
the bias in the ranked list output by the ranking system (output bias).
 

Input bias can be understood as the ground truth of that job role. Output bias is what is 

observed. Ranking bias captures the bias given by the Google ranking algorithm for that job role

. If the metric value is towards -1, the bias is considered toward males. If the metric is towards 1,

 the bias is toward females. Values close to 0 represent neutrality. To understand how these 

metrics are calculated, refer to the supplementary column.

 

Observation/ Evaluation: 

 

1. In the bar plot above, input, output, and ranking bias are plotted with a negative value of the 

y-axis, which means a bias toward males, and a positive value implies a bias towards females.

 The input, output and ranking bias of all technology job roles are male-biased. This means 

there are more male images than female images. The ranking algorithm is further male-biased, 

promoting males to the top results.  

 

2. In female-dominated jobs roles, the input and output bias are towards positive; that is, both 

there are more female images, and overall results are female-dominated. Even though there 

are more female images, the ranking system is still promoting male images to the top results, 

as ranking bias is still negative with the exception of nurse_practitioner and veterinarian. 

 

 
  
3. From the plot above, it can be observed that the google search engine is more biased in Technology job roles than female-dominated job roles. From the pie chart, it can be seen that out of the total bias contribution against a woman in technology roles, google ranking algorithms contribute 34% to it. 
4. Another observation from the above line graph is that in job roles whose input is highly male
 or female-biased, ranking algorithms make little changes compared to the balanced job roles, 
with low input bias. 


Summary of Observations: 

  1. Even though the raking of female-dominated job roles has several images, google ranking algorithm is still biased towards male pictures, similar to technology careers. 

  2. Image search results are more biased in technology careers than in female-dominated careers, and they comprise 34% of the total output bias in results. 

  3. Ranking algorithms give a large bias on search queries whose input results are already very biased.


Part 2: Impact on users due to image search bias:


Goal: The goal of this part of the study was to detect gender bias among society on different professional roles. Further, we wanted to understand the effect of bias present in search engine results on general opinions within society.


For this, we did ‘Post-test true experimental design based experiment. We recruited volunteers through various channels and volunteers needed to perform a simple task. The task has to be performed digitally via a form. The experiment group received a slightly different task from the control group. 

 

Survey Design-:

 
The task we chose is 'translation.' Some languages like English don't have gendered verbs, but other regional languages like Telugu or Hindi have gendered verbs. In general, when we use a translation task, from non-gendered language to gendered language, people translate based on context.


We decided that a simple translation task could extract people's internal biases. Using this as the basis, we gave gender-neutral sentences in English involving the professions. We ensured that professions with both female and male biases were evenly used. We asked the participants to translate this sentence to either Hindi or Telugu. Once the task was complete, we annotated it to find which gender was used.

For the experiment group, we modified the task slightly with 'treatment'. The 'treatment' used here is showing pictures of the relevant profession and then asking for a translation task. The pictures used have varying amounts of male faces and female faces. 
 

Survey structure: A total of 6 groups of participants were made. 3 groups for control and three groups for experimental. For each set of control, experiment groups, we asked the group to translate two questions. One question is based on a male-biased role, and the other is based on a female-biased role. 



Examples of questions used were :


  1. My friend became the system admin of our college
  2. The vet near Indira nagar is a very friendly one
  3. After graduation, my classmate became a nurse practitioner
  4. My neighbor became a sales engineer 
  5. That dental hygienist charges a lot for cleaning
  6. The entrepreneur went to California.


 
 

One example set of possible responses (actual response) : Translate task : 


The vet near Indira nagar is a very friendly one.


Possible translations are 

  • Indira Nagar ke paas wala vet kafi friendly hai - Male Vet
  • Indra nagar ke pass jo vet hai vo kaafi friendly hai - Neutral Vet
  • Indira nagar ke paas vali vet bahot friendly hai - Female Vet



Details of ‘Treatment’ For The Experimental Group :

 

Set 1: 10% of females /males pictures used in respective roles

Set 2: 20% of females /males pictures used in respective roles

Set 3: 30% of females /males pictures used in respective roles


Note: Sales Engineer is a male-biased role. We use more female pictures in it. The nurse is a female-biased role. We use more male pictures in it.



Survey Results and Interpretation for Female Biased roles : 


A total of 100 responses were received across six groups. The groups were not perfectly balanced; hence we used ratios when computing our metrics. 


In the case of female-biased roles, we measured the use of % use of female-gendered words.

 

Female Biased roles: % use of female-gendered words:

 


 

As noted above, as we increase more male images in the treatment group, we observe that usage of female words is decreasing. From the control group, we observe that people are inherently biased in roles. By changing the images, we observed that this bias can be magnified or minimized. When we showed more male faces, people stopped being more biased. This strongly proves that people can get biased by looking at images. 

The below graph clearly explains the above observation. As we use more males in images, the ‘diff’ keeps increasing. 


Survey Results and Interpretation for Male Biased roles:


For the male-biased roles, we measured the responses with male-gendered pronouns. The ‘treatment’ used here is to show more and more female faces in male-biased roles. The outcome is much more evident here compared to female-biased roles. As we show more female faces, the use of male-gendered words is reduced in the experiment group. Also, the impact of bias reduction is directly proportional to how much more faces we are showing. This proves that the search engine results can reinforce and magnify inherent bias. 

 


 

 

Male Biased roles results: % of Male Gendered words:

 


 

Results analysis: Gender-neutral translation:

 


But as we see from above, the usage of gender-neutral words is high in the experiment group. When we show ‘images’ with a balanced mix, people are becoming more neutral. This shows that by showing a more balanced set of images, we can improve the bias in human biased.



Summary of results from Post Test experiment:

  1. People are inherently biased about professional roles. Bias is higher among male-dominated roles. 

  2. The bias of people can increase or decrease based on the images they see about particular roles. 

  3. We observed that bias can be reduced by bringing more balanced images.People become more gender-neutral when we add more balanced images.


Future Work-:

We focussed on using the image search to find gender bias. To find the impact on the user, a translation task from English to gendered language is given. This leaves various avenues for the future work:

 

1. In this work, binary values are considered, it can be extended to the use cases which can have more than two labels.

2. Other methods can be proposed to find the impact of google image search on users other than the translation task. 

3. Other forms of bias can be calculated like politics, or towards a sports team in a particular region, and search engines of social networking sites like Twitter can utilize this framework.

 


Supplementary:


Underlying of the framework, code, and analysis can be found on github .

 

Team Members-: 

 

Mehul Arora (2021701033)
Manoj Sirvi (2019111016)
Shivanshu Jain (2019101001)
Zishan Kazi (2019111031)
Medha Vempati (2018101063)
Siddarth Vijay (2018113020)
Prudhvi Potuganti (20173049)
Venkat Sai Ram (2021204018) 

 

Comments

Popular posts from this blog

23: Understanding the discourses around CAA NRC

Archaeological Data Analysis on Harappan Civilization

14 : Misinformation Spread in Social Networks