Team-03: Understanding the Effects of Gender and Religion in Popular Perception of Celebrity Tweets

Understanding the Effects of Gender and Religion in Popular Perception of Celebrity Tweets

Introduction

The world is a happening place. Events happen throughout the year which changes the world in ways that we could not even imagine at present. It could be a scientific discovery, economic downfalls, wars, protests, stock market changes, and many more. These changes could be a result of scientific, social, economic, cultural, or political efforts. With the advent of social media, it has become easier for the common person to voice their opinion on these events. The motivation behind these opinions could be that of a direct implication of these events on the day to day life of these people or it could be due to simple intellectual interest in the topic. The influence of these posts is usually restricted to the closed circles that these people have and are of limited significance. It is a very rare event that a post made by a common person would gain significant traction and be subject to widespread judgement.

Celebrities on social media platforms usually have a reach that is much more than what a medical representative from Mumbai or an IT professional from Bangalore would have. The opinions made by these individuals gain significant attention and are subject to popular judgement because of the fact that these individuals have the power to influence the thoughts of several thousand people who follow them.

Problem Statement

The idea of this project is to look into tweets by individuals who have a celebrity status pertaining to social issues. Celebrity accounts here are verified accounts on Twitter. We wanted to see how the other users, that is, the general audience reacted to these tweets. Ideally, we wanted to analyse what role gender and religion of the celebrity have in these responses. Do celebrities practising a certain religion tend to get more positive/negative responses for expressing views on a certain issue, how different is the general public when it comes to responding to tweets by female celebrities as opposed to their male counterparts, are some questions that we tried answering through this project.

Related Work

Twitter is a social media website that was started by Jack Dorsey and 3 other co-founders in 2006. It is mostly an online news and microblogging site where people can follow each other and also communicate with each other via small messages or Tweets. Twitter is often used by celebrities to communicate with their fans and a wider audience. A lot of work has been done to understand how to make effective use of this microblogging site to communicate effectively.

Stever et al. view Twitter as a safe space for celebrities to share their thoughts to their fans and a wider audience without revealing intimate details about themselves or their lives. Twitter is a space that also lets the fans send messages to the celebrity without them having to gain access to the celebrity’s page or wall. So as far as the fans are concerned, it is a place where they get access to celebrity news and a place where they have an option to communicate directly with the celebrities. The authors cite the presence of inappropriate fans and followers including stalkers and toxic fans. Twitter protects the celebrity from these people physically. However, even this paper does not talk about the mental effects that the toxicity brings about in celebrities being subjected to the same.

Methodology

For our problem, we initially considered a number of issues that both celebrities and the general public have talked about, such as the #MeToo movement, immigration, CAA, gender pay gap, LGBT rights, etc.

We finally shortlisted two topics - CAA and Gender Pay Gap. We chose these two topics considering the following:


  1. We wanted to take one topic that is directly related to each of the two parameters we have based our study around - gender and religion. Gender is directly relevant to the topic of Gender Pay Gap, as suggested by the name. Similarly, religion is also directly relevant to the topic of CAA. 

  2. We wanted to consider one Indian issue (CAA) and one international issue (Gender Pay Gap), as part of our analysis.


We finetuned a RoBERTa model on the dataset that we created from the tweets for this task. RoBERTa is a model that was built on top of Google’s BERT. It extends BERT by changing crucial hyperparameters, such as deleting the next-sentence pretraining goal and training with considerably bigger mini-batches and learning rates. The model was trained with tweets alongside the sentiment score of the replies that it received. The sentiment score is basically a number in the range [-1,1] which was determined using a sentiment analysis model.

We scraped tweets by celebrities chosen at random on the above-mentioned issues and then used them as inputs to the model. The predictions were then compared to the actual sentiment score that we observed.

Dataset

We scraped over 15,000 public tweets each from Twitter on the two topics mentioned above by regular users. The tweets were then filtered based on the media that they contained, language and emojis used. Basically, the tweets that weren’t textual in nature, weren’t English and had too many emojis were removed. We also filtered out tweets based on the responses that they received. We removed all tweets that did not even receive 10 replies. This cleaning of the tweets resulted in a dataset of approximately 3000 tweets.


Sentiment Analysis Model

We used a generic model from the Natural Language Toolkit (NLTK) for this task. We tried evaluating the model by running it on a sample set and received an accuracy of 64%. This model was then employed to find the sentiments behind the replies that a tweet received.


Observations
                                             Celebrities chosen for CAA
                                                            Celebrities chosen for Gender Pay Gap
Word Cloud for CAA
                                                    


Conclusions

Celebrity culture in India is something that’s quite prevalent. People aspire to be popular, rich and successful. But there is a dark side to the popularity that we often ignore. Virat Kohli can not always just talk about cricket, Deepika Padukone should be able to talk freely about mental health. The public trial that comes with a celebrity’s post is what we explored here. The celebrity status comes along with both pros and cons- the public could either react very positively or respond with trolls and harsh criticism. Our work simply explores if such reactions are dependent on factors such as gender and religion of the celebrity.


The results of the above study should not be used to derive strong conclusions of any kind. The main contribution of this work is to present a pipeline for studying such issues, where we leverage machine learning-based prediction methods (which are themselves pre-trained on real-world data) and compare their results with what we actually see online. A high correlation between machine-predicted values and real-world data in our case would also present itself as a bigger problem with respect to large language models and the data they are trained on. 


Acknowledgements

We would like to thank Professor PK and the TAs for their guidance, valuable inputs and suggestions.

Comments