21 : Top COPs - Analyzing Twitter-based Community Oriented Policing & engagement with government accounts
Introduction
Twitter is used by many people to talk to government agencies and get their voices heard. It democratizes access to government agencies (for internet users) and also introduces the element of accountability since the complaint made is visible to other users on the platform.
However, not all tweets with a government agency tagged in them get a reply. The question we were interested in answering was how can one tweet about a grievance so that it gets a reply from the government organizations. In other words, we wanted to see if there were some unwritten policies that these government accounts followed while replying to tweets. We would study this by analyzing emergent patterns in tweets that get replies vs those that do not.
Note that there has been some work done on studying the interaction between Social media accounts of Police agencies in India and civilians. This study done by Niharika under Prof. PK, analyses the emotional difference in responses, reaction time, etc. between the police replies and civilian posts. The study was more geared towards understanding the patterns inherent to police agency based replies and not about what kind of tweets get replies etc.
Trends in Topics
To do this we first started by collecting Twitter data. We collected 50,200 tweets that tagged 15 government Twitter accounts. 11,100 tweets had a reply associated with them. We collected these tweets from 5th Feb to 5th March. We then recorded the frequency of occurrence of each English content word in replied and unreplied tweets separately. We then took the difference of the normalized exponential of the counts for each word to obtain a differential count for each word, thus indicating how likely it is for a tweet to be replied to if it contains a given set of words. Finally, these differential counts were plotted onto a semantic plane generated using dimensionally-reduced English GloVe-300 vectors.
Through this, we find clusters of important & similar words in replied & unreplied tweets using a Semantic Similarity Model. The graph here is a flattened word embedding graph. Each dot on this graph represents a word and the closeness between these dots represents the similarity of those words. To view the interactive versions, please follow the HTML links here!
We will be dividing our observations into two. We will first see patterns in a particular state across all government agencies. In our case, Delhi and Mumbai. We will then see patterns within a government agency across all states. In our case Municipal and Police.
First, let us look at the observations from Delhi across all domains. We saw that the tweets that did not get a reply were about controversial topics. This comes from huge clusters of words around the terms Hindu, Muslim, Jihad, Terrorist etc. We also saw that informal speech gets unreplied. We saw a cluster of informal stop words like hey, ya and hindi words like bhi, na etc.
We observed that cybercrime and transportation-related tweets got more replies relative to others. Finally, we saw that polite words occurred more frequently in tweets that got replies. This last observation is consistent with the previous study done in this topic, which just focused on police agencies.
For tweets from Mumbai across domains a similar trend is observed. Tweets about controversial topics get fewer replies. Tweets with informal language or those showing gratitude get few replies. We also observed that tweets that mentioned a location got more replies and cybercrime-related tweets got more replies. This shows that being specific about a problem rather than being vague results in a better chance of getting a reply. We saw the same trend of politeness getting more replies in Mumbai as well.
For tweets from police organizations across cities, we saw that tweets with more non-English words received fewer replies, tweets with intensifying, non-informative adjectives received fewer replies and tweets that talked about less complex issues like fraud got more replies than issues of rape and drugs. Politeness followed a similar trend with polite tweets getting more replies. This (complexity of issues) was a finding which was not covered by the other study, they had mentioned only the non controversial terms in the keywords analysis.
For tweets from municipal corporations across cities, we see a similar trend. Tweets with more non-English words get fewer replies and tweets that are polite get more replies. Non-informative words containing tweets and tweets that expressed gratitude received fewer replies compared to more informative, location and time specified ones.
Trends in Emotions
We then used Empath (https://github.com/Ejhfast/empath-client), which is a tool for analyzing text across lexical categories and also generating new lexical categories to use for analysis, to mimic LIWC features by using the categories and sample words used per category. This allows us to obtain the scores for a tweet across various semantic categories. We applied this over the entire tweet-set and generated two kinds of plots: semantic-scores vs (1) various days of the week & (2) times of the day. For each slot, we compute the ratio of the scores of replied (or unreplied) tweets to the total number of tweets made in that time-slot. The warmer regions in the plots indicate topics which get replied to (or don’t) during the pertaining time period. We will discuss the noticeable trends that we observed after doing this.
We saw that death, anger, negative emotions and sexuality are consistently high across the week for unreplied tweets.
For unreplied tweets related to transport, discrepancy, future tense and money use are high throughout the week. Suggesting that queries related to events that have not happened yet and cost-related topics do not get replies.
For municipality related tweets, Hearing and Leisure both get a lot of replies, because of neighborhood sound complaints where noise is Hearing related and music is Leisure, in context of events like late night parties.
For replied tweets in Delhi, future tense, home and discrepancy are talked about pretty equally throughout the day We can see that tweets related to anxiety increase as time passes and is highest at night/early morning
Trend in Response Time
We also measured the average time taken to respond by departments mainly Police and Municipal and the average of the government agencies of the city. We saw that when it comes to the police department, Mumbai responds the fastest while Delhi is the slowest. Among municipalities, BMC is the fastest whereas Delhi and Thane's municipalities are the slowest. When it comes to average reply time across government agencies in the city, we see that Mumbai responds the fastest and Bengaluru is the slowest.
When to shoot your shot?
With the time related data, we could also graph the probability of getting a reply given that a certain tweet was made. We saw that for Delhi and Mumbai based agencies there is a higher chance of a reply if a tweet is posted early in the morning or late at night. Whereas for Hyderabad there is an equal probability of getting a reply at all times except during the midday.
Conclusion
There are two benefits of doing this kind of study. First, as citizens, we get to know what are the best practices while addressing government accounts on Twitter. The second benefit is that we get to see what problems are people posting about and if there is any trend in that data which can help government agencies be better prepared, for example, the pattern of more noise complaints at night and anxiety-related complaints night and early morning. The third benefit is that of holding government agencies accountable and uncovering hidden biases. Through this study, we saw that there were certain topics related to anger, death and sexuality that did not get a reply. Government agencies can use this data to see how they can better address these deficits. While anger and death related topics are generally hate speech that the agencies usually will not reply to, sexuality-related tweets include sexual violence which should be addressed. Our study was a small scale study and hence we can not completely say that our results are conclusive, however, we believe that we have made a case for doing this kind of study on a larger scale and doing regularly to understand the troubles people are facing and improving the quality and coverage of government outreach.
Team Members:
Mukund Choudhary (CLD | UG4)
Ishan Upadhyay (CLD | UG4)
KV Aditya Srivatsa (CLD | UG4)
Monil Gokani (CLD | UG4)
Rishav Kundu (CSD | UG4)
Comments
Post a Comment