04 : User ratings and trust analysis for online articles
User ratings and trust analysis for online articles
Introduction and Problem statement description
When rating articles on online websites, are individuals' behaviour influenced by people they trust. For this we have used a dataset which has user ratings for different types of articles and every user has a trust circle and distrust circle. The ratings represent how much a certain user rates a certain textual article written by another user, i.e. a review. Trust circle for X user represents users which are trusted by X user and the same type of dataset is available for distrust users. We want to analyse effect of their trusted and distrusted circle rating on a particular article and whether user is influenced by them or not.
Datasets
The dataset has 132,000 users, who issued different ratings. Around 841,372 statements out of which 717,667 are the trust ratings and 123,705 are the distrust ratings. And around 85,000 article authors received at least one statement.
Our dataset consists of three table
Ratings table - This table contains rating form (0-5) that are given by the user for the articles along with some metadata. Metadata includes fields like creation date, display_status etc.
Article table - Contains the details of the article, like the Author Id, Content Id , Subject Id etc, where subject Id represents the subject which the article is all about.
Trusted_Distrusted table - It contains the trust and distrust relations. The fields include User Id, Other User Id, Value (where 1 represents trust and -1 represents distrust) and the creation date.
The distribution of the overall ratings for the entire dataset is shown below.
Research Questions
We will explore the concept of collaborative filtering and echo chambers. I.e Are people influenced by other people’s opinion or whether a community of a user’s trusted circle influences his/her decision on whether something is good or bad.
- Firstly, we will look to correlate the users’ ratings with the ratings of their trusted circle, as well as their distrusted circle. We hope to find patterns in the dataset that point to the fact that users’ are in fact influenced by the people they trust.
- Secondly, we will analyse whether trust building is influenced by the people the users trust, i.e., whether people trust people trusted by their own trusted circle, and similarly for their distrusted circle.
- Thirdly, we will analyse whether the users’ perception of the articles in binary terms, i.e. goodness or badness of the article is influenced by the people they trust or distrust.
Results
Correlation Metric
The matrix shows that there is a positive correlation between the user rating and the user trusted circle. Also there exists negative correlation between the user ratings and the user distrust circle ratings. Though surprisingly the metric depicts that there also exists a slightly positive correlation between trusted circle rating and the distrusted user rating.
Triadic Closure
Triadic closure represents transitivity in any dataset. That is if User A trusts User B, and User B trusts User C, then it may be that User A also trusts User C. We checked if there exists a transitivity among the trusted circle of a user or not. For this purpose we took 1000 random users and formed a connectivity graph among the users. Among 1000 users we got 129 such triadic closure. So we can say that there exists transitivity in the trusted user dataset.
Applications
- Find the approximate actual rating of any article: Say if there is a large number of people in a trusted circle group, then the actual rating of that article will be influenced by that circle. So to mitigate this, we can give less weight-age to the ratings given by that circle. Also this can be used to stop from fake reviews which are mostly given by some group of people to increase the rating.
- Targeted marketing: If some product is bought or liked by most of the trusted circle of a user, then due to Word-of-mouth marketing, it is highly likely that the user will also buy or like that product. We can find the trusted circle of a user, using the mobile phone contacts or friend circle from Facebook, Instagram etc. Then we can do targeted marketing, which can help us to save the marketing cost.
Conclusions
We determine the correlation between the user rating and ratings given by their trusted and distrusted circle. We found that
User rating ∝trusted circle rating
User rating ∝ 1/distrusted circle rating.
Trusted circle rating ∝ Distrusted circle rating
Also people do get influenced by the trusted and distrusted circle. User’s circle have an influence on the perception they from about any rating they give, leading to biased ratings.
The dataset depicts some amount of transitivity among the trusted circle.
Video
Acknowledgement
Team Members
Akshat Gupta (2020201050)
Aditya Rathi (2020201041)
Nisarg Sheth (2020201049)
Pujan Ghelani (2020201083)
Aviral Sharma (2020201062)
Ankit Parashar (2020201043)
Akshat Gupta (2020201050)
Aditya Rathi (2020201041)
Nisarg Sheth (2020201049)
Pujan Ghelani (2020201083)
Aviral Sharma (2020201062)
Ankit Parashar (2020201043)
Comments
Post a Comment