Here’s an interesting piece of research on how the way people use social media can be used to predict age, gender, and even socioeconomic status.
This concept is not new to linguists, but now computer scientists have gone a step further, linking the online behaviour of more than 5,000 Twitter users to their income bracket. They published their results in the journal PLOS ONE.
For this experiment, the researchers started by looking at Twitter users’ self-described occupations. Using a UK-based hierarchy of nine socioeconomic groups based on occupation, the researchers determined average income for each, then sought a representative sampling from each. They used a statistical natural language processing algorithm to identify distinct words used in each group, to examine if these words could be used to predict socioeconomic group based on the language used in tweets.
Some of the results validated what is already known, such as that a person’s words can reveal age and gender, and that these are tied to income. But there were also some surprises. The research found those who earn more tend to express more fear and anger on Twitter, while perceived optimists have a lower mean income. Lower income users use Twitter to communicate with each other, while higher income users are more likely to use it to desseminate news or for professional, rather than personal, reasons.
The study was led by Daniel Preotiuc-Pietro a post-doctoral researcher in the University of Pennsylvania Positive Psychology Center in the School of Arts & Sciences, collaborating with Svitlana Volkova of Johns Hopkins University, Vasileios Lampos and Nikolaos Aletras of University College London, and Yoram Bachrach of Microsoft Research.