Linguistic Shortcomings of Social Media Monitoring – notes.

It is difficult to get an accurate reading on how commonly a word is used in a given society. In fact, the task of measuring word frequency fully objectively is inherently impossible. The results will always be affected by the size of the corpus and the choice of the texts entered in it. On a global scale, where words take on subtle new meanings as they are appropriated into the semiotic structure of the actor and thereby changed, the problem becomes even more obvious.  Frequency means nothing without cultural context.

This is not to say that frequency isn’t important. It is important and revealing. Frequencies are only broadly indicative of cultural salience and they can only be used as one among many sources of information about a society’s cultural preoccupations. But measurements only tell part of the story. And when they are decontextualized or prescribed meanings based on the person developing the algorithm that assigns sentiment. They give a potentially false understanding. To be correctly interpreted, figures have to be considered in the context of an in-depth analysis of meanings.

If four thousand people call a product “shitty,” it is fair to say that four thousand people reacted negatively to it. But that measurement can’t tell us about the culture of those people – are they engineers addressing it from a technological angle? Are they Venezuelan students reacting to a larger political issue? We assume that a word can be easily categorized along a linear trajectory – negative/positive, etc. But this isn’t necessarily the case. Words can be studied as focal points around which cultural domains are organized. By exploring these focal points in depth, we may be able to show the general organization principles which lend structure and coherence to a cultural domain as a whole, and which often have an explanatory power extending across multiple domains.

In a sense it is true that words have no “fixed” meanings because meanings of words change. But if they were always fluid and without any “true” content, they could not change either. Words do have identifiable, “true” meanings, the precise outlines of which can be established on an empirical basis by studying their range of use and articulating the contexts that subtly repurpose them. The key point is that social media monitoring today does not account for semantic deviation and language as fundamentally tied to speech and discourse.



