Situated Data Analysis
Jill Walker Rettberg proposes situated data analysis as a new method for analysing how personal data is always constructed and represented in specific contexts.
Rettberg, Jill Walker. “Situated Data Analysis: A New Method for Analysing Encoded Power Relationships in Social Media Platforms and Apps.” Humanities and Social Sciences Communications 7, no. 1 (December 2020): 5. https://doi.org/10.1057/s41599-020-0495-3.
Donna Haraway coined the term situated knowledges in 1988 to demonstrate that knowledge can never be objective, that it is impossible to see the world (or anything) from a neutral, external standpoint. Haraway calls this fiction of objectivity “the god trick,” a fantasy that omniscience is possible.
With Facebook and Google Earth and smart homes and smartphones vastly more data is collected about us and our behaviour than when Haraway wrote about situated knowledge. The god trick as it occurs when big data are involved has been given many names by researchers of digital media: Anthony McCosker and Rowan Wilken write about the fantasy of “total knowledge” in data visualisations, José van Dijck warns against an uncritical, almost religious “dataism“: a belief that human behaviour can be quantified, and Lisa Gitelman points out that “Raw Data” is an Oxymoron in her anthology on the digital humanities. There is also an extensive body of work on algorithmic bias analysing how machine learning using immense datasets is not objective but reinforces biases in the data sets and inherent in the code itself (there are many references to this in the paper itself).
Situated data analysis provides us with a method for analysing how data is always situated, always partial and biased. The paper uses Strava as an example, but the method is equally applicable to other data used in digital platforms. Let's consider selfies.
When you take a selfie and upload it to Facebook or Instagram, you’re creating a representation of yourself. You’ve probably taken selfies before, and you’ve probably learnt which camera angles and facial expressions tend to look best and get you the most likes. Maybe you’re more likely to get likes if you post a selfie of yourself with your friends, or in front of a tourist attraction, or wearing makeup – and probably what works best depends on your particular community or group of friends. You’ve internalised certain disciplinary norms that are reinforced by the likes and comments you get on your selfies. So at this level, there’s a disciplinary power of sorts. This is the first layer, where your selfie is situated as a representation of yourself that you share with friends or a broader audience.
Facebook and Instagram and other platforms will show your selfie to other people for their own purposes of course. Their goal is to earn money from advertisers by showing people content that will make them spend more time on the platform, and also by gathering personal data about users and their behaviour. Here your selfie is situated differently, as a piece of content shown to other users. There is an environmental power happening here – the environment (in this case the social media feed) is being altered in order to change peoples’ behaviour – for instance I might pause to look at your selfie and then happen to notice the ad next to it.
A third level at which the data of your selfie is situated happens when a company like Clearview illegally scrapes your selfie, along with three million other selfies, and uses them as a dataset for training facial recognition algorithms. Next time you are at a protest or rally, the police or an extremist group might use Clearview to identify you. Perhaps in the future you’ll be banned from entering a shop or a concert or a country because your face was identified as being at that protest. Maybe a nightclub or a university will have a facial recognition gate and will only let in people without a history of attending protests. Obviously this was not something you had in mind when you uploaded that selfie – but it’s an example of how placing data in new contexts can make a huge difference to what the data means and what it can be used for. A facial recognition gate that refused entry to people would also be a kind of environmental power – the environment is simply changed so you can no longer act in certain ways.
Situated data analysis is a flexible model for understanding data in context, and Rettberg plans to continue to develop the framework, particularly in connection with machine vision technologies and with algorithmic bias.