Image: 2012 Ted Goff

I think we need to say it upfront that all data is *much* greater than what people call big data. As someone who works with data in development space, I believe in the use of all data.

That’s why its imperative to understand “all data > (is greater than) big data”.

Increasingly conversations about possibility of big data playing an emerging role in solving development challenges has me, like few others, unsettled. Just because it has the term “big” in front of data does not mean its “all” data. There are many definitions that exist of big data. See here 40 such definitions. The common theme between them, for me, is that this data is ‘analyzable’ by machines to derive meaning from it.

This is problematic, for a few reasons:

1. Firstly, there is still significant amounts of data that are not digital. Sure, we are seeing digitization of data increasingly but that does not mean all data is digitized or in an analyzable format today. Therefore, whatever constitutes the universe of ‘big data’ is a subset of ‘all data’.

2. Secondly, digital data that is being created every second does not represent “all” and definitely not “us”.  So any analysis that results to public policy application will definitely not be reflective of the “us”. This is captured well by Nick Couldry in A necessary disenchantment: myth, agency and injustice in a digital world

A new myth about the collectivities we form when we use platforms such as Facebook. An emerging myth of natural collectivity that is particularly seductive, because here traditional media institutions seem to drop out altogether from the picture: the story is focused entirely on what ‘we’ do naturally, when we have the chance to keep in touch with each other, as of course we want to do.

3. Thirdly,  its a myth that big data is generating entirely new and better forms of knowledge which will help solve development issues. This is the most problematic in the field of public policy. As Nick Couldry puts it:

” analysts are giving up on specific hypotheses and instead
focussing on generating, through countless parallel calculations, ‘a really good proxy’ for whatever is associated with a phenomenon, and then relying on that as the predictor. ”

The implication of development policy-making based on ‘real good proxy’ sends nervous shivers down my spine.

4. Lastly, to me, there is a power differential that is at play in what the ‘data’ in the big data represents. Whose data (digital haves vs digital have-nots), who analyzes (digitalsavvy-haves vs. digitialsavvy have-nots) and how its analyzed are all subject to the biases and power relations that exist in the real world ‘we’ inhabit.

These reasons are important to remember as we invest time, energy and money in making arguments about ‘big data’ in development discourse. There are challenges that with timely, relevant data analysis may be met but development challenges are not always wanting of faster analysis but are results of long standing socio-economic-political power struggles which no matter how fast and timely analysis you produce will not be solved because of that analysis.

Ad series for UN Women by Memac Ogilvy & Mather Dubai

Ad series for UN Women by Memac Ogilvy & Mather Dubai

In my previous posts here and here I have lamented how mainstream advertising agencies most often perpetuate the stereotypes when portraying women so lo and behold when I see a mainstream advertising firm taking up the issue of stereotyping I was excited to see it but unfortunately that was short-lived. This recent campaign by UN Women using Google’s instant search tool box to “reveals widespread sexism” has stirred many an emotion. They range from disbelief to evidence of what they knew already.

But is Google telling the truth? Google Instant is an “autocomplete” feature – which, according to their site is:” a search enhancement that shows results as you type.”

It is based on the company’s careful mining of hundreds of billions of searches performed each month. For anyone who wants to know how Google describe how it works here’s the official blog explaining during its launch. It is important to understand how this works before we use its results and say its telling the truth. Perhaps its seemless-ness in our daily search imparts the impression that its the truth. Google seems to know most often what I am looking for when I start typing in the search box, hence my conclusion is that it knows what I want – and when what it throws up as result is exactly what I wanted – I begin to feel that it is telling me “the truth”. I am exaggerating the chain of thought but in a sense the UN Women campaign is basing it on this principle – What google search is showing is the “truth”.

The campaign though powerful misses the important consideration which is that Google instant is based on an algorithm that uses important factors such as your location, your search history, your language, your timing, your browsing history, the “freshness” of any topic in determining what text to complete your sentence with. Here’s a dated but useful article to show how it works in detail. All these factors including Google’s policy on what it censors influences what shows up.

From a campaign point of view its capitalizing on 1 factor: shock factor, not just the shock of how deeply rooted sexism is but the shock of seamless integration of Google instant search box in our lives. Which is a good campaign but it is also using incorrectly the public perception that there is a direct linearity of relationship between what Google autocomplete throws up with what we all really think (ie is the truth). The bigger problem for me is not what this campaign is subtly but definitely using but rather the ability of a company to use its algorithmic prowess to feed us what we should be believing as the truth. The great danger this poses is that we are stuck in a loop of information that reinforces incorrect information for all future searches. One way, at least for me, was to turn autocomplete off just by changing one setting. I recommend trying it then perhaps we will really search for what we wanted rather than be sucked into believing what is being fed to us is what we were looking for.

Let this post not confuse you that I am saying sexism does not exist, let it not be misread for one minute that we are not surrounded by injustice towards women. That is the truth. But this post is not about that and if you did not understand that well then this post was lost on you.

Post Navigation