Love Your Data. Can I have some context with that?

You know what is sexy? Presentations where the data and algorithms presented by researchers come with a healthy does of real life context. [Also, other researchers who read applied statistics textbooks in coffee shops early in the morning. I have been doing this a lot recently and just made friends with someone who was reading a different book by the same statistician I was reading.]

I constantly complain that we lose a lot of information when we work with big data analytics. Part of it is that many researchers are encouraged to work with data from their desks in offices tucked away inside of universities or office buildings in major cities, far away from the ecosystems they are trying to describe through numbers and algorithms.

Nate Silver spends a lot of time talking about the weakness of prediction models in his book The Signal and the Noise: Why so many predictions fail — but some don’t. He points out that economists have trouble identifying relevant variables to make predictions. This is fair… economies are constantly changing in structure and dynamic. It would be really hard to collect appropriate data on the formal economy as it shifts, and even harder to keep track of informal economic activity in a way that would lend itself well to predicting output for the future.

I’ve found the only way that I truly understand the pulse of an economic ecosystem is by living and breathing the structure and community of it. After all, economies depend on communities and trust for transactions to take place at all. But this is for another post.

But I did find someone trying to add context to big data!

I watched this talk by Anna Rosling Rönnland from TEDxStockholm yesterday, and while the introduction is a little confusing, the center of the talk is important. The best way to watch this talk, in my opinion, is to consider the implications of using photographs to describe the spread of the distribution.

In non-jargon speak, this means, consider how your perspective on wealth disparity changes when you see how people in the richest 25% versus the middle versus the lowest 25% brush their teeth. This hits home a lot harder than quoting per capita numbers at someone would, because it also takes into account differences in pricing/living costs within the country. We can see where wages fall short and what that means in the day to day life of workers around the world. We gain perspective on data. And that’s sexy.

 

Violence and Evaluation: Why It Matters To Document Progress

My preferred field of research is in informal economies. This means, often, that information is very limited, existing data sets can be misleading, not cleaned up well, or just not complete. Unfortunately, a lot of the existing research is based on anecdotal evidence — I can prove some of the theories that I work with… after hours of compiling data from individual sources into my own data sets. Or going into the field and painstakingly collecting it myself.

I find that working with non-profits, especially those interested in reducing violence, yields similar challenges. The groups I work with and think about often devote their resources to the issues they are trying to address, which might make sense in the short term… but then we also run into issues where we can’t scale solutions or improve development models because there was never a system to document progress before/after a program was implemented and/or measure the impact that program had on the specific target groups over time.

What do I mean by this? Look at Ciudad Juarez. The documented homicide rate has decreased significantly since 2010, there has been a ton of investment in local social programs, the military left the policing programs to local police forces… but what worked? Many things happened at once. Which social programs were most effective and why? How do all of these changes in the local fabric of the city interact with one another? What failed? And what were the negative side effects of these changes? What are we not seeing in these new numbers? How do we evaluate “positive change?”

It’s nice that sometimes there is enough clear data from different accounts that we can draw some conclusions after the fact. Sometimes, we receive anecdotes that offer enough context that we can compare data from one story to data from another. This is an extremely slow process — compiling data from anecdotes and interviews, but it is possible.

I would love to see groups in all spheres of development, violence reduction, public investment, etc. being trained to document their findings better and making these records public. That would, of course, require them to disclose when their programs were not working… which is another public branding issue for non-profits, but would, overall, ensure that we can find better programs that really can scale to bring positive change.

A girl can dream.