The Unbearable Bigness Of Data

(And What We Should Be Doing About It)


Welcome to the Data Deluge.  

By now you’ve probably gotten sick of hearing about big data, little data, fat data, thin data and all manner of data. You’ve gotten your head around Terabytes, Exabytes and Zetabytes. You’ve noted that the price of data has crashed by 90% over the past few years on a per unit basis. Your CIO has mastered Hadoop and MongoDB and you understand the benefits of data lakes over, say, data puddles. The scary part of all of this is that we’re still in the early days of the data deluge. We are hurtling into a quantified universe fed by smart cities, homes and cars; platform driven models and clickstream driven relationships. In fact, I was having coffee this morning with the well travelled, well informed, and always insightful John McCarthy from Forrester, and we were positing that in a few years from now, data will take over from ‘Digital’ as the centrepiece of the organisational transformation and focus across the world.

Right now, though, we’re caught in a deluge with no real clarity about how we’re going to actually use all the data that’s floating around. And here are three key challenges we’re going to have to deal with:

What, not Why – A New Mindset

A question I often ask my colleagues who are experts in data sciences is as follows: let’s suppose that when it rains, people drink more cappuccinos. Now, if Starbucks knew this, it could advertise or promote cappuccinos every time it rained. It could even launch branded umbrellas. But how would it discover this? Historically, the story would be one of a smart store manager who one day realises that rainy days increases his cappuccino sales, and having defined the premise, starts to collect the data to validate his hypothesis. Or even more traditionally, Costa Coffee runs focused groups, and the link between weather and coffee preferences is established. Critically, a qualitative hypothesis would be at the front of the process and data collection would follow. Because, how else would we know if it’s the rainfall or the pollen count or indeed, the volume of traffic on the roads that we should be correlating coffee sales with?

In the new world of data, or ‘big data’, this works the other way around. A brand like Caffe Nero could take all their sales data across the world, and run hundreds or thousands of analyses, searching for correlation, with any number of external and easily accessible data sources. This includes the obvious ones such as weather, or transport, but also for example days of week or month, time of day, and train and bus schedules, sales in other retail stores, etc. This list is only limited by your creativity and the data availability.

But most fundamentally, this is a shift from why, to what. As well highlighted by Cukier and Schonberger in their book on Big Data, in this new world, we find the correlation first and then the hypothesis. And we actually don’t care why. Let’s suppose we discovered that the coffee consumption actually varied with the tides. We would need to verify whether this was simply a spurious correlation, but from there on, we could go straight to predictability and dispense with the causality, or the ‘why’ question. This is a mind shift for those of us who are used to a ‘scientific’ mentality which requires us to establish causality in order for any approach to rise beyond heuristics into a scaled and logical argument.

The Crown Jewels?

If you haven’t read Adrian Slywotzky’s great book on Value Migration, this would be a great time to start. The book talks through how value migrates from older to newer business models or from a segment to another, or even one firm to another.

We are going to see significant value moving to those companies in each industry that get the value of the data. Be it healthcare, or education, or automobiles, or even heavy industry. Either an incumbent, such as GE, with it’s smart engines and its Predix platform, or challengers such as Amazon, in retail, or upstarts such as 23andme.

The question you want to be asking yourself is, in your industry and in your firm, what are some of the areas of opportunity where you can create new platforms to data-enable processes, or value to customers. How can you converge the primary and ancillary meaning in your data onto areas of your competitive strategy? And also, you may want to perform an audit of what data you might be giving away, perhaps because you feel that it’s not core to your business or you have a player in the industry who has historically be collecting this data. For example, Experian and credit scores. Ask yourself are you merely giving away data that you don’t use, or are you handing over the source of competitive differentiation in your industry? Remember the story about IBM, Microsoft and Intel? I argued this point in my post about Uber and taxi companies, too.

To underscore the earlier point, I believe that value will increasingly migrate, in each industry, to those who best manage, and build strategic & competitive alignment with their data strategies and/ or new offerings based on the data and its meaning.

Adding Love To Data

A couple of years ago, at the annual FT Innovate conference, a lively round table discussion followed after a well known retail CEO had made a presentation about data and analysis. The presentation covered examples of analysing customers to great and occasionally worrying insight, within the industry. From knowing if a woman is pregnant even before she knows it herself, to people having affairs, or stacking beer and nappies together, in front of the stores, all of this can today be deduced from data itself. The debates afterward spilled over onto lunch led to the insight that while there’s been a lot of talk about analysing customers, it misses the point of empathy.

Let’s remind ourselves though – the customer does not want to be analysed. As with any relationship, he or she wants to be loved, cherished, understood and served better.At the end of the day, for most businesses, this translates to a mind-shift again, of adding a layer of human understanding to data, to creatively and emotionally assess the customers’ needs and to allow the analytics to feed off the empathy and emotional connect, rather than be driven purely by the algorithm.

In Sum:

You will hear a whole lot more about data in the coming weeks and months. However, for starters, you could keep these 3 guidelines in mind:

  • Look for correlations, not causality. You want to throw tons of data together and find patterns that aren’t born in some logical causal hypothesis but is simply an observed correlation done at the data level.
  • Be aware that the future of your industry, just like any industry, will involve the value of data. So try to identify and own areas of data which help you drive competitive advantage and/or new products and services, and start building proofs of concept.
  • Add love to data. Don’t just analyse your customers. Bring observation and empathy to the table as well, and marry the analytics with the empathy for best results.
What are your lessons from working with big, small and tiny data so far?

