Proof, if it were needed, about the degree of hype surrounding so-called 'big data' this year came in the form of the analysis done by John McDuling in Quartz of the transcripts from investor conference calls and presentations from more than 5,000 companies. 'Big Data' had been mentioned during 841 separate calls/presentations this year, up 43% from the 589 times that it featured last year. This was a much larger increase than other corporate buzzwords like 'the cloud'.
I liked the point that Christopher Mims made on Quartz a few months back that what a lot of companies call 'big data' really isn’t that big. The term, he says, is usually used by consultants and IT companies that want to sell big pieces of kit (and consultancy). The hallmark of big data are the kind of server clusters that can crunch huge quantities of data concurrently, and yet the kind of problems solved by engineers at even the most data-hungry firms can often be dealt with by the capacity of single server or even PC (Mims references this paper from Microsoft research appropriately called 'Nobody ever got fired for buying a cluster'). The danger of a disproportionate focus on the scale of data is innappropriate application of technology, unnecessary complexity, and avoidable confusion. Ironically, a potentially far more significant and pressing issue than technology is the current and looming talent shortage in data analysts.