Tuesday, November 1, 2011

Small Data and Analytics

We hear a lot these days about "big data":  petabyte sized databases, millions of records, in-memory data processing.  I was at a conference recently where the speaker discussed his 70 BILLION record database!  Several large organizations that deal with data at these volumes have successfully leveraged their "big data" with analytics to better manage their business.

But what about the vast number of businesses that are NOT part of the Fortune 500?  Those that are not capturing massive data volumes of data as part of their core business?  Those that have small...data?  Mere gigabytes, perhaps.  Can analytics still benefit them?  The answer is: yes.

One key to leveraging small data for business success is to integrate data from disparate business systems: Google Analytics, SalesForce.com, SAP, legacy databases, etc.  Big data systems that for many house their small data and potential for business insights. 

Integration need not happen within the systems themselves, but rather with the data that they generate, or expel, for the purpose of analytics.  Many business intelligence, reporting and analytics tools allow for dynamic and virtual integration of disparate systems, at the point of analysis.  This eliminates the need to monkey with the architecture of the systems themselves, and the need to develop ETL processes, data warehouses and data marts.  The best part of "small data" is that computing power exceeds data volume demands, allowing integrated analytical data sets to be generated dynamically. 

If performance does become an issue, some business intelligence tools will create stand alone "extracts" that are mini-data marts specific to the analytics at hand.  These extracts are a consequence of the analysis and do not need to be separately designed or optimized.  In addition, scheduled refreshes can be performed such that data stay current and live.

But what of these tools, systems and databases?  Will insights come simply by cross system integration?  Well...maybe, but probably not.  Unlike big data environments, where analysts can go swimming in the data and come upon insights by thrashing about, small data requires a bit more finesse (not that big data analysts don't have finesse).  Given the relatively small size of "small data", and the somewhat complex nature of how any business defines, measures, characterizes and organizes themselves, the different combination ways to look at the data begin to dwarf the actual amount of data available.  Something statisticians call "degrees of freedom". 

As such it is imperative when working with small data to begin with clear and concise business goals and objectives (see previous blog post: Micro-goals).  This focus will help narrow down the perspective applied to small data and will increase the degrees of freedom necessary for valid, significant and insightful analysis.

While "big data" does seem to be getting a lot of attention, it is the collection of vast "small data" insights that will will propel change in our businesses and economy.  Let's get started!