Wednesday, February 2, 2011

Machine vs. Human Data Capture

It is undeniable.  The volume of data that businesses face today is unprecedented.  The Boston Globe compares this "data deluge" to the flood of books that became available after Gutenberg invented the printing press in the 15th century (Information overload, the early years). 

What differs today is how information is captured.  Most of the data we interact with on a daily basis is created by humans: articles, books, weather reports, emails, text messages, blog posts (like this one), twitter feeds, the list is endless.  In addition, much of the data stored in business databases are also entered and generated by humans: sales prospects, clinical codes, risk evaluations, incident codes, quality assessments, manufacturing results, etc.  Name the business, and likely there is some aspect of their data capture that relies on the effort and judgement of a human.

There is a huge amount of data though that are automatically captured as a consequence of human interaction with a machine: user login timestamps, pages visited on a website, tv stations watched, links clicked, files accessed, location where a photo was taken, etc.  This "data exhaust"  holds tremendous analytic value to understand how humans are interacting with the world around them and the machines that they use.  For a software company, it could inform which aspects of an application are being used the most, and direct redesign.  For a commerce website it can explain why buyers leave the site prior to committing to the sale.  Google uses it to determine what search results come to the top of the list for a given set of search terms.  Some smart phone apps actually track where you are when they are being used.

The benefit of machine data over data captured by humans is that there is no human subjectivity or effort in its capture.  The downside is that it is limited to WHAT machines can capture or are programed to capture.  As business processes become more automated, however, and information systems more integrated, machine data capture and exchange will become more and more prevalent. This will open the door for ever more granular and accurate data describing human activities; which in the end is a major key to business growth. With human resources shifting the focus away from (but not eliminating) human data capture, that effort can be redirected to more and more advanced analytics that give important and more focused business insights.

While many businesses are just beginning to leverage and streamline the use of machine captured data, there is still a lot of growth potential in this area.  The future challenge will be to discern what to capture, how to most efficiently capture it, and how to use it for business benefit.

No comments:

Post a Comment