Data – The Unrealized Frontier

by Michael Schwartz

What can be done with the big data we have been collecting? The metrology industry is sitting on mountains of collected data. So why is it that most of us have yet to find value in our data?

I started thinking about the value of the data, when I thought “how do software companies make money giving their software away?” Somehow, these guys have found value in in their data, data collection and advertising that is greater than the software itself. Companies like ROVIO (the creators of Angry Birds) have converted their business model from selling software to giving it away for free; and surprisingly, they are making good money doing it.

What amazes me is the data they are collecting is largely not real data, like our data. By and large the data they are collecting is metadata—that is, data about data. From this data, they are able to extract details about us, our habits, likes and dislikes, etc.

I was talking to a political consultant in Granville, Ohio, and he was telling me about his company. His company collects data from our phones. They analyze it to determine if we are a swing voter in a swing state. Their mountains of data are able to figure out how to project a 0.6% swing in votes for a specific candidate, based on the weather on voting day.

When I compare the value and accuracy of mountains of calibration data to the data they get from my phone, there is no comparison. I realize it is like comparing apples and oranges, but our calibration data is way more valuable, way more accurate! Our data makes everything in the modern world possible like cell phones, and the cellular networks they run on, possible! So why have we not been able to leverage our data into something of value?

I see several things holding us back as an industry, things I believe are easily correctible, and once corrected, will allow us to leverage the value of our data.

The first one is really simple. We need to start telling the world that we have tons of data. Tell the world we have data, lots and lots of data. Not that we are trying to sell the data, but we should talk about our data like we talk about the weather. Then people will ask us if we can get specific information out of our data? We need to have those conversations with people outside our industry to discover what is hidden in our data.
data-blocks
The second thing holding us back is a much more difficult problem to resolve and will not happen overnight. As an industry, few of our data systems communicate with each other; our data formats are non-standardized, and many of us have accepted scanning a PDF document and attaching it to a calibration record as the best way to standardize our business process. We need to create or adopt a standard.

The metrology industry needs to either adopt a standard or create one of our own, just like other industries have managed to create standard formats and standards. Unlike other industries, this will be no easy feet; metrology, after all, covers all measurements, all industries, and all levels from the NMI level all the way down to production. The sooner we adopt something, the better off we will be and the sooner we can start utilizing our data.
Then we can start creating analytical tools to scrape some data gathering value from our calibration data. I am not talking about interval analysis tools—I know those tools are already available. I am talking about what is under the iceberg, in other words, what is the hidden value in the data we have amassed.

One good example we could take advantage of could potentially reduce the calibration costs by not performing 100% of the calibration every year. What if we could calibrate some smaller portion of the unit and still maintain its reliability. Often, we as calibration technicians are testing points that never fail, or even seldom fail. We could save our labs’ time and money if we could distinguish highly important test points versus test points that seldom fail.

If you step back and think about it, many calibration labs follow the manufacturer’s written procedure; most of these manuals were written by an engineer, not a metrologist, when the instrument was being designed. The engineer had access to limited data and time. In many cases, the test points chosen are based on a limited sample set. For many calibration labs, we use this procedure religiously, until the end of time. So, 100 calibrations later or even five or ten years down the road, the calibration lab has amassed ten times more data than the engineer ever had.

Why don’t we let the data speak for itself? We have in our hands a gold mine of data, data that tells us what needs to be checked and how often. With a little bit of clever number crunching, we can not only perform interval analysis, but test-point-by-test-point interval analysis. We can save time and money testing high failure points, at every calibration, and low failure points, every other calibration, and even drop some points that never fail on their own. This is nothing new; Fluke did it with their artifact calibration on the 5720’s.