Data’s Identity Crisis

Social impact

Despite what revered magazine covers may say, data is not the new oil. Indeed, data is not a commodity in the traditional sense.

Sometime last decade, data became a buzzword. Data is now used as a euphemism for a wide range of things, many of which are unrelated to data assets. Terminology such as “data-driven” is used to suggest a contemporary edge to any unsuspecting noun. CEOs, analysts, traders, commentators and others all claim solutions and ideas are “data-driven”. Despite being ubiquitous with smart decision making, few can articulate what data is.

One of the greatest challenges of the new economy is determining data’s value. It is here that we encounter the many complexities, irregularities and ambiguities that differentiate data so markedly from oil and make definitional issues critically important.

Each dataset has unique characteristics, some innate to the dataset itself and some relating to its source, its owner or its intended use. Trivially, datasets may differ markedly in terms of basic attributes like size, shape and purpose.

The size of a dataset, and that ever-present, albeit slightly ageing buzz phrase “Big Data” is also rather misleading when it comes to value. What is more valuable: a record of the entire set of trades of every cryptocurrency on every exchange for the last 20 years, along with a real-time stream of new trades (with an expected delay measured in tiny fractions of a second), or the price of Bitcoin, as it would be a week from now, with absolute certainty? The former may be measurable in giga or terabytes. The latter is a single number. The former is realistic, the latter requires psychic powers. Realism aside, which is more valuable? A single number, the smallest datum possible may well be more valuable than giga, tera, peta or even zettabytes. The lesson here is the value of data is not necessarily determined by its size.

Ownership of data is often poorly defined or constrained due to privacy and usage restrictions or complex ownership structures. Value is thus necessarily relative to such ownership.

A dataset does not have intrinsic value independent of whose data it is.

Related to the ownership issue there are also questions of where data starts and ends. When valuing a dataset, are we valuing just the electronic data on defined storage media? Or are we also valuing data that may be recorded in non-digitised mediums, or might we be considering tacit data? Are we considering only existing data, or a stream that is expected to deliver more data on a regular basis? Are we also valuing the bespoke hardware and software that enables this data to be collected and analysed?

What about the teams that enable and manage these processes?

At Aurum Data, we’ve spent several years developing models and methodologies for effective, evidence-based data valuation. This exhaustive research has been laborious but revealing.

Unlike other asset classes, data’s value can be fundamentally subjective. What is of significant value to one company can be utterly worthless to a peer, despite perceived similarities.

This subjectivity can also apply internally, meaning that simple replacement value alone may not adequately capture the commercial or operational ramifications of a data loss or breach.

In addition, data is only as good as its utilisation. Too many datasets sit unused in the metaphoric trophy cabinet – lauded by many but used effectively by none.

Data is more organic than many realise.

It often has unrecognised potential, both internal and external, that can only be realised with the right structures and management.

The team at Aurum Data believes valuation plays a critical role in that process.

Oil helped to shape the 20th Century and data is likely to do the same this century. However, it is important to recognise that is where the similarities end.