The data ‘Cinderella’

Data is vital to digital ecosystems

The bad news

Data has long been the ‘Cinderella’ of corporate systems. Organisations invest willingly in IT system redundancy and high-availability, but rarely do they invest commensurately in the data itself. Data is the lifeblood of any digital organisation and is not only essential for corporate health but if, of sufficient quality, may have the potential for monetisation in its own right. Data should therefore be at the forefront when considering risk management, cost-effectiveness,  forward IT planning and business strategy.

Quite often the reverse is true. Much attention is given to data silos and data resident in applications but little attention is paid to truly data-centric architectures. As Steve Miller, formerly at Gresham Tech pointed out:

“One fundamental barrier is that most banks and large enterprises rely on architectures that have evolved over time. In fact, ‘architecture’ may not even be the right term, as that implies a level of intent and design that can often be wholly absent. In a sense these landscapes are more like application ecosystems, consisting of a patchwork of inter-dependent systems that provide the underlying services the company relies upon to conduct its day-to-day business.”

In a data-negligent culture:

  • Data is of poor quality.
    • It will have been largely ignored; parked on a dusty drive somewhere, gathering more dust and encrustations and bloating from neglect. As long as applications run, why disturb it? Yet we all understand the relationship between poor quality blood and bad health. The same is true of data in the digital organisation.
  • Data does not flow freely.
    • Data needs workflow and the inevitable system bottlenecks and choke points hamper the free flow of data through the enterprise. We all understand the impact of such blockages on human systems, heart attacks. Yet, why should data be any different? Yet there are still substantial amounts of hard coding or proprietary code around, instead of loosely-coupled integration.
  • Data is not curated.
    • Data update routines are usually complex and awkward to implement. New data inputs represent a challenge as topologies struggle to cope with them because applications, rather than data, were the lead design consideration. Yet we all know that change is part of life and that data will change. This is most evident in the management and governance of customer data. If your customer data does not change, you don’t have a business.
  • Data is not motile.
    • For many years data was regarded as static whereas we now have the concepts of data-at-rest, data-in-motion and data streams in analytics where the processing is done on the fly. Workflow is now regarded an essential part of making sure data is active and used, not just sitting in a repository somewhere. Data that is active is data that is creating or supporting value.
  • Let’s move the data to a datalake!
    • Simply moving data from one place to a data lake will not solve any problems unless the data is first understood. You will simply have a bigger problem to deal with and will probably have lost the data lineage and provenance signposts along the way. And if you are thinking of moving to Cloud…. that would be like trying to thread a camel through the eye of a needle.

The consequences of data negligence are manifold:

  • Data opportunity costs.
    • The effort required to manage bad data is significant. It sucks in management time, and at operational level it may equate to a full time role. This is resource that could be put to better use.
  • The opportunity cost of bad data increases.
    • As more and more data is generated – through the ever-growing plethora of apps, devices and governance- the impact (and inherent risks) of bad data increase both in volume and in terms of the number of systems affected.
  • Bad data is inefficient.
    • Those moving to cloud will perforce have to address bad data to make sure that cloud systems are populated with clean, efficient and accurate data – both to reduce cost but also to ensure optimal performance.
  • Bad data disincentives the workforce.
    • Even simple tasks take longer, generating internal pressure, demotivated staff and customer disatisfaction.
  • Compliance penalties and reputational damage.
    • GDPR and similar regulation has served to highlight how many institutions have paid far less heed to their data assets, than to their bricks and mortar ones. Data cleansing also has a cost but before you can decide what is worth cleaning, you need to know what data you have, and understand that data. If there has been poor stewardship and housekeeping, you may not be able to prioritise in this way. The financial cost of reputational damage can be significant and crippling with long-term consequences.
  • Data opportunities are lost.
    • Because an organisation is so busy managing bad data it fails to see opportunities for data value or monetisation. The data may have value when added to someone else’s data – or there can be data through analytics, as long as the business is clear about use case potential. Analytics is not a magic bullet. You have to point the gun!

The good news

Retail organisations that have their data act together have a strategic lead in terms of customer footprint, loyalty and growth. This because data and not applications were the starting point for strategy and systems design.  Those designs relied on exposable data – from the edge to the centre –  obviously secured where necessary. In financial markets, the emphasis was on superfast trading, where the spoils go to the fastest.

Data is growing at a huge rate, thanks to IoT, IoM, smart cities, smart motorways, smart homes, social media and a plethora of devices. So, there is no shortage of data, just a scramble to stay ahead of the volume vs. performance curve, all the while retaining some understanding of the data that is there. Clearly the social media giants have this one sorted, but others are catching up e.g. on realising that alternative data when added to their existing base may represent value add – if not for them, then as a monetisable commodity.

From a market perspective, investors have already realised that data has value – it is one of the reasons certain stocks & shares command a higher buy price than one might otherwise expect from unprofitable results. The organisations at the top of the DOW are data giants, as much as anything else.