Data integration

Integration adds value to data – the harmonisation and combination of various datasets provides enrichment and further insights from analysis, and keeps organisations aware of their data curation responsibilities.

Loose coupling

We like SOA integration services, for instance, to link  Cloud to related platforms, repositories, applications (APIs), other services or ecosystems.

  • We focus on the ‘how’ and ‘why’ of integration and for this reason favour loose-coupling, since this mitigates the need for recoding for the inevitable changes that arise.  Hard-coding or proprietary workarounds also tend to cost more in the long run as new development teams attempt to understand, untangle, and re-engineer the processes bound up in the code.
  • Regarding the ‘when’ and ‘where’,  we also have information architecture expertise that can assist with assessing integration infrastructure aspects: to mitigate the possibility of choke points; to situate data landing areas appropriately for optimal processing performance and to maximise the curated flow of data through the system through alleviating timing issues which may otherwise constrain the organisation. These bottlenecks can be very limiting, particularly if batch processing otherwise has to be done in discrete times at the dead of night.

Integration & metadata

Different types of data and different source applications require different integration approaches:

  • Moving or integrating data from an ERP/CRM system, for instance, also requires an understanding of both the ‘source’ and ‘target’ systems and the data metadata and formats. Gaining an understanding of the datasets and data structures in order to map them to another system can be challenging without specialist tools. We use Safyr to speed the process of showing the user which tables exist in the packaged application, how they are related (and in most packages) show which application functions use which tables. The time savings from using Safyr are considerable.
  • Integration of spatial (locational) data for location intelligence, information and insight will require specialist assistance – you may have vector coordinates, raster objects or even time-series data which have their own very particular aspects, beyond and above typical data integration routines.

Integration & Governance

Integration, in terms of both source and target systems, also has to be evaluated from a governance perspective: to establish whether, for instance, any sensitive personal data is in scope and whether this is being managed according to the required legislation (GDPR, DPA etc.); to ensure that destruction and ‘right to be forgotten’ flags are also passed along the chain; and to ensure that data which has past its ‘use by’ date, is not perpetuated.

The primacy of integration and governance was underlined by Zikopoulos et al. in Big Data Beyond The Hype, where Zikopolous writes:

“…too many people treat this topic as an afterthought—and that leads to security exposure, wasted resources, untrusted data, and more. We actually think that you should scope your Big Data architecture with integration and governance in mind from the very start.” (p.26)

This plays very much to the ‘privacy by design’ dictum of GDPR but also repatriates data teams from wrangling and janitorial work towards genuine stewardship and data science.