Logo
  • ABOUT US
    • CONTACT US
    • NEWS
    • WORK WITH US
  • CONSULTING
    • CASE STUDIES
    • DIGITAL METAMORPHOSIS
    • DILIGENT DISCOVERY
    • STRATEGIC OPTIMISATION
    • PUBLICATIONS
    • VIEWS
  • SOLUTIONS
    • BUSINESS INTELLIGENCE
    • DATA
      • PREPARATION
        • DATA AUDIT
        • DATA DISCOVERY
        • DATA MANAGEMENT
        • DATA QUALITY
        • DATA TRANSFORMATION
        • DATA VALIDATION
      • PRODUCTION
        • DATA ENGINEERING
        • DATA MIGRATION
        • DATA MODELLING
        • INFORMATION ARCHITECTURE
        • MASTER DATA
      • PRESENTATION
        • ANALYTICS & REPORTING
        • DATA CURATION
    • METADATA
      • PREPARATION
        • METADATA AUDIT
        • METADATA DISCOVERY
        • METADATA IN ARCHIVES
        • METADATA IN THE DARK
        • METADATA IN LEGACY APPS
      • PRODUCTION
        • METADATA SIGNALS
        • METADATA ACTIVATION
        • CONTROL DATA IN ERP & CRM
        • REDUCE THE PAIN OF ETL
      • PRESENTATION
        • METADATA CURATION
        • METADATA MOTIVATION
        • METADATA RATIONALISATION
    • KNOWLEDGE
      • GRAPH KNOWLEDGE
        • KNOW CONTENT
        • MEANINGFUL DOMAIN
        • NO KNOWLEDGE, NO AI
        • SEMANTICS – THE HUMAN PERSPECTIVE
  • TRAINING
    • ALL COURSES
    • COURSE CATEGORIES
      • CORPORATE FINANCE
      • DIGITAL
      • HOSPITALITY
      • UPDATED FOR DIGITAL
  • ABOUT US
    • - CONTACT US
    • - NEWS
    • - WORK WITH US
  • CONSULTING
    • - CASE STUDIES
    • - DIGITAL METAMORPHOSIS
    • - DILIGENT DISCOVERY
    • - STRATEGIC OPTIMISATION
    • - PUBLICATIONS
    • - VIEWS
  • SOLUTIONS
    • - BUSINESS INTELLIGENCE
    • - DATA
      • - - PREPARATION
        • - - - DATA AUDIT
        • - - - DATA DISCOVERY
        • - - - DATA MANAGEMENT
        • - - - DATA QUALITY
        • - - - DATA TRANSFORMATION
        • - - - DATA VALIDATION
      • - - PRODUCTION
        • - - - DATA ENGINEERING
        • - - - DATA MIGRATION
        • - - - DATA MODELLING
        • - - - INFORMATION ARCHITECTURE
        • - - - MASTER DATA
      • - - PRESENTATION
        • - - - ANALYTICS & REPORTING
        • - - - DATA CURATION
    • - METADATA
      • - - PREPARATION
        • - - - METADATA AUDIT
        • - - - METADATA DISCOVERY
        • - - - METADATA IN ARCHIVES
        • - - - METADATA IN THE DARK
        • - - - METADATA IN LEGACY APPS
      • - - PRODUCTION
        • - - - METADATA SIGNALS
        • - - - METADATA ACTIVATION
        • - - - CONTROL DATA IN ERP & CRM
        • - - - REDUCE THE PAIN OF ETL
      • - - PRESENTATION
        • - - - METADATA CURATION
        • - - - METADATA MOTIVATION
        • - - - METADATA RATIONALISATION
    • - KNOWLEDGE
      • - - GRAPH KNOWLEDGE
        • - - - KNOW CONTENT
        • - - - MEANINGFUL DOMAIN
        • - - - NO KNOWLEDGE, NO AI
        • - - - SEMANTICS - THE HUMAN PERSPECTIVE
  • TRAINING
    • - ALL COURSES
    • - COURSE CATEGORIES
      • - - CORPORATE FINANCE
      • - - DIGITAL
      • - - HOSPITALITY
      • - - UPDATED FOR DIGITAL

Metadata curation

Home | Solutions | Metadata services | Metadata curation

Look after your metadata and it will look after you

Gartner Research stresses the future importance of active metadata. “By 2023, organizations utilizing active metadata, machine learning and data fabrics to dynamically connect and automate data management processes will reduce their time to data delivery, and impact on value by 30%.” Active metadata implies curated metadata, since it must be up-to-date and accurate to be of any use – to be trusted enough as the basis for decision-making & analytics.

Active not passive

The increasing availability of tools such as metadatabases, metadata management, metadata inventory, enterprise metadata cataloguing and so forth demonstrates increasing awareness of the need for metadata curation. But such tools began life largely as passive administrative repositories where captured metadata was seen as an end itself.

Programmatic, not manual

Going forward, most metadata should and can be added programatically and automatically at the point of creation. Get into good habits now – it will pay dividends in the future.

To remediate or not to remediate, that is the question

Remember that on average, 80% of the digital estate contains data which is not understood or even known – the same applies to its metadata. But remediation is not a straightforward issue.

Governance

If metadata is part of a system of record then any proposed remediation, corrections or insertion (to remedy absent metadata) will need careful consideration. Poor management of metadata can impact negatively on upstream systems, especially compliance.

Workflows that create metadata for an object need to factor in:

update;
destruction;
preservation;
audit.

Metadata needs to facilitate timely discovery or protection of key information; that this information is collected in the first place; and that any processes connected with it are logged for audit.

Use case

Understand why metadata is added to an object and the scenarios which would require the object’s metadata to be updated or removed. Understanding these helps to identify existing workflows which may require updating to cater to new requirements and conditions to evaluate.

Lifecycle

Workflows that manage the lifecycle of metadata or tags frequently contain functions to validate an object has the correct tags assigned. These functions are often used to determine the success of a task to add, update, or remove a tag & are also useful for searching. But such functions may come with lapsed time before new valid items are added to the system, so assumptions about consistency can lead to unexpected results or unhandled errors. Queries may generate no results for very recently added objects – the creation workflow not yet being completed. Similar  issues exist in metadata management where the system executing the CRUD operation is not the same as the system which responds to queries.

Comparative analysis

How do you assign appropriate metadata in an historic or archive environment which may be incomplete or meaningless, where the provenance & referential integrity has been lost. If, for instance, your organisation has grown through acquisition or consolidation, it is perfectly possible that the metadata in older acquired files could be completely random: the result of data input misunderstandings (e.g. whole sentences input as metadata, compounded by ‘munging’ (mashed up until no good) & poor database management!

Structured data

Curating structured data should be done through the lens of analysis or forward purpose, such as AI. Both benefit from canonical data & metadata so metadata curation can be useful in identifying anomalies that can negatively impact downstream activity.

Unstructured data

Unstructured data is often content-rich, so you are more likely to be able to segment it according to file types or use cases (as these tend to be more specific), and therefore make reasonable inference as to what the metadata should be.

Winning ways

They say they have no ‘dark’ data at Google and they have created comprehensive tools for understanding unstructured data e.g. audio-visual searches. Search engines reap the rewards of understanding, knowing and finding data because the metadata exists, is curated and continuously validated by users (e.g. Google maps, report an error):

executive support & recognition of the strategic import of metadata should be non-negotiable – it needs to be managed & governed by the business;
metadata needs dedicated review (quality control) and maintenance (continuous improvement), just like any physical asset or facility;
whatever the situation with pre-existing metadata, it is essential to get into good habits going forward otherwise your metadata remediation issues will only exacerbate going forward as data volumes and data-generating IoT devices also increase;
the main challenges though lie with your user population: to instil a culture of metadata input – full & quality input too! Users who create web content will be used to this discipline – others will not.
new devices and software will generate metadata automatically, however your user population using back office or legacy systems will  likely not be in the habit of inserting appropriate metadata – for instance when saving a document. Some, but not all of that data, will be generated automatically, but the key metadata about what the document’s relevancy is, can only come reliably from its creator or editor;
metadata supplied by content creators may be subjective which raises the issue of quality & consistency. (You really don’t want to parse metadata if you can help it!) Within an organisation, content creators or assessors need to be on the same page so that their captured experience becomes organisational knowledge, not merely personal. There needs to be at least a nascent ontology and basic conventions about what tags should be applied;
in similar vein, require content creators to routinely embed or update any ownership, copyright or usage entitlement metadata along with their creation. We’ve all seen historic metadata in Word documents… say no more;
there may be no easy way to generate metadata for physical items (except perhaps if these are being digitised) but this is also important as frequently pre-digital items have enormous value – otherwise they wouldn’t have been preserved! Scanning thousands of documents is unproductive, wasted effort if metadata is not created as part of that process;
make sure you are capturing all your metadata!

INCORVUS

Incorvus Ltd.
Reg. Office:
5 Calico Row,
Plantation Wharf,
London, SW11 3YH.

Reg. in England & Wales.
Reg. Co. No. 5203890.

SMALL PRINT

Anti-bribery policy
Cookie declaration & policy
Corporate social responsibility policy
Environmental policy
Equal opportunities policy
Modern slavery policy
Privacy notice & policy
Terms of service
Training terms & conditions

FIND

G-Cloud & DOS supplier

Reach out

Contact us
+44 (0)20 8538 9898
info@incorvus.com
LinkedIn
Twitter
Incorvus logo

© Incorvus Ltd. 2023 All rights reserved.