Data strategy: preparation

The act of preparing the data for further use, taking into account recommendations from the Discovery phase, data preparation is largely concerned with data cleansing, but should also take forward the draft of the new data structures, models and taxonomies:

  • Data Cleansing is typically data quality and data purging (but may include more proactive elements aimed at adding future value):
    • parsing;
    • standardisation of metadata according to agree taxonomies;
    • de-duping;
    • blending and merging of sibling or peer records;
    • organisation;
    • enrichment (for onward publication);
    • tagging or annotation (for onward use).
  • Trial uploads: before migrating, the prepared data needs to be tested for its suitability for the new (Cloud) environment. These uploads will use clean, trial, data.
  • Rinse & repeat: repeat the data cleansing and testing process until satisfied that the data is ready for migration. Capture and report any anomalies, and monitor data quality and performance metrics.

The outcome should be data that is usable, useful, accurate, validated, fully described, complete and meaningful – clean, trial data ready for upload during the migration phase.