Validating data & its dimensions
Data validation tests the accuracy, consistency (of format and standards), quality and integrity of the data (& associated metadata) for onward information engineering, data provisioning and curation. Note that ‘reliable’ data still needs to be validated. Even though this is not its primary purpose, data anomalies revealed by validation techniques can reveal hitherto unnoticed business issues.
Validating data may involve assessing its formats and standards against notional data models:
- legacy data is particularly vulnerable to ‘bad habits’ acquired in previous years;
- even though data should be agnostic, technological advances render certain formats obsolete in the context of non-native applications.
This may prompt further consideration of the data model, especially if some data transformation is required e.g. as in the case of differing spatial data formats. Data validation can be by script, or automated using software tools.