Computer projects that involve migrating data
from one system (source) to another (target) are notoriously high
risk. In many instances the failures are obvious where systems simply
failing to work at all. Other failures are however, less obvious.
There are a number of "ticking Time Bombs" where the failure is
in the form of the data being corrupted as part of the process,
but where the corruption is yet to be noticed. The longer these
errors go unnoticed, the more severe the consequences on the operations
concerned. The net effect of such failures very much depends on
the organisation, department and operations concerned. For commercial
organisation the impact will be noticed in the profits, where as
such failures with NHS patient data might have far more serious
or even fatal implications.
How Migration Projects Go
Wrong - Information
The basis for decision
making or the business meaning of the data. Information
is held in a computer system as "data" or "data values".
How Migration Projects Terminology
Data is the physical representation
of the information stored in a computer. To minimise storage,
information is held in often extremely complicated coded
format, clinical coding being one example where a single
code might represent many words or definitions.
How and Why Data Migrations
Projects Fail
People often notice and refer to data corruption when
they are presented with a word that makes no obvious sense,
for example where symbols or serious misspelling are seen
in data fields when the input data was known to be accurate.
Serious data corruption is usually a lot less obvious,
and in the case of migrated data, where data values may
need to be changed to fit with new system constraints,
data corruption does not refer to the data value but to
the information that the data value is supposed to represent.
Just as in spoken languages, the most subtle error in
translation can complete change the context of the phrase,
often with the most profound of consequences
Data migration is most common when replacing or upgrading
systems established (or source) systems. In such cases it
is common that the data values will need to be changed to
comply with the constraints of the new (or target) system.
The process for such migrations has thus been named as Extract
(extracting the data from the source system) Transform (the
transformation or translation of data from one form to another)
and Load (the loading of the transformed data to the new
system). The IT industry uses term ETL when referring to
this process. Where there are high volumes of data to be
migrated specialist ETL tools may be used to help automate
the process. Whilst these might speed the process they add
an additional risk as if there are "bugs" in the tools used
it is likely to result in the process being corrupt.