Data Integrity and Compliance
Data Integrity and Compliance
This reading illustrates the importance of data integrity using an example of a global
company’s data. Definitions of terms that are relevant to data integrity will be
provided at the end.
A strong analysis depends on the integrity of the data. If the data you're using is
compromised in any way, your analysis won't be as strong as it should be. Data
integrity is the accuracy, completeness, consistency, and trustworthiness of data
throughout its lifecycle. That might sound like a lot of qualities for the data to live up
to. But trust me, it's worth it to check for them all before proceeding with your
analysis. Otherwise, your analysis could be wrong. Not because you did something
wrong, but because the data you were working with was wrong to begin with.
When data integrity is low, it can cause anything from the loss of a single pixel in an
image to an incorrect medical decision. In some cases, one missing piece can make all
of your data useless.
Data integrity can be compromised in lots of different ways. There's a chance data
can be compromised every time it's replicated, transferred, or manipulated in any
way.
In a lot of companies, the data warehouse or data engineering team takes care of
ensuring data integrity. Checking data integrity is a vital step in processing your data
to get it ready for analysis, whether you or someone else at your company is doing it.
Calendar dates are represented in a lot of different short forms. Depending on where
you live, a different format might be used.
A good analysis depends on the integrity of the data, and data integrity usually
depends on using a common format. So it is important to double-check how dates
are formatted to make sure what you think is December 10, 2020 isn’t really October
12, 2020, and vice versa.
Conclusion
Fortunately, with a standard date format and compliance by all people and systems
that work with the data, data integrity can be maintained. But no matter where your
3
data comes from, always be sure to check that it is valid, complete, and clean before
you begin any analysis.
As you progress in your data journey, you'll come across many types of data
constraints (or criteria that determine validity). The table below offers definitions and
examples of data constraint terms you might come across.