Regulatory compliance refers to the process by which individuals, organizations, or businesses adhere to the laws, rules, regulations, and standards set forth by government bodies or industry-specific authorities. The goal of regulatory compliance is to ensure that entities operate within the legal and ethical boundaries of their industry and jurisdiction, thereby promoting transparency, accountability, and the protection of various interests, such as consumer rights, data privacy, environmental sustainability, and financial stability.
Therefore, you can simply think of regulatory compliance as following the law. But it can be difficult to understand the morass of regulations and determine which specific requirements apply to your industry and type of business. Nevertheless, as regulations expand organizations need to deploy better controls to ensure quality data and properly protected database systems. There are many industry and governmental regulations, such as Sarbanes-Oxley, HIPAA, PCI DSS, and GDPR, all of which drive the need to improve data protection, management, and governance.
Implementing data management systems to ensure data quality should be large part of regulatory compliance. And it should start with metadata management.
What is Metadata?
But what is metadata? Metadata describes and defines data. It is used to provide documentation such that data can be understood and more readily consumed by your organization. Metadata answers the who, what, when, where, why, and how questions for users of the data.
Consumers of data must be able to put their data in context before the data becomes useful as information. Metadata makes data useful by embellishing it with details such as data type, length, a textual description, as well as other characteristics of the data. So, for example, metadata allows the user to know that the customer number is a five-digit numeric field, whereas the data itself might be 13024. That number, without context, doesn’t really mean much at all.
Furthermore, appropriate data definitions are required in order to apply the controls for compliance to the correct data. A control is a measure or process put in place to manage risk. Controls are put in place to comply with the requirements of regulations or business. It is the responsibility of organizations to understand which regulations apply to their business and then to put in place the proper controls to assure compliance with the mandates of those regulations.
With this in mind then, how can we ensure that the proper controls are in place for each type of data we are statutorily required to protect? We need up-to-date and accurate metadata to place all of our data into proper categories for determining which regulations apply. For example, PCI-DSS applies to payment card transactions, HIPAA applies to healthcare data, and so on. Some data will apply to multiple regulations and some data will not be regulated at all. But without proper metadata definitions, it is impossible to apply regulatory compliance to data. What will inevitably occur is one of two things: either all controls are placed on all data which causes additional work and perhaps performance problems, or some (or all) data is not protected as required by regulations.
Data Quality
Another aspect of compliance is to ensure that data, once accurately defined, is itself accurate. Imposing regulatory controls on the wrong data does no good at all. This raises the question “How good is your data quality?”
Poor data quality can cost the typical company between 15% to 25% of revenue in the view of data quality expert Thomas C. Redman. According to software marketing and technology expert Hollis Tibbetts, “Incorrect, inconsistent, fraudulent, and redundant data cost the U.S. economy over $3 trillion a year.”
The cost of poor data quality notwithstanding, high-quality data is crucial for complying with regulations. Think about it. If the data is not accurate, how can you be sure that the proper controls are being applied to the right pieces of data to comply with the appropriate regulations?
But what can we about poor-quality data? Data quality is a business responsibility, but IT systems can assist by instating technology controls. The first step should be to build properly constructed databases with well-defined data types. Building constraints into the database can improve overall data quality, as well as defining referential integrity in the database. Additional constraints should be defined in the database as appropriate to control uniqueness, as well as data value ranges using check constraints and triggers.
Another technology tactic that can be deployed to improve data quality is data profiling. Data profiling is the process of examining the existing data in the database and collecting statistics and other information about that data. With data profiling, you can discover the quality, characteristics, and potential problems of information. Using the statistics collected by the data profiling solution, business analysts can undertake projects to clean up problematic data in the database.
Data profiling can dramatically reduce the time and resources required to find problematic data. Furthermore, it allows business analysts and data stewards to have more control of the maintenance and management of enterprise data.
The Bottom Line
It is a reliable maxim that you first must understand a thing before you try to change or control it. Failing to do so will usually result in a mess. So, it should make sense that you first must understand what data you collect and manage before you attempt to control it. Especially with regard to regulatory compliance where than can be stiff penalties for noncompliance.
Inevitably, this means that metadata management must be a crucial first step for any organization because understanding your data is a prerequisite for implementing appropriate controls for regulatory compliance… not to mention that it helps your organization to better conduct business and serve its customers.