Why has data anonymization in the pharma industry become an important topic? Technology has significantly reshaped the clinical trials landscape, and clinical trial Sponsors need to cope with data privacy and transparency requirements. We will look at how these transparency obligations have created a need for patient data anonymization and how technology supports compliance. In May 2018, the General Data Protection Regulation (GDPR) came into effect in the European Union which ensures greater privacy of personal and sensitive data. Technology is helping companies adapt to these changes and be compliant with the new legislation.
What is the business need for data anonymization?
The Clinical Trial Regulation 536/2014 is set to come into effect in 2019 and looks to create an environment that is favorable to conducting clinical trials in the EU while ensuring the highest standards of safety for participants and increased transparency of trial information.
In order to fulfill transparency, disclosure and research requests while safeguarding the privacy of individuals, clinical trial Sponsors need a reliable method of anonymizing data. The EMA lays out policy guidelines for the anonymization of clinical trial data:
- Policy 0070 Phase I: within 60 days of a marketing authorization decision, the Clinical Study Report must be made available in a form that removes any risk of subject’s identity being breached.
- Policy 0070 Phase II: pertains to the publishing of individual patient data. This will make it mandatory for Sponsors to respond to reasonable requests from the public to access the data, not just the results, collected during clinical trials.
Data anonymization can be broken down into three techniques:
- De-identification: removing or re-coding health information that could identify an individual such as patient identifiers, free text verbatim terms or references to dates, unusual height, weight etc. Generalization and suppression groups data points or removes them altogether. Global generalization models are preferred due to their ability to preserve data utility in data analysis and mining applications. Randomization alters the veracity of the data in order to remove the strong link between the data and the individual.
- Anonymization: destroying all links between the de-identified datasets and the original datasets
- Redaction: removal of text and images from a document in order to preserve confidentiality. This is a necessary step if data is not anonymized at the patient level.
How can companies determine the level of redaction to employ? The redaction of documents should be planned following a risk analysis approach, and a risk analysis plan should be available to document the decisions you made when deciding on the level of redaction to implement. Companies should ask themselves if they have a risk approach to Personal Protected Data (PPD) and what key documents need to be considered. At this point it is beneficial for companies to consult experts in the field for a complete risk assessment.
It is important to keep in mind the difference between data privacy and data usability. A risk assessment helps us understand what level of anonymization needs to be applied in order to preserve as much data as possible while protecting personal data. Once data has been anonymized, a measurement can be made to identify the risk of re-identification by using statistical techniques to compare an example of the anonymized data to the original data.
The Data Anonymization Process
The anonymization process itself requires a functional specification that defines the anonymization that is to be conducted and turns that into a machine readable metadata file. That file is used along with the datasets to de-identify the data and then destroy the links with the original data to produce anonymized data. A report on what was done and the results of any risk measurement is completed.
Companies need a strategy to ensure results and individual patient data can be shared in a way that protects persona data. A risk-based approach allows us to apply the correct level of anonymization in order to preserve data utility while protecting personal data.
Why Pharma needs to be concerned with GDPR
Let’s turn now to GDPR: a legislation passed in April 2016, it took effect on 25th May 2018 and it aims to give citizens control of their personal data and simplify the regulatory environment for international business by unifying regulation within the EU.
Major requirements that concern pharma companies include:
- Identifying personal and sensitive data
- Accountability: maintaining personal data processing inventory
- Embedding Privacy by Design into systems and processes
- Deletion and the right to be forgotten
- Mandatory notification to a Supervisory Authority within 72 hours for certain types of breaches
- Considering international, cross-boarder transfers and steps that need to be taken
- Assigning a Data Protection Officer (DPO)
- Fines up to 4% of global annual turnover or €20 million EUR for non compliance
Data controllers must be able to demonstrate compliance with data protection principles.
Pharma companies should be utilizing latest technologies and expertise and qualifications of service providers to comply with these regulations. Service providers can help support with a risk assessment and data anonymization and redaction workflows. Companies should also be asking service providers what procedures they have in place for data privacy such as Privacy Shield and ISO 27001:2014 compliance for an Information Security Management System.
CROS NT has a data anonymization workflow process and tool in place to support companies with data anonymization and redaction. We are ISO 27001:2014 certified for our IT systems and processes and work with technology providers that have implemented data privacy measures for data collection and transfers.