A growing number of security and data privacy experts are warning that proposed NHS Digital plans to scrape medical data on 55 million patients in England into a new database creates unacceptable levels of security risk.
The plan was officially announced earlier in May, and of particular note is that patients have only until 23 June 2021 to opt-out of the scheme by filling out a paper-based form and handing it to their GP. If they do not do so, their data will become part of the data store, and they will not be able to remove it, although they will be able to stop data yet to be generated from being added.
The General Practice Data for Planning and Research (GPDPR) database will contain swathes of sensitive personally identifiable information (PII), which will be pseudonymized and include data on diagnoses, symptoms, observations, test results, medications, allergies, immunizations, referrals, recalls and appointments. It will also include information on physical, mental, and sexual health, data on gender, ethnicity, and sexual orientation, and staff who have treated patients.
It is proposed that the data store be shared by multiple bodies, including academic and commercial organizations such as pharmaceutical companies in the interests of research and forward health planning, to analyze inequalities in healthcare provision, and to research the long-term impact of Covid-19 on the population.
David Sygula, a senior cyber security analyst at CybelAngel, conceded that taken at face value, the plans provided some “strong benefits” from the perspective of an academic researcher, and agreed that – as NHS Digital hopes – an initiative such as GPDPR could be precious in controlling the magnitude of the pandemic’s impact on the UK.
“However,” he added, “data collection on this scale is creating a new set of risks for individuals, where their personal health information is exposed to third-party data breaches.
“The extent of the unsecured database problem is growing. It is not simply an NHS issue, but the NHS’s third, fourth, or further removed parties too, and how they will ensure the data is securely handled by all suppliers involved. These security policies and processes absolutely need to be planned well in advance and details shared with both third parties and individuals.”
Sygula recommended several mechanisms that might usefully be put in place – such as the complete anonymization, not pseudonymization, of data – on the basis that a leak of data from the system is practically inevitable.
“Security researchers, attackers, and rogue states have all put in place processes to identify unsecured databases and will rapidly find leaked information,” he said. “That is the default assumption we should start with. It is about making sure patients are not personally exposed in case of a breach while setting up the appropriate monitoring tools to look for exposed data among the supply chain.”
Timelines too short?
Beyond the risk from third-party breaches and cybercriminals tempted by valuable personal data, IntSights chief compliance officer Chris Strand said that in his view, NHS Digital had failed to give people long enough to assess their individual risk position and opt-out if desired.
“The opt-out plan could introduce complexities for some people who aren’t actively involved in how their data is used or who understand the implications of how their data may be used for research,” he said. “In less than a month, how can they ensure that every individual included had an adequate opportunity to be informed on the data use and also had a chance to understand the implications of their data being used by third parties?
“I would be concerned about the legality of proving that people had a fair opportunity to opt-out of the ‘data collection’. Challenges could be presented after the database is released to those who want to use it for research.
“Having dealt with the process of ensuring data use is disclosed to data owners, there may be legal consequences as it could be difficult to prove that all the individuals included in the database had an adequate opportunity to opt-out of its use, especially given the nature of the sensitive data involved in this database.”
History repeating itself
Keystone Law technology and data partner Vanessa Barnett was also among those who pointed out risks. She said previous data-sharing health initiatives, such as an arrangement between the Royal Free Hospital NHS Trust and Google DeepMind, had been ruled non-compliant with the UK’s Data Protection Act (DPA) by the Information Commissioner’s Office (ICO).
“This is one of those times where one of the less famous bits of the GDPR [General Data Protection Regulation] comes to mind – that the processing of personal data should be designed to serve mankind,” she said. “The right to protection of personal data is not absolute; it must be considered about its function in society and be balanced against other fundamental rights, by the principle of proportionality.
“This processing of health data could quite rightly serve mankind – but it all depends on what data, who it is given to, and what they do with it.” In the Royal Free-DeepMind case, the ICO found shortcomings in how patient records were shared, notably that patients would not have reasonably expected their data to be shared. The Trust should have been more transparent over its intentions.
“To me, this new mass sharing proposed by the NHS could well be history repeating itself,” said Barnett. “Most people would not expect their GP records to be shared in this way, have no awareness of it, and will not opt-out because they had no understanding.
“It is noteworthy to see that the data will be pseudonymized rather than anonymized – so it is possible to reverse-engineer the identity of the patients in some circumstances. Suppose the created data lake is genuinely for research, analyzing healthcare inequalities, and research for serious illness. What is the reason this cannot be done on a true anonymized basis?”
Barnett warned that while using personal data in this way was not in itself illegal, failure to put in the necessary legwork to enable the data subjects – the general public – to understand what is happening and to have a “real and proper” opportunity to withdraw consent could ultimately prove a breach of some of the more administrative aspects of the DPA.
What NHS Digital says
According to outgoing NHS Digital CEO Sarah Wilkinson, GP data is precious to the health services because of the volume of illnesses treated in primary care.“We want to ensure that this data is made available for use in planning NHS services and clinical research,” she said.
But Wilkinson did acknowledge that it was critical that this was done, so that patient confidentiality and Trust are prioritized and uncompromised.
“We have therefore designed technical systems and processes which incorporate pseudonymization at source, encryption in transit and in situ, and rigorous controls around access to data to ensure the appropriate use,” she said. “We also seek to be as transparent as possible in how we manage this data so that the quality of our services are constantly subject to external scrutiny.”
NHS Digital says it has consulted with patient and privacy groups, clinicians, and technology experts, as well as multiple other bodies, including the British Medical Association (BMA), the Royal College of GPs (RCGP), and the National Data Guardian (NDG) on the GPDPR system.
Arjun Dhillon, Caldicott guardian and clinical director at NHS Digital, said: “This dataset has been designed with the interests of patients at its heart.
“By reducing the burden of data collection from general practice, together with simpler data flows, increased security and greater transparency, I am confident as NHS Digital’s Caldicott guardian that the new system will protect the confidentiality of patients’ information and make sure that it is used properly for the benefit of the health and care of all.”
NHS Digital’s GPDPR transparency notice, including further details of how the data will be used and by whom, and information on how to opt-out, is available here.