FAQs about DataLoch
Some common questions answered.
DataLoch is the first Health and Social Care project established under a ten year multi-sector Data Driven Innovation (DDI) programme funded by a substantial Edinburgh and South East Scotland (ESES) City Region Deal. This programme’s ambition is to establish the region as the data capital of Europe and help its organisations and citizens benefit from the data revolution.
DataLoch has been developed in partnership with NHS Lothian and its strategic plan has been approved by the East Region Health Boards (Lothian, Borders and Fife), and by the region’s six Integrated Joint Boards.
The aim of the DataLoch project is to develop a secure repository of all health and social care data for the ESES region to help find solutions to the major health and social care challenges we all face. This will require the development of an efficient and safe approach to store, define, link, and make accessible, data from across the region to inform research and service improvements.
Finalised in August 2018, the Edinburgh and South East Scotland City Region Deal is a UK and Scottish Government-led investment in the region designed to accelerate productivity and inclusive growth through the funding of infrastructure, skills and innovation.
The UK and Scottish Governments, and regional partners, are investing £1.3bn over 10 years in transport, housing, culture, skills, employability, and innovation. The regional partners include three NHS Boards (Lothian, Borders and Fife), six local authorities (City of Edinburgh, Midlothian, East Lothian, West Lothian, Fife and the Scottish Borders), plus regional universities and colleges.
The University of Edinburgh and Heriot-Watt University are co-ordinating the delivery of the Data-Driven Innovation component, which was included in the Deal in recognition of the region’s strengths in technology and data science; the growing importance of the data economy for everyone; and the need to tackle the gap in digital skills.
The DataLoch is a Data-Driven Innovation funded initiative.
In common with the rest of the UK, the Edinburgh and South East Scotland region is facing a number of major health and social care challenges. These include, but are not limited to, an ageing population; increasing numbers of people living with long-term conditions; delays in hospital discharge; and ever increasing costs of medicines. Meeting such challenges requires new innovative solutions.
Bringing together all key data for the region’s individuals – including social, primary and secondary care - will allow a holistic data-driven approach to the prevention, treatment, and provision of health and care services. It will also provide deeper insights into both individual and community experiences and use of services helping to inform the future development of better and more efficient care.
The recent specific challenge of Covid19 has highlighted, and accelerated, the need to have defined and linked data to inform service management and mitigage impact.
DataLoch will bring together the routine data currently collected as part of people’s day to day interactions with health and social care services. This will include details of activities such as the types of services being used, visits to hospitals or GPs, treatments and medicines received, and outcomes and test results. This “raw” data will then be cleaned, defined and linked within DataLoch with data extracts prepared for specific projects.
DataLoch is exploring if it should hold free text information both in terms of what value can be derived from free text but, crucially, the use of Natural Language Processing and other technologies to ensure that de-identification can be conducted to a robust standard. If these prove positive, DataLoch will look to interrogate free text with a view to key de-identified components of it being coded to support other pre-coded data e.g. diagnosis criteria. Datasets containing these de-identified codes would then be made available to specific approved projects.
As outlined in The General Data Protection Regulation 2016, the legal basis for the processing this data is:
6(1)(e) – processing is necessary for the performance of a task carried out in the public interest or in the exercise of official authority vested in the controller.
[Justification for this purpose is for a confidential service for and with instruction from data controllers only]
9(2)(j) – processing is necessary for archiving purposes in the public interest, or scientific and historical research purposes or statistical purposes in accordance with Article 89(1)
[Justification for this purpose is for research under supervision of Caldicott Guardian(s) and anonymous for requestors]
A dedicated team will produce and manage the datasets held within DataLoch. Within this team, data scientists, with specialist training and access rights, will access the “raw” information used by DataLoch to produce clean, defined and linked datasets.
Following an approved application process, other users, such as accredited researchers and health and social care service managers, will be able to access specific datasets to help inform their approved research or service improvement work.
Any person wishing to access any of the datasets held within DataLoch will have to follow an approved application process. This will require applicants to meet a number of key governance criteria to ensure their purpose and interest is both legitimate (e.g. they are employed by a relevant public body or approved organisation) and appropriate (e.g. they are requesting access for approved scientific research or service improvement projects).
Each project will go through additional scrutiny to ensure the request is proportionate and in the public interest. Once approved, the DataLoch team will provide access to the relevant information through an appropriately secure portal. This means that approved project owners will access and analyse the data requested within a secure area.
In addition to these internal DataLoch processes, a separate governance panel will be established to provide additional scrutiny and guidance for applications which raise any potential additional risk. This panel will also maintain oversight, including audit, of DataLoch’s processes to ensure they are being properly followed at all times.
During its initial set-up, DataLoch will only use data controlled by NHS Lothian housed within NHS Lothian’s Research Safe Haven. This infrastructure is one of several Safe Havens across Scotland already dedicated to safeguarding NHS information and which are required to meet the best practice national standards for access and information security.
Once testing of this set-up is complete, DataLoch will move to the Edinburgh International Data Facility, using data from the wider region. This infrastructure, when complete, has an objective to set the highest possible standard for information security and access governance. The combination of the secure infrastructure used and governance policy and processes will ensure protection from attack or unauthorised use.
Within the current phase, i.e. exploratory work only embedded within NHS Lothian Research Safe Haven, a Data Protection Impact Assessment has been drafted and will be modified as the programme progresses in consultation with data controllers. Privacy assessments will also be carried as appropriate within subsequent phases.
As part of its development phase, the project is establishing a communication strategy which includes a website and a wider communication campaign. In addition, we are establishing a Public Reference Group to be involved in informing the development journey with data controllers.
There is no patient opt-out, and this is in line with the legal basis for processing.
In response to the third Caldicott review, the Scottish Government’s Chief Medical Officer set out a detailed argument broadly rejecting that review’s recommendations to introduce an opt-out for individual de-identified data being used in research and service management. The key reasonings, within that response, align with the strategic rationale for establishing DataLoch e.g.
“offering consent opt-outs on data processing, essential for delivering high quality care and core task, is disruptive, costly and can reduce quality and equity of care”
“opt-outs are not distributed randomly in the population, they are a sign of specific health needs not being addressed or lack of trust…..the resultant bias provides poorer quality information for shared decision making”
In line with this current policy, there are no plans to include an individual opt-out facility for DataLoch in its aim to provide de-identifed data to inform research and the better management and provision of services. However, the ongoing work on transparent communication and the embedding of a public reference group within the development acknowledges the need to be as open as possible with the public about how, and for what purpose, their de-idenitfied data is being used.
DataLoch is being developed within a phased programme. The current first phase is due to run until Aug 2020 and is exploring and deciding on the best options to access, store, define and link key datasets. This scope of this phase has recently been increased to accommodate urgent work for NHS Lothain to define, link and make available key data on the management of Covid19. The first dataset from this work was made available on 30 April 2020.
The second phase is planned to run until autumn 2021 and will involve linking increased numbers of datasets and running selected test projects to test and finalise processes and infrastructure. The intention is to complete these, and lauch a functioning service for real projects, in late 2022.
In March 2020, NHS Lothian and the University of Edinburgh, asked the DataLoch team to help in the production of a dedicated COVID-19 linked dataset to support immediate hospital-based service management and to provide a data asset for active and anticipated regional and national research into the outbreak. The DataLoch Leadership Team agreed to extend the scope of Phase 1 to accommodate this work on 1 April 2020.
DataLoch, working in collaboration with clinicians on NHS Lothian data within the NHS Lothian Safe Haven, has subsequently built a linked COVID-19 dataset which was released for use, via a contolled application process, on 30 April 2020.