Preserving, sharing, and disposing of your research data
Information on how and where to preserve research data, and how to write a data access statement for research publications.
Understand the requirements
For many funders, long-term data preservation and sharing (where appropriate) is increasingly becoming a requirement. You need to decide what data to preserve, and where, and to what extent it can be shared. Equally it is sometimes necessary to safely dispose of your research data. Find out what your funder and UWE Bristol says about data preservation and sharing.
Decide where and how to preserve your research data
Digital repositories keep your data safe and readable over the long term. They also allow you to make your research data easily accessible to a wider audience, if appropriate.
Locate a research data repository
There are a range of subject-based repositories available, and UWE Bristol has its own Research Data Repository.
Data is usually deposited by the data generator or owner, or a designated person. The data repository takes responsibility for preserving and sharing the data according to any restrictions.
Some funders recommend or suggest data repositories.
|ESRC||UK Data Service|
|NERC||NERC data centers|
|Wellcome Trust||Guidance on data repositories|
|BBSRC||Guidance on data repositories|
|EPSRC||Please contact firstname.lastname@example.org for advice.|
re3data is a global registry of research data repositories that covers different academic disciplines, for the long-term storage and access of research data.
It is recommended that, where possible, data is deposited with an existing, recognised data repository. The UWE Bristol Research Data Repository can be used where no suitable national or subject data repositories exist. If you want to use this service, please contact the Library's Research Support Team via email@example.com.
Assess the suitability of a data repository
There are a number of things to consider when selecting a suitable repository to archive and publish your research data:
- What type of data does the repository accept, and what is its subject focus?
- Does the data repository already have a good reputation in this field, or is it recommended or required by your funder or journal?
- Will the repository provide enough metadata to enable your data to be discovered and cited by other researchers?
- Will the repository issue your data with a persistent identifier, such as a Digital Object Identifier (DOI) or an accession number, that you can include in your data access statement? A search for archives in re3data allows you to tick a box restricting results to those that provide persistent identifiers.
- Are access restrictions or embargoes permitted? Will the archive ensure that confidential or personal data are secured if that is required?
- Do the data repository's terms and conditions fit with the University's intellectual property advice as defined in the Code of Good Research Conduct? Researchers should not use data repositories that require any transfer of rights without having this authorised by a senior member of the University.
- What licences are available and do they comply with the University's Research Data Policy?
- Is the archive established and well funded so that you can rely on it still preserving your data in 10 years time, or even longer?
- If for whatever reason the data repository ceases to exist, is it possible to reclaim research data and place it elsewhere?
The Digital Curation Centre has produced a helpful checklist for evaluating data repositories.
Working with a data repository
Deposit processes vary depending on the research data repository, so you will have to consult the relevant repository's web pages or contact them directly to find out the detail. This should be done as close to the start of a project as possible.
Some repositories allow free deposit and access, while others charge a fee to maintain, share, and access data. Any relevant costs should be established at the start of the project, or ideally before the project starts, so that necessary funding for data preservation can be established.
As far as possible, digital research data should be preserved in formats which do not require specialist software to read them. The easiest way to ensure long term accessibility to research data is to store it in standard file formats which are non-proprietary, for example using CSV files instead of Microsoft Excel. Some useful examples of formats for long term preservation are provided by the UK Data Service.
Research data repositories will have their own requirements about how research data is presented for preservation and accessibility. You should contact your data repository to see what is required as early on in the project as possible, especially if data is to be collected and used in one format, and converted to another for preservation.
All research data repositories will require some form of documentation or metadata to describe the research data but the format will differ between resources and data repositories.
Your options at UWE Bristol
UWE Bristol Research Data Repository
If data cannot be placed in a national or subject data repository, it might be possible to use the UWE Bristol Research Data Repository. This can accept metadata relating to datasets, and in some cases, where the data can be shared without restriction, the datasets themselves. There is video guidance available, but if you would like to use this service, please contact the Library's Research Support Team at firstname.lastname@example.org first.
UWE Bristol Library, RBI, and IT Services have worked together to provide long term, restricted access data storage for research data which cannot be openly shared. This service is provided by Arkivum and access is managed by the Library Research Support team. Please contact email@example.com for more information.
Preserving, sharing, and disposing of your research data
Information on how to conduct data appraisal including deciding what to keep, share or discard.
Increasingly there are research funder requirements surrounding long-term research data preservation and, sometimes, sharing. Even if your research was not funded by an external body, you should still give careful consideration to the long term value of your data.
It is, however, not practical, desirable or, in some cases, permissible to preserve and share all research data. Deciding which data to preserve, and what level of access to allow (if any), or which data should be disposed of, can be difficult especially where there are ethical or commercial considerations.
The following information will help you to decide what data to keep, and the extent to which it can be shared.
Deciding what research data to preserve
As a starting point, give some thought to these six questions:
- Does your funder or the University need you to keep this data and/ or make it available for a certain amount of time?
- Are you required to keep this data by law, regulation, or professional standards?
- Does this data constitute the 'vital records' of a project, organisation, or consortium, and therefore need to be retained for a defined period?
- Do you have legal and intellectual property rights to keep and re-use the data? If not, can it be negotiated?
- Does sufficient documentation and descriptive information (metadata) exist to explain how to find the data or record, wherever it ends up being stored?
- If you need to pay to store the data, can you afford it?
Based on guidance from the Digital Curation Centre, NERC and Bristol University, this guide and checklist offers advice to help you evaluate the long term value of your research data.
Deciding access levels to research data
Deciding levels of access to data can be difficult if there are commercial, ethical, or sensitivity considerations. It is vital that any data which is preserved or shared aligns with the initial informed consent, contract, or agreements that were in place at the start of a project, unless subsequent permissions have been gained.
Any personal or sensitive data which is to be preserved or shared must also adhere to the legal requirements of the General Data Protection Regulation (GDPR).
It is the UWE Bristol project manager's responsibility to decide if data needs to be retained, made available for re-use, or securely disposed of. UWE Bristol's research data security guidance provides the institutional-level framework to assist with the decision-making process.
If research data is to be retained, it is possible to do so under three main levels of access, which should be discussed with the appropriate data repository:
Metadata and data can be deposited with a data repository, and made available for re-use under licence.
Metadata is openly available, but data can only be accessed by bona fide researchers, under certain prescribed conditions, and for particular uses under licence.
Metadata is openly available, but the research data is not available for re-use.
Deciding what research data to discard
It is not always possible or desirable to preserve all research data. If a data management plan was created at the start of a project, it should be clear which data is to be retained, because of regulatory or funder requirements for instance, and what should be discarded.
Reasons for discarding research data could include:
- regulatory frameworks which require the discarding of data
- data not being core to a project, or not underpinning a research output
- data being no longer useful
- making valuable data easier to find
- making effective use of available storage for data with a long term value.
The University's Waste and Resources Team should be contacted for guidance on how to securely dispose of hard copy, electronic, and confidential research data.
Get your data ready for preservation and sharing
Once it has been established that research data is suitable for preservation, there are some actions that need to be undertaken in order to get it ready for deposit with a data repository. The Getting ready to deposit your data checklist (PDF) provides some pointers to navigate the process.
Write a data access statement
Data access statements are used in publications to describe where data directly supporting the published paper can be found, and under what circumstances they can be accessed. Statements are required by many funders, as part of their data management and open-access policies.
Some journals provide a section for a data access statement, however where this is not the case you should still include a statement in your manuscript.
A data access statement should include:
- the name of the data repository where the data is held, and any persistent identifiers (for example a DOI) for the data set
- any ethical or commercial reasons why the data is not openly available
- instructions on how to request data that is not openly available
- any specific terms of re-use.
It is not sufficient to suggest that interested parties contact the author for access to data.
Sample data access statements
"All data cited in this paper are available from [name of data repository, and persistent identifier]"
"All data supporting this paper are provided as supplementary information accompanying this paper."
Secondary use of data
"This paper was based on data already available from [insert location and any persistent identifier, for example a DOI]"
"This paper was based on existing data obtained under license. Details of how the data were obtained can be found at UWE Bristol Data Repository at [insert URL]"
"Because of [ethical, commercial, other] sensitivity issues, supporting data is not openly available. Further information about the data, and conditions for access, can be found at the UWE Bristol Data Repository at [insert URL]"
"Supporting data will be available from [insert data repository and persistent identifier] after a six month embargo period, to allow for commercialisation of findings"
"Because of confidentiality agreements, supporting data can only be made available to researchers on acceptance of a non-disclosure agreement. Details of how to request access are available from UWE Bristol Data Repository at [insert URL].
No new data
"No new data were created during this research."
If the underlying data is held in a variety of locations, it might be appropriate to cite each datasets, including a persistent identifier, individually and direct readers to the references, for example:
"This paper is supported by multiple datasets which are available at locations cited in the reference section."