Prepare your data for the DataVault
Advice on preparing to use the DataVault
Things to do before you deposit data in the Edinburgh DataVault
At the outset of your project (or earliest opportunity)
- Sharing: Evaluate whether at least some of your data would be more suitable for sharing on Edinburgh DataShare. If you want your data to be publicly accessible without restriction, DataShare would be more suitable. If you simply want to hold the data back temporarily till you've published your analysis, you can do so by using DataShare's temporary embargo feature.
- Personal data: Personal data, (i.e. data which identifies human subjects) whether sensitive or not, should not be placed in the DataVault. You should evaluate whether you need to anonymise your data to completely remove all personal data, or pseudonymise your data (replacing personal data and storing participant IDs on a separate system). Contact the Research Data Support team on email@example.com for further assistance.
- Cost: Include the cost of using the DataVault in your grant application.
You or your School will be billed for your DataVault usage, according to the rate displayed in the Research Services' "Charges" page (in the "Storage Charges" table). You must ensure in advance that funds will be available to meet these costs. At the time of writing, charges are calculated on a pro-rata basis to the nearest penny; so for example deposits of less than one terabyte, and over 100 GB, will be charged proportionally to the quoted per-terabyte rate. Please check the Research Services' "Charges" page for the current rate. You'll need to estimate generously the amount of data you're likely to need to vault by the end of your project, to ensure you have sufficient funds set aside. You will find it helpful to compose and maintain a Data Management Plan. N.B. your Storage Manager billing contact (eg your School) will be billed after you deposit the data; therefore you'll need to make that deposit before your grant expires so that the bill can be met from your grant funds.
- Storage: Consider whether you may need to request a greater amount of storage than the default limit of 2 terabytes. You may deposit up to 2 TB of data in the DataVault without the need to request permission (although you will be billed in the usual way regardless of the size).
- Structure: Arrange your files into appropriate subsets if necessary.
You will be able to create multiple deposits in an 'archive' of the vault (except in the interim service). Each deposit can contain multiple files and can have its own DOI (created via the associated Pure Dataset record). So you'll need to identify whether you have some subsets of data which you want to be able to cite or to which you want to be able to grant access independently. You may find it easiest to group your selected files together in one folder for each deposit, but you don't need to do this, because the system will allow you to select files from more than one location for a particular deposit.
- Signposting: Make sure you have structured and labelled your data in a consistent manner, to make it easy for future users to navigate round and understand the data. For example, a single deposit may be assigned one DOI (via Pure) and cited as a single entity. How many separate deposits / DOIs should you create, in order to make it easy for others to cite, navigate and re-use that data?
- Accessibility: Consider whether you could enhance the sustainability and/or accessibility of your data in some way, particularly by using a standard format, and/or by selecting a file format(s) which are not dependent on proprietary software. The Research Data Service provides advice on digital preservation.
- Documentation: Prepare your documentation. Consider whether you have provided sufficient documentation to allow others to re-use the data: Have you spelled out acronyms and explained the labels of your variables and values? Have you included sufficient information on the research methodology and procedures used?
- Time: Make sure you schedule plenty of time to get your data into DataVault when the time comes - deposits will be queued over multiple days to allow the server resource to be managed efficiently, so you need to get that Pure record created, and the DataVault deposit submitted (and if you are using the Interim DataVault service, your data in place on DataStore - see below) all done several days before you anticipate any external users needing to access it for example.
When you are ready to make a deposit
- Consider whether your data are at the appropriate stage for depositing in the DataVault: If your data may need to be accessed again, but are unlikely to change because the analysis is complete, and therefore you no longer need to store them in your active data store, then it is time to archive the data in Edinburgh DataShare, the DataVault or another appropriate data repository. The "RDS Flowchart" on the "About the Research Data Service" page may assist you in this process. N.B. you or your School will be billed after you deposit the data, in accordance with the Research Services' "Charges" page; therefore you'll need to make that deposit before your grant expires unless it is below the threshold indicated on the Charges page.
- Create a Pure record describing your deposit, listing all the data creators and funders, and the contact person and any Data Manager as appropriate, and linking it to a Project and any associated publications. To maximise discoverability, add keywords and make sure your description is both detailed and clear. You will be required to specify the ID of this Pure record when you make your DataVault deposit.
- Check whether you have access to the DataVault. You should be able to login with EASE. Ordinarily by default, PIs or Co-PIs(Principal Investigators) will be recognised as a 'Data Owner' by the system. In the case of the Interim DataVault service, you will need:
- Storage Admin rights over a DataStore Storage Area, which will be used to vault and retrieve your data. To check this, go to the Storage Manager, login with EASE and look under the heading "My Storage Areas" - these are the areas over which you have Storage Admin rights. If you don't yet have Storage Admin rights for any area, please ask your PI or your School Data Manager about this. N.B. you will need to place the files you wish to vault in a folder on DataStore within an area for which you have Storage Admin rights, in order to be able to vault and retrieve your data.
- DataVault depositor status. Go to the Storage Manager and look for the "Create DataVault Deposit" button under the "Actions" heading alongside the Storage Area in question - the button will only appear if you have been granted access.
- You are ready. Go ahead and deposit your data! If you are unable to login or complete the actions you need to, please contact your School Data Officer (they will be the 'Data Manager' for your archive) or the Information Services Helpline. Do not delete the original copies of your data until you receive confirmation that the deposit has been made *and* backed up by DataVault.
- If you require any training or support in relation to preparing your data or using the DataVault, please contact the Research Data Service.