Defining research data
Research data, unlike other types of information, is collected, observed, or created, for purposes of analysis to produce original research results.
Classification of research data
Research data can be generated for different purposes and through different processes (Research Information Network classification):
- Observational: data captured in real-time, usually irreplaceable. For example, sensor data, survey data, sample data, neuroimages.
- Experimental: ldata from lab equipment, often reproducible, but can be expensive. For example, gene sequences, chromatograms, toroid magnetic field data.
- Simulation: data generated from test models where model and metadata are more important than output data. For example, climate models, economic models.
- Derived or compiled: data is reproducible but expensive. For example, text and data mining, compiled database, 3D models.
- Reference or canonical: a (static or organic) conglomeration or collection of smaller (peer-reviewed) datasets, most probably published and curated. For example, gene sequence databanks, chemical structures, or spatial data portals.
Research data formats
Research data comes in many varied formats:
- Text - flat text files, Word, Portable Document Format (PDF), Rich Text Format (RTF), Extensible Markup Languague (XML).
- Numerical - Statistical Package for the Social Sciences (SPSS), Stata, Excel.
- Multimedia - jpeg, tiff, dicom, mpeg, quicktime.
- Models - 3D, statistical.
- Software - Java, C.
- Discipline specific - Flexible Image Transport System (FITS) in astronomy, Crystallographic Information File (CIF) in chemistry.
- Instrument specific - Olympus Confocal Microscope Data Format, Carl Zeiss Digital Microscopic Image Format (ZVI).
Research data (traditional and electronic research) may include all of the following:
- Documents (text, Word), spreadsheets
- Laboratory notebooks, field notebooks, diaries
- Questionnaires, transcripts, codebooks
- Audiotapes, videotapes
- Photographs, films
- Test responses
- Slides, artefacts, specimens, samples
- Collection of digital objects acquired and generated during the process of research
- Data files
- Database contents (video, audio, text, images)
- Models, algorithms, scripts
- Contents of an application (input, output, logfiles for analysis software, simulation software, schemas)
- Methodologies and workflows
- Standard operating procedures and protocols
The following research records may also be important to manage during and beyond the life of a project:
- Correspondence (electronic mail and paper-based correspondence)
- Project files
- Grant applications
- Ethics applications
- Technical reports
- Research reports
- Master lists
- Signed consent forms
This article was published on Aug 28, 2009