Skip to Main Content

Research Data Management

Documentation & Metadata

Metadata

This is information about a data set. Typically metadata is created to help potential users understand how the data was created and other important factors that cannot be determined by looking at the data itself.  Various organizations have created metadata standards to guide data developers to provide key metadata and standardize how metadata is written within a given field of research. For example, if you are working with sequencing data, in many cases you will be required to submit data to the Sequence Read Archive. Providing metadata and other documentation about your data allows users to better understand, access, and reuse the data.

When to Record Metadata

Many fields are developing standards for what metadata to collect across different data types. Whenever possible, it is best to consult community standards before you begin collecting research data. It is easiest and most efficient to record metadata during the research process while the data still are active. This also ensures that the metadata record is complete and accurate.

How to Record Metadata

The metadata and additional documentation should be recorded throughout the research process to ensure accuracy and comprehensiveness. It should be stored and published alongside research data in an accessible and reliable format appropriate for the type of study or data. Your metadata will likely come from several sources during your research:

  • Technical Metadata: Generated from research instruments and software used.
  • Additional Metadata: Most metadata will be collected manually. Consider using an existing schema or templates to make the process standardized and easier.
Metadata Standards & Guidance

Different fields and repositories have different metadata standards and schemas. It is important to explore these and where possible, employ one or multiple established metadata standards, or schemas, that are widely used within your discipline. If you are storing your data in a repository, you also must comply with its metadata requirements. To find a metadata schema for your discipline or data type, see these metadata directories:

Log

A log is a document that records the actions taken to either collect data or analyze a dataset with specific software.

Codebook

A codebook is a document that lists the codes and meanings assigned to each code used in a research project. The Inter-University Consortium for Political and Social Research (ICPSR) has developed a comprehensive Guide to Codebooks.

readme File

A readme file is a file that describes the files present in a file collection, gives more information about a given file, or describes a piece of software or an analysis script. A good description of the elements of a readme file and a downloadable template can be found on Cornell University’s Guide to Writing “readme”-style Metadata.