Skip to Main Content

Research Data Management

Data Privacy & Confidentiality

 

In an effort to improve research reproducibility an increasing number of health science funders and publishers are asking researchers to share the de-identified data underlying their research. This includes the updated NIH Data Management and Sharing Policy, which expects researchers to "maximize the appropriate sharing of scientific data." Though the expectation is for investigators to maximize data sharing, the NIH Data Management and Sharing Policy recognizes that researchers must balance the expectation to share data with privacy considerations and that there may be justifiable limitations to sharing scientific data. It's important to plan for data sharing so that you can meet funder requirements while protecting participant privacy. The UC Davis IRB determines whether there are adequate provisions to protect the privacy of subjects and to maintain the confidentiality of the identifiable data at each segment of the research from recruitment to maintenance of the data.

Informed Consent

Informed consent is one of the founding principles of research ethics. Its intent is that human participants can voluntarily enter research with full information about what it means for them to take part, and that they give consent before they enter the research. It’s best to plan for sharing your data from the very start of your research project, including considerations for obtaining informed consent regarding the storage and sharing of research data for future use.

"Consumers viewed consent as the most important privacy protection. The central role of consent may reflect the value placed by consumers on preserving autonomy and the ability to choose whether and how their personal data are used."

Gupta R, Iyengar R, Sharma M, et al. Consumer Views on Privacy Protections and Sharing of Personal Digital Health Information. JAMA Netw Open. 2023;6(3):e231305. doi:10.1001/jamanetworkopen.2023.1305

Resources

De-Identification

Generally, the scientific data derived from human research participants, including qualitative data, should be adequately de-identified prior to sharing to ensure protection of research participants, maintain privacy, and mitigate risk, especially for vulnerable or marginalized groups.

Resources

Controlled Access

Certain studies (e.g., qualitative or mixed-methods projects) may generate scientific data that are challenging to de-identify or still pose privacy risks even when data are de-identified due to the presence of information that can allow inferences to be made about a research participant’s identity. For example, imaging data, rich clinical/phenotypic data, transcripts from focus groups or in-depth interviews, ethnographic observations, audio recordings of deliberative community-based engagements, social media posts, etc. may need special protections to ensure participant privacy. In these instances, selecting a repository with a controlled-access mechanism may be the most appropriate option to ensure the protection of participants in the study.

Data Collection

  • Limit information collected to only what is necessary to address the research question(s)
  • Avoid collecting superfluous identifiable information (including electronic identifiers) unless it is necessary for your research
    • For example, Qualtrics collects IP addresses by default, which is considered by some IRBs and international standards to be personally identifiable information. Settings can be changed to "Anonymize Response" to prevent Qualtrics from collecting IP addresses.