Skip to main navigation Skip to search Skip to main content

Plucking and Re-planting ORCIDs in Data Repository Datasets: Readme Harvesting for Metadata Improvement

Research output: Contribution to conferencePosterpeer-review

Abstract

Persistent identifiers (PIDs) for research data authors are increasingly important and have been identified as major gaps in the metadata for data repositories including the Data Repository for the University of Minnesota (DRUM). Open Researcher and Contributor IDs (ORCIDs) are already strongly recommended as unique author IDs in the scholarly community and are collected in the DRUM repository through text-based Readme file templates. An increasing number of journals and funding bodies are requiring ORCIDs from submitters including more US federal agencies in May 2025. In DRUM, ORCIDs are present in the well-structured, text-based Readme files, but are not incorporated into the DSpace-based system metadata which prevents automation and reuse of the data in processes such as minting DOIs with DataCite. This metadata improvement project seeks to harvest the ORCID information in these Readme files and transform them into a format that will enable better integration with the repository metadata structure and the global scholarly infrastructure at large.

Tools and resources used in this project include the DataCite API, DSpace API, OpenRefine, and Python scripts to collect and analyze local metadata and transform it. This process will be documented to enable automated and repeatable processes for a strategic metadata improvement project.
Original languageEnglish (US)
DOIs
StatePublished - Jun 16 2025
EventOpen Repositories 2025: Twenty Years of Progress, a Future of Possibilities - University of Chicago, Chicago, United States
Duration: Jun 15 2025Jun 18 2025
Conference number: 20
https://or2025.openrepositories.org/

Conference

ConferenceOpen Repositories 2025
Abbreviated titleOR2025
Country/TerritoryUnited States
CityChicago
Period6/15/256/18/25
Internet address

Bibliographical note

Title of poster was changed to "Transplanting ORCIDs in Data Repository Datasets: Readme Harvesting for Metadata Improvement"

Keywords

  • data repositories
  • metadata management
  • Persistent identifiers
  • ORCIDs
  • institutional repositories

Fingerprint

Dive into the research topics of 'Plucking and Re-planting ORCIDs in Data Repository Datasets: Readme Harvesting for Metadata Improvement'. Together they form a unique fingerprint.

Cite this