RSS

Metadata Gone Wrong – What to Fix and Why It Matters Workshop Update

During the Metadata Gone Wrong – What to Fix and Why It Matters Workshop held on 24 April 2025, speakers spoke about issues with metadata. 

The biggest takeaway was the need to identify pathways and introduce solutions to fix errors or omissions upstream and build infrastructure to minimize the introduction of metadata errors going forward.

Discover our four expert panelists’ talks about metadata capture. Links to the individual presentations and videos can be found below and on the event page.

Howard Ratner, CHORUS Executive Director opened the discussion with the areas of metadata that CHORUS monitors and has found to be problematic. Metadata such as funder ID/name, license metadata (start dates, urls), ORCID IDs, author affiliations, and grant IDs are often incorrect, incomplete or missing. When metadata is mismatched, missing, or incorrect, assumptions may be made. Howard also noted that additional metadata elements that need attention are:

  • ROR IDs are not widely implemented for affiliations and even less so for funders
  • Publisher attributions are often not updated on transferred publications
    (for example, publisher name should be updated not just the member ID)
  • Publication dates are missing for many records. Is that really possible?

Learn more about CHORUS at www.chorusaccess.org or contact Howard at hratner@chorusaccess.org.

Yvonne Campfens, OA Switchboard (www.oaswitchboard.org) emphasized that it is essential to provide accurate attribution to not only the authors but the institutions and funders as well so that content is discoverable. OA Switchboard WorkflowDoing so downstream (after publication) is wasting time and money – think of it as a “hit and miss” strategy.

Relying on automated matching applications using partial metadata or trying to fix metadata downstream, without otherwise improving data quality, increases the risk of false positives leading to inaccuracies and false negatives leading to incompleteness. If you want to report to research funders and have insight into what you are publishing, you can not accept inaccuracy or incompleteness. If you look at the state of research from the publisher’s point of view the real solution is fixing it upstream. 

Contact Yvonne at yvonne.campfens@oaswitchboard.org for more information about the work OA Switchboard is doing with metadata. 

Presentation
Video

Ted Habermann, Co-founder and CEO Metadata Game Changers (metadatagamechangers.com) highlighted opportunities for improvement and shared report results from Crossref (www.crossref.org/06members/53status.html).

Ted also mentioned that Crossref has reported they are updating millions of records every month, so there is a lot of activity going on with opportunities for improvement. In DataCite, 10% of elements in the DataCite schema are mandatory, as it was originally designed for citations; yet, the remaining percentage supports a large number of other important use cases. When thinking of FAIR data, it’s important to consider use cases of interoperability, reusability, and accessibility. 

For organizations that manage repositories, Ted suggests implementing continuous improvement strategies and, during each iteration, finding ways to measure and identify areas of improvement.

Metadata Game Changers

You can find more from Ted and the analysis he has performed on his blog at metadatagamechangers.com/blog or contact Ted at ted@metadatagamechangers.com

Presentation
Video

John Chodacki, UC Curation Center Director, California Digital Library introduced COMET (cometadata.org), a community-led initiative to establish a curation model that facilitates a wide range of stakeholders to collectively enrich PID metadata. Bringing stakeholders together to develop and produce high-quality metadata requires a commitment to invest in system and process improvements. “We can change the current situation by building a shared system to collect, validate, and include the community’s enrichment work in PID metadata,said John. 

COMET Workflow

Discover more about COMET at cometadata.org or contact the COMET team at info.cometadata@gmail.com

Presentation
Video

Mitchell Bakos, Director of Product Development and Solutions Architecture, American Chemical Society and member of the STM Article 4 Task & Finish Group, (stm-assoc.org/what-we-do/strategic-areas/standards-technology/article-4-tfg/). Mitchell spoke to machine-readable solutions that can be self-protected at the item (metadata) level. Soft binding is a method of linking metadata and rights declarations to digital content without embedding them directly into the metadata file. Instead, metadata and rights information are stored separately and cryptographically linked to the content. Registries that promote soft binding provide simple and practical web protocols to ease the discovery of licensing policies. 

STM Article 4 WG Image

Discover more about the STM Article 4 Task & Finish Group is doing at stm-assoc.org/what-we-do/strategic-areas/standards-technology/article-4-tfg/ or contact Mitchell at M_Bakos@acs.org

Presentation
Video

The workshop then moved into an engaging discussion with the audience to further deep dive into how scholarly metadata could be improved to help its stakeholders. 

The attendees discussed what risks there were to having poor scholarly metadata: 

  • a user might find something but not understand the meaning or purpose of the object 
  • legal issues 
  • survival 
  • common sense 
  • hygiene 
  • and misidentification of researchers, funders, institutions, etc.

The attendees discussed what potential incentives there might be for metadata improvement: 

  • less labor/lower costs 
  • managing editorial integrity 
  • better reporting to stakeholders 
  • better internal information management 
  • improved directions on what internal metadata should/could be made public
  • and solving the use case of: “How can I find all outputs that have been created by [researcher] / [institution] / funder / lab.

The workshop focus was then narrowed onto what specific metadata needs help: 

  • authors, author/contributor affiliations
  • funding information
  • licensing information
  • references
  • source/venue
  • publication dates 
  • language 
  • work / resource type
  • and titles.

The session ended with some ideas about metadata quality metrics / scores and sharing metadata quality success stories, so that stakeholders could learn from each other. 

Q&A/Panel Discussion

Next Steps

CHORUS is planning a follow-up workshop to outline actionable strategies for advocating, implementing, and sustaining metadata improvements with global metadata standards in mind. Having publishers and industry leaders outline changes they have made or are planning to make to resolve metadata quality issues and ensure completeness and accuracy of metadata. Join our newsletter to be notified of future events.

Links to the speaker presentations and recordings can also be found on the event page.

Share this: