The focus of the workshop was the need for all involved in the scholarly ecosystem to focus on improving metadata around authors, author/contribution, author affiliations, funding information, and references.
Participants were asked to provide their thoughts on the role of AI in metadata quality and highlights were that we are seeing the elements of caution but also optimism all of which were discussed further during the whiteboard exercise.
Key points raised included the rise in co-authorship to an average of eight authors per publication by 2030, the need for standardized author contributions using ORCID IDs, and the challenges of AI-generated references. Scott Dineen from Optica, Parth Agrawal from Cactus, and Usha Biradar from Molecular Connections, highlighted issues with AI-generated metadata and the importance of early detection. Chelsea Dinsmore from the University of Florida and Liz Agee from USDOE/OSTI discussed the need for accurate author affiliations and funder identifiers, emphasizing the role of authors in providing complete metadata. Also covered was the use of AI for metadata enhancement and the importance of upstream data collection. The discussion touched on the issue of inappropriate reference usage, emphasizing the need for early integrity checks and highlighted the importance of AI in enhancing citation context accuracy, and underscored the shared responsibilities in the research ecosystem, urging seamless integration of integrity checks.
Mark Doyle (workshop moderator) acknowledged the amount of downstream data enhancement and enrichment being done by funders, libraries, etc., but there are no mechanisms for those improvements to be fed back upstream to help improve metadata. Everyone looks to the publisher and author to do it all upfront (which is of course ideal), but a broader approach may be needed.
After the brief presentations from speakers, workshop participants were asked to contribute, identify pathways and introduce solutions to fix errors or omissions upstream and build infrastructure to minimize the introduction of metadata errors going forward.
Specific calls to actions were:
- Follow up with CHORUS on opportunities to continue contributing ideas and feedback on improving metadata quality.
- Investigate RAID initiative (https://www.raid.org/) as a way to capture project-level metadata that could help streamline author/affiliation information.
- Explore the Fairlist platform (https://fairlyz.lifetimeomics.com/about-fairlyz/) that uses AI-based quality control to improve metadata within data repositories.
There is more work to be done to ensure the necessary data and tools are available and processes are implemented so that we make progress in improving the challenges faced with metadata, especially in the era of AI. An idea would be to convene specialized working groups to flesh out these ideas more and develop real plans on which tools, processes, and best practices can be integrated into general workflows.
A special thanks was given to the event sponsors AIP Publishing, Springer Nature, American Physical Society, and STM. CHORUS could not do what it does without their support.
Link to the recording and presentations can be found on the event page.

















































