?? Can an AI catch its own mistakes?
What if language models could critique themselves, i.e., refining answers to complex questions without human help?
? Our latest work, accepted at BioASQ @ CLEF 2025, puts this to the test using cutting-edge LLMs in a high-stakes professional search context.
? Authors: Samy Ateia & Udo Kruschwitz
? We explore how current reasoning and nonreasoning Large Language Models (LLMs) like Gemini-Flash 2.0, o3-mini, o4-mini and DeepSeek-R1 can generate, evaluate, and refine their own outputs to support domain-specific professional search, particularly in biomedical QA. ?
? Our study tests a LLM self-feedback mechanism in a Retrieval Augmented Generation (RAG) pipeline, asking:
Can LLMs effectively critique themselves?
Can this improve performance on expert tasks like those in BioASQ?
? Findings suggest that performance varies across models and task types.
? Our study informs future research aiming to understand when we should trust self-correcting AI to work on its own and when expert human input still matters most in complex professional search tasks.
? Find the pre-print version of our study here: https://arxiv.org/abs/2508.05366 (external link, opens in a new window)
We are looking forward to insightful discussions at CLEF 2025 and advancing the conversation around transparency, user involvement, and AI-supported expert search. See you in Madrid! ???
#ProfessionalSearch
#LargeLanguageModels #LLMs #AI #NLP
#RetrievalAugmentedGeneration #RAG
#SelfFeedback #SelfCorrection #QueryExpansion
#PhDResearch #AIResearch #BiomedicalResearch
#BioASQ #CLEF2025
#ResearchSuccess #ResearchPaperAccepted
#InformationScienceRegensburg #StayInformed
Information/Contacts
CLEF 2025 is hosted by the UNED University at Madrid, Spain, September 9-12, 2025. Find more information on CLEF 2025, the 16th "Conference and Labs of the Evaluation Forum" here: https://clef2025.clef-initiative.eu/index.php (external link, opens in a new window)
Find more information on the thirteenth BioASQ Workshop here: https://www.bioasq.org/workshop2025 (external link, opens in a new window)
About Samy Ateia
About Udo Kruschwitz
