LLMs are reshaping how we evaluate ranked lists, conversations, and even simulate users. At #LLM4Eval @ #SIGIR2025, we’re spotlighting 3 special themes: 🧠 Beyond query-doc 🗣️ Simulating users 🧪 Synthesizing evaluation data 👇 Curious? Read on & submit by April 23! #LLMs #IR
LLMs are reshaping how we evaluate ranked lists, conversations, and even simulate users. At #LLM4Eval @ #SIGIR2025, we’re spotlighting 3 special themes: 🧠 Beyond query-doc 🗣️ Simulating users 🧪 Synthesizing evaluation data 👇 Curious? Read on & submit by April 23! #LLMs #IR
🧵 Theme 1: 🧠 Evaluation beyond query-doc: How can LLMs evaluate full sessions, ranked lists, or multi-turn conversations? We welcome work that goes beyond traditional Qrels and explores richer evaluation units.