728x90
반응형
SMALL
안녕하세요,
IITP 토론토대학교 인공지능 파견 간 기업프로젝트도 함께 진행하고 있는데요.
오늘은 진행하고 있는 LG Toronto Agent AI Project에서 담당자분이 추천해주신 논문 공유드립니다.

https://arxiv.org/abs/2401.16788
Can Large Language Models be Trusted for Evaluation? Scalable Meta-Evaluation of LLMs as Evaluators via Agent Debate
Despite the utility of Large Language Models (LLMs) across a wide range of tasks and scenarios, developing a method for reliably evaluating LLMs across varied contexts continues to be challenging. Modern evaluation approaches often use LLMs to assess respo
arxiv.org
Submission history
Tue, 30 Jan 2024 07:03:32 UTC
728x90
반응형
LIST