Assessing the Proficiency of LLMs with Various Tasks and Evaluators

Kim, Tong Min; Lee, Youngrong; Kim, Chansik; Ko, Taehoon

doi:10.3233/SHTI240473

IOS Press Ebooks

Guest Access

As a guest user you are not logged in or recognized by your IP address. You have access to the Front Matter, Abstracts, Author Index, Subject Index and the full text of Open Access publications.

loading subjects...

Assessing the Proficiency of LLMs with Various Tasks and Evaluators

Authors

Tong Min Kim, Youngrong Lee, Chansik Kim, Taehoon Ko

Pages

552 - 553

DOI

10.3233/SHTI240473

Category

Research Article

Series

Studies in Health Technology and Informatics

Ebook

Volume 316: Digital Health and Informatics Innovations for Sustainable Health Care Systems

Abstract

Previous studies have been limited to giving one or two tasks to Large Language Models (LLMs) and involved a small number of evaluators within a single domain to evaluate the LLM’s answer. We assessed the proficiency of four LLMs by applying eight tasks and evaluating 32 results with 17 evaluators from diverse domains, demonstrating the significance of various tasks and evaluators on LLMs.

This website uses cookies

This website uses cookies