ChatEval

View Conversations

Model responses for a given dataset are available to view and compare against other models. First, select an evaluation dataset and add the model you wish to compare (you can compare multiple models at once). Human and automatic evaluations can be similarly viewed here or as a leaderboard (per metric) here.

Dataset

Select Evaluation Dataset

Model

Add Model