Human Label Variation as Stable Signal: Learning Annotator-Specific Explanation Behavior via Cross-Annotator Preference Optimization
Beiduo Chen, Pingjun Hong, Ziyun Zhang, Benjamin Roth, Anna Korhonen, Barbara Plank
Read on arXiv →Key claim
Large language models can learn annotator-specific reasoning.
This study explores how large language models can learn individual annotator reasoning through free-text explanations. The key finding is that while individual patterns are weak at the single-annotation level, they become detectable with proper aggregation, and the proposed method CAPO significantly improves the model's ability to mimic annotator-specific behavior.
In plain English
The authors of this study discovered that large language models (LLMs) can learn the unique reasoning styles of different annotators by analyzing their free-text explanations. Unlike previous methods that focused solely on the labels given by annotators, this research shows that understanding the reasoning behind those labels can lead to better model performance. They introduced a new technique called cross-annotator preference optimization (CAPO), which helps the model better mimic individual annotators by comparing their responses to other valid annotations. This approach not only improves the model's ability to generate explanations that reflect specific annotator preferences but also enhances the overall quality of the annotations. Builders should care because this method could lead to more accurate and context-aware AI systems that better understand human reasoning, making them more effective in real-world applications.
The paper introduces a new method for learning annotator-specific label-explanation behavior, extending the understanding of human label variation.
The experiments are well-supported with comparisons to baselines and human validation, demonstrating solid evidence for the claims made.
Deep reliability assessment
The methodology supports the claim that annotator-specific label-explanation behavior can be learned and modeled, but the generalization to larger annotator pools and different data regimes is overclaimed without empirical evidence.
Reproducibility
Yes, the paper mentions that all code, prompts, evaluation scripts, and model-training configurations are available at https://github.com/mainlp/CAPO.
Discussion questions
- How does the model's performance change with a larger and more diverse pool of annotators?
- What are the practical implications for using CAPO in real-world annotation tasks?
- What evidence would contradict the claim that CAPO improves aggregation-aware imitation over SFT?
Key figure
Figure 1 shows the variation in label agreement and label proportions across annotators, indicating annotator-specific structure.
