Skip to main content
Training language models to follow instructions with human feedback

Training language models to follow instructions with human feedback

Long Ouyang, et al.

00
2022-03-17
rlhfalignment

Abstract

This paper introduces and evaluates the idea described in “Training language models to follow instructions with human feedback”, and reports empirical results that helped shape subsequent work in rlhf, alignment.