NEW SAVANNA: Training GPT3 to be more responsive

Thursday, January 27, 2022

Training GPT3 to be more responsive

We've trained GPT-3 to be more aligned with what humans want: The new InstructGPT models are better at following human intent than a 100x larger model, while also improving safety and truthfulness. https://t.co/rKNpCDAMb2
— OpenAI (@OpenAI) January 27, 2022

We’ve used basically the same technique (which we call RLHF) in the past for text summarization (https://t.co/nrJjX62SsV). “All we’re doing” here is applying it to a much broader range of language tasks that people use GPT-3 for in the API
— Ryan Lowe (@ryan_t_lowe) January 27, 2022

No comments:

Post a Comment

Subscribe to: Post Comments (Atom)