We've used reinforcement learning from human feedback to train language models for summarization. The resulting models produce better summaries than 10x larger models trained only with supervised learning: https://t.co/Sk31d1CnTu— OpenAI (@OpenAI) September 4, 2020
No comments:
Post a Comment