NEW SAVANNA: Copy suppression head in GPT-2 [interpretability]

Friday, October 13, 2023

Copy suppression head in GPT-2 [interpretability]

What’s copy suppression? Without head L10H7, GPT-2 naively follows "All's fair in love and" with " love". L10H7 detects the " love" prediction, attends to the previous " love" token, suppressing the token, resulting in the correct prediction " war". More examples: pic.twitter.com/NsIg3dSvpf
— Callum McDougall (@calsmcdougall) October 13, 2023

No comments:

Post a Comment

Subscribe to: Post Comments (Atom)