What’s copy suppression? Without head L10H7, GPT-2 naively follows "All's fair in love and" with " love". L10H7 detects the " love" prediction, attends to the previous " love" token, suppressing the token, resulting in the correct prediction " war". More examples: pic.twitter.com/NsIg3dSvpf
— Callum McDougall (@calsmcdougall) October 13, 2023
Pages in this blog
Friday, October 13, 2023
Copy suppression head in GPT-2 [interpretability]
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment