If you (like me) have wondered what the feed-forward layers in transformer models are actually doing, this is a pretty interesting paper on that topic:https://t.co/cqs1OksVR5 pic.twitter.com/BiplVDxS3e
— Karl Higley (@karlhigley) April 30, 2022
While I'm finding it tough sledding – my fault – this is a very interesting article, well worth your time. I'm beginning to think we're going to figure out what's going on inside these engines.
No comments:
Post a Comment