New work w/@ziv_ravid @kchonyc @leavittron @nsaphra: We break the steepest MLM loss drop into *2* phase changes: first in internal grammatical structure, then external capabilities. Big implications for emergence, simplicity bias, and interpretability! 🧵 https://t.co/15KcwwK7nP pic.twitter.com/GRFvaQ3DnA
— Angelica Chen (@_angie_chen) September 15, 2023
Read the whole thread and check out the paper.
No comments:
Post a Comment