📢New research on mechanistic architecture design and scaling laws.
— Michael Poli (@MichaelPoli6) March 28, 2024
- We perform the largest scaling laws analysis (500+ models, up to 7B) of beyond Transformer architectures to date
- For the first time, we show that architecture performance on a set of isolated token… pic.twitter.com/khJAXnvwWA
No comments:
Post a Comment