Sunday, April 7, 2024

Visualizing Attention: A Transformer's Heart

This is the second of three videos from 3Blue1Brown about how transformers work. Here's the first.

Timestamps:
0:00 - Recap on embeddings:
1:39 - Motivating examples:
4:29 - The attention pattern:
11:08 - Masking:
12:42 - Context size:
13:10 - Values:
15:44 - Counting parameters:
18:21 - Cross-attention:
19:19 - Multiple heads:
22:16 - The output matrix:
23:19 - Going deeper:
24:54 - Ending

You might also want to look at this post, where I have three videos where 3Blue1Brown visualizes the use of Fourier series to trace out fairly elaborate line drawings.

No comments:

Post a Comment