Blog

Documenting my learning of AI Safety.

The Residual Stream

mechanistic interpretability

residual stream

linear probes

This post discusses the residual stream in transformer models and builds towards using it for two examples of mechanistic interpretability: Logit Lens (nostalgebraist 2020) a…