2y ago

Mapping the Mind of a Large Language Model

www.anthropic.com

We have identified how millions of concepts are represented inside Claude Sonnet, one of our deployed large language models. This is the first ever detailed look inside a modern, production-grade larg...

I often see a lot of people with outdated understanding of modern LLMs.

This is probably the best interpretability research to date, by the leading interpretability research team.

It's worth a read if you want a peek behind the curtain on modern models.