Towards Data-centric Interpretability with Sparse Autoencoders
Nick Jiang*, Lily Sun*, Lewis Smith, Neel Nanda
Vision Transformers Don’t Need Trained Registers
Nick Jiang*, Amil Dravid*, Alexei Efros, Yossi Gandelsman
preprint
Interpreting and Editing Vision-Language Representations to Mitigate Hallucinations
Nick Jiang*, Anish Kachinthaya*, Suzie Petryk, Yossi Gandelsman
ICLR 2025