lewington
/

CLIP-ViT-L-scope

Model card Files Files and versions Community

lewington commited on Oct 27, 2024

Commit

242c1e5

·

1 Parent(s): ebffa7b

add references

Files changed (1) hide show

README.md +5 -1

README.md CHANGED Viewed

@@ -130,4 +130,8 @@ The outcomes are plotted below. Active Feature Proportion is the proportion of f
 All layers were trained across all 257 image patches. Below we provide plots demonstrating the reconstruction MSE for each token (other than the CLS token) as training progressed. It seems that throughout training the outer tokens are easier to reconstruct than those in the middle, presumably because these tokens capture more important information (i.e. foreground objects) and are therefore more information rich.
 ![](./media/layer_22_training_outputs.png)
-![](./media/layer_22_individually_scaled.png)

 All layers were trained across all 257 image patches. Below we provide plots demonstrating the reconstruction MSE for each token (other than the CLS token) as training progressed. It seems that throughout training the outer tokens are easier to reconstruct than those in the middle, presumably because these tokens capture more important information (i.e. foreground objects) and are therefore more information rich.
 ![](./media/layer_22_training_outputs.png)
+![](./media/layer_22_individually_scaled.png)
+## References
+We draw heavily from prior Visual Sparse Autoencoder research work by [Hugo Fry](https://www.lesswrong.com/posts/bCtbuWraqYTDtuARg/towards-multimodal-interpretability-learning-sparse-2) and [Gytis Daujotas](https://www.lesswrong.com/posts/iYFuZo9BMvr6GgMs5/case-study-interpreting-manipulating-and-controlling-clip). We also rely on Autointerpretability research from [Anthropic Circuits Updates](https://transformer-circuits.pub/2024/august-update/index.html), and take the TopKSAE architecture and training methodology from [Scaling and Evaluating Sparse Autoencoders](https://cdn.openai.com/papers/sparse-autoencoders.pdf).