Play project
Interactive Layerscope's itch.io pageResults
Criteria | Rank | Score* | Raw Score |
Generality | #1 | 4.333 | 4.333 |
Mechanistic interpretability | #3 | 4.333 | 4.333 |
Novelty | #7 | 3.333 | 3.333 |
ML Safety | #8 | 3.000 | 3.000 |
Reproducibility | #9 | 3.667 | 3.667 |
Ranked from 3 ratings. Score is adjusted from raw score by the median number of ratings per game in the jam.
Judge feedback
Judge feedback is anonymous and shown in a random order.
- Interesting project! Sadly I looked at this after the tool timed out. This would have been significantly improved by including the Colab! I'd encourage you to make a public version of it I broadly buy that this tool is a cool thing to exist, and that looking at all neurons in a layer is a useful view, though obviously this would be a cooler tool with clear and compelling use cases! One question I have is about the speed and usability of the tool - my experience is that Plotly is pretty slow with massive plots? On the other hand, being able to hover and see the top neuron indices is pretty useful. I agree with the take that neurons that strongly activate sparsely are more interesting than neurons that activate everywhere. Another thing worth tracking is that some neurons activate much more than others in general, and maybe this would be improved by normalising by that? My main suggestions for improvement would be a public version, and just exploring a bunch more and seeing if you find anything interesting! One idea would be to pick some phenomena that you want to investigate and staring at neurons for that. -Neel
- This is a great example of creating a live application that can be utilized for mechanistic interpretability work and it's nice to see its use case in the report as well.
What are the full names of your participants?
Víctor Levoso Alejandro González Chris Lonsberry
What is your team name?
We can read the matrix (actually we can't)
What is you and your team's career stage?
Multiple
Does anyone from your team want to work towards publishing this work later?
Maybe
Where are you participating from?
Online
Leave a comment
Log in with itch.io to leave a comment.
Comments
https://colab.research.google.com/drive/1F5SoDy1JPvZe5lf6dnGHSkB4C6vPIFcW?usp=sh...
Duno why we didn't end up linking the colab.
Here it is anyway in case anyone is interested.
I might fix some things and make a pull request to add it to transformer lens later if Neel thinks that's a good idea.
Maybe modifying the original neuroscope so it has a selector to change between showing layers and showing an specific Neurons would be better idea though .
Also about speed, it's not that bad.
It still takes a moment to load in colab wich can be anoying but it was still usable. (And for whatever reason it was near instant when Alex ran the notebook locally)
That said we were trying whith gpt small maybe it's worse whith bigger models.