Play project
Trafo Mech Int on the web!'s itch.io pageResults
Criteria | Rank | Score* | Raw Score |
Reproducibility | #1 | 4.400 | 4.400 |
Generality | #3 | 3.400 | 3.400 |
ML Safety | #7 | 3.200 | 3.200 |
Mechanistic interpretability | #8 | 3.800 | 3.800 |
Novelty | #12 | 2.600 | 2.600 |
Ranked from 5 ratings. Score is adjusted from raw score by the median number of ratings per game in the jam.
Judge feedback
Judge feedback is anonymous and shown in a random order.
- I love it. Very good hackathon vibes, and a fun tool. Would be nicer if "no effect" looked white on the heat map rather than dark blue (and even better if "full effect recovered" was 1 or -1). No idea how to compare it to others re a prize, since it's not really research per se, but I'm happy it exists and hope you enjoyed making it!
- This is an excellent project and I appreciate the effort put into creating an easy-to-use and accessible web app for creating standard mechanistic interpretability plots. The fact that the code is available on GitHub for reproducibility makes it even more valuable. While the current version is limited, I believe that with some further development and hosting on a stronger server, it has the potential to become an even more powerful tool for researchers and practitioners alike. Keep up the great work! Also check out the less user-friendly http://interp-tools.redwoodresearch.org/.
What are the full names of your participants?
Stefan Heimersheim, Jonathan Ng
What is your team name?
Abbreviation shills
Does anyone from your team want to work towards publishing this work later?
No
Leave a comment
Log in with itch.io to leave a comment.
Comments
No one has posted a comment yet