Skip to main content

Indie game storeFree gamesFun gamesHorror games
Game developmentAssetsComics
SalesBundles
Jobs
TagsGame Engines
A jam submission

Interpreting Catastrophic Failure Modes in OpenAI’s WhisperView project page

Submitted by Lawrencium103 — 15 minutes, 47 seconds before the deadline
Add to collection

Play project

Interpreting Catastrophic Failure Modes in OpenAI’s Whisper's itch.io page

Results

CriteriaRankScore*Raw Score
Novelty#13.8893.889
ML Safety#23.2223.222
Generality#43.2223.222
Interpretability#53.5563.556
Reproducibility#113.6673.667

Ranked from 9 ratings. Score is adjusted from raw score by the median number of ratings per game in the jam.

Judge feedback

Judge feedback is anonymous.

  • Cool work! I'm pleasantly surprised that the logit lens works here, and that you can remove so many encoder + decoder layers, and interesting choice of problem. And cool use of PySvelte! My guess is that this failure comes from induction heads which notice and respond to repeated patterns, so brief hiccups turn into robust repeated sequences. Looking at which heads are most key in this behaviour would feel interesting to me. Misc point - I believe that GPT-3 can also get caught repeating the same word (probs downstream of induction heads)

Where are you participating from?
London, UK

What are the names of your team member?
Edward Rees, John Hughes, Ellena Reid

What are the email addresses of all your team members?
edward.r.rees@gmail.com

Leave a comment

Log in with itch.io to leave a comment.

Comments

Submitted(+1)

nice! It’s great to see steps towards interpretability of multi modal models

Link to attention score experiments with Whisper and our analysis so far when the model hallucinates (provided in report too): https://github.com/erees1/alignment-jam/blob/main/Whisper_Attention.ipynb 

Submitted

Very cool to see someone pulling apart such a new and interesting NN.

(+1)

Here's the link to reproduce the logit lens experiments with Whisper (we didn't have time to put it in our write up): https://github.com/McHughes288/alignment-jam/blob/main/logit_lens_whisper.ipynb