Skip to main content

Indie game storeFree gamesFun gamesHorror games
Game developmentAssetsComics
SalesBundles
Jobs
TagsGame Engines
A jam submission

This Is Fine(-tuning): A benchmark testing LLMs robustness against bad fine-tuning dataView game page

Submitted by JanWehner — 18 minutes, 6 seconds before the deadline
Add to collection

Play game

This Is Fine(-tuning): A benchmark testing LLMs robustness against bad fine-tuning data's itch.io page

Results

CriteriaRankScore*Raw Score
Safety#13.6673.667
Benchmark#23.3333.333
Generality#42.3332.333
Novelty#42.6672.667
Reproducibility#52.6672.667

Ranked from 3 ratings. Score is adjusted from raw score by the median number of ratings per game in the jam.

Judge feedback

Judge feedback is anonymous.

  • An interesting approach to measuring adversarial data impacts! It’s probably hard to generalize this without creating a new benchmark per task but thinking more about the general direction of performance falloff is very encouraged.

Where did you participate?
Delft

What are the full names of the participants?
Jan Wehner, Joep Storm, Tijmen van Graft, Jaouad Hidayat

What is your team name?
Alignment Avengers

Leave a comment

Log in with itch.io to leave a comment.

Comments

No one has posted a comment yet