15 entries were submitted between 2023-01-20 16:00:00 and 2023-01-23 03:15:00. 52 ratings were given to 15 entries (100.0%) between 2023-01-23 03:15:00 and 2023-01-25 14:00:00. The average number of ratings per game was 3.5 and the median was .
Criteria | Rank | Score* | Raw Score |
ML Safety | #1 | 3.750 | 3.750 |
Novelty | #1 | 4.500 | 4.500 |
Generality | #5 | 3.000 | 3.000 |
Mechanistic interpretability | #5 | 4.250 | 4.250 |
Reproducibility | #5 | 4.000 | 4.000 |
Criteria | Rank | Score* | Raw Score |
ML Safety | #2 | 3.674 | 4.500 |
Reproducibility | #13 | 2.041 | 2.500 |
Generality | #14 | 1.633 | 2.000 |
Mechanistic interpretability | #14 | 1.225 | 1.500 |
Novelty | #14 | 2.041 | 2.500 |
Criteria | Rank | Score* | Raw Score |
Novelty | #1 | 4.500 | 4.500 |
Generality | #2 | 4.000 | 4.000 |
ML Safety | #3 | 3.500 | 3.500 |
Mechanistic interpretability | #7 | 4.000 | 4.000 |
Reproducibility | #10 | 3.500 | 3.500 |
Criteria | Rank | Score* | Raw Score |
Mechanistic interpretability | #1 | 4.571 | 4.571 |
Judge's choice | #2 | n/a | n/a |
Generality | #4 | 3.286 | 3.286 |
ML Safety | #4 | 3.429 | 3.429 |
Reproducibility | #4 | 4.143 | 4.143 |
Novelty | #8 | 3.000 | 3.000 |
Criteria | Rank | Score* | Raw Score |
ML Safety | #5 | 3.333 | 3.333 |
Generality | #5 | 3.000 | 3.000 |
Novelty | #11 | 2.667 | 2.667 |
Mechanistic interpretability | #11 | 3.333 | 3.333 |
Reproducibility | #14 | 2.000 | 2.000 |
Criteria | Rank | Score* | Raw Score |
Judge's choice | #4 | n/a | n/a |
Novelty | #4 | 3.750 | 3.750 |
Reproducibility | #5 | 4.000 | 4.000 |
Generality | #5 | 3.000 | 3.000 |
ML Safety | #6 | 3.250 | 3.250 |
Mechanistic interpretability | #9 | 3.750 | 3.750 |
Criteria | Rank | Score* | Raw Score |
Reproducibility | #1 | 4.400 | 4.400 |
Generality | #3 | 3.400 | 3.400 |
ML Safety | #7 | 3.200 | 3.200 |
Mechanistic interpretability | #8 | 3.800 | 3.800 |
Novelty | #12 | 2.600 | 2.600 |
Criteria | Rank | Score* | Raw Score |
Generality | #1 | 4.333 | 4.333 |
Mechanistic interpretability | #3 | 4.333 | 4.333 |
Novelty | #7 | 3.333 | 3.333 |
ML Safety | #8 | 3.000 | 3.000 |
Reproducibility | #9 | 3.667 | 3.667 |
Criteria | Rank | Score* | Raw Score |
Novelty | #5 | 3.674 | 4.500 |
Reproducibility | #8 | 3.674 | 4.500 |
Generality | #9 | 2.858 | 3.500 |
ML Safety | #9 | 2.858 | 3.500 |
Mechanistic interpretability | #10 | 3.674 | 4.500 |
Criteria | Rank | Score* | Raw Score |
ML Safety | #9 | 2.858 | 3.500 |
Reproducibility | #11 | 2.858 | 3.500 |
Generality | #13 | 2.449 | 3.000 |
Mechanistic interpretability | #13 | 2.449 | 3.000 |
Novelty | #13 | 2.449 | 3.000 |
Criteria | Rank | Score* | Raw Score |
Judge's choice | #1 | n/a | n/a |
Reproducibility | #1 | 4.400 | 4.400 |
Mechanistic interpretability | #2 | 4.400 | 4.400 |
Novelty | #3 | 4.200 | 4.200 |
Generality | #11 | 2.800 | 2.800 |
ML Safety | #11 | 2.800 | 2.800 |
Criteria | Rank | Score* | Raw Score |
Judge's choice | #3 | n/a | n/a |
Reproducibility | #3 | 4.250 | 4.250 |
Mechanistic interpretability | #5 | 4.250 | 4.250 |
Generality | #5 | 3.000 | 3.000 |
Novelty | #8 | 3.000 | 3.000 |
ML Safety | #12 | 2.750 | 2.750 |
Criteria | Rank | Score* | Raw Score |
Mechanistic interpretability | #3 | 4.333 | 4.333 |
Reproducibility | #5 | 4.000 | 4.000 |
Novelty | #6 | 3.667 | 3.667 |
Generality | #12 | 2.667 | 2.667 |
ML Safety | #13 | 2.667 | 2.667 |
Criteria | Rank | Score* | Raw Score |
Reproducibility | #12 | 2.449 | 3.000 |
Generality | #14 | 1.633 | 2.000 |
ML Safety | #14 | 1.633 | 2.000 |
Mechanistic interpretability | #15 | 0.816 | 1.000 |
Novelty | #15 | 0.816 | 1.000 |
Criteria | Rank | Score* | Raw Score |
Generality | #9 | 2.858 | 3.500 |
Novelty | #10 | 2.858 | 3.500 |
Mechanistic interpretability | #12 | 2.858 | 3.500 |
ML Safety | #14 | 1.633 | 2.000 |
Reproducibility | #15 | 1.633 | 2.000 |