15 entries were submitted between 2023-01-20 16:00:00 and 2023-01-23 03:15:00. 52 ratings were given to 15 entries (100.0%) between 2023-01-23 03:15:00 and 2023-01-25 14:00:00. The average number of ratings per game was 3.5 and the median was .
| Criteria | Rank | Score* | Raw Score |
| Mechanistic interpretability | #1 | 4.571 | 4.571 |
| Judge's choice | #2 | n/a | n/a |
| Generality | #4 | 3.286 | 3.286 |
| ML Safety | #4 | 3.429 | 3.429 |
| Reproducibility | #4 | 4.143 | 4.143 |
| Novelty | #8 | 3.000 | 3.000 |
| Criteria | Rank | Score* | Raw Score |
| Judge's choice | #1 | n/a | n/a |
| Reproducibility | #1 | 4.400 | 4.400 |
| Mechanistic interpretability | #2 | 4.400 | 4.400 |
| Novelty | #3 | 4.200 | 4.200 |
| Generality | #11 | 2.800 | 2.800 |
| ML Safety | #11 | 2.800 | 2.800 |
| Criteria | Rank | Score* | Raw Score |
| Generality | #1 | 4.333 | 4.333 |
| Mechanistic interpretability | #3 | 4.333 | 4.333 |
| Novelty | #7 | 3.333 | 3.333 |
| ML Safety | #8 | 3.000 | 3.000 |
| Reproducibility | #9 | 3.667 | 3.667 |
| Criteria | Rank | Score* | Raw Score |
| Mechanistic interpretability | #3 | 4.333 | 4.333 |
| Reproducibility | #5 | 4.000 | 4.000 |
| Novelty | #6 | 3.667 | 3.667 |
| Generality | #12 | 2.667 | 2.667 |
| ML Safety | #13 | 2.667 | 2.667 |
| Criteria | Rank | Score* | Raw Score |
| ML Safety | #1 | 3.750 | 3.750 |
| Novelty | #1 | 4.500 | 4.500 |
| Generality | #5 | 3.000 | 3.000 |
| Mechanistic interpretability | #5 | 4.250 | 4.250 |
| Reproducibility | #5 | 4.000 | 4.000 |
| Criteria | Rank | Score* | Raw Score |
| Judge's choice | #3 | n/a | n/a |
| Reproducibility | #3 | 4.250 | 4.250 |
| Mechanistic interpretability | #5 | 4.250 | 4.250 |
| Generality | #5 | 3.000 | 3.000 |
| Novelty | #8 | 3.000 | 3.000 |
| ML Safety | #12 | 2.750 | 2.750 |
| Criteria | Rank | Score* | Raw Score |
| Novelty | #1 | 4.500 | 4.500 |
| Generality | #2 | 4.000 | 4.000 |
| ML Safety | #3 | 3.500 | 3.500 |
| Mechanistic interpretability | #7 | 4.000 | 4.000 |
| Reproducibility | #10 | 3.500 | 3.500 |
| Criteria | Rank | Score* | Raw Score |
| Reproducibility | #1 | 4.400 | 4.400 |
| Generality | #3 | 3.400 | 3.400 |
| ML Safety | #7 | 3.200 | 3.200 |
| Mechanistic interpretability | #8 | 3.800 | 3.800 |
| Novelty | #12 | 2.600 | 2.600 |
| Criteria | Rank | Score* | Raw Score |
| Judge's choice | #4 | n/a | n/a |
| Novelty | #4 | 3.750 | 3.750 |
| Reproducibility | #5 | 4.000 | 4.000 |
| Generality | #5 | 3.000 | 3.000 |
| ML Safety | #6 | 3.250 | 3.250 |
| Mechanistic interpretability | #9 | 3.750 | 3.750 |
| Criteria | Rank | Score* | Raw Score |
| Novelty | #5 | 3.674 | 4.500 |
| Reproducibility | #8 | 3.674 | 4.500 |
| Generality | #9 | 2.858 | 3.500 |
| ML Safety | #9 | 2.858 | 3.500 |
| Mechanistic interpretability | #10 | 3.674 | 4.500 |
| Criteria | Rank | Score* | Raw Score |
| ML Safety | #5 | 3.333 | 3.333 |
| Generality | #5 | 3.000 | 3.000 |
| Novelty | #11 | 2.667 | 2.667 |
| Mechanistic interpretability | #11 | 3.333 | 3.333 |
| Reproducibility | #14 | 2.000 | 2.000 |
| Criteria | Rank | Score* | Raw Score |
| Generality | #9 | 2.858 | 3.500 |
| Novelty | #10 | 2.858 | 3.500 |
| Mechanistic interpretability | #12 | 2.858 | 3.500 |
| ML Safety | #14 | 1.633 | 2.000 |
| Reproducibility | #15 | 1.633 | 2.000 |
| Criteria | Rank | Score* | Raw Score |
| ML Safety | #9 | 2.858 | 3.500 |
| Reproducibility | #11 | 2.858 | 3.500 |
| Generality | #13 | 2.449 | 3.000 |
| Mechanistic interpretability | #13 | 2.449 | 3.000 |
| Novelty | #13 | 2.449 | 3.000 |
| Criteria | Rank | Score* | Raw Score |
| ML Safety | #2 | 3.674 | 4.500 |
| Reproducibility | #13 | 2.041 | 2.500 |
| Generality | #14 | 1.633 | 2.000 |
| Mechanistic interpretability | #14 | 1.225 | 1.500 |
| Novelty | #14 | 2.041 | 2.500 |
| Criteria | Rank | Score* | Raw Score |
| Reproducibility | #12 | 2.449 | 3.000 |
| Generality | #14 | 1.633 | 2.000 |
| ML Safety | #14 | 1.633 | 2.000 |
| Mechanistic interpretability | #15 | 0.816 | 1.000 |
| Novelty | #15 | 0.816 | 1.000 |