Thanks, but I can't really take credit for the audio since I didn't create them. For the sculpture's control, the idea is to keep the statue's orientation facing forward at 0 degree so the movement is tied to that particular direction. This is so I can easily recreate them with code. (But TBH though I'm too lazy to deal with transform basis when the camera is rotated.)
As for comparing 2D images, all images in-game are just color data stored in 64 array space to represent a 16x16 grid. First, I divide the array spaces into 4x4 matrix (each consisting of 4x4 spaces) which I then compare with the original artwork, also divided into a 4x4 matrix, sequentially. Each matrix must pass a 0.7 (70%) similarity threshold before the similarity percentage is accepted. If it does not pass the threshold, the matrix is immediately rejected. Finally, the total similarity score of all accepted matrixes is averaged to get the accuracy for the images.