Thanks for your feedback!
The clue was indeed the rhythm to mimic.
The eye was asking to shoot, so the input to use is an arrow key, which represents the action of shooting.
I thought that even if it wasn't clear, people would try the arrow keys since they're the only action inputs, but I was wrong.