Very interesting, thank you!
When I did musical rhythm game last jam - I just made an array of notes with their length and pauses (also converted from midi) - and just went over the list and initiating a visual note que 1 second in advance of emoting a sound note. And when the visual notes had timers attached to them against which the input was checked - whether it's within the allowance interval.
So it looks the same, but instead of music playback - checks against timer attached to visual ques. Although I didn't do the classic "lines" and was using scaling-appearing notes :)
Not sure though whether timer works in physical, or something else - I didn't get that deep into engine yet. But for some reason I thought that asking timer is "faster" than asking a music file. Although when it's loaded - there is probably no difference?