This tool is incredible, however I have noticed some quirks. I've been using this tool to create subtitles for media that doesn't have any. The issue I've encountered is the timing of the subtitles. Sometimes some of the dialogue will appear prematurely before the sentence is said. Or if if there's context like [Music], [Screams], etc, the caption will just sit there indefinitely until there's new dialogue. I'm not sure if it's overcompensating the timing based on the transcription model? I've been aiming for maximum accuracy.