Love the app, works fast and remarkably well for the processing speed! I normally use whisperx with pyannote for segmentation and diarization but that process is not well suited to rapidly transcribing video in low latency distribution! My primary use case is to scan video for obscene language prior to general distribution. The drag and drop pipeline with post processing cleanup grid has cut my worktime in half! Quick question, Is there any way to inject into the translation pipeline so I could add Speaker Diarization?