Devlogs
Initial Release!
Posted August 01, 2025 by fynflood
#release
This is the first public release, so feedback is welcome and encouraged.
Core Features
- Offline Audio Transcription: Converts session audio (.mp3, .wav, etc.) into a text transcript using a local instance of whisper.cpp. Your audio data remains private.
- Hardware Acceleration: Can utilize NVIDIA (CUDA) or AMD (Vulkan) GPUs for significantly faster processing. A standard CPU fallback is also included.
- Per-Campaign Custom Dictionaries: You can create profiles for different campaigns and provide lists of unique names, locations, and game terms. This context is fed to the model to improve transcription accuracy.
- Optional AI Summarization: After a transcript is generated, you can use your own Google Gemini or OpenAI API key to create summaries. The application supports managing multiple, differently prompted summaries for a single session (e.g., a DM-focused version and a player-focused recap).
- Session Management: Transcripts and summaries are automatically saved and organized by campaign and session number for later review.
- Audio Recorder: Includes a basic tool to record audio from a microphone or directly from system audio output (for capturing online sessions).
- Export Options: Export content as .txt files, send summaries directly to a Discord webhook, or create new notes in an Obsidian vault.
Technical Overview
<span class="ng-star-inserted" <the="" application="" is="" built="" with="" python="" using="" the <="" span=""></span>The application is built with Python using the customtkinter library for the user interface.
Audio processing and file I/O for chunking long recordings are handled through direct, silent calls to ffmpeg and ffprobe to ensure stability and prevent console window pop-ups.
Transcription is accomplished with Whisper. A custom compiled version to work with AMD GPUs, and an out of the box version for use with Nvidia GPUs.
All required executables are bundled with the application.