Last year, I came across the StreamDiffusion project on GitHub, and I wondered if it would be possible to integrate it into a game engine — specifically Unreal Engine — so that artists could dynamically prompt a style filter applied to the entire game view. It was an exciting idea: real-time, AI-driven stylization controlled by text prompts.
The goal was to create a full-screen post-process filter that could change the visual style of the game in real time based on user input.
The setup was built on Unreal Engine 5.4, with StreamDiffusion’s latest version (though the project had been inactive since late 2023).
I aimed for a playable prototype, not a production-ready feature. In practice, it barely reached 1 FPS, and the visual quality was far from what StreamDiffusion is capable of outside Unreal. The project was ultimately aborted, but not without learning a lot along the way.
Here's what the prototype could look like with a functional pipeline.
To bridge Unreal and StreamDiffusion, I packaged the Python project into an executable and handled the communication between the two via Pipe. Unreal sent frames to StreamDiffusion, and the generated images were sent back and rendered via a Render Target attached to the main camera.
Everything ran locally on a RTX 4080 Super with 64 GB of RAM, but hardware wasn’t the bottleneck.
The real issue was architectural.
The main difficulty was launching and communicating with the Python process at runtime. Handling asynchronous inputs and outputs through pipes worked, but the latency was significant, synchronization was fragile...
Performance was, unsurprisingly, terrible.
The whole pipeline was too heavy:
Unreal capturing frames
Sending them to the external process through pipes
Waiting for StreamDiffusion’s output
Reinjecting and displaying the result
At less than 1 FPS, it was more of a slideshow than an effect. The visual results were also underwhelming, mostly because the input frames weren’t optimized for the diffusion process and because I hadn’t found the right parameters for my specific application.
Packaging the Python project into an executable was a poor choice.
A better approach would have been to use a Python wrapper for C++, such as <Python.h> or a more modern binding solution, keeping everything within a single multithreaded environment.
If I were to start again, I would:
Use a more recent real-time image generation project, better optimized for low-latency tasks (if this kind of project exists).
Rework the input/output pipeline, possibly modifying the render texture before feeding it to the generator.
Experiment with ControlNet by sending rig or pose data to improve image consistency.
The pipeline worked, technically. But it was far from usable in real time, and visually disappointing. Still, it proved the concept: Unreal can, in theory, communicate with a live image generation process and re-render its results on the fly.
This was purely experimental, but a valuable reminder that bridging two complex ecosystems (Unreal and AI generation) requires architectural thinking first, not just brute-force connections.
The project didn’t meet its performance goals, but it offered deep insight into Unreal’s rendering pipeline and the challenges of integrating live AI generation.
Did you like this post? Tell us
Leave a comment
Log in with your itch.io account to leave a comment.