Hey!
Can dig into this later, but I remember that the Arabic support script did something like this to get ligatures working.
My RTL support script: https://pastebin.com/h9fyzJcq
Arabic support for Unity by Konash: https://github.com/Konash/arabic-support-unity
STM Emoji Support: https://pastebin.com/qr9me3GA
Essentially, the Unity inspector and other aspects just fail when it comes to ligatures, languages with connected scripts, rendering RTL text, and emoji.
I think if you go through the source code for Konash's Arabic support script and my Emoji support script, you'll be able to find a way to take the original string, and send that in a way that renders correctly. But it would have to skip anything related to Unity's inspector, so... a replacement script like what you wrote or something that fixes these problems on the fly is probably the way to go. Check out STM's "PreParse" event too, that method is meant for doing things like this, taking text sent to STM, and modifying it before it's actually processed by STM. (My Emoji script shows how to do this) That will let you just have this as a component you can put on any STM object you want to have ligature support. (Or any other custom processing code you can imagine)