hey, i have worked with articulated sound systems like this before (can check https://dyscomorph.itch.io/meditesselation-o and https://soundcloud.com/givehmthaphreudianslep/gathering-post) and wouldn't mind helping out with this.
my frame 1 recommendation is that the normal enemy should operate like this: 1234, 223 (on the beat, perhaps a droning note, i can help source notes depending on how production flows, im guessing its inbetween a midi system and soundtrack system) on the first 1234 they can attack towards the player and on 223 they fall back. The slightly larger enemy can keep the current pattern and a good 'musical' spawn is starting on the second loop of the normal enemies pattern (so 1234 223, 1234 223, 1 unique spawn).
my recommendation is doing patterns on a consistent tempo, but we can have enemies/periods that are up-tempo, also guns should fire on beat (will have their own patterns, such as a revolver 123456 2 beat reload, or a pistol 12345678 - 1 measure reload for balance?, maybe too punishing). we can build up, implementing is not always easy (or my strong suit lmao) smh
this my discord "theshovelier", if you have a different communication go-to lmk (sorry thought this was a feeler for a composer lol, imma let it ride)