If i am not wrong you can try to create a Candi AI system (i never used, but i suppose it similar to character tavern but with chat like approach instead of narrative?) adressing several things. 1) A chat like interface, not like chatGPT but something like messenger/whattsapp or things like that.
2) have the bot promptet to reply like in a conversation. Things like: 'Hi, how was your day?' instead of: *I enter the kitchen and saw you there, drinking a coffee as the steam caress your face and the sun paint glowing lines in the background.* "Hi, how was your day?" *bla bla bla*
3) A Diffusion model (like SDXL or SD1.5) with a character lora and well configured previously to create images. For example if you request the character some n****
4) Perhaps you may also need a reasoning model (Deepseek or Qwen models) even small parameters models tend to give good replies.
I don't know, those are just a few ideas. I made a prototype for a system that allows LLMs to control NPC's in a 3D game. It's called 'Coppersons' Is not the same as Candy.AI but maybe it can give you an idea/approach/whatever? it can be used in both 2D and 3D applications.