Skip to main content

Indie game storeFree gamesFun gamesHorror games
Game developmentAssetsComics
SalesBundles
Jobs
TagsGame Engines

Hi, me again. So I've bee running the Github versions and in 0.0.228 I'm honestly not seeing any changes between runing default ollama, and ollama with the following environmental parameters:


OLLAMA_FLASH_ATTENTION=1,OLLAMA_KV_CACHE_TYPE=q8_0,OLLAMA_NO_KV_OFFLOAD=1


or


OLLAMA_FLASH_ATTENTION=1,OLLAMA_KV_CACHE_TYPE=q4_0,OLLAMA_NO_KV_OFFLOAD=1

that normal behavior or just ollama being it's usual shitty self...


Also, SD WebUI forge is kinda dead. WebUI is still in active development as is ComfyUI but the original forge is no longer maintained. I think there's a fork but yeah... for all the improvements it brought, the guy that made it moved on. I think he's the guy that made controllnet and is now working on video.

> honestly not seeing any changes between runing default ollama, and ollama with the following environmental parameters:
Hmm, I wonder if maybe the env vars are not being picked up? Tbh I wasn't quite sure how to test it. If you have a good idea then lmk!

> Also, SD WebUI forge is kinda dead. 

Yeah, that's true. Working on adding ComfyUI next! Any other local image gen tools you think I should add?

sorry, not really an ollama user. It's just worse if you're not using it a server that LM Studio. That being said, I just compared the speed and the RAM and VRAM footprint and it seems to be the same indifrent of what KV cache quant I chose. That's not really how these things are suposed to work. dunno any other way to test it thoug. I mainly use GUIs and stick as far away from the terminal as I can. Actually installing and uninstalling Hammer AI's flatpack is the most usage my terminal has seen in months. :)

As for other infrence engines for tti models, Automatica's SD webUI is still in active development but it's not the best. Forge has a fork but it's far from the same quality as the original since the original forge was made my a literal genius with a PhD and all. ConfyUI is the prefered infrence engine for most desktop and laptop image generations since it's the most efficient. The easiest way to install any of them on a PC would me Stability Matrix if you want to do some testing.

So, I've been playing around with the 230 betas (I think am at 233 now) and I really think you should be made aware that diffusion t2i models really don't like to generate images that are smaller than their default set size. That's 512x512 for SD 1.5 models and 1024x1024 for SDXL and FLUX. That doesn't mean you can't diverge a fair bit, but 256x256 really is not a good image size and that seems to be what the in-chat image generation seems to want to default at.

Oo, thank you so much! I bumped the values all up to 1024. Does it work better for you now? Or should intelligently choose 512 or 1024 based on the type of image generation model?

Well, since you ask so nicely here are my notes and observations after playing with a few offline models:

1. you don't get the info about what kind of checkpoint a diffusion model is so unless you want to keep a database of all released models and checkpoints for offline usage so my suggestion would be NO.  though I'd change the 1024x1024 default you use now.

2. since you DO know what model you're online checkpoints are that's a YES for them. Use 512x512/512x768/768x512 as sane defaults for the SD 1.5 models and 1024x1024/1024x1536/1536x1025 for the rest. The bigger the canvas the higher the quality of the image. A lot of SD 1.5 and SDXL base models still use hi-res fix because they generate at lower resolutions.

3. I recommend you also change from using the euler normal to dpm++ 2m beta as the in-chat default for generation. dpmpp 2m is as fast as euler step wise but it produces better results and is guaranteed to be in everything. If you don't want to change the sampler separatly because not all interfaces can do that keep it at normal, but beta tends to work better with pretty much all text2img models out there, unlike say Kerras which literally doesn't work in most non SD based models.

4. I also recommend you use square 1:1  image gen for the character image generation and 2:3/3:2 for in chat generation. 

5. I would also like to point out that for on-personal-hardware-generation your default negative prompts do not work. you might have novel AI embeds running for the online checkpoints but they're not there in core ComfyUI. When you generate something HammerAI only sends 'person' for the negative as an example in place of sending the full list of negative prompts, namely 'distorted face, unnatural pose, harsh shadows, poor lighting, over-saturated skin tones, extra digits'.  My suggestion would be to go to civitai and have a look in what are the standards positive and negative prompts for each model type you use and replace the negative prompts you're currently using with a series of defaults for each: SD 1.5. SDXL, pony, noobai, Illustrious, Flux, Imagen etc, where once selected, after we put our prompt and if the model supports a negative prompt a negative one, HammerAI automatically sends the usual positives and negatives in the format:

model/checkpoint specific positive BREAK actual prompt we put in or the AI generated for in-chat Argentinean
model/checkpoint specific negative prompts if applicable BREAK anything we ourselves put in the negative column

This WILL improve image quality from what is now by a lot, even if you stick with just 1024x1024 for in-chat generation, image quality will improve for most models. Note. some of the newer models might not need any specific prompts, but older SD1.5 and SDXL base models do. That being said, 128, 256 and 512 tend to be the stable model deviations for image size. 256 preferred over 128, 512 preferred over the other two. preferred in the sense that the models were trained on images limited to resolutions dividable by them and the VAE's trained on sized based on them as well. It doesn't matter what aspect or size the images originally had, diffusion models were trained on a very limited resized/cropped subset of them and the best results can be obtained by using resolutions that are perfectly divisible by one or all of those numbers. I think there's a new Chinese model that can produce images of more random sizes without any degradation, but I also think it's a gigantic models so you know. Note, this is for image generation, image editing has a bit more leeway but you're only doing generation so it matter to you.


ALSO... Mistral NeMo base models are bad at generating prompts from chat... they'er old and most diffusion models have come after them. I'd recommend into looking into some newer small models to see if you can't find something better and summing up chat and creating a prompt for a specific image generator. NeMo base models currently do more of a sum of chat thus far with a lot of caveats like including diologue or complex multiscens in the prompts they create for image generation. And nemo models tend to also be the largest 8GB of VRAM can fully load, not that any bigger model will be that much better if it's a finetune of a really old model or hasn't been trained on how to structure a t2i prompt for certain models. I really don't know what to suggest for this since I don't know what you're telling the model to do when I click the 'image' button in chat but I can see the prompt it makes and send for generation in comfyui. You might be able to improve this by changing how you ask for an image generating prompt to be created.

Thank you so much! This is incredibly useful. I will implement these suggestions. Curious, any chance you're also an engineer and would like to get paid to come help work on HammerAI image gen? Would love someone with your expertise to help out directly!

While I appreciate the offer I'm not and would hate to scam someone who's giving us this wonderful program for free. As for how I become so knowledgeable. Well, I just decided at the start of the year that AI wasn't going anywhere so I might as well start experimenting with it and since I didn't have a 4090 to play with but didn't want to be dependent on someone else's server I had to learn a few things. 


It took some time before I discovered that Flash attention doesn't work with all models but if it does it will not degrade the response at all while diminishing the VRAM/RAM needed, or that if you can't fit both KV cache and model into memory it's better to let the KV cache in RAM and fit just model into VRAM. A while to learn that ComfyUI was the best and fastest way to generate images on consumer grade hardware, a while to find out the limits of diffusion models and checkpoints etc.  And none of this info is in any one place. I'm sure that for people that work at google or meta it's all very clear, but if you're just now getting into AI... the learning curve, especially for consumer grade hardware is steep. Especially if like me one is NOT an engineer and has to rely on programs like LMStudio, Stability Matrix or HammerAI. But I'm more than willing to share the little I have learned with you. Cloud models will always be superior to locally run models, but local offers privacy and in the care of RP spiciness. Plus, I want to see how hard I can push my hardware to go.


That being said, if you want a killer feature given the limited context you offer online, try to see if you can get a model to generate new intros for existing characters. If you go to chub you'll see most intros aren't alternatives but continuations of chats from the previous intro. If you can get HammerAI to take the charterer sheet and the base intro and generate a new intro you would have surpassed the core context window limitation most of your competitors currently face and made the characters of HammerAI be secondary only to the large context window behemoths like OpenAI or Grok.

(1 edit)

Thank you so much! This is really so incredibly useful. I just changed the default samplers, and will look at the rest. 

If you ever do learn to code I'd be happy to hire you! Alternatively, I'd happily pay you to help improve the prompts in the app and how I set up different parts of image generation? You wouldn't need to code, we could just be on a call and look through different parts of the app? You have so much more expertise than me here, and I think it could really help out all the HammerAI users! Feel free to DM on Discord (hammer_ai) if you're interested?