Skip to main content

On Sale: GamesAssetsToolsTabletopComics
Indie game storeFree gamesFun gamesHorror games
Game developmentAssetsComics
SalesBundles
Jobs
TagsGame Engines

AlucardNoir

26
Posts
1
Topics
A member registered Nov 07, 2019

Recent community posts

While I appreciate the offer I'm not and would hate to scam someone who's giving us this wonderful program for free. As for how I become so knowledgeable. Well, I just decided at the start of the year that AI wasn't going anywhere so I might as well start experimenting with it and since I didn't have a 4090 to play with but didn't want to be dependent on someone else's server I had to learn a few things. 


It took some time before I discovered that Flash attention doesn't work with all models but if it does it will not degrade the response at all while diminishing the VRAM/RAM needed, or that if you can't fit both KV cache and model into memory it's better to let the KV cache in RAM and fit just model into VRAM. A while to learn that ComfyUI was the best and fastest way to generate images on consumer grade hardware, a while to find out the limits of diffusion models and checkpoints etc.  And none of this info is in any one place. I'm sure that for people that work at google or meta it's all very clear, but if you're just now getting into AI... the learning curve, especially for consumer grade hardware is steep. Especially if like me one is NOT an engineer and has to rely on programs like LMStudio, Stability Matrix or HammerAI. But I'm more than willing to share the little I have learned with you. Cloud models will always be superior to locally run models, but local offers privacy and in the care of RP spiciness. Plus, I want to see how hard I can push my hardware to go.


That being said, if you want a killer feature given the limited context you offer online, try to see if you can get a model to generate new intros for existing characters. If you go to chub you'll see most intros aren't alternatives but continuations of chats from the previous intro. If you can get HammerAI to take the charterer sheet and the base intro and generate a new intro you would have surpassed the core context window limitation most of your competitors currently face and made the characters of HammerAI be secondary only to the large context window behemoths like OpenAI or Grok.

Well, since you ask so nicely here are my notes and observations after playing with a few offline models:

1. you don't get the info about what kind of checkpoint a diffusion model is so unless you want to keep a database of all released models and checkpoints for offline usage so my suggestion would be NO.  though I'd change the 1024x1024 default you use now.

2. since you DO know what model you're online checkpoints are that's a YES for them. Use 512x512/512x768/768x512 as sane defaults for the SD 1.5 models and 1024x1024/1024x1536/1536x1025 for the rest. The bigger the canvas the higher the quality of the image. A lot of SD 1.5 and SDXL base models still use hi-res fix because they generate at lower resolutions.

3. I recommend you also change from using the euler normal to dpm++ 2m beta as the in-chat default for generation. dpmpp 2m is as fast as euler step wise but it produces better results and is guaranteed to be in everything. If you don't want to change the sampler separatly because not all interfaces can do that keep it at normal, but beta tends to work better with pretty much all text2img models out there, unlike say Kerras which literally doesn't work in most non SD based models.

4. I also recommend you use square 1:1  image gen for the character image generation and 2:3/3:2 for in chat generation. 

5. I would also like to point out that for on-personal-hardware-generation your default negative prompts do not work. you might have novel AI embeds running for the online checkpoints but they're not there in core ComfyUI. When you generate something HammerAI only sends 'person' for the negative as an example in place of sending the full list of negative prompts, namely 'distorted face, unnatural pose, harsh shadows, poor lighting, over-saturated skin tones, extra digits'.  My suggestion would be to go to civitai and have a look in what are the standards positive and negative prompts for each model type you use and replace the negative prompts you're currently using with a series of defaults for each: SD 1.5. SDXL, pony, noobai, Illustrious, Flux, Imagen etc, where once selected, after we put our prompt and if the model supports a negative prompt a negative one, HammerAI automatically sends the usual positives and negatives in the format:

model/checkpoint specific positive BREAK actual prompt we put in or the AI generated for in-chat Argentinean
model/checkpoint specific negative prompts if applicable BREAK anything we ourselves put in the negative column

This WILL improve image quality from what is now by a lot, even if you stick with just 1024x1024 for in-chat generation, image quality will improve for most models. Note. some of the newer models might not need any specific prompts, but older SD1.5 and SDXL base models do. That being said, 128, 256 and 512 tend to be the stable model deviations for image size. 256 preferred over 128, 512 preferred over the other two. preferred in the sense that the models were trained on images limited to resolutions dividable by them and the VAE's trained on sized based on them as well. It doesn't matter what aspect or size the images originally had, diffusion models were trained on a very limited resized/cropped subset of them and the best results can be obtained by using resolutions that are perfectly divisible by one or all of those numbers. I think there's a new Chinese model that can produce images of more random sizes without any degradation, but I also think it's a gigantic models so you know. Note, this is for image generation, image editing has a bit more leeway but you're only doing generation so it matter to you.


ALSO... Mistral NeMo base models are bad at generating prompts from chat... they'er old and most diffusion models have come after them. I'd recommend into looking into some newer small models to see if you can't find something better and summing up chat and creating a prompt for a specific image generator. NeMo base models currently do more of a sum of chat thus far with a lot of caveats like including diologue or complex multiscens in the prompts they create for image generation. And nemo models tend to also be the largest 8GB of VRAM can fully load, not that any bigger model will be that much better if it's a finetune of a really old model or hasn't been trained on how to structure a t2i prompt for certain models. I really don't know what to suggest for this since I don't know what you're telling the model to do when I click the 'image' button in chat but I can see the prompt it makes and send for generation in comfyui. You might be able to improve this by changing how you ask for an image generating prompt to be created.

So, I've been playing around with the 230 betas (I think am at 233 now) and I really think you should be made aware that diffusion t2i models really don't like to generate images that are smaller than their default set size. That's 512x512 for SD 1.5 models and 1024x1024 for SDXL and FLUX. That doesn't mean you can't diverge a fair bit, but 256x256 really is not a good image size and that seems to be what the in-chat image generation seems to want to default at.

sorry, not really an ollama user. It's just worse if you're not using it a server that LM Studio. That being said, I just compared the speed and the RAM and VRAM footprint and it seems to be the same indifrent of what KV cache quant I chose. That's not really how these things are suposed to work. dunno any other way to test it thoug. I mainly use GUIs and stick as far away from the terminal as I can. Actually installing and uninstalling Hammer AI's flatpack is the most usage my terminal has seen in months. :)

As for other infrence engines for tti models, Automatica's SD webUI is still in active development but it's not the best. Forge has a fork but it's far from the same quality as the original since the original forge was made my a literal genius with a PhD and all. ConfyUI is the prefered infrence engine for most desktop and laptop image generations since it's the most efficient. The easiest way to install any of them on a PC would me Stability Matrix if you want to do some testing.

Hi, me again. So I've bee running the Github versions and in 0.0.228 I'm honestly not seeing any changes between runing default ollama, and ollama with the following environmental parameters:


OLLAMA_FLASH_ATTENTION=1,OLLAMA_KV_CACHE_TYPE=q8_0,OLLAMA_NO_KV_OFFLOAD=1


or


OLLAMA_FLASH_ATTENTION=1,OLLAMA_KV_CACHE_TYPE=q4_0,OLLAMA_NO_KV_OFFLOAD=1

that normal behavior or just ollama being it's usual shitty self...


Also, SD WebUI forge is kinda dead. WebUI is still in active development as is ComfyUI but the original forge is no longer maintained. I think there's a fork but yeah... for all the improvements it brought, the guy that made it moved on. I think he's the guy that made controllnet and is now working on video.

For the desktop version. Basically, there are two way to use AI, one with slowly building KV cache over yeach new prompt and reply and one where the user sends the entire conversation back to the AI to be processed with each new prompt to get a new response. On desktop it's faster to use KV cache than it is to reprocess the entire conversation again and again. Thing is, the KV cache can be separate from from rest of the model. IF the offload to VRAM option is used and there is enough VRAM it's always faster BUT if there isn't enough VRAM for the desired KV cache size then part of the model and part of the KV cache are in VRAM and the rest are in RAM and this is always slower. 

If you can fit everything in VRAM you're at  say 21 tokens per second, with only the model in VRAM and KV cache in RAM you'd be at around 15, and with part of the model and part of the KV cache in VRAM and the rest in RAM you could go as low as 10 or even 5 tokens per second. So it's always preferable to only load the model in VRAM and let the KV cache in RAM if you can't fit everything in VRAM. For the website, since you're only using 4k context window as long as everything fits in VRAM, I wouldn't touch it - if it ain't broke don't fix it and whatnot. But on desktop, allowing us to keep KV cache only in RAM or offload it to VRAM, can significantly increase performance.


As for recommended models. I'd say move the Nous models to Hermes 3 (nonthinking) look into the ArliAI RPMax v.1.3 series of models (4 models at 4 sizes, based on 3 different bases, Llama 3.1, qwen 2.5 and Mistral Nemo), and the latest Latitudegames models. I'm using Wayfarer 12b for RP and Muse 12b for story writing (latitudegames models) but they have larger models and again, all open source and on Huggingface. Dreamgen is also doing interesting stuff, but their older stuff is, well, older, and the new model - Lucid - is still in beta and fairly bad at following instruction. 

But yeah, try Wayfarer, at least for me it's significantly superior to the Drummer Rocinante you have as a default option. I get actual RP responses from it wile Rocinante 12b wants to just continue my own posts 90% of the time. Also, I'd probably remove the thinking models from the default options. Honestly, most people are not going to have the kind of hardware to run them at high enough speeds to make the thinking steps worth it - at least not on desktop. Especially for smaller models that even unquantize can still catch themselves in an infinite thinking loop. 

Overall, I'd try to find finetunes and test them if I were you. What I recommended is what I tested and found to be an improvement over what came before. I'd stay away from mergers, ablated and uncensored models. Just try to find RP and story finetunes that are open source and on huggingface to test. Also, and you did not hear this from me, try IBM's Granite 3.3 8B model... for a model designed for office work and which was instruction trained to be harmless and safe boy does it follow nsfw instructions well. And I do mean NSFW. And it's Apache 2.0 :)


As for IQ quants, they can offer similar performance to KV cache at smaller sized - but are only similarly fast under ROCm and CUDA - significant slowdowns under Vulkan and CPU. I know Ollama supports them, though I don't think you can DL IQ quant from their site directly. An IQ4_XS should be very similar to a Q4_KS in output - within margin or error for RP and story purposes - but substantially smaller. 

Yes and no. On the one hand when the models when they run they do run better, on the other hand it's still impossible to run the model set as character model and I have to use the custom model option. The good news is that it does run faster and that at least on custom I haven't encountered the issues I had before where it just wound't run.


BTW, did you put the linux AMD ROCm support in just for me as your one known linux user or is HammerAI actually detecting that I have an AMD AI capable CPU with iGPU besides the Nvidia GPU it's using right now? Because if the latter that's actually impressive - ROCm support is so spotty on linux is might as well not be there. The 780m from AMD is a lot weaker than the 4060 so I don't think It will see much usage, but I might try bigger models just to see how it behaves if Hammer can actually use the AMD iGPU natively.

PS. Please add a few newer RP models. Some of your competitors have a few finetunes that are open source licenses. ArliAI, Latitudegames, Dreamgen. Please add a few newer Nemo finetunes. Also, and this is just up to you, consider IQ quants and do not offload KV cache to VRAM. IQ quants can make 8GB fully enough to 100% offload most non-gemma 12B models, as long as one doesn't also try to offload KV cache to the Vram. That's in case you're not doing this already.


Anyways, cheers and thanks for the new Ollama update, it did in fact help.

I have to say, after I played with it for a while, it's rather disappointing.  The usage of Ollama makes the performance far worse that it could otherwise be and the default models never work correctly. Sometimes they work, sometimes they just decide to give you no letters or a letter or 2-3 words and then stop generating . Nothing but closing the app can save the performance at that point. The available models are also far from the best RP models out there, with many being neither RP nor Story models. The website is decent but the promise of a desktop application is not fulfilled by this app. It also doesn't help that the Ollama back end is slower both to first token and to generate than alternatives like KoboldCCP and LM Studio. Overall, a disappointing program that would probably have potential were it not so fundamentally based around Ollama.


But this is just after a week of random usage, I'm sure other people that have been using it for longer might be able to give both more constructive criticism and more pertinent reviews. Better than Sillytavern on account of being actually designed for desktop/laptop use as opposed to server usage, but that's about it.

Not really, they work. I would probably prefer a more LM Studio like approach where we can select if we want to run local models via CUDA, Vulkan or CPU but no, the applications works. 

That being said, just out of curiosity, aren't you running the website on a linux servers? I mean it's probably a docker container but still, I'd be actually shocked if you told me you're running the website on a windows servers.

Stupid question: why provide a flatpack in place an appimage if you're not going use flathub?

From Solus here and can confirm the same grey screen bug. Tried it on a laptop, Solus (linux kernel 5.14), i5 9300H, Nvidia 1650 GTX (with nvidia driver), on an SSD.

I'd like to disagree with you on one point. PC games weren't always this update prone. 

30 years ago most games, PC or otherwise did not have updates.

25 years ago some PC games would sometimes get an update.

20 years ago, because of increased internet speeds, updates became the norm for a lot of PC titles, but most games were still finished at release.

15 years ago updates became a thing on console, because not of higher internet speeds but because of the ubiquity of broadband.

10 years ago or around that time is when the shit hit the fan. Not only were microtransations booming thanks to a 2009 Horse Armour game, always online DRM was becoming the norm and the first day one patches started to become the norm on PC and  console. And that's all ignoring kickstarter and steam greenlight.

5 years ago is when indies went to patreon/early access, and AAA started going all in on microtransactions. Not a year has passed since that time where we did not see yet another AAA studio saying they'd change their focus towards lootboxes and microtransacions.

The good news is indies are blooming. The bad news is that they're doing it either on kickstarted/indigogo, on patreon or via early access. The real bad news is that for every honest dev we get a dozen that either quit or are just scamming people.


And with porn being treated as a second class citizen in most stores, online or otherwise, and with the state of the game industry being where it is, it's not wonder a lot of devs have no idea how to make a good game or the proper way to develop anything, let alone a game. Games like Deviant Anomalies are rare, in that they're porn games and good games at the same time. But, considering how few porn games ever get finished, indifferent of how much money gets poured into them on Patreon, and how some the 1.0 games play... well, let's say I'm hoping for the best but I'm not expecting much.

I actually deleted my F95 account after a spat with someone on there in regards to the topic or such games being release episodically. For some (most?) devs that episodes are nothing more then internal chapters and every "new" episode is just the latest version of a game. For others  an episode is just that, a stand alone game short game that continues from where their previous game left off. The 1.0 debate is the same. For some (again, most) 1.0 means final release. but for some devs, 1.0 is just an arbitrary point that means nothing. That's why I was concerned. I like the direction of this game, but if it hit's 1.0 with this little content I'm out. The same for games whose devs suddenly, decide to split the game into the old version and the new version - see My New Family for an example of that.


As for AAA games doing shit like this... well, that's our fault frankly. If we keep doing shit like preordering, buying games before reviews are out, and not asking for refunds of newly released games that don't work in the hope that they might one day work the only people to blame are us, not the corporate execs overworking developer who make the decision to release games before they're ready.

Normally I'd be with you on that but some devs think the number after 0.9 is 1.0 indifferent of the state of the game. Just look at Lord King as an example of someone doing just that fairly recently.

Decent game and decent update though I think calling this 0.4.3 is a bit too optimistic. This seems to be more of a sandbox/visual novel combo than a pure VN and I doubt anybody, the author included, will be satisfied with this ending too soon and with not enough cases for us to solve. Just my 2 cents.

She was fired two days before her birthday. It's on the back of the photo... now if you haven't found the photo yet...

I really wish you had gone with 0.10 as opposed to 1.0.

Rereading my comment, I'd like to clarify something. When I say this series showcases your growth I mean that literally. The last game of the series in significantly better. Even if the models uses are the same old, the way they are posed, the way they're used, the diversity of backgrounds and models, and the writing, all see a significant improvement. You did in fact become a better game developer over the course of this series. I know some of your newer stuff has better models, though simpler stories. Here's hoping once you get the hang of the newer, better looking assets you try your hand at another story like you did in this series.

Interesting series, though considering the quality of the assets, the images are a bit too large - and the the games a bit too large - for what we're playing. That being said, you do seem to be one of the few adult game creators that actually finishes his games... even if they're short enough to mostly count as demos.

Should you ever decide to move to bigger, better things as it were, remember this series, because while the art and story might not be great, hell, sometimes they fall beneath mediocre, not only does this series showcase your growth as an artist, but as an author and story teller. It would be interesting to see you try to tackle this subject a second time, but this time as just one game wit ha few choices.

Were it not for the - in my opinion - way too CG looking Demons this would be one of the best games of it's kind out there. It might just be me but the demon CGIs just take me out of it. Otherwise it's a 10/10. Probably the best game of it's kind on itch.io.

Game version: Version 34

OS: Linux, distro Solus, DE Budgie.

Bug description: Tittle pretty much says it all. Game opens, then crashes, taking the DE with it, forcing it to restart.

Way to replicate: Dl, unzipped, checked to see if the executable had permission to run - it did. Tried to open it and it crashed to desktop - taking budgie with it. Tried again and it forced a log off since the system couldn't recover.

Second attempt was via terminal. I get one or two dozen errors, with a final warning that one of the files it had loaded - object something or other - hadn't been successfully unloaded and was still in memory. Second attempt to open takes the entire system with it. 


Unfortunately I erased the game before thinking of filling this and since the game doesn't work to begin with I'm in no mood to redownload it just to make a copy of the error messages. Sorry, but I won't be any more help.

I dunno if I agree with the statement "Solid graphics quality." from the games description but I do know one thing, this game is funny. The dialogue is actually captivating and funny. It's not perfect, but it's actually a step or two above other games in the same genre. I can't believe I'm writing this but this is actually one of those games I'd recommend not for the porn as much as I'd recommend it for the dialogue.

I'm not complaining the incest was removed, I'm complaining about how illogical and unreadable the text is now that the incest has been removed. This is a visual novel with just a few choices - thought they tend to be major choices. Because of the way the incest was bruteforced out of the game the text now makes no goddamn sense.

Because in a lot of games incest isn't the main theme and you can chose not to engage with that part of the game.

The incest removal is so clunky it's actually in LOL territory. The game is almost unplayable because of the aforementioned clunkyness. Wile I'm not a fan of incest, I dunno if this was that great of a move - at least in the brute force it was done. I take it it was imposed by Patreon or Ichi.io rules?

On the one hand I pity the dev, he's going to have one hell of a time rewriting the game dialogue thus far. On the other I really don't care for incest so in the long run this will only make the game more playable for people like me.