Yeah, I tried that and get a CUDA out of memory error. Is there a work around? I have more than enough shared memory.
Tried 150it and 256x256 with one prompt. Keeps outputing black screens. What could I be doing wrong?