upvote
I habe found llama-fit sometimes just selects a way to conservative load with VRAM to spare.
reply