undefined

points

[-]

> I've tried using qwen and deepseek but they can't even output documents

What agent harness did you use? Usually, "write_file", "shell_exec" or similar is two of the first tools you add to an agent harness, after read_file/list_files. If it doesn't have those tools, unsure if you could even call it a agent harness in the first place.

by jxmesth17 hours ago|

parent|

[-]

Sorry for the confusion, I was actually talking about their Web based chat. Since most of my work is governance and docs, I just use their Web chats and they just refuse to output proper documents like Claude or Chatgpt do.

by embedding-shape17 hours ago|

parent|

[-]

Aha... Well, I let Codex (Claude Code would work too) manage/troubleshoot .xlsx files too, seems to handle it just fine (it tends to un-archive them and browse the resulting XML files without issues), seen it do similar stuff for .app and .docx files too so maybe give that a try with other harnesses/models too, they might get it :)

by jxmesth6 hours ago|

parent|

[-]

Yeah, it's just way easier to do via the web/mobile app but I'll give using it via the CLI a try. Thanks :)

by noduerme15 hours ago|

parent|

prev|

[-]

You're not giving an AI command line access to your work computer? How do you expect to keep up? /s

by dymk15 hours ago|

parent|

[-]

You give it command line access in a VM...

by ycui198611 hours ago|

parent|

[-]

i give it in real ubuntu, no vm, no docker. so long I don't ask it to organize files, it will behave. it has not screw me so far.

by dymk11 hours ago|

parent|

[-]

Godspeed

by koen_hendriks14 hours ago|

parent|

prev|

[-]

You mean a VM like the one that contains a 0day that can escape the sandbox that gets found every year at pwn2own?

by enneff13 hours ago|

parent|

[-]

Presumably you’re also using a browser to view this web page. There have also been vulnerabilities in that. You have to draw a line somewhere.

by andai13 hours ago|

parent|

prev|

[-]

I run mine as a separate unprivileged user. (No VM.) Am I pwned?

by dymk11 hours ago|

parent|

prev|

[-]

Maybe, but the sort of 0days you're talking about aren't exploited in any meaningful way for almost all developers.

by arcanemachiner9 hours ago|

parent|

prev|

[-]

"Seatbelts don't save the life of everyone who gets into an accident, so why bother wearing one?"

by chillfox11 hours ago|

parent|

prev|

[-]

You can make a harness fully functional with just the "shell_exec" tool if you give it access to a linux/unix environment + playwright cli.

by ecocentrik18 hours ago|

prev|

[-]

When was the last time you used Qwen models? Their 3.5 and 3.6 models are excellent with tool calling.

by jxmesth17 hours ago|

parent|

[-]

I gave it a try a few weeks ago tbh, I'll give it another shot tho. I mainly use their Web chats since that's easier to use and previously, qwen, deepseek, kimi, all were unable to output proper docx files or use skills.

by ecocentrik17 hours ago|

parent|

[-]

Try loading the models up in a coding harness like Claude Code. There's a few docx skills listed on Vercel's skill index.

https://skills.sh/tfriedel/claude-office-skills/docx

by ycui19869 hours ago|

parent|

prev|

[-]

outputting docx files does not have much to do with model capability. it is about whether tool calling has be configured .

by sscaryterry17 hours ago|

prev|

[-]

You can use GLM-5.1 with claude code directly, I use ccs, GLM-5.1 setup as plan, but goes via API key.

by zrn9002 hours ago|

prev|

[-]

You can just use Cline in VSCode to get most of the tooling you need - it works with all models. Including Xiaomi's new Mimo with 1m context window and blazing fast speed. It's much cheaper than Claude's biggest plan and with much, much more quota.

by NobleLie13 hours ago|

prev|

[-]

Yep Claude Code CLI does A LOT (which is now confirmed even more)

by jwitthuhn18 hours ago|

prev|

[-]

I've been using qwen-code (the software, not to be confused with Qwen Code the service or Qwen Coder the model) which is a fork of gemini-cli and the tool use with Qwen models at least has been great.

by ycui198611 hours ago|

prev|

[-]

qwen3.5 and qwen3.6 are both good at tool calling.

by estimator729216 hours ago|

prev|

[-]

You can use both codex and Claude CLI with local models. I used codex with Gemma4 and it did pretty well. I did get one weird session where the model got confused and couldn't decide which tools actually existed in its inventory, but usually it could use tools just fine.