upvote
deleted
reply
It has a JSON mode that you need to enable in settings and then you can create a simple python script to interact with it or have the agent use `curl` and `jq` to interact with it.

It's at the bottom of this page: https://docs.searxng.org/admin/settings/settings_search.html

reply
I am also interested in what a full local AI stack with web search and other tools looks like. As far as I can tell, SearX does not embed an MCP server, so it can't be directly called from llama-server for example. Open WebUI does have an integration for SearX and other providers, but the results I obtained weren't particularly impressive.
reply
I use Searxng through Onyx, both as regular search and Onyx's Deep Research mode. I also have https://github.com/ihor-sokoliuk/mcp-searxng to add search to coding agents. Haven't really had many issues with it.
reply
are you running a quant?

i have a friend with a 4080 that is wanting to experiment with local models and those cards should be similar enough. can you give any more detail about your setup? ty!

reply
Yep -

`gemma4-26b-a4b-it-qat.gguf`

https://huggingface.co/lmstudio-community/gemma-4-26B-A4B-it...

It is really great to use. As the poster above mentioned, my setup with Sear is the following, all through `llama.cpp`, which has a built-in webui with an MCP client:

* SearXNG in Docker — enable the JSON API (`search.formats: [html, json]`; off by default).

* `searxng-mcp` (FastMCP, native streamable-HTTP): `TRANSPORT=streamable-http HOST=127.0.0.1 PORT=8100` `SEARXNG_URL=http://localhost:8888 uvx --from searxng-mcp --with fastmcp searxng-mcp`

* `llama-server` with `--webui-mcp-proxy`, then add the server in the webui.

Some gotchas:

* `searxng-mcp` forgets to declare its own dep → `--with fastmcp`.

* Endpoint is `/mcp`, not the `/searxng-mcp/mcp` the docs claim.

* `--webui-mcp-proxy` only enables the CORS proxy; each MCP server entry still needs its "Use llama-server proxy" checkbox ticked, or the browser fetches direct and CORS-fails.

* Terminal clients (OpenCode etc.) skip the proxy — point them straight at `:8100/mcp`.

A couple interesting tidbits:

* There are temporal issues with search-related tool calls. The model trips out. 2026 results read to it a "future-dated hallucination" because it doesn't know the date. There's an additional `--tools get_datetime` function that will allow it to ground via the real date.

* Snippets-only is enough for most "what's current" questions and keeps context tiny.

Let me know if you have any questions!

reply
https://www.zarl.dev/posts/hal-by-any-other-name Here is my write up on my local model setup also have https://zarldev.github.io/zarlmono/ as my local 1st coding agent
reply