upvote
> Sure, not related to DSPy though, and completely tablestakes.

I agree but you'd be surprised at how many people will argue against static typing with a straight face. It's happened to me on at least three occasions that I can count and each time the usual suspects were trotted out: "it's quicker", "you should have tests to validate anyhow", "YOLO polymorphism is amazing", "Google writes Python so it's OK", etc.

It must be cultural as it always seems to be a specific subset of Python and ECMAScript devs making these arguments. I'm glad that type hints and Typescript are gaining traction as I fall firmly on the other side of this debate. The proliferation of LLM coding workflows has likely accelerated adoption since types provide such valuable local context to the models.

reply
> not sure why the whole article assumes the only language in the world is Python

https://github.com/ax-llm/ax (if you're in the typescript world)

reply
>the whole article assumes the only language in the world is Python.

This was my take as well.

My company recently started using Dspy, but you know what? We had to stand up an entire new repo in Python for it, because the vast majority of our code is not Python.

reply
I think this is an important point! I am actually a big fan of doing what works in the language(s) you're already using.

For example: I don't use Dspy at work! And I'm working in a primarily dotnet stack, so we definitely don't use Dspy... But still, I see the same patterns seeping through that I think are important to understand.

And then there's a question of "how do we implement these patterns idiomatically and ergonomically in our codebase/langugage?"

reply
Out of curiosity, what are you finding success with in dotnet land? My observation is that it's not clear when Semantic Kernel is recommended versus one of multiple other MSFT newly-branded creations
reply
Agent Framework + middleware + source generation is the way to go.

Agent Framework made middleware much easier to work with.

Source generation makes it possible to build "strongly typed prompts"[0]

Middleware makes it possible to substitute those at runtime if necessary.

[0] https://github.com/CharlieDigital/SKPromptGenerator/tree/mai...

reply
we have been using Agent Framework. I also have been eyeing LlmTornado. Personally, I find dotnet as a whole hard to implement the kind of abstractions I want to have to make it ergonomic to implement AI stuff.

I've been fiddling around with many prototypes to try to figure out the right way to do this, but it feels challenging; I'm not yet familiar enough with how to do this ergonomically and idiomatically in dotnet haha

reply
Why did you do that instead of using Liquid templates?
reply
I think all of these things are table-stakes; yet I see that they are implemented/supported poorly across many companies. All I'm saying is there are some patterns here that are important, and it makes sense to enter into building AI systems understanding them (whether or not you use Dspy) :)
reply
I can say for 10 years I have been looking at general purpose frameworks like Dspy and even wrote one at work and they tend to be pretty bad, especially the one I wrote.

I agree with all the points that they list but I fear if I looked close at the code and how they did it I wouldn't stop cringing until I looked away. Frameworks like this tend to point out 10 concerns that you should be concerned about but aren't and make users learn a lot of new stuff to bend their work around your framework but they rarely get a clear understanding of what the concerns are, where exactly the value comes from the framework, etc.

That is, if you are trying to sell something you can do a lot better with something crazy and one-third-baked like OpenClaw, which will make your local Apple Store sell out of minis, than anything that rationally explains "you are going to have to invent all the stuff that is in this framework that looks like incomprehensible bloat to you right now." I mean, it is rational, it is true, but I can say empirically as a person-who-sells-things that it doesn't sell, in fact if you wanted me to make a magic charm that looks like it would sell things and make sure you don't sell anything it would be that.

reply
yeah the point I want to get across is less "you should use Dspy" and more "understand Dspy, so you are intentionally implementing the capabilities you need"

Implementations are generally always going to be messy; and still I feel like not all the messiness is incidental. A lot of it is accidental :)

reply
Dspys advertising aside, imho it is a library only for optimizing an existing workflow/ prompt and not for the use cases described there. Similar to how I would not write "production" code with sklearn :)

They themselves are turning into wrapper code for other libraries (e.g. the LLM abstraction which litellm handles for them).

Can also add:

Option 3: Use instructor + litellm (probabyly pydantic AI, but have not tried that yet)

Edit: As others pointed out their optimizing algorithms are very good (GEPA is great and let's you easily visualize / track the changes it makes to the prompt)

reply
The sklearn to me is (and mirrors) the insane amount of engineering that exists/existed to bring Jupyter notebooks to something more prod-worthy and reproducible. There’s always going to be re-engineering of these things, you don’t need to use the same tools for all use cases
reply
Hmm not quite what I meant. Sklearn has it's place in every ML toolbox, I'll use it to experiment and train my model. However for deploying it, I can e.g. just grab the weights of the model and run it with numpy in production without needing the heavy dependencies that sklearn adds.
reply
In my experience the behavior variation between models and providers is different enough that the "one-line swap" idea is only true for the simplest cases. I agree the prompt lifecycle is the same as code though. The compromise I'm at currently is to use text templates checked in with the rest of the code (Handlebars but it doesn't really matter) and enforce some structure with a wrapper that takes as inputs the template name + context data + output schema + target model, and internally papers over the behavioral differences I'm ok with ignoring.

I'm curious what other practitioners are doing.

reply
Model testing and swapping is one of the surprises people really appreciate DSPy for.

You're right: prompts are overfit to models. You can't just change the provider or target and know that you're giving it a fair shake. But if you have eval data and have been using a prompt optimizer with DSPy, you can try models with the one-line change followed by rerunning the prompt optimizer.

Dropbox just published a case study where they talk about this:

> At the same time, this experiment reinforced another benefit of the approach: iteration speed. Although gemma-3-12b was ultimately too weak for our highest-quality production judge paths, DSPy allowed us to reach that conclusion quickly and with measurable evidence. Instead of prolonged debate or manual trial and error, we could test the model directly against our evaluation framework and make a confident decision.

https://dropbox.tech/machine-learning/optimizing-dropbox-das...

reply
It's not just about fitting prompts to models, it's things like how web search works, how structured outputs are handled, various knobs like level of reasoning effort, etc. I don't think the DSPy approach is bad but it doesn't really solve those issues.
reply
funnily enough the model switching is mostly thanks to litellm which dspy wraps around.
reply