upvote
I love hearing this.

My story: mostly business analytics (2005-2022), sales engineering, sales (both at same tech start up), and now running a solo consulting business.

I also really liked sales. Updating a CRM, not so much. But sales allowed me to spend my day talking with people about problems. No day the same, and lots of focus on finding different/better ways to communicate.

In what industries did these roles happen? Same industry/domain or have you changed that as well?

reply
Domain is all government, but the tech is different across each of them.

I love talking too, part of why I think pre-sales is a lot of fun. And I actually love my CRM work from a data perspective, but my background is in synthesizing data and optimization. Once I turned my sales process into a network optimizing problem, it became extremely interesting to me and imperative to keep the data current.

reply
Sales is amazing but if your companies sales people require engineering to build POCs a lot of the times or always have to sell some custom solutions, then it wastes a lot of resources and it usually indicates the company is losing product market fit.
reply
That is true. My current work is in bespoke environments with mainly non-technical buyers who have been burned in the past. Our POCs are pretty minor lifts to build credibility and have worked extremely well.

If you're working in SaaS or commodity products and have to run POCs a lot, you're totally correct.

reply
Can you share one such puzzle?
reply
I am a solution engineer mostly on the traditional ML side of things but have good knowledge of K8S/GKE. The most fun I had last year was helping a customer serve their models at scale. They thought it was cost prohibitive (500k inferences/second and a hard requirement of 7ms at p99) and so they were basically serving from a cache which was lossy (the combinatorial explosion of features made it so that to have full coverage you needed exabytes of ram) and was stale prone. We focused on the serving first. After their data scientists trained a New pytorch model (small one, 50k parameters more or less) we compiled to onnx (as the model is small and CPU inference is actually faster), grafted the preprocessing layers to the model so that you never leave the ONNX C++ runtime (to avoid python), and deployed it to GKE. A 8 core node using AMD genoa cpus managed to get 25k/inferences per second. After a bit of fiddling with Numa affinity, GKE DNS replication, Triton LRU caches and few other things we managed to hit 30k inferences per second. If you scale up to the traffic it would cost them few thousands per month, which is less than their original cache approach.

Now they are working on continuous learning so that they can roll out new model (it is a very adversarial line of business and the models get stale in O(hours)). For that part I only helped them design the thing, no hands on. It was a super fun engagement TBH

reply