upvote
Thanks! Yeah, the single-species focus does a lot of the work. Under the hood it's not one big model - there's a cannabis verification gate, then routing into disease vs pest vs deficiency, then narrower classifiers from there. Each one has a simpler job so accuracy stays high.

Early on the photography thing was a real problem. Training data was mostly decent shots, then inference would come in as some blurry phone photo under purple LEDs.

Confident misclassifications. The fix wasn't clever - just more data that looks like how people actually take photos of their plants. Messy, badly lit, half the leaf out of frame. Once there was enough of that in the training set the models stopped caring about white balance. About 1.1 million augmented images now and light temperature just isn't a factor. No color card needed.

For tissue culture - I'd bet the multi-species part is what's killing you. I'd pick the single highest-value species, collect a probably-uncomfortable amount of well-labeled data for just that one, and see if things change. Right now you might not be able to tell what's a data problem vs a fundamental limitation, because the generalization overhead masks both.

reply
> there's a cannabis verification gate, then routing into disease vs pest vs deficiency, then narrower classifiers from there. Each one has a simpler job so accuracy stays high.

That never occurred to me. That's a great insight.

> I'd pick the single highest-value species, collect a probably-uncomfortable amount of well-labeled data for just that one

I think you're right. If I want to move forward with it I think it's the only feasible way to validate a proof of concept. Generalizing can't produce a useful tool at my scale.

Thank you! I think this was a helpful nudge. Narrow classifiers could make some things a lot easier. Do you know of any reading materials about routing like this? Is it just programmatic decision tree stuff, or is there something more clever I'm unaware of?

reply
Glad it helps. As for narrow classifiers, it's decision tree logic as you say, and best done via trial and error than over-engineering and theory. Cleverness comes from your own experience :)
reply