points
Why should connecting small models to big models result in higher output quality than just running the big models without the small models?