Three tuple: (original image, text edit instruction, final image).
Easy to patch for editing models, anyway. Maybe not text to image models.
It probably comes up more than you think. Storyboarding, product placement, model images, etc.
It's not critical in the short term, but it'll wind up on their backlog for sure.