So if you can move complexity over to the controller you can spend 100:1 ratio in unit cost. So you get to make the memory dies very dumb by e.g. feeding a source synchronous sampling clock that's centered on writes and edge aligned on reads leaving the controller to have a DLL master/slave setup to center the clock at each data group of a channel and only retain a minimal integer PLL in the dies themselves.