I'm not deep into the details of the AMD DRAM controller, but this detail could cause some of your anomalies. If this was an academic paper, the findings would be borderline invalid. You might want to remove the extra module and run the benchmarks again.
At least once the tests become big enough to have some data in both partitions, the bandwidth will start to matter.