upvote
Even with the exact same dataset and architecture, ML results aren't perfectly replicable due to random weight initialisation, training data order, and non-deterministic GPU operations. I've trained identical networks on identical data and gotten different final weights and performance metrics.

This doesn't mean the model only works on that specific dataset - it means ML training is inherently stochastic. The question isn't 'can you get identical results' but 'can you get comparable performance on similar data distributions.

reply
Then researchers should re-train their models a couple times, and if they can't get consistent results, figure out why. This doesn't even mean they must throw out the work: a paper "here's why our replications failed" followed by "here's how to eliminate the failure" or "here's why our study is wrong" is useful for future experiments and deserves publication.
reply
As per my previous comment - we are discussing stochastic systems.

By definition, they involve variance that cannot be explained or eliminated through simple repetition. Demanding a 'deterministic' explanation for stochastic noise is a category error; it's like asking a meteorologist to explain why a specific raindrop fell an inch to the left during a storm replication.

reply