Disagree slightly with this- pronouncing the tones individually and getting to the point where you can be understood isn't too hard (well still hard), but combining them when speaking more quickly is more challenging, especially if you want it to flow nicely, and adding emphasis while maintaining the tones. Not that it's mandatory if you just want to understand/be understood, it depends on one's goals.
It's a common misconception that it's enough just to learn the tones and move on and it's very hard to find teachers who are able to help with more advanced pronunciation
This is an interesting observation. Another one that I sometimes mention to my friends who didn't have an occasion to learn Chinese before is that in this language speaking, reading and writing are actually 3 separate components. You can read characters without knowing how to write them properly or even remembering them entirely. Lots of my Taiwanese acquaintances forget how to write certain characters, because nowadays most of the text they write is in bopomofo on their phones. Bopomofo represents sounds, so basically knowing how an expression sounds and being able to read the character (pick it from a set of given characters for the chosen sound) is enough to "write" it.
You can get used to the tones in a relatively short amount of time. If you are in an immersive environment for a month or two, you will end up wondering how it is that anyone can't hear the tones.
In contrast, there is simply no way to memorize thousands of words in that timeframe.
Most of it is passively paying attention. It should not be a struggle, it's one of those the more you struggle and overintellectualize the less time you are focusing on paying attention and letting your hearing ability do its work it was evolved to do.
The other thing is this whole emphasis on accents is misdirected. Teachers do not place this excessive emphasis on accents, it is people who want to sound "authentic" which is not a very wise goal of language learning in the first place.
I do think that learning music can help a little, especially a sonically complex instrument like violin and the like.
(caveat: I'm way oversimplifying on my Saturday afternoon, but that's my tentative views on this that I would try to argue for.)
I've seen people struggle to pronounce a word when I explicitly tell them what tones it contains, but then pronounce it perfectly when I ask them to just imitate me.
But I disagree about accents. One of the major flaws in most foreign language education, in my opinion, is that pronunciation is not emphasized heavily enough at the beginning. Being able to pronounce the basic sounds correctly has a huge impact on how native speakers perceive your language skills, even if you're not very advanced in the language.
That's true, but it counsels against trying to develop better pronunciation early.
If you sound like a native despite having just started to learn the language, people will naturally conclude that you are mentally retarded.
(1) It doesn't get any more difficult to fix your accent. But most people won't, because there's virtually no benefit to doing it.
This is related to
(2) Once you learn to speak a language, you're not at any risk of people thinking that you can't speak it, even if you speak with a strong accent.