upvote
Additional points, CUDA is polyglot, and some people do care about writing their kernels in something else other than C++, C or Fortran, without going through code generation.

NVidia is acknowledging Python adoption, with cuTile and MLIR support for Python, allowing the same flexibility as C++, using Python directly even for kernels.

They seem to be supportive of having similar capabilities for Julia as well.

The IDE and graphical debuggers integration, the libraries ecosystem, which now are also having Python variants.

As someone that only follows GPGPU on the side, due to my interests in graphics programming, it is hard to understand how AMD and Intel keep failing to understand what CUDA, the whole ecosystem, is actually about.

Like, just take the schedule of a random GTC conference, how much of it can I reproduce on oneAPI or ROCm as of today.

reply
> Working with distributions packagers and integrating with them does not cost much... This would currently give you a competitive advantage over Nvidia..

Packaging is actually a huge amount of effort if you try to package for all distros.

So the common long-standing convention is to use a "vendored software" approach. You design everything to install into /opt/foo/, and you provide a simple install script to install everything, from one (or several) giant zips/tarballs. It's very old and dumb but it works quite well. Easy to support from company perspective, just run your dumb installer on a couple distros once in a while. Don't depend on distro-specific paths, use basic autodetection to locate and load libraries/dependencies.

Once you do that, it is actually easier for distros to package your software for you. They make one basic package that runs the installer, then they carve up the resulting files into sub-packages based on path. Then they just iterate on that over time as bugs come in (as users try to install just package X.a, which really needs files from X.b).

But you need to hire people with expertise in the open source world to know all this, and most companies don't. Maybe there's just not a lot of us left out there. Or, more likely, they just don't understand that wider support + easier use = more adoption.

reply
> Packaging is actually a huge amount of effort if you try to package for all distros.

That's the neat part: You do not have too package for all the distro.

Just make your components easy to decouple with provided pkg-config (ideally) and a proper configuration mechanism.

No bundle, no hidden download or tangled vendored messy script.

Then it is easy to do: you can just provide the packages for your main targets (e.g Ubuntu, Redhat typically) and the community of other distros will take care of the rest.

reply
> Supporting only Server grade hardware and ignoring laptop/consumer grade GPU/APU for ROCm was a terrible strategical mistake. A lot of developers experiments first and foremost on their personal laptop first and scale on expensive, professional grade hardware later.

NVIDIA is making the same mistake today by deprioritizing the release of consumer-grade GPUs with high VRAM in favour of focusing on server markets.

They already have a huge moat, so it's not as crippling for them to do so, but I think it presents an interesting opportunity for AMD to pick up the slack.

reply
There actually isn't any locking involved. I can take a new, officially unsupported version of ROCm and just use it with my 7900 XT despite my card not being officially supported and it works. It's just that AMD doesn't feel that they need to invest the resources to run their test suite against my card and bless it as officially supported. And maybe if I was doing something other than running PyTorch I'd run into bugs. But it's just laziness, not malice.
reply
I used to be able to run ROCm on my officially unsupported 7840U. Bought the laptop assuming it would continue to work.

Then in a random Linux kernel update they changed the GPU driver. Trying to run ROCm now hard-crashed the GPU requiring a restart. People in the community figured out which patch introduced the problem, but years later... Still no fix or revert. You know, because it's officially unsupported.

So "Just use HSA_OVERRIDE_GFX_VERSION" is not a solution. You may buy hardware based on that today, and be left holding the bag tomorrow.

reply
This is a very unprofessional attitude. There is no space for laziness in business.
reply