undefined

points

[-]

The article explains it. This is not for streaming over the web, but for editing professional grade video on consumer hardware.

by doctorpangloss7 hours ago|

parent|

[-]

davinci resolve is the only commercial NLE with any kind of vulkan support, and it is experimental

prores decodes faster than realtime single threaded on a decade old CPU too

it doesn't make sense. it's much different with say, a video game, where a texture will be loaded once into VRAM, and then yes, all the work will be done on the GPU. a video will have CPU IO every frame, you are still doing a ton of CPU work. i don't know why people are talking about power efficiency, in a pro editing context, your CPU will be very, very busy with these IO threads, including and especially in ffmpeg with hardware encoding/decoding nonetheless. it doesn't look anything like a video game workload which is what this stack is designed for.

by pandaforce7 hours ago|

parent|

[-]

6k ProRes streams that consumer cameras record in are still too heavy for modern CPUs to decode in realtime. Not to mention 12k ProRes that professional cameras output.

by lostmsu7 hours ago|

parent|

prev|

[-]

That reduces power consumption. So should improve battery life of laptops and help environment a little.

by nerdsniper4 hours ago|

prev|

[-]

> If you are sending the frames via some protocol over the network, like WebRTC, it will be touching the CPU anyway. Software encoding of 4K h264 is real time on a single thread on 65w, decade old CPUs, with low latency.

This is valid for a single stream, but the equation changes when you're trying to squeeze the highest # of simultaneous streams into the least amount of CapEx possible. Sure, you still have to transfer it to the CPU cache just before you send it over WebRTC/HTTP/whatever, but you unlock a lot of capacity by using all the rest of the silicon as much as you can. Being able to use a budget/midrange GPU instead of a high-end ultra-high-core-count CPU could make a big difference to a business with the right use-case.

That said, TFA doesn't seem to be targeting that kind of high stream density use-case either. I don't think e.g. Frigate NVR users are going to switch to any of the mentioned technologies in this blog post.

by pandaforce8 hours ago|

prev|

[-]

The article explicitly mentions that mainstream codecs like H264 are not the target. This is for very high bitrate high resolution professional codecs.

by jpc08 hours ago|

prev|

[-]

I'm not entirely sure that this is true.

I haven't actually looked into this but it might not be the realm of possibility. But you are generating a frame on GPU, if you can also encode it there, either with nvenc or vulkan doesn't matter. Then DMA the to the nic while just using the CPU to process the packet headers, assuming that cannot also be handled in the GPU/nic

by nerdsniper4 hours ago|

parent|

[-]

You can also often DMA video coming in through peripherals to get it straight into the GPU, skipping the CPU.

by eptcyka8 hours ago|

prev|

[-]

It will be more energy efficient. And the CPU is free to jit half a gig of javascript in the mean time.

by temp08268 hours ago|

parent|

[-]

It's hugely more efficient, if you're on a battery powered device it could mean hours more of play time. It's pretty insane just how much better it is (I go through a bit of extra effort to make sure it's working for me, hw decoding isn't includes in some distros).

by hrmtst938374 hours ago|

prev|

[-]

If the frames already live on the GPU, pulling them over PCIe just to feed a CPU encoder is wasted bandwidth and latency.

by xattt8 hours ago|

prev|

[-]

It’s a leftover mindset from the mid-2000s when GPGPU became possible, and additional performance was “unlocked” from an otherwise under-utilized silicon.