upvote
> I would hope that the system knows precisely what is using every single byte of physical and virtual memory.

Of course the system knows what is using every page. The difficulty is really in how to account for pages that are backed by disk.

If you count all of those as free, that's not accurate. If you count all of those as used, that's not accurate either. Additionally, FreeBSD (at least) doesn't have separate queues for disk backed pages, so there's not really a good way to know how much of your active (or inactive) memory is disk backed.

As an additional caveat that measuring active/inactive has costs. In the past, FreeBSD wouldn't really do the work for that until it needed to... I know some stuff changed, but I don't remember where it ended up; it wasn't great when it bulk marked a ton of pages as inactive and then the active ones would fault back in.

reply
This is only really a problem if you accept overcommit as a force of nature that can't be changed or tweaked (you can still do address space reservation without needing overcommit)

If you don't, it becomes rather easy (and strict commit accounting is done for example on Linux even if it isn't used in some cases)

Memory mapped files can be entirely recreated from the disk so no need to charge for them. Anonymous pages (whether private or shareable) have to be charged. Shareable memory is the harder one to charge. (The case where a mapping is used by only one can get charged as private commit.) These two previous cases are charged even if in a swap file or whatnot

reply
> If you count all of those as free, that’s not accurate.

Why not? It depends on what you’re measuring. Physical memory? They count as free. Virtual memory? They count as used.

The ambiguities only arise when we stop making that distinction very clear.

reply
> It depends on what you’re measuring.

What I want to know is do I have enough physical memory for what I'm running.

Sure, I can drop disk backed pages and recreate them as needed, but when it happens too often, there goes performance.

reply
Yep.

Thrashing is a well known known issue that can occur with swap, but it can also happen from page cache or memory mapped files. Indeed not having swap enabled can make things worse, as private pages that haven't been used in a hours cannot be swapped out to keep the important files cached or memory mapped.

Realistically for measuring physical memory sufficiency, you care about memory/data of any type (even files) that will be used in use upcoming time period, and ensuring that a sufficient percentage of it can be held in physical memory to avoid thrashing.

This is hard! Technically impossible to know for the general case (halting problem), and all methods of trying to approximate it involves trade-offs.

reply
The thing is it's easy to define free, unused memory. But a lot of the used memory is your system caching stuff that would be free if you needed more than what's actually free. So you can see you have 1g of free memory out of your 4g, but then you allocate 3g and it will do without a sweat and you'd be confused. So you have to go and dig for what those caches are and report that they're effectively free too.
reply
Instantly reclaimable disk caches should count as available, and they do.

This isn’t hard. The OS should just expose a counter for available memory instead of having applications understand every type of memory reservation.

edit:

Linux does this, but it has its own share of issues with memory counters. The “cached” memory includes tmpfs and ramfs for seemingly no reason.

reply
> The “cached” memory includes tmpfs and ramfs for seemingly no reason.

If you're curious why that is by the way, it's because that's actually how these are implemented (tmpfs/ramfs is just a mount to a filesystem where the files never get marked clean[1])

[1]: https://www.kernel.org/doc/Documentation/filesystems/ramfs-r...

reply
That’s clever. Makes for terrible UX, though.

AFAIK the only way for you to figure out how much of your disks is actually cached involves enumerating all tmpfs and ramfs mounts, summing their sizes, and subtracting the sum from the cache size reported by the kernel.

reply
Well, what's the alternative? Invent a new type of memory reservations specifically to account for tmpfs/ramfs mounts? That'd violate your own stated desired goal of

> The OS should just expose a counter for available memory instead of having applications understand every type of memory reservation.

reply
I don’t see how it contradicts my goal. I want memory counters that abstract away kernel behavior that the application has no business accounting for.
reply
The applications, frankly, have no business accounting any memory at all outside of what they use themselves via sbrk/mmap.
reply
I think a memory pressure indicator is useful. For example, if you’re writing a garbage collector, you can choose to hold off from returning pages back to the kernel when the system is under low pressure, and do this more often under high pressure.
reply
That's very, very marginally useful. If your application recently ran GC when the system was under low pressure and freed a bunch of pages (and still holds to them), and then some other application suddenly acquires a lot of memory, your application won't free those pages until next GC happens. Which most likely won't until you've used up all those freed pages, and at that point a lot of swapping will already have had happen.

Explicitly set per-application upper bounds à la GOMEMLIMIT and Java's -Xmx, together with cgroup's enforcement, seem to be much more useful in practice.

Or there was a proposal for some sort of SIGLOWEM signal that would be sent to all processes in the system (with default disposition of "do nothing") that'd allow applications to release some of its non-vital memory holdings: also a much more timelier notification.

reply
Ostensibly you could subtract "Shmem"[1] in /proc/meminfo from the cached value... maybe?

Do agree it's not the best UX and utilities should probably do a better job at showing that

[1]: https://man7.org/linux/man-pages/man5/proc_meminfo.5.html

reply
That metric would give you a number of bytes which can be used for pages not backed by files, but it won't give you actual memory usage statistics:

It won't count executable pages and memory-mapped file use as "used" memory, so your system might display gigabytes "free" when it's starving, executables getting paused when code pages are paged-in from disk.

It's just less useful than what's displayed now. "Everyone is doing it wrong" is usually a signal that you're missing something.

reply
Linux MemAvailable from /proc/meminfo is just an estimation calculated as an arbitrary percentage (50%) of free and potentially reclaimable memory.

You can't determine how much of the memory can actually be reclaimed under memory pressure until you try to reclaim it.

reply
> You can't determine how much of the memory can actually be reclaimed under memory pressure until you try to reclaim it.

Yes, and I’m arguing that we should be able to, by having the kernel keep track.

reply
You will be surprised by how inaccurate memory measurements are.
reply
Not after reading this article. :-)
reply