Of course the system knows what is using every page. The difficulty is really in how to account for pages that are backed by disk.
If you count all of those as free, that's not accurate. If you count all of those as used, that's not accurate either. Additionally, FreeBSD (at least) doesn't have separate queues for disk backed pages, so there's not really a good way to know how much of your active (or inactive) memory is disk backed.
As an additional caveat that measuring active/inactive has costs. In the past, FreeBSD wouldn't really do the work for that until it needed to... I know some stuff changed, but I don't remember where it ended up; it wasn't great when it bulk marked a ton of pages as inactive and then the active ones would fault back in.
If you don't, it becomes rather easy (and strict commit accounting is done for example on Linux even if it isn't used in some cases)
Memory mapped files can be entirely recreated from the disk so no need to charge for them. Anonymous pages (whether private or shareable) have to be charged. Shareable memory is the harder one to charge. (The case where a mapping is used by only one can get charged as private commit.) These two previous cases are charged even if in a swap file or whatnot
Why not? It depends on what you’re measuring. Physical memory? They count as free. Virtual memory? They count as used.
The ambiguities only arise when we stop making that distinction very clear.
What I want to know is do I have enough physical memory for what I'm running.
Sure, I can drop disk backed pages and recreate them as needed, but when it happens too often, there goes performance.
Thrashing is a well known known issue that can occur with swap, but it can also happen from page cache or memory mapped files. Indeed not having swap enabled can make things worse, as private pages that haven't been used in a hours cannot be swapped out to keep the important files cached or memory mapped.
Realistically for measuring physical memory sufficiency, you care about memory/data of any type (even files) that will be used in use upcoming time period, and ensuring that a sufficient percentage of it can be held in physical memory to avoid thrashing.
This is hard! Technically impossible to know for the general case (halting problem), and all methods of trying to approximate it involves trade-offs.
This isn’t hard. The OS should just expose a counter for available memory instead of having applications understand every type of memory reservation.
edit:
Linux does this, but it has its own share of issues with memory counters. The “cached” memory includes tmpfs and ramfs for seemingly no reason.
If you're curious why that is by the way, it's because that's actually how these are implemented (tmpfs/ramfs is just a mount to a filesystem where the files never get marked clean[1])
[1]: https://www.kernel.org/doc/Documentation/filesystems/ramfs-r...
AFAIK the only way for you to figure out how much of your disks is actually cached involves enumerating all tmpfs and ramfs mounts, summing their sizes, and subtracting the sum from the cache size reported by the kernel.
> The OS should just expose a counter for available memory instead of having applications understand every type of memory reservation.
Explicitly set per-application upper bounds à la GOMEMLIMIT and Java's -Xmx, together with cgroup's enforcement, seem to be much more useful in practice.
Or there was a proposal for some sort of SIGLOWEM signal that would be sent to all processes in the system (with default disposition of "do nothing") that'd allow applications to release some of its non-vital memory holdings: also a much more timelier notification.
Do agree it's not the best UX and utilities should probably do a better job at showing that
[1]: https://man7.org/linux/man-pages/man5/proc_meminfo.5.html
It won't count executable pages and memory-mapped file use as "used" memory, so your system might display gigabytes "free" when it's starving, executables getting paused when code pages are paged-in from disk.
It's just less useful than what's displayed now. "Everyone is doing it wrong" is usually a signal that you're missing something.
You can't determine how much of the memory can actually be reclaimed under memory pressure until you try to reclaim it.
Yes, and I’m arguing that we should be able to, by having the kernel keep track.