undefined

points

[-]

The stack does grow down though no matter what, in the sense that the pushing decrements the stack pointer. You can represent this as "up" in your diagram, but I don't think this makes it any easier conceptually because by analogy to a simple push/pop on an array, you'd naively expect higher addresses to contain more recent stack contents.

The core of the issue is that the direction stack growth differs from "usual" memory access patterns which usually allocate from lower to higher addresses (consider array access, or how strings are laid out in memory. And little-endian systems are the majority)

But if we're going with visualization options I prefer to visualize it horizontally, with lower addresses on left. This has a natural correspondence with how you access an array or lay out strings in memory.

by bignerd_952 days ago|

parent|

[-]

Please try to draw, step by step, a process where lower addresses are at the top and higher addresses are at the bottom. You’ll see that this makes everything much easier to understand.

Do not confuse this with push and pop on an abstract stack data structure. That is not the same as the process stack. On a real process stack, newer data is stored at LOWER addresses. In fact, every push decrements the stack pointer (the address is decreased).

If you want an example, think about how a string is placed and accessed on the stack. First, the stack pointer is decremented to reserve space (so in my diagram this “moves up” visually). Then the string can be read byte by byte by incrementing an index from the lower address toward the higher address. This is exactly like reading a book: left to right, top to bottom. If you flip memory upside down, everything becomes unnatural to understand: you would have to read the string from the bottom to the top.

Try decompiling a program with Ghidra. Open the disassembly view and look at the addresses on the left. Lower addresses are shown at the top. Higher addresses are shown at the bottom. In my diagram this matches perfectly. Everything is consistent and you never have to mentally flip the memory layout.

Years of practice led me to this, not just theory.

by amitprasad2 days ago|

prev|

[-]

I think I got stuck in the same rut that I learned address space in whilst writing that diagram. I would tend to agree with you that your model makes much more sense to the student.

Related: In notation, one thing that I used to struggle with is how addresses (e.g. 0xAB_CD) actually have the bit representation of [0xCD, 0xAB]. Wonder if there's a common way to address that?

by bignerd_952 days ago|

parent|

[-]

If you're referring to little-endianness, it means the CPU stores multi-byte values in memory with the least significant byte first (at the lowest address).

This convention started on early Intel chips and was kept for backward compatibility. It also has a practical benefit: it makes basic arithmetic and type widening cheaper in hardware. The "low" part of the value is always at the base address, so the CPU can load 8 bits, then 16 bits, then 32 bits, etc. starting from the same address without extra offset math.

So when you say an address like 0xABCD shows up in memory as [0xCD, 0xAB] byte-by-byte, that's not the address being "reversed". That's just the little-endian in-memory layout of that numeric value.

There are also big-endian architectures, where the most significant byte is stored at the lowest address. That matches how humans usually write numbers (0xABCD in memory as [0xAB, 0xCD]). But most mainstream desktop/server CPUs today are little-endian, so you mostly see the little-endian view.

by amitprasad2 days ago|

parent|

[-]

Not so much the confusion of what little endian is, but how we tend to represent it in notation. Of course this confusion was back when I was first learning things in high school, but I imagine I’m not alone in it

by bignerd_952 days ago|

parent|

prev|

[-]

Yes, I reached the same conclusions the hard way while exploiting memory corruption bugs. Once I understood how misleading these representations can be, everything finally became clear.

About the address notation you're describing, I'm not sure I fully get the problem. Can you spell out the question with a concrete example?

This is what the address space of a real bash process looks like on my machine:

$ cat /proc/$(pidof bash)/maps

5e6e8fd0f000-5e6e8fd3f000 r--p 00000000 fc:00 3539412 /usr/bin/bash

5e6e8fd3f000-5e6e8fe2e000 r-xp 00030000 fc:00 3539412 /usr/bin/bash

5e6e8fe2e000-5e6e8fe63000 r--p 0011f000 fc:00 3539412 /usr/bin/bash

5e6e8fe63000-5e6e8fe67000 r--p 00154000 fc:00 3539412 /usr/bin/bash

5e6e8fe67000-5e6e8fe70000 rw-p 00158000 fc:00 3539412 /usr/bin/bash

5e6e8fe70000-5e6e8fe7b000 rw-p 00000000 00:00 0

5e6e94891000-5e6e94a1e000 rw-p 00000000 00:00 0 [heap]

7ec3d1400000-7ec3d16eb000 r--p 00000000 fc:00 3550901 /usr/lib/locale/locale-archive

7ec3d1800000-7ec3d1828000 r--p 00000000 fc:00 3548995 /usr/lib/x86_64-linux-gnu/libc.so.6

7ec3d1828000-7ec3d19b0000 r-xp 00028000 fc:00 3548995 /usr/lib/x86_64-linux-gnu/libc.so.6

7ec3d19b0000-7ec3d19ff000 r--p 001b0000 fc:00 3548995 /usr/lib/x86_64-linux-gnu/libc.so.6

7ec3d19ff000-7ec3d1a03000 r--p 001fe000 fc:00 3548995 /usr/lib/x86_64-linux-gnu/libc.so.6

7ec3d1a03000-7ec3d1a05000 rw-p 00202000 fc:00 3548995 /usr/lib/x86_64-linux-gnu/libc.so.6

7ec3d1a05000-7ec3d1a12000 rw-p 00000000 00:00 0

7ec3d1a2b000-7ec3d1a84000 r--p 00000000 fc:00 3549063 /usr/lib/locale/C.utf8/LC_CTYPE

7ec3d1a84000-7ec3d1a85000 r--p 00000000 fc:00 3549069 /usr/lib/locale/C.utf8/LC_NUMERIC

7ec3d1a85000-7ec3d1a86000 r--p 00000000 fc:00 3549072 /usr/lib/locale/C.utf8/LC_TIME

7ec3d1a86000-7ec3d1a87000 r--p 00000000 fc:00 3549062 /usr/lib/locale/C.utf8/LC_COLLATE

7ec3d1a87000-7ec3d1a88000 r--p 00000000 fc:00 3549067 /usr/lib/locale/C.utf8/LC_MONETARY

7ec3d1a88000-7ec3d1a89000 r--p 00000000 fc:00 3549066 /usr/lib/locale/C.utf8/LC_MESSAGES/SYS_LC_MESSAGES

7ec3d1a89000-7ec3d1a8a000 r--p 00000000 fc:00 3549070 /usr/lib/locale/C.utf8/LC_PAPER

7ec3d1a8a000-7ec3d1a8b000 r--p 00000000 fc:00 3549068 /usr/lib/locale/C.utf8/LC_NAME

7ec3d1a8b000-7ec3d1a8c000 r--p 00000000 fc:00 3549061 /usr/lib/locale/C.utf8/LC_ADDRESS

7ec3d1a8c000-7ec3d1a8d000 r--p 00000000 fc:00 3549071 /usr/lib/locale/C.utf8/LC_TELEPHONE

7ec3d1a8d000-7ec3d1a90000 rw-p 00000000 00:00 0

7ec3d1a90000-7ec3d1a9e000 r--p 00000000 fc:00 3551411 /usr/lib/x86_64-linux-gnu/libtinfo.so.6.4

7ec3d1a9e000-7ec3d1ab1000 r-xp 0000e000 fc:00 3551411 /usr/lib/x86_64-linux-gnu/libtinfo.so.6.4

7ec3d1ab1000-7ec3d1abf000 r--p 00021000 fc:00 3551411 /usr/lib/x86_64-linux-gnu/libtinfo.so.6.4

7ec3d1abf000-7ec3d1ac3000 r--p 0002e000 fc:00 3551411 /usr/lib/x86_64-linux-gnu/libtinfo.so.6.4

7ec3d1ac3000-7ec3d1ac4000 rw-p 00032000 fc:00 3551411 /usr/lib/x86_64-linux-gnu/libtinfo.so.6.4

7ec3d1ac4000-7ec3d1ac5000 r--p 00000000 fc:00 3549065 /usr/lib/locale/C.utf8/LC_MEASUREMENT

7ec3d1ac5000-7ec3d1ac6000 r--p 00000000 fc:00 3549064 /usr/lib/locale/C.utf8/LC_IDENTIFICATION

7ec3d1ac6000-7ec3d1acd000 r--s 00000000 fc:00 3548984 /usr/lib/x86_64-linux-gnu/gconv/gconv-modules.cache

7ec3d1acd000-7ec3d1acf000 rw-p 00000000 00:00 0

7ec3d1acf000-7ec3d1ad0000 r--p 00000000 fc:00 3548992 /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2

7ec3d1ad0000-7ec3d1afb000 r-xp 00001000 fc:00 3548992 /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2

7ec3d1afb000-7ec3d1b05000 r--p 0002c000 fc:00 3548992 /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2

7ec3d1b05000-7ec3d1b07000 r--p 00036000 fc:00 3548992 /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2

7ec3d1b07000-7ec3d1b09000 rw-p 00038000 fc:00 3548992 /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2

7ffd266f8000-7ffd26719000 rw-p 00000000 00:00 0 [stack]

7ffd2678a000-7ffd2678e000 r--p 00000000 00:00 0 [vvar]

7ffd2678e000-7ffd26790000 r-xp 00000000 00:00 0 [vdso]

ffffffffff600000-ffffffffff601000 --xp 00000000 00:00 0 [vsyscall]

___

Each line is a memory mapping. The first field is the start address. The second field is the end address. So an entry like

7ffd266f8000-7ffd26719000

means "this mapping covers virtual addresses from 0x7ffd266f8000 up to 0x7ffd26719000."

The addresses are always increasing:

- left to right: within a single line you go from lower address to higher address

- top to bottom: as you go down the list you also go to higher and higher addresses

Exactly like reading a book: left to right and then top to bottom.

by 17186274402 days ago|

parent|

[-]

The issue amitprasad is pointing out is when you read addresses byte-wise and you determine that they are in little-endian.

by 17186274402 days ago|

prev|

[-]

That's how stacks on my desk grow and how everything grows in reality. I wouldn't numerate stacked things on my desk from the top, since this constantly changes. You also wouldn't name the first branch of a tree (the plant) to be the top-most one.

In your example "the stack grows down", seems to be wrong in the image.

by bignerd_952 days ago|

parent|

[-]

Thanks! I tried to rewrite the final sentence

by 17186274402 days ago|

parent|

[-]

Yeah, but does that really help? The phrases "growing down/up" still exist and now you defined them to mean the opposite. This issue still didn't go away, since heap and stack still grow in different directions. Can't you just start drawing from the bottom of the blackboard, and it will be obvious? Coordinate systems also typically work that way.

by bignerd_952 days ago|

parent|

[-]

Yes, I draw the heap starting at the top of the board and the stack starting at the bottom of the board and grow them toward each other. That works fine in a one-off explanation.

The problem is that most textbooks draw the opposite, so the student leaves my lecture, opens a book or a slide deck, and now “down” means a different thing.

It gets worse when they get curious and look at a real process with /proc/<pid>/maps. Linux prints mappings from low address to high address as you scroll down (which matches my representation). That is literally reversed from the usual textbook diagram. Students notice and ask why the book is “wrong.”

So I've learned I have to explicitly call this out as notation.

Same story as in electronics class still teaching conventional current flow (positive to negative), even though electrons move the other way (negative to positive). Source: https://www.allaboutcircuits.com/textbook/direct-current/chp.... Historical convention, and then pedagogy has to patch it forever.

by ofalkaed2 days ago|

parent|

prev|

[-]

Starting at the bottom of the blackboard would be backwards from how it prints in the terminal when you cat /proc/<pid>/maps.

by 17186274401 days ago|

parent|

[-]

The way it's printed in the terminal is honestly backwards to me. This seams to only come from the scroll direction of the terminal, and because this is not a drawing, but a simple list. Every other tool like a debugger show it in the opposite direction and in all illustrations I have read it's that way too.

by bignerd_951 days ago|

parent|

[-]

So you use debuggers. Good. Then you can confirm that the program counter is incremented after each instruction, and that you read assembly from top to bottom. That means smaller addresses are at the top and larger addresses are at the bottom. This matches my layout, and it also matches what you see in the terminal in /proc/<pid>/maps.

by 17186274401 days ago|

parent|

[-]

Incrementation direction is orthogonal to up/down. When I disassemble a program, the addresses are all over the place and not ordered.

> That means smaller addresses are at the top and larger addresses are at the bottom.

No, my mental model is the exact opposite and this matches the jargon out there.

> also matches what you see in the terminal in /proc/<pid>/maps

I think of this as a sorted list, not as a display or description of a model.

When I drive on a road, I think of think of things on the road near me to have lower addresses in my coordinate system and things further away as having higher addresses. When I write a list of things I see, this will be from left-to-right then from top-to-bottom on a sheet of paper, because it is a list that follows the writing directions of my language/script. When I look at a traffic sign things nearer to me will be at the bottom and things far away at the top, because that's the agreed-upon mental model of a road. When I look at my navi, things near me will be in the center and things far away from me at the edge of the display.

When I write down points in the first sector of the coordinate system, I might order things according to the x-coordinate ascending top-to-bottom. That doesn't mean I would draw the axis inverted.

The correspondence of physical addresses to position is entirely non-linear and also three dimensional so there is no natural top and bottom especially when we are talking about virtual addresses.

When I get taught a new concept I want to get to know the model everyone uses. I will not like a teacher, that tells me a different ordering which is different from how everyone else does it, because this the output of some random command on some random OS, which actually shows a list, not a graph of a memory model. (Sorry that's harsh, of course I still appreciate didactic simplification.)

Maybe the issue is that you consider the stack to be so important to determine the model of the whole process space. When I would draw a stack on it's own, I DO draw it from top-to-bottom. But when I draw a whole process space, I do not, because everything else is mapped/allocated from bottom-to-top. When you invert the direction of the mental model, yes the stack now grows from bottom-to-top. But no, the other things are now allocated from top-to-bottom instead. This are more things: the text, libraries, mmap'd files, and most-used thing: the heap are all allocated inverted now. And the most important thing, the index for all that: the addresses now are allocated from top-to-bottom.

by krackers1 days ago|

parent|

[-]

Yeah this is sort of the same objection I had in https://news.ycombinator.com/item?id=45709016

Although thinking about it more, the fact that address indexing is now from top-to-bottom is actually consistent with how I imagine indexing for Nx1 array (or equivalently, how you index matrices in math).

Thinking about it more, I don't think there is any real convention, even diagrams are split. You are going to have to learn to "flip" things either way, the same way matrix indexing differs from cartesian plane indexing. Same way different graphical systems have different conventions for where the origin is. People colloquially say things like "high mem", and you'll have to translate this to your mental model.

It's why I suggested a horizontal, left-to-right visualization. I think everyone would agree that lower addresses on the left, higher addresses on right.

Also could you elaborate more on what you mean by debuggers showing it the opposite direction? If you do `info proc mappings` in gdb it is also lower addresses at top. This might be debugger specific though.