upvote
Parameterized types in C using the new tag compatibility rule

(nullprogram.com)

The recent #def #enddef proposal[1] would eliminate the need for backslashes to define readable macros, making this pattern much more pleasant, finger crossed for its inclusion in C2Y!

[1] https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3531.txt

reply
I really don't think the backslashes are that annoying? Seems unnecessary to complicate the spec with stuff like this.
reply
While long-def's might be nice, you can even back in ANSI C 89 get rid of the backslash pattern (or need to cc -E and run through GNU indent/whatever) by "flipping the script" and defining whole files "parameterized" by their macro environment like https://github.com/c-blake/bst or https://github.com/glouw/ctl/

Add a namespacing macro and you have a whole generics system, unlike that in TFA.

So, it might add more value to have the C std add an `#include "file.c" name1=val1 name2=val2` preprocessor syntax where name1, name2 would be on a "stack" and be popped after processing the file. This would let you do types/functions/whatever "generic modules" with manual instantiation which kind of fits with C (manual management of memory, bounds checking, etc.) but preprocessor-assisted "macro scoping" for nested generics. Perhaps an idea to play with in your slimcc fork?

reply
Not personally interested in this hack, but https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3037.pdf means struct foo {} defined multiple times with the same fields in the same TU now refers to the same thing instead of to UB and that is a good bugfix.
reply
I think this is an interesting change, even though I (as someone who has loved C for 30+ years and use it daily in a professional capacity) don't immediately see a lot of use-cases I'm sure they can be found as the author demonstrates. Cool, and a good post!
reply
> don't immediately see a lot of use-cases I'm sure they can be found

How about making a parameter optional when it isn't a pointer? I don't touch C often, but curiously enough I was just met with this case while making a change in i3:

  -Con *con_get_fullscreen_con(Con *con, fullscreen_mode_t fullscreen_mode);
  +Con *con_get_fullscreen_con(Con *con, fullscreen_mode_t fullscreen_mode, fullscreen_layer_t fullscreen_layer, bool any_layer);
I added that `fullscreen_layer` param, which is an enum, and later found that I needed to make it optional. Maybe there's a better convention for cases like this, but I couldn't think of anything better than adding that `any_layer` boolean to decide whether to use or ignore `fullscreen_layer`. If `fullscreen_layer` were a pointer, it can just be set to `NULL`, but it's not. If the definition allowed without much duplication, I'd split the function into 2, one with the fullscreen_layer param and another without. Again, not an option. I don't want to pollute the enum for the concern of a single function either. If this were Haskell, the convention here would be to wrap the `fullscreen_layer_t` in a Maybe (a parameterized type).

Extending on the example of this post to write a Maybe parameterized type in C, it works:

  #include <stdio.h>

  #define MAYBE(T) struct maybe { bool is_just; T x; }
  #define JUST(T, t) ((MAYBE(T)){.is_just = true, .x = (t)})
  #define NOTHING(T) ((MAYBE(T)){.is_just = false})

  #define FROM_MAYBE(y, m) ((m).is_just ? (m).x : y)

  void print_maybe(MAYBE(unsigned) m) {
    if (m.is_just) {
      printf("%u\n", m.x);
    }
  }

  int main() {
    print_maybe(NOTHING(unsigned));
    print_maybe(JUST(unsigned, 3));
    return 0;
  }
I'm not going to use this, because it's kind of unconventional in C and it's just a single case in what I'm working on. However, if this problem were a more common occurrence, something like this might be nice to use to avoid having multiple parameters to pass around when one could be enough.
reply
Slighty off-topic, why is he using ptrdiff_t (instead of size_t) for the cap & len types?
reply
From one of his other blogposts. "Guidelines for computing sizes and subscripts"

  Never mix unsigned and signed operands. Prefer signed. If you need to convert an operand, see (2).
https://nullprogram.com/blog/2024/05/24/

https://www.youtube.com/watch?v=wvtFGa6XJDU

reply
I still don't understand how these arguments make sense for new code. Naturally, sizes should be unsigned because they represent values which cannot be unsigned. If you do pointer/size arithmetic, the only solution to avoid overflows is to overflow-check and range-check before computation.

You cannot even check the signedness of a signed size to detect an overflow, because signed overflow is undefined!

The remaining argument from what I can tell is that comparisons between signed and unsigned sizes are bug-prone. There is however, a dedicated warning to resolve this instantly.

It makes sense that you should be able to assign a pointer to a size. If the size is signed, this cannot be done due to its smaller capacity.

Given this, I can't understand the justification. I'm currently using unsigned sizes. If you have anything contradicting, please comment :^)

reply
C offers a different solution to the problem in Annex K of the standard. It provides a type `rsize_t`, which like `size_t` is unsigned, but where `RSIZE_MAX` is recommended to be `SIZE_MAX >> 1`. You perform bounds checking as `<= RSIZE_MAX` to ensure that a value used for indexing is not in the range that would be considered negative if converted to a signed integer, because a negative value would fail the check `<= RSIZE_MAX`.

IMO, this is a better approach than using signed types for indexing, but AFAIK, it's not included in GCC/glibc or gnulib. It's an optional extension and you're supposed to define `__STDC_WANT_LIB_EXT1__` to use it.

reply
I dont know either.

int somearray[10];

new_ptr = somearray + signed_value;

or

element = somearray[signedvalue];

this seems almost criminal to how my brain does logic/C code.

The only thing i could think of is this:

somearray+=11; somearray[-1] // index set to somearray[10] ??

if i'd see my CPU execute that i'd want it to please stop. I'd want my compiler to shout at me like a little child, and be mean until i do better.

-Wall -Wextra -Wextra -Wpedantic <-- that should flag i think any of these weird practices.

As you stated tho, i'd be keen to learn why i am wrong!

reply
> It makes sense that you should be able to assign a pointer to a size. If the size is signed, this cannot be done due to its smaller capacity.

Why?

By the definition of ptrdiff_t, ISTM the size of any object allocated by malloc cannot be out of bounds of ptrdiff_t, so I'm not sure how can you have a useful size_t that uses the sign bit?

reply
Stroustrup believes that signed should be preferred to unsigned even for values that can’t be less than zero: https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p14...
reply
Skeeto and Stroustrup are a bit confused about valid index types. They prefer signed, which will lead to overflows on negative values, but have the advantage of using only half of the valid ranges, so there's more heap for the rest. Very confused
reply
It seems as though this makes it impossible to do the new-type paradigm in C23 ? If Goose and Beaver differ only in their name, C now thinks they're the same type so too bad we can tell a Beaver to fly even though we deliberately required a Goose ?
reply
"Tag compatibility" means that the name has to be the same. The issue the proposal is trying to address is that "struct Goose { float weight; }" and "struct Goose { float weight; }" are different types if declared in different locations of the same translation unit, but the same if declared in different translation units. With tag compatibility, they would always be treated as being the same.

"struct Goose { float weight; }" and "struct Beaver { float weight; }" would remain incompatible, as would "struct { float weight; }" and "struct { float weight; }" (since they're declared without tags.)

reply
Ah, thanks, that makes sense.
reply
i fear this will make slopy code compile more often OK.
reply
Dear God I hope nobody is committing unreviewed LLM output in C codebases.
reply
Can you give an example?
reply