C23 Implications for C Libraries

layer8 · on Dec 12, 2022

> The minimum value for N is 1 for unsigned types and 2 for signed.

I wonder why they’re disallowing _BitInt(1). It would be a signed integer type with the possible values -1 and 0.

crest · on Dec 12, 2022

If you assume twos complement (does anyone still insist one supporting ones complement ints for the sake of old DSPs and minicomputers?)

mastax · on Dec 12, 2022

Two's complement is required in C23.

cryptonector · on Dec 12, 2022

Probably for the same reasons that 1-bit signed int bitfields are UB. I suppose they could have allowed it as you say, but... is there a need?

Someone · on Dec 12, 2022

FWIW, https://en.cppreference.com/w/cpp/language/bit_field doesn’t mention that.

It does say bitfields of size zero are allowed (that forces padding to the size of the allocation unit), and also, curiously, says that, in C, “whether int bit-fields that are not explicitly signed or unsigned are signed or unsigned is implementation-defined.”

layer8 · on Dec 12, 2022

It creates a special case when you're converting between signed and unsigned _BitInt(N) and you want to abstract over the concrete N.

hvdijk · on Dec 12, 2022

Are they UB? What rule says they are? I would not be surprised by such a rule, but I am not aware of one.

ynik · on Dec 12, 2022

A single-bit bitfield has only a sign bit, no value bits.

Prior to C23 there was no guarantee of two's complement, so it was possible that such a field could only hold the value 0.

hvdijk · on Dec 12, 2022

True. And that still doesn't mean the behaviour is undefined, that just means the construct is not useful except on two's complement implementations. Which, even before C23, was all of them: the C23 change to require two's complement wasn't really meant to invalidate existing implementations, it was meant to reflect the reality that there were no other implementations worth considering. See https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2218.htm, the only non-two's-complement implementation still in use that was found was very much a legacy-only thing for backwards compatibility where it was not expected that its users would want a modern C compiler anyway.

cryptonector · on Dec 12, 2022

Oh, it might be implementation-defined rather than undefined behavior. Sorry, I'm not going to look through the spec right now for this.

synergy20 · on Dec 12, 2022

wish defer made it, so I can kind of do RAII.

even better, adding constructor and destructor for RAII in c, main() has it via attributes __constructor__ __destructor__, gcc has __cleanup__ for non-main() functions too, let's add it to C directly?

quelsolaar · on Dec 12, 2022

I'm in the wg14, and I made a passionate argument against defer. C is all about simplicity and clarity, over conviniance. Defer, is an invisible jump, its much better using a goto, that clearly denotes a change in flow. In C everything should be as clear and as explicit as possible. Nothing should happen without the user explicitly saying so. This means more typing, and thats fine, there are plenty of languages that offer loads of syntax sugar if thats what users want, but only C offers clarity.

enriquto · on Dec 12, 2022

Thank you so much for your work! Your comment made me happy and warm inside, as if there's still some order in this crazy world. As if there's still a few sane and robust things that we can attach ourselves to, knowing that they will not break.

How can we support/encourage you? (in case you need it)

quelsolaar · on Dec 12, 2022

Speaking up is a good start! The support is much apprechiated!

ff317 · on Dec 12, 2022

Did you get rid of longjmp() too? longjmp() is far more perilous for similar reasons and purposes.

Defer is useful for code clarity in many situations IMHO, and it's easy enough for a project or code QA tool to ban its use for those that don't like it.

Gibbon1 · on Dec 13, 2022

And think he should push to get rid of terrible things like stdint and designated initializers too.

masklinn · on Dec 15, 2022

Surely <string.h> is the first standard header to execute?

secondcoming · on Dec 12, 2022

If C is simple and clear why is it fraught with pitfalls? While C code can be nice and clear to read that doesn’t mean there less mental burden on the developer to keep track of everything going on.

quelsolaar · on Dec 12, 2022

I would argue that there is less mental burden, if you stay within some simple limits. C has a lot of complexity if you push its limits, mainly because it is so old and the limits have been pushed in ways they havent in other languages. C has to run on very strange platforms and has many many implementations, and so much code depending on it. This makes it very hard to maintain. If two large C compilers do things slightly different and very important software depends on those behaviours, then its very hard to make a "Clean" fix without breaking lots of software.

The argument against "Why dont they just fix X?" usualy commes down to: it would break millions of lines of code, make every C tutorial/book obsolete and force millions of C programmers to learn new things, not to mention that if we broke the ABI, we would break almost every other language since they depend on Cs stable ABI. Breaking C would literaly cost tens if not hundreds of billions of dollars for the industry.

Look at the move between Python 2 and 3. The cost of breaking bakwards compatibility have probably been astonomical, and then there is way less Pyton code being maintained then C code.

C operates on a scale that is almost unfathomable. A 1% perfomance degredation, in C will have ameasurable impact on the worlds energy use and co2 emissions, so the little things really matter.

dmz73 · on Dec 13, 2022

C programmers claim some things that are simply not true: 1) simplicity - C is not a simple language, it is a language lacking advanced features which does not make it simple to use but it might be easier to create incomplete non-performant C compiler than compiler for better languages 2) performance - C naive/straight forward/clear implementations are not the most performant and most of the C programs are not highly optimised for performance, C compilers have had so much optimisation work done on them that they can actually generate decently performant code but if better languages received same level of effort as what C had they would be able to achieve better performance than C with smaller and safer code 3) clarity - C is not clear to read once structs and pointers are used and specially once it is optimised for performance...and then there is the preprocessor...which is a whole different language that is required to make any non-hello-world program even possible If I have to provide examples for any of the above you are either not a C programmer or you are an "advanced" C programmer that doesn't have to deal with large code bases and instead deals with small pieces and someone else takes care of the rest. I mean, just try to figure out what long int is on Linux without compiling a program and tell me that C is simple and clear language...and if you are still clinging on that lie then tell me what a struct looks like in memory so I can interface to it from another language...or even another C compiler...because C is simple and clear, right?

tpoacher · on Dec 20, 2022

Not the OP, but you are conflating two very different ideas under a singular definition.

C is "simple" in the same way DNA is "simple". There is a small set of very straightforward rules, but that set provides immense flexibility. But it in no way means that any resulting object will not be complicated. Perhaps a better word than "simple" would be "non-complex", making the distinction between "complex" vs "complicated" systems.

By contrast, the features you refer to would increase the "complexness" of the language (I won't even dare use the grammatically correct word "complexity" here, lest we deviate into yet another trap of conflating definitions).

Which, I would agree with OP is, for better or worse, probably far outside C's goals as a language.

sai_c · on Dec 12, 2022

This has nothing to do with C, and everything with its intended problem domain.

You can get into pointer and memory errors in C++, Rust and Ada. All of them low-level system languages. Sure, those errors might be harder to produce, but not impossible, and definitely easy enough to still trip you up.

I programmed in all of those languages, except Rust (just don't like it). At least in C you pretty much now WHY (not necessarily where in the code) things went south, without consulting a thousand page specification or having to remember the myriad of language feature interactions that could have triggered those problems.

Moreover, C being small, it's a good on/off language. Try doing a code review for a C++/Rust/Ada code base which uses features heavily after not having touched the language for a year. I bet it is not as easy as C.

You know, some things in life are just hard. And low-level programming is one of those things. C is only honest about this.

anonymoushn · on Dec 12, 2022

Will promotion of arithmetic results to int be removed, since this is generally a force against simplicity and clarity?

klodolph · on Dec 12, 2022

I think this is the right approach. You can make C better without trying to compete with newer languages.

pjmlp · on Dec 12, 2022

Yeah, why bother competing with newer languages on safety, that was never something that mattered on C's design.

d0mine · on Dec 12, 2022

> simplicity and clarity ...

It may be deceptive. Just look at how various compilers or even the same compiler with different options interpret the same C code.

> as explicit as possible

C is not assembler.

yazaddaruvala · on Dec 12, 2022

> Nothing should happen without the user explicitly saying so.

Umm, why is typing something like defer (and documenting what it does) different than typing something like goto (and documenting what it does)?

Frankly, not being comfortable with adding safe and efficient abstractions to a language will result in the death of the language (both for spoken languages and also for programming languages).

naasking · on Dec 12, 2022

> Umm, why is typing something like defer (and documenting what it does) different than typing something like goto (and documenting what it does)?

Because goto is explicitly declaring a control flow where defer causes implicit control flow in code that does not explicity declare it. That's a meaningful difference that the OP was trying to specifically avoid regardless of whether you care about it.

yazaddaruvala · on Dec 12, 2022

Why is "defer" not considered "explicitly declaring control flow" if the documentation describes it correctly?

naasking · on Dec 12, 2022

Because the control flow happens implicitly at the end of the block, not explicitly as with goto, loop, if-statements, etc. A deferred statement can also have arbitrary time and space complexity, so it's not like a variable going out of scope at the end of a block which is constant time. It furthermore requires invocation of all deferred statements while unwinding the stack upon calling exit(). See the details here:

https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2589.pdf

yazaddaruvala · on Dec 12, 2022

> Because the control flow happens implicitly at the end of the block

Why is this different than any loop in C? or function without a return? Wouldn’t it be more explicit and very minimal overhead to instead always use a goto at the end of a looping block?

Even tho C allows while loops (as a mistake - all control flow should be explicitly defined at the end of the block), why aren’t you - as a best practice - using an explicit goto and if block instead of while / for?

naasking · on Dec 12, 2022

> Why is this different than any loop in C? or function without a return?

Because it's non-local control flow. Did you just ignore my whole comment where I pointed out where cases like exit() unwinds the stack, invoking deferred block in each caller up the call chain back to the root? Does for/while do that?

yazaddaruvala · on Dec 12, 2022

I really did read it, I just really believe that your biases are clouding your critical thought about this.

> invoking deferred block in each caller up the call chain back to the root

Yes if the caller writes N defers in the same function or across M functions then the code will run N defers.

Why is this any different than nesting N loops, in the same function or across M functions? It’ll still run N end of block control flow statements without them being “explicit”.

At the end of the day “explicit” typically is really just a replacement for saying “familiar with the previous documentation, don’t change things”.

Anyways, that’s all I can put into this conversation. Please just try to consider that you’re conflating familiarity with simplicity and leverage System 2 to really ask yourself if what you’re favoring is valuable to the group or just the NIMBY.

naasking · on Dec 13, 2022

I honestly can't summarize the extra runtime complexity any more clearly than I already have. The original poster was very clear about keeping the runtime complexity down for various well-motivated reasons. They are correct that if you want more runtime support for high-level features, then C is perhaps not the language for you.

masklinn · on Dec 12, 2022

> even better, adding constructor and destructor for RAII

Constructors are a nuisance, and destructors while great require significant additional semantics to avoid double destruction.

C++ uses unconditional destructor thanks to ubiquitous copying and non-destructive moves, but that requires all objects to always be in a destructible state.

Rust instead relies on lexical drop flags inserted by the compiler if objects are conditionally moved.

They’re more convenient and composable than defer/cleanup/scope, but they also have more langage impact.

Obviously the trade off is defer pushes that issue onto the developer.

mjburgess · on Dec 12, 2022

I suspect `defer` is the mainstream right tradeoff between the implicit nightmarish semantics of C++ and the formalism of rust.

I think the Rust view of C++ will turn out to be an over-reaction. The issue wasn't a failure to make the implicit explicitly provable, it was simply to make it explicit at all.

masklinn · on Dec 12, 2022

> I suspect `defer` is the mainstream right tradeoff

It’s not. It’s a conveniently simple language addition which foists all the issues and edge cases onto the language user. Defer is a limited subset which is more verbose and more error-prone.

However like “context managers” (e.g. try blocks, using, with, unwind-protect, bracket) it also works acceptably in languages where ownership concepts are non-existent or non-encoded, whereas destructors / drops require strict object lifetimes.

> The issue wasn't a failure to make the implicit explicitly provable, it was simply to make it explicit at all.

What, pray tell, would be the use of making borrows “explicit” then ignoring them entirely? Useless busy work?

mjburgess · on Dec 12, 2022

Well i didn't mean only defer, context managers also count as a 'making explicit'.

My point is that requiring compile time proofs is likely too far to reach the practical goal.

Indeed, it seems most Rust code today agrees: widely turning off the borrow checking with copies and the like.

jamincan · on Dec 12, 2022

> Indeed, it seems most Rust code today agrees: widely turning off the borrow checking with copies and the like.

I here this and that `unsafe` is widely used in Rust code bases repeated a lot, but is there any actual data backing this?

mjburgess · on Dec 12, 2022

There are three "dialects" of rust: borrow-checked, unsafe, and copy-everything.

The existence of these three "dialects" goes to my point that the borrow checker isnt The Ideal solution to the problem of high-perf. safe static computing.

anonymoushn · on Dec 12, 2022

Some of this data may be difficult to collect, because some project leads evaluate Rust, discover that the entire ecosystem is built on top of libraries that endlessly thrash a global allocator with synchronization primitives in it, and move on.

flohofwoe · on Dec 12, 2022

A sane defer-like mechanism would be fine with me, but please no destructors (or worse) constructors, they are much too inflexible because they bind 'behaviour' to types. Data should be 'stupid' and not come with its own hardwired behaviour, otherwise this becomes a design-rabbit-hole that eventually ends up in C++.

masklinn · on Dec 12, 2022

While I agree that destructors are unsuitable for C, that’s because for them to be useful you’d have to throw out and redo the entire standard library, so they can’t be usefully retrofit in the language, however

> they are much too inflexible because they bind 'behaviour' to types

1. that is the baseline you want, data is not an amorphous and meaningless blob, the default for a file or connection is that you close it, the default for allocated memory is to release it, etc…

2. that also allows much more easily composing such types and semantics, without having to do so by hand

3. and is much more resilient to future evolution, if an item goes from not having a destructor to having one… you probably don’t care, but if it goes from not having a cleanup function (or having a no-op one which you could get away with forgetting to call) to having a non-trivial cleanup function you now have a bug lurking

4. destructors also make conditional cleanups… just work, the repetitive and fiddly mess is handled by the computer, which is kind of the point of computers, rather than having to remember every time that you need to clean the resource on the error path but not on the non-error path, defers easily trigger double-free situations; this also, again, make code more resilient to future evolutions (and associated mistakes)

5. furthermore destructors trivially allow emulating defers, just create a type which does nothing and calls a hook on drop

> Data should be 'stupid' and not come with its own hardwired behaviour

That’s certainly a great take if you want the job security of having to fix security issues forever.

flohofwoe · on Dec 12, 2022

> ...the default for a file or connection is that you close it...

Something like a file or connection is already not just 'plain old data' though.

In my mind, 'data' are the pixels in a texture (while the texture itself is an 'object'), or the vertices and indices in a 3D mesh (while the 'mesh' is an object), or the data that's read from or written to files (but not the 'file object' itself).

C is all about data manipulation (the 'pixels', 'vertices' and 'indices'), less about managing 'objects'.

For objects, constructors and destructors may be useful, but there are plenty cases in C++ where they are not (textures and meshes in a 3D API are actually a good example for where destructors are quite useless, because when the 'owner' of the CPU side texture object is done with the object doesn't mean that the GPU is done with the data that's 'owned' by the CPU side object - traditional destructors can be used of course by delaying the destruction on the CPU side texture object until the GPU is done with the data, but why keep the texture object around when only the pixel data is needed by the GPU, not the actual texture object?

A 'defer mechanism' is just the right sweet spot for a C like language IMHO.

masklinn · on Dec 12, 2022

> Something like a file or connection is already not just 'plain old data' though.

OK? But now you need to… have two different mechanisms, when one can do both?

> In my mind, 'data' are the pixels in a texture (while the texture itself is an 'object'), or the vertices and indices in a 3D mesh (while the 'mesh' is an object), or the data that's read from or written to files (but not the 'file object' itself).

This “data” does not have a destructor, and thus is not a concern.

> C is all about data manipulation (the 'pixels', 'vertices' and 'indices'), less about managing 'objects'.

I think that would be news to every C program I’ve written.

> there are plenty cases in C++ where they are not

In which case you can just not have them, and not care.

> A 'defer mechanism' is just the right sweet spot for a C like language IMHO.

If by “a C like language” you mean “a language refuses to take any complexity off of the user’s back” then sure.

Joker_vD · on Dec 12, 2022

How about:

    do {
        SomeConnection *conn = conn_open(...);
        if (!conn) break;

        do {
            SomeSession *sess = sess_open(conn, ...);
            if (!sess) break;

            do {
                // ...
            } while(false);

            sess_close(sess);
        } while(false);

        conn_close(f);
    } while(false);

The idea is to use "break" instead of "goto fail" because goto is bad.

Cloudef · on Dec 12, 2022

Why is goto bad? Note that C's goto isn't absolute, but relative. setjmp is the absolute goto.

rwmj · on Dec 12, 2022

Defer was a truly terrible suggestion (as presented). However the C committee should have standardized the widely used __attribute__((cleanup)).

Fredej · on Dec 12, 2022

Oh man, no defer? Bummer. I'm switching from C++ to C recently and there sure are a bunch of things I'm missing!

AlbertoGP · on Dec 12, 2022

I made a small pre-processor that implements block-scoped defer in C, by transforming the source to add the deferred code at each exit point in a function. It is called Cedro: https://sentido-labs.com/en/library/cedro/202106171400/#defe...

That link goes directly to the relevant section in the manual.

It includes a cc wrapper called `cedrocc` so you don’t need intermediate files, and it works hard to produce clean code for the generated parts. The rest is not modified at all. The goal is to be useful even if you only use it once to generate that repetitive code.

The pre-processor `cedro` depends only on the C standard library, and `cedrocc` depends on POSIX.

unwind · on Dec 12, 2022

Uh, not to be That Guy, but what did you expect? Just compare K&R [1] and Stroustrup (any edition) [2] next to each other, and you will get a pretty strong hint of C being a smaller language. That's kind of the point, or it used to feel like it was anyhow.

[1]: https://en.wikipedia.org/wiki/The_C_Programming_Language

[2]: https://www.stroustrup.com/4th.html

masklinn · on Dec 12, 2022

Being a small language is not an excuse. Scheme is a small language, still has dynamic-wind (since R5RS, so no spring chicken). Smalltalk is a tiny language, but has BlockClosure#ensure:.

You can have basic safety and QOL features without making the language a monstrous beast. Hell, you can cut old garbage like K&R declarations or digraphs to make room. You can even remove iso646 and most of string.h as a gimme.

pjmlp · on Dec 12, 2022

First of all, if you are comparing K&R against Stroustrup, then you should use C++ARM, not later editions.

Secondly, Bjarne's books include the overview of the standard library, which neither K&R nor its ANSI C revision do.

Third, since everyone keeps using compiler extensions in C, moreso than in C++, those should be included as well.

unwind · on Dec 12, 2022

At least in my ancient copy (second edition, from 1988 I think) K&R most definitely has an overview of the standard library, in Appendix B.

pjmlp · on Dec 12, 2022

I stand corrected on that matter it seems.

Fredej · on Dec 14, 2022

I mean, it's not that I didn't expect changes or didn't expect the language to be smaller - it's just that there definitely things that I would want to bring with me over to C. One thing I really like is RAII - that I can ensure that things are cleaned up in a known, once-defined fashion and I don't have to worry about it everywhere I use a given object. I also generally like using early returns, which is somewhat more complicated with C, as I may need to have more cleanup code around. It can be somewhat mitigated by coding more functionaly and input the necessary parameters to a function, so I can have a different function just doing allocation and deallocation. But still, it's more verbose.

`defer` would to some degree solve that issue.

Similarly, I've been missing nullptr, just for the expressiveness. I like that C23 now includes it :)

cpeterso · on Dec 12, 2022

> The maximum [_BitInt] value is provided by a new macro BITITNT_MAX

Why “BITITNT” instead of “BITINT”? I hope this is a typo, but “BITITNT” is repeated in the example code.

layer8 · on Dec 12, 2022

The typo isn’t repeated in section 14.2, however it’s still wrong, as the correct name is BITINT_MAXWIDTH.

sgt · on Dec 12, 2022

For a new project in C going forward, will it makes sense to usr C23?

ainar-g · on Dec 12, 2022

Depending on your definition of “using”, you may not actually be able to, at least for now. If CPPReference's table[1] to be believed, there are still important features that are either missing from GCC and Clang or implemented “partially”, which may mean different things for different features and compilers.

In my personal opinion, unless you're doing the project just for fun, it seems better to stick to C11/C17, at least for the next few years.

[1]: https://en.cppreference.com/w/c/23#C23_core_language_feature...

m-ee · on Dec 12, 2022

In the embedded world it’s still a struggle to get people to use C99, maybe in 20 years for us.

vkaku · on Dec 12, 2022

Yes. The idea behind most of the stuff seen in C23 is to rid itself of UB that ultimately lead to easier to maintain better compilers and libraries.

While in practice it may take long for some things to follow the standard, it is certainly a step in the right direction.

This is my humble opinion.

MattPalmer1086 · on Dec 12, 2022

Glad to see memset_explicit for zeroing out sensitive data.

g0xA52A2A · on Dec 12, 2022

Previous discussion https://news.ycombinator.com/item?id=33691048

habibur · on Dec 12, 2022

C needs default parameters.

Where the compiler puts on stack the default value, whenever the function call doesn't include it.

Guess that won't compromise with anything.

cozzyd · on Dec 12, 2022

Default values for structs, only taking effect when static or used with designated initializers, would be good enough imo, since you can always pass a struct to a function.

So you could do foo({.bar=1}) and the rest would be default initialized.

rwmj · on Dec 12, 2022

That's how it works already, eg:

  #include <stdio.h>
  #include <stdlib.h>

  struct params {
    int a;
    void *p;
  };

  void
  f (struct params p)
  {
    printf ("p.a = %d, p.p = %p\n", p.a, p.p);
  }

  int
  main (void)
  {
    f ((struct params){.a = 42});
    f ((struct params){.p = f});
    f ((struct params){});                                                       
    exit (EXIT_SUCCESS);
  }

Even though these are stack allocated, the missing fields are initialized to 0/NULL. The last case (no parameters) is new in C23. In C99 you had to use (struct params){0} to initialize it. https://en.cppreference.com/w/c/language/compound_literal

cozzyd · on Dec 12, 2022

Yes but they're default initialized to 0 in this case, whereas it would be nice if there was some way be way they could be initialized to default potentially non-zero values.

rwmj · on Dec 12, 2022

Fair point, which could also apply to structs in general.

flohofwoe · on Dec 12, 2022

Yep, this is also at the top of my wishlist to 'complete' the designated initializagtion feature:

    struct bla_t {
        int a = 23;
        const char* hello = "Hello World!";
    };

...and only allow comptime known constants for the default value declarations.

pantalaimon · on Dec 12, 2022

You can already do

    struct bla_t {
        int a;
        const char* hello;
    } blub = {
        .a = 23,
        .hello = "Hello World!",
    };

flohofwoe · on Dec 12, 2022

No that's just creating an item 'blub' of struct 'bla_t', it doesn't help with default values that the compiler could fill in with when using designated init instead of zeroes. To borrow from your example:

    struct bla_t {
        int a = 23;
        const char* hello = "Hello World!";
    } blub = {
        .a = 46,
    };

The missing designated init item blub.hello would now be initialized to "Hello World!" by the compiler by looking up the default value in the struct declaration. Currently, missing designated init items are set to zero. And it would work in any other place where a bla_t is created:

    struct bla_t blob = {};

This would initialize blob to its default state of blob.a = 23 and blob.hello = "Hello World!".

pantalaimon · on Dec 12, 2022

But this is the exact kind of implicit magic I’m happy that C does not have.

If I look at a piece of code that does

    struct bla_t blub = {
        .a = 46,
    };

I know that everything except .a will be zero, without knowing anything about bla_t.

With your proposal I would now get a 'helpful' default that might not be what I want at all.

flohofwoe · on Dec 12, 2022

That's a fair point of course, but when you pass 'blub' into a library function, this function needs to fill in the zeroes with default values anyway, and that's just as opaque, and it causes extra runtime overhead which wouldn't be there if the compiler already knows the default values.

pantalaimon · on Dec 12, 2022

> this function needs to fill in the zeroes with default values anyway

Wait - no. What's wrong with

    #define BLA_INIT_DEFAULT { .a = 23, .hello = "Hello World!", }

    struct bla_t blub = BLA_INIT_DEFAULT;
    blub.a = 42;

That's what your proposal would do under the hood anyway, just more explicit.

flohofwoe · on Dec 12, 2022

You can even change that to:

    #define BLA_DEFAULTS .a = 23, .hello = "Hello World!"
    
    struct bla_t blub = {
        BLA_DEFAULTS,
        .a = 42,
    };

...because C99 allows designated initializers to show up multiple times, but that's all a bit too much macro magic for my taste, I'd really prefer the defaults in the struct declaration.

pantalaimon · on Dec 12, 2022

This just saves you a single line of code compared to

    struct baz tmp = {.bar=1};
    foo(&tmp);

flohofwoe · on Dec 12, 2022

The point is that foo() now needs to check every struct item for being zero and use the default value instead. If the default values would be listed in the struct declaration, the compiler could fill those in when tmp is created, without any additional runtime overhead.

It would also free up zero as being an actual value instead of standing for 'default value'.

pantalaimon · on Dec 12, 2022

Then just make a define with the default values

    #define FOO_INIT_DEFAULT {.bar=1}

I don’t want some 'magic' struct that behaves different than all other structs on init.

A static (default zero) struct lives in .bss, that entire section is zeroed on init.

If you want to have default values, it goes to .data where it will consume some ROM.

flohofwoe · on Dec 12, 2022

If you want your structs to be zero initialized, then you just wouldn't declare default values in the struct declaration, and everything would work as before.

Also, once you initialize structs with values somewhere in the code (for instance with designated init), there's a high chance that the compiler will put a copy into the data section anyway, which is then memcpy'ed into the runtime struct (it depends on the compiler and compile options).

JonChesterfield · on Dec 12, 2022

How would you have those work with function pointers?

unwind · on Dec 12, 2022

Wouldn't that be just as hard as for any other function? The name of a function (a bit like the name of an array) evaluates to basically a function pointer value, there is little difference between the two calls here:

    int foo(int a, int b)
    {
      return a + b;
    }

    int main(void)
    {
      printf("Direct: %d\n", foo(1, 2));
      int (*ptr)(int, int) = foo;
      printf("Indirect: %d\n", ptr(1, 2));
      return 0;
    }

of course default arguments, if added, would have to be part of the function pointer type as well, making the above:

    int foo(int a = 1, int b = 2)   // NOT REAL CODE, FANTASY SYNTAX
    {
      return a + b;
    }

    int main(void)
    {
      printf("Direct default: %d\n", foo());
      int (*ptr)(int a = 1, int b = 2) = foo;   // NOT REAL CODE, FANTASY SYNTAX
      printf("Indirect default: %d\n", ptr());
      return 0;
    }

Unnamed function arguments would look silly (`int (ptr)(int = 1, int = 2)`?), but I would be radical then and only support default argument values for named arguments, probably.

Edit: fixed a typo in the code, changed in-code comment.

JonChesterfield · on Dec 12, 2022

Encoding the defaults in the type seems valid. Interesting alternative to name mangling + cast to select the variant.

kangalioo · on Dec 12, 2022

I see the point. Out of curiosity, I thought up some approaches: - disallow making function pointers of functions with default params - require explicitly passing all params when used as a function pointer - generate a trampoline function that generates the default params and use that as the function pointer - somehow include the default param values as part of the function pointer type

habibur · on Dec 12, 2022

Good point.

masklinn · on Dec 12, 2022

> Where the compiler puts on stack the default value, whenever the function call doesn't include it.

So if a dynamically linked library uses default values, and you make use of them, and the dynamically linked library decides to change its default values (e.g. a crypto library switches to more secure defaults), you don’t get the update until you recompile your own code.

That doesn’t sound great.

Isn’t it C# which uses this strategy, of embedding the defaults in the caller?

PS: I don’t think C is defined in terms of stack, and modern calling conventions use registers for at least the first few arguments.

Joker_vD · on Dec 12, 2022

That's IIRC how C++ also does this, so... yeah, it's not great but lots of things are not great about C and/or C++ and the standard advice is "yeah, don't do that if that hurts".

Cloudef · on Dec 12, 2022

Please no initialization fiasco from C++ ... http://mikelui.io/2019/01/03/seriously-bonkers.html

JonChesterfield · on Dec 12, 2022

Seems unrelated? Though attribute((constructor)) will give you that fiasco in vendor extended C if you miss it.

flohofwoe · on Dec 12, 2022

This makes only sense when it comes with named parameters which can be provided in any order, not the half-assed default-parameter implementation from C++.

tempodox · on Dec 12, 2022

Interesting and very welcome improvements. Can't wait to get my hands on a full C23 compiler.

malkia · on Dec 12, 2022

Shouldn't this:

char16_t s16[2 * sizeof mbs];

be

char16_t s16[sizeof mbs];

?

malkia · on Dec 12, 2022

Oh, I see now....

_dh54 · on Dec 12, 2022

Edit: This comment was not accurate

layer8 · on Dec 12, 2022

Unix/Posix time doesn’t include leap seconds (it’s 86400 seconds per day, always, per definition [0]), whereas UTC, taken as a count of actual seconds, does include leap seconds. So you are right that converting between them requires additional data. However, few applications care about converting between Unix time and that notion of UTC, but instead care about converting between Unix time and calendar dates and times of day and time zones, which is easier with Unix time than with a UTC seconds count. So I think you’re mixing things up here.

[0] https://en.wikipedia.org/wiki/Unix_time#Definition

_dh54 · on Dec 12, 2022

In the link you provide it says this about Unix time:

    In Unix time, every day contains exactly 86400 seconds but leap seconds are accounted for.

It then provides an example for when the Unix time went from 915148800 to 915148800 after 1 atomic second on 1998-12-31T23:59:60.00

So it's incorrect to say that Unix time does not include leap seconds.

butlerm · on Dec 12, 2022

It would be more accurate to say that Unix time "accounts for" leap seconds by not accounting for them at all, but rather switching into temporal displacement ambiguous repeat timestamp la la land for an entire second before returning to an unambiguous encoding of UTC.

layer8 · on Dec 12, 2022

Leap seconds are not counted in Unix time. Otherwise days with leap seconds would be counted as having 86401 seconds instead of 86400 seconds. This is not the case. In the example you cite, the Unix time resets to 915148800 after the leap second (1999-01-01T00:00:00), so the leap second isn't counted. Unix time differs from the true number of seconds since the epoch by the number of leap seconds (plus the UTC-TAI time difference between 1970 and 1972, before leap seconds were introduced).

JimDabell · on Dec 12, 2022

The leap second is being abandoned:

https://www.nature.com/articles/d41586-022-03783-5

wyldfire · on Dec 12, 2022

Presumably this means no more leap seconds to be added but we'll be stuck with the ones we've got?

SAI_Peregrinus · on Dec 12, 2022

Most likely. They could add leap minutes, hours, or days instead. But they'll probably set UTC as a constant offset from TAI, and let the few uses that care about accurate solar time use UT1.

_dh54 · on Dec 12, 2022

UTC can never be set as a constant offset from TAI, it will eventually cease to be useful and that can never happen.

tialaramex · on Dec 12, 2022

TAI is very useful. The thing people seem really anxious about is that it isn't directly tied to the Earth's rotation. But, like, so what?

SAI_Peregrinus · on Dec 13, 2022

If you want something tied to Earth's rotation, UT1 is the way to go. If you don't care, TAI is good. UTC is in a weird compromise position, where it (traditionally) tried to be within 1s of UT1 but otherwise tick at the same rate as TAI. That compromise turned out not to be what anyone needs, so future uses will probably pick between UT1 and TAI/GPS/Unix/some other fixed offset from TAI.

kasajian · on Dec 12, 2022

The next version of C should redefine C as a strict subset of C++. That is, at any given moment, a particular revision of C should be a subset of a particular revision of C++. Each new version of C++ would then cause a revision of C to be released.

tsimionescu · on Dec 12, 2022

If anything, C++ could have decided to become a strict superset of C - they wouldn't even have to change too much and could do it without giving up backwards compat.

C would have to change significantly to become a proper subset of C++, and pretty much no C program would still compile (because of int* c = malloc(sizeof int);).

msla · on Dec 12, 2022

Even if you're OK with breaking every call to malloc(), as the other poster said, plenty of C code has variables named things that clash with C++ reserved words, such as new, and, while it's possible for programming languages to simply not have reserved words, I think C++ implementation developers and programmers would rather avoid the kinds of things clever C++ programmers would come up with were those guide rails to be taken off.

https://softwareengineering.stackexchange.com/questions/1896...

No reserved words in PL/I: https://www.ibm.com/docs/en/epfz/5.3?topic=sbcs-identifiers

> Identifiers can be PL/I keywords or programmer-defined names. Because PL/I can determine from the context if an identifier is a keyword, you can use any identifier as a programmer-defined name. There are no reserved words in PL/I. However, using some keywords, for example, IF or THEN, as variable names might make a program needlessly hard to understand.

optymizer · on Dec 12, 2022

As someone who writes C code and couldn't give a rat's ass about C++, I have to ask: why?

flohofwoe · on Dec 12, 2022

Please let's not bring all the design warts of C++ into C (instead only port the actually good ideas over, after they've been proven to be actually good ideas).

It would make more sense if C++ would reverse direction and become a superset of C (like ObjC choose to do from the start), but for C++ it's much too late now.

asveikau · on Dec 12, 2022

That would break pretty much every call to malloc, since C allows implicit cast from void* and C++ does not.

layer8 · on Dec 12, 2022

This wouldn’t be possible in an even remotely compatible fashion, due to C++’s name mangling, among many other reasons.

cryptonector · on Dec 12, 2022

Can't be done: it would break backwards compatibility. E.g., `struct foo` and `foo` are the same type in C++ automatically (as if `typedef struct foo foo`), but not in C.

mm007emko · on Dec 12, 2022

I write C every now and then and couldn't care less about C++. So the question is why, what would be the benefit? Do you really think that everyone who use C uses C++ as well?

pjmlp · on Dec 12, 2022

I would rather see C fully replaced by C++, as Microsoft was pushing for and sadly did an 180° turn on that.

At least WG21 takes security more seriously.

optymizer · on Dec 12, 2022

Just replacing 'gcc' with 'g++' in my Makefile doubles or triples the compilation time.

I wish people took developer experience more seriously.

pjmlp · on Dec 12, 2022

If you told me that you were making use of templates I would have belived it.

By the way, the C projects I used to work on 20 years ago, took an hour to compile, and it wasn't worse thanks to ClearMake sharing of object files.

In any case, even if templates make C++ slower to compile, I would rather have slower builds than more opportunities to keep security researchers busy.