Hacker News new | past | comments | ask | show | jobs | submit | _kst_'s comments login

C is a relatively low level language, but it is not assembly language.

The difference is clear. Assembly language programs specify sequences of CPU instructions. C programs specify runtime behavior.


The C standard, since 1989, has said that attempting to modify the array object corresponding to a string literal has undefined behavior. Whether it "works" or not is not the issue.

The problem is that it's currently legal to pass a string literal to a function expecting a (non-const) pointer-to-char argument. As long as the function doesn't try to write through the pointer, there's no undefined behavior. (If the function does try to write through the pointer, the behavior is undefined, but no compile-time diagnostic is required.) If a future version of C made string literals const, such a program would become invalid (a constraint violation requiring a diagnostic). Such code was common in pre-ANSI C, before const was introduced to the language.

The following is currently valid C. The corresponding C++ code would be invalid. The proposal would make it invalid in C, with the cost of breaking some existing code, and the advantage of catching certain errors at compile time.

    #include <stdio.h>

    void print_message(char *message) {
        puts(message);
        // *message = '\0'; // would have undefined behavior
    }

    int main(void) {
        print_message("hello");
    }


> Whether it "works" or not is not the issue.

Of course it is. It doesn't work on anything modern, and thus it is impossible for portable code which actually runs in the real world and has to work to have relied on it for a long time.

Your example is not code any competent C programmer would ever write, IMHO. Every proficient C programmer I've ever worked with used "const char *" for string literals, and called out anybody who didn't in review.

Old code already needs special flags to build with modern compilers: I think the benefit of doing this outweighs the cost of editing some makefiles.


A conforming implementation could make string literals modifiable, and (obviously non-portable) code could rely on that. I don't know whether any current compilers do so. I suspect not.

Apart from that, it's not about actually modifying string literals. It's about currently valid (but admittedly sloppy) code that uses a non-const pointer to point to a string literal. It's easy to write such code in a way that a modern conforming C compiler will not warn about.

That kind of code is the reason that this proposed change is not just an obvious no-brainer, and the author is doing research to find out how much of an issue it really is.

As it happens, I think that the next C standard should make string literals const. Any code that depends on the current behavior can still be compiled with C23 or earlier compilers, or with a non-conforming option, or by ignoring non-fatal warnings. And of course any such code can be fixed, but that's not necessarily trivial; making the source code changes can be a very small part of the process.

Any change that can break existing valid code should be approached with caution to determine whether it's worth the cost. And if the answer is yes, that's great.


> That kind of code is the reason that this proposed change is not just an obvious no-brainer

I don't understand your point here: I disagree this is "obvious", and I don't think I've said anything to imply that?

> And of course any such code can be fixed, but that's not necessarily trivial; making the source code changes can be a very small part of the process

In many cases, it's so trivial you can write code to patch the code. Often, the resulting stripped binary will be identical, so you can prove it's not necessary to even test the result! If decision makers can be made to understand that, you can run around most corporate process that makes this sort of thing hard.

I've spent a lot of time fixing horrible old proprietary code to use const because I think it's important: most of the time, it's very easy. I don't deny there are rats nests that require a lot of refactoring to unwind, but that is the exception rather than the rule, in my personal experience.

It will be vanishingly rare that code will need to be modified in a way that actually changes its runtime behavior to tolerate the proposed change.


My point is that the risk of breaking existing code is the only reason not to apply this change to the standard.

My point is also that that's a valid reason to proceed carefully before making the change.

Even if the required source code changes are trivial or automatable, there will still be some variable amount of work required to deploy the changes. For a small program or library, maybe you can just rebuild and deploy. But for some projects, any change requires going through a full round of review, testing, recertification, and so on. For an update to code that controls a medical device or a nuclear reactor, for example, changing the code is the easy part.

I support the proposed change. I also support performing all due diligence before imposing it on all future implementations and C software.


> But for some projects, any change requires going through a full round of review, testing, recertification, and so on.

If the new binary is literally identical to the last one which was passed validation, absolutely zero additional testing is required. It is a waste of resources to retest an identical binary (assuming everything else can be held constant of course, which obviously can't always be the case).

Actually sending our hypothetical refactoring to production would itself be a waste of resources anyway, since the binary is identical... you just skip it, wait for the next real change, and then proceed as usual.

All processes have exceptions, the "binary identical output" is an easy one if your leadership chain is capable of understanding it.

And to be clear, "binary" here could absolutely mean "entire firmware image". The era of reproducible builds is upon us, and it is glorious.


Sure if the new binary is bitwise identical to the old one, there's no need to release it.

But ...

"The era of reproducible builds is upon us"

What about old code built with old toolchains? And what about organizational policies that require a full round of testing for any update? How hard do you think it would be to change such policies?

No doubt there's some software that could easily be modified, recompiled, and released. My point is that there is also some software that can't.

And yes, in those cases the likely solution is to leave the code alone and continue to build it with the old toolchain.

The point is that the proposed change will break existing valid code, and that has a non-zero cost. I support Jens Gustedt's effort to find out just what that cost is before imposing the change. (And again, I hope the change does go into the next edition of the standard.)


The most current SQLite amalgamation (3.49.1) is showing ~70 warnings when compiled with -Wwrite-strings.

But maybe 70 warnings in 250k LoC is OK for your standards of proficiency.


Surely you agree that is a problem that ought to be fixed in that code?

70 warnings really doesn't sound that bad to fix. Most are probably trivial. I'm sure a few aren't.

If nobody is around to fix it, that's what legacy flags are for.


Different crew vehicles have different capabilities. Dragon is similar to Soyuz, Apollo, et al. It doesn't have a lot of maneuverability on the way down -- and if it lands in the ocean, it doesn't need it. (Soyuz lands on land; again, if it's off target it's no big deal.)

I think there were designs for a version of Dragon that could do propulsive landings, but that was abandoned.

Assuming they get Starship working reliably, it will do precise landings, probably with a tower catch, eventually with a crew on board.


Soyuz is single use: if the capsule gets banged up a bit on a hard earth landing, it's not a problem (well, assuming the people inside survive).

Dragons are reusable, and Fram2 was in fact the fourth mission of the same vehicle. So they need to be treated with kid gloves.


The study is based on 263 galaxies.

It should be fairly easy to determine the rotation direction of any (spiral) galaxy we can see, based on reasonable assumptions about the relationship between rotation and the configuration of the spiral arms. There should be thousands or millions of visible galaxies for which this could be determined (out of the estimated 2 trillion galaxies in the observable universe). Perhaps I'm missing something, but why bother reporting a result from such a tiny sample?

It should also be possible to derive more detailed information that just clockwise vs. counter-clockwise. The rotation of a galaxy defines a direction (the galaxy's rotational north pole) and a point on the surface of an imaginary sphere. This could be determined by the galaxy's apparent rotational direction, its orientation, and its position in the sky. It would be interesting to see a plot of those points. In principle, they should be random. (If the points spell out "Go stick your head in a pig", I'll be very sorry that Douglas Adams didn't live to see it.)


The author is the project editor for the ISO C standard.

(And I hardly think that analyzing speculating about the motivation for the author's chosen nickname is constructive.)


Great! I'm a programmer. And I've sure spent too much time on C++isms.

> (And I hardly think that analyzing speculating about the motivation for the author's chosen nickname is constructive.)

Nope! Gets right to it. This is really building C++ (but this time how I want). It adds work for every C programmer who has to check off a whole bunch of small tasks to keep a codebase living for many years.


"ThePhD" stands for "The Phantom Derpstorm", though.


Guys is it C++ when I don’t accidentally leak memory


Reminds me of the rats we used to have in our garage, that had built nests from our dryer lint.

Our dryer lint was largely cat hair.


My gf has a small very sterile scarcely decorated living room. A mouse chose to live under the fake fireplace. The funny part is that she has 7 cats. When they aren't looking he steals their cat food and drinks their water.

I also see a cat hunt tiny mice on a tiny strip of grass in the middle of a busy intersection. Apparently this is a great spot for a nest, in the middle of large roaring vehicles that kill everything on their way.


Mounted on a coconut.


I deal with the main/master brouhaha by using a script I wrote that determines the name of the appropriate branch:

    #!/bin/bash
    
    git remote show origin | sed -n '/^ *HEAD branch: */s///p'
It's in my `$HOME/bin` as `git-master`, symlinked as `git-main`.

    git switch $(git master)
(`git foo` finds and executes a `git-foo` command anywhere in $PATH, a handy feature if you want to implement your own extensions.)

(This is of course irrelevant to the topic of the top-level post.)


Ah neat! Might have to steal that. Let's just hope nobody decides that "origin" is racist too...


Leap seconds should be replaced by large rockets mounted on the equator. Adjust the planet, not the clock.


It'd be not-so-funny if there was a miscalculation and the Earth was slowed down or sped up too much. There's a story about the end of times and the Antichrist (Dajjal) in the Muslim traditions where this sort of thing actually happens. It is said that the "first day of the Antichrist will be like a year, the second day like a month, and third like a week", which many take literally, i.e. a cosmic event which actually slows down the Earth's rotation, eventually reversing course such that the sun rises from the West (the final sign of the end of humanity).


Neither C nor POSIX requires time_t to be signed.

The Open Group Base Specifications Issue 7, 2018 edition says that "time_t shall be an integer type". Issue 8, 2024 edition says "time_t shall be an integer type with a width of at least 64 bits".

C merely says that time_t is a "real type capable of representing times". A "real type", as C defines the term, can be either integer or floating-point. It doesn't specify how time_t represents times; for example, a conforming implementation could represent 2024-12-27 02:17:31 UTC as 0x20241227021731.

It's been suggested that time_t should be unsigned so a 32-bit integer can represent times after 2038 (at the cost of not being able to represent times before 1970). Fortunately this did not catch on, and with the current POSIX requiring 64 bits, it wouldn't make much sense.

But the relevant standards don't forbid an unsigned time_t.


Apparently both Pelles C for Windows and VAX/VMS use a 32-bit unsigned time_t.


Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: