In the really simple cases (like accessing a float as a long, or similar), you'r...

DSMan195276 · on Feb 14, 2017

> I have a pointer to a struct b - can I cast it to a pointer to struct a and still dereference 'variant'?

I think it is a bit of a gray area, but personally I've always held the opinion/understanding of no, that is invalid. The C standard does make one point fairly clear on strict-aliasing, which is the idea that strict-aliasing revolves on the idea that an object can only be considered to be one type of data (Or a `char` array, the only exception). Your example would be invalid for the reason that you can't treat an object of `struct b` as though it is a `struct a` - the fact that they share the same preamble doesn't change that. To be clear with what I'm saying: `struct b` can alias an `int`. `struct a` can also alias an `int`. But `struct b` can't alias a `struct a`, and because of this an int accessed through a `struct a` can't be accessed through a `struct b`.

That said, in general I find this to usually be a fixable problem, which also has (IMO) a cleaner implementation:

    struct head {
        int variant;
    };

    struct a {
        struct head h;
    };

    struct b {
        struct head h;
        long data;
    };

Now you can take a pointer to a `struct b` object and treat it like a pointer to a `struct head` object (Because it is a `struct head` object). You could do the same thing with objects of type `struct a`. So now you can cast both of them to `struct head` and examine the `variant` of both. Then later you could cast it back to the proper type.

This approach to aggregate types is heavily used in the Linux Kernel and other places (Including most of my own code). The `container_of` macro makes it even nicer to use (Though the legality of the `container_of` macro is an interesting debate...).

> The BSD socket API was built on exactly this kind of type punning.

Kinda. It's actually surprising how close it comes to skirting this issue (And it does skirt it), but I believe it's actually possible to use BSD sockets without ever violating the strict-aliasing rule (Though of course, there are ways of using it which would arguably violate the rule). In most cases for BSD sockets, strict-aliasing is more likely going to be broken in the kernel, not your code.

To note though, the strict-aliasing rule only applies to dereferencing pointers. You can cast pointers back and forth all day, you just have to ensure that when you're done you're treating it the object as you originally declared it. Thus, if you pass a `struct sockaddr_in` to `bind` and cast it to a `struct sockaddr`, the strict-aliasing rule isn't violated because you never dereferenced the casted pointer.

Going along with that, as long as you correctly declare your `struct sockaddr`s from the beginning you won't have any strict-aliasing woes. The only situation where this could technically be a problem is `accept` and `recvfrom`, since they are the only functions that gives a `struct sockaddr` back. But assuming you already know what address-family the socket is using, you can declare the correct `struct sockaddr` for that family from the start, cast it and pass it to `accept` or `recvfrom`, and then use it as your originally declared type without breaking strict-aliasing.

Of course, it's also worth keeping in mind that the BSD sockets interface came before C89. You definitely wouldn't design it the same way if you were to do it today.

caf · on Feb 15, 2017

Kinda. It's actually surprising how close it comes to skirting this issue (And it does skirt it), but I believe it's actually possible to use BSD sockets without ever violating the strict-aliasing rule (Though of course, there are ways of using it which would arguably violate the rule). In most cases for BSD sockets, strict-aliasing is more likely going to be broken in the kernel, not your code.

Well, firstly it's pretty unsatisfying to hear that yes, this API contravenes strict aliasing restrictions, but only on the library side! - essentially that it is impossible to implement the sockets C API in C.

That aside, this still excludes long-standing examples like embedding a pointer to struct sockaddr in your client struct, which points to either a sockaddr_in, sockaddr_in6 or sockaddr_un depending on where that client connected from (well, you can still do it, but now you can't examine the sockaddr's sa_family member to see what type the address really is - you need to have a redundant, duplicate copy of that field in the client struct itself).

The situation is similar with sockaddr_storage. The whole point of that type is to allow you to stash either AF_INET or AF_INET6 addresses in the same object and then examine the ss_family field to see what it really is - the text in POSIX says:

  The <sys/socket.h> header shall define the sockaddr_storage structure, which shall be:

    Large enough to accommodate all supported protocol-specific address structures

    Aligned at an appropriate boundary so that pointers to it can be cast as pointers to protocol-specific address structures and used to access the fields of those structures without alignment problems

( http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/sys... )

Of course, it's also worth keeping in mind that the BSD sockets interface came before C89. You definitely wouldn't design it the same way if you were to do it today.

Well, the aforementioned sockaddr_storage came about after C89.

And wasn't C89 supposed to be about codifying existing practice, anyway?

DSMan195276 · on Feb 15, 2017

> Well, firstly it's pretty unsatisfying to hear that yes, this API contravenes strict aliasing restrictions, but only on the library side! - essentially that it is impossible to implement the sockets C API in C.

Then it'd also be pretty unsatisfying to hear that it's impossible to write an OS kernel in standard C too - it requires compiler extensions for lots of various details. It's impossible to write a standard C library in nothing but standard C as well. I would absolutely agree, however, that the fact that you have to use an extension to get past strict-aliasing is unfortunate (It'd be nice if in a future C they add something like the `may_alias` extension into the standard). But you can do it, and OS code is definitely one where extensions are going to be rampant anyway. For example, the Linux Kernel disables strict-aliasing completely.

> That aside, this still excludes long-standing examples like embedding a pointer to struct sockaddr in your client struct, which points to either a sockaddr_in, sockaddr_in6 or sockaddr_un depending on where that client connected from (well, you can still do it, but now you can't examine the sockaddr's sa_family member to see what type the address really is - you need to have a redundant, duplicate copy of that field in the client struct itself).

Strictly speaking, that's not true. You can examine it directly, but it requires casting the `struct sockaddr * ` to a `sa_family_t * ` instead. This is valid because regardless of what type of object it really is, it must start with a `sa_family_t` entry, so you are treating it as the correct type (And by that notion, `sa_family_t` is allowed to alias any of the `struct sockaddr` types). Besides that, you could also use a `union` to combine all the possible `sockaddr`s that you're going to handle together with a `sockaddr_storage`. Then you can do things like normal and simply access the `sockaddr` through the correct union member.

> Well, the aforementioned sockaddr_storage came about after C89.

That it did. However, that was more about making the current API work with larger addresses (namely ipv6) not to fix the API. There really wasn't any other way to do it.

> And wasn't C89 supposed to be about codifying existing practice, anyway?

Some compilers implemented the strict-aliasing optimization (And did lots of other strange things), some did not. The C standards committee chose to go with strict-aliasing since it provides a lot of optimization opportunities.

All that said, I'm not saying things are perfect by any means, as our conversation here shows. I don't think things are nearly as bad as people tend to think, however. Generally speaking, unless you're doing something a little bit shady it's possible to avoid any strict-aliasing issues, and if it's really not possible there's generally a way to simply "turn it off", albeit that may result in a little less portable code - though if you're breaking strict-aliasing the portability of your code is already a bit suspect to begin with.

caf · on Feb 15, 2017

Strictly speaking, that's not true. You can examine it directly, but it requires casting the `struct sockaddr * ` to a `sa_family_t * ` instead. This is valid because regardless of what type of object it really is, it must start with a `sa_family_t` entry,...

No, that isn't guaranteed by POSIX - it has to have an sa_family_t member, but it doesn't have to be the first one.

I also think it's problematic that the use explicitly contemplated by POSIX is considered ill-formed C.

Besides that, you could also use a `union` to combine all the possible `sockaddr`s that you're going to handle together with a `sockaddr_storage`. Then you can do things like normal and simply access the `sockaddr` through the correct union member.

I do not think this is that easy when you include sockaddr_un into the mix, because of the way that sockaddr doesn't include the full size of its path member. This is, in fact, the point that I throw up my hands and use -fno-strict-aliasing because the fact that pointer provenance, rather than just value and type, is important together with the fact that it's not actually clear whether you've correctly laundered the pointer through a union or not, makes it all too... grey.

Some compilers implemented the strict-aliasing optimization (And did lots of other strange things), some did not.

C compilers existed in 1989 that would assume different structure types with common initial sequences couldn't alias? With the "common initial sequence" carve-out in §3.3.2.3? I am sceptical...

DSMan195276 · on Feb 16, 2017

I'm gonna try to address everything I can in this reply. Sorry it has taken so long.

> C compilers existed in 1989 that would assume different structure types with common initial sequences couldn't alias? With the "common initial sequence" carve-out in §3.3.2.3? I am sceptical...

That's not quite what I was talking about. What some compilers would do is not generate extra reads when you had a `long * ` and a `int * ` in the same scope - with the idea being that those two cannot point to the same data, and thus it is not necessary to reread the data from the `int * ` when you write to the `long * `. Compilers have now taken it a slight step further - an `int` that belongs to a `struct a` and an `int` that belongs to a `struct b` can't alias - but it is really not much different from the original idea (And hence why it is legal). What the standard really describes is that objects have a single defined type and that accessing objects though a type other then their original type is invalid, which fits with what compiler writers have taken to doing. That said, I would not be opposed to the standard simply making that legal. While avoidable in most cases, it does cause problems in some instances (BSD sockets being a very notable example), and I'd wager it only brings marginal optimizations (for which `restrict` already provides a solution).

> I do not think this is that easy when you include sockaddr_un into the mix, because of the way that sockaddr doesn't include the full size of its path member. This is, in fact, the point that I throw up my hands and use -fno-strict-aliasing because the fact that pointer provenance, rather than just value and type, is important together with the fact that it's not actually clear whether you've correctly laundered the pointer through a union or not, makes it all too... grey.

Technically, you could use a `char` array for the `sockaddr_un`, and then just cast it to the right type. That's legal because `char` can alias. That said I'm fairly sure that `sockaddr_un` has a defined size - it doesn't use a FAM in implementation, it's just that the length of it's path member can vary. The POSIX standard isn't as clear as can be, but notes that it is left undefined only for the reason that different Unix's use different max lengths, and it says that it's typically somewhere in the range of 92 to 108. That along with the typical usage of `sockaddr_un` implies to me that it is perfectly fine to declare one, it just doesn't have a guaranteed length. Used in a `union` it should be fine. (All that said, I think what you've said also shows another current issue with C - there should be a way to statically declare a `struct` that has a FAM at the end by providing the length for the FAM. There's no way to do this currently except using a `char` array and casting, which is not an acceptable solution IMO).

On that note though, the entire issue here could actually be largely resolved by simply adding the `may_alias` gcc attribute to the definition of `struct sockaddr` (And `struct sockaddr_storage`). It would declare that a `sockaddr` can alias any other type you want, and thus would make it legal to read from a `sockaddr` and then cast it to another type for use - removing the need for the `union` BS and all the other various hacks to get around this issue. Obviously that's not standard C, but I think it makes a pretty good argument that adding something like `may_alias` to the standard would be a very good addition.

And that touches only the larger problem with strict-aliasing that I see - there's no way to avoid it. We have `restrict` which ironically allows us to avoid the aliasing problem for pointers which strict-aliasing doesn't apply, but we have no way to tell the compiler two pointers (or types) can alias when it thinks they can't. `may_alias` is one solution, but really any solution that fixes would problem would be extremely welcome in my book. I think the standards writers currently consider `union` to be the solution, but IMO that's simply not sufficient.

> No, that isn't guaranteed by POSIX - it has to have an sa_family_t member, but it doesn't have to be the first one. > > I also think it's problematic that the use explicitly contemplated by POSIX is considered ill-formed C.

As long as all the sa_family_t members in all of the various `sockaddr` types overlap then you could make it work (If they don't overlap I fell like that would create lots of other issues). Obviously though this is a pretty clumsy solution.

And I would agree - I wouldn't say it's anybodies particular fault that we've hit this particular point (Though you could argue that compiler writers jumped the gun on this one), but it is an issue worth addressing. I do think it's possible to use it correctly through the usage of a few different techniques, but 1. most programs already written don't do that, and 2. like I said before, you shouldn't have to go through a million hoops (that aren't even mentioned) to use the interface correctly.