Hacker News new | past | comments | ask | show | jobs | submit login
Nil in Go is typed in theory and sort of untyped in practice (utcc.utoronto.ca)
127 points by goranmoomin on March 31, 2021 | hide | past | favorite | 68 comments



For some context, I've been programming in Go for about six years now, and I've literally never even thought about this.

In general, you really haven't got much reason to be worried about something's "interface type", because if you're really worried about something's "type" (which can happen) you are almost certainly worried about it's "concrete type". The only interesting thing you can ask about the interface type directly would be "Does the value inside this interface implement this interface?", and the answer is statically yes, the compiler assured that, and the interface type is statically assured at runtime, so it's not a very interesting question. Even if fmt added a way to "see" the interface type passed in, I can't imagine what it would be useful for, because it could only ever be:

     var x SomeInterfaceType

     // some number of lines of code

     fmt.Printf("%I\n", x)
which would statically print "SomeInterfaceType".

So, I mean, it's not like anything said in that blog post is false necessarily, but if you're collecting reasons to hate on Go's nil handling, there's no real justification in adding this to your list. I'd also add that I answer newbie questions a lot on Reddit, and nothing even remotely like this has ever come up there, either.


I programmed in Go for three years, and got bitten by this at least twice.

Being able to determine if a value that implements an interface is nil or not would really be nice.


That's actually not what this blog post was talking about. That's a separate issue. My commentary can be found https://news.ycombinator.com/item?id=26636227 , and see the grandchild as well.


How does nil implement an interface?


A pointer to a type that implements an interface can be nil, but a function that takes said pointer as an interface type will not be able to distinguish between a pointer that points to an allocated object and a nil one.


Attempting to call a method on a nil interface value produces a runtime panic. Otherwise, it is the type that implements an interface, and it's possible that the type works even if the specific value of the type is nil. For instance, you could have a 'myIntPointer' type that is a pointer to an int, and implement a String() method for myIntPointer, and at that point a nil myIntPointer still has a String() method and implements the fmt.Stringer interface (although it may get a runtime panic if you actually call the method).


I agree that you won't think about this too often. However, if you're trying to use the reflect package to get type information about variables, you may end up having to think about it (and like me, you may end up feeling very confused!).


The predeclared identifier nil is a constant, like 0 or "".

Like 0 or "", nil is untyped. Go has both typed and untyped constants. 0 also is untyped, which is why you can write x==0 regardless of whether x is an int32 or an int64.

Unlike 0 or "", nil does not have a default type. The default type of a constant is the type that it is implicitly converted to when a type is required. For example, "x := 0" declares a new value x with the value 0, but 0 is untyped. Untyped integer constants have a default type of int, so x is given the type int.

Since nil does not have a default type, it's an error to write "x := nil".

You can declare a typed constant with the value 0 or nil. int32(0) is a constant with type int32 and the value 0. (* int32)(nil) is a constant with the type pointer-to-int32 and the value nil.

Variables of an interface type can hold any value that satisfies the interface. For example, a value of type io.Reader can hold any value that has a Read method with the appropriate signature.

The common confusion about nil occurs because you can compare an interface value to a non-interface value.

  var b *bytes.Buffer       // b is a variable with type *bytes.Buffer
  var r io.Reader           // r is a variable with type io.Reader
  r = b                     // r now holds a value with type *bytes.Buffer and value nil
  b == nil                  // true: b is nil
  r == b                    // true: r and b are equal
  r == (io.Reader)(nil)     // false: r is not a nil io.Reader
  r == (*bytes.Buffer)(nil) // true: r contains a nil *bytes.Buffer
  r == nil                  // false: r is not a nil io.Reader
 
The confusing part is that last line. How can r not be nil, when it contains a nil value? And the reason is hopefully apparent from the previous two lines: It depends on the type of "nil".

Whether you find this a wart in the language or not, this is not a case of nil being either typed in theory or untyped in practice. The predeclared identifier "nil" is an untyped constant.


The Go specification is careful to not call 'nil' a constant, and in fact at one point specifically says that it isn't ("Conversions", in the section on converting constants into typed constants, which actually uses '(*int)(nil)' as an example of something that is not a typed constant). Also, although it wasn't clear in the entry, the original article that it was a reaction to talked specifically about 'nil variables' (ie, variables with the value of nil).

(Even the concept of 'the value of nil' is tricky in Go; I believe the specification only talks about things being comparable to nil or allowing nil to be assigned to them. The specification really goes to a lot of work to not treat nil as a value, exactly. I suspect that the Go spec authors really did not want a rerun of the C idea that the NULL pointer is '0' and has an all-zero value and so on.)

(I'm the author of the linked-to entry.)


> Like 0 or "", nil is untyped. Go has both typed and untyped constants... Unlike 0 or "", nil does not have a default type.

So it feels like you're trying to draw some distinction here that is like, "look, nil isn't really that special - there's groups A and B, and the core confusion is people think nil is in group A while the truth is in it's group B." And it's true, Go has both typed and untyped constants.

But as far as I know, nil is the only untyped constant without default type. So "group B" is still just nil, and nil remains a sui generis source of confusion in Go.


Yes, but giving nil a default type would be way more confusing.


Go already kind of does though, when you use it in a type switch nil has type nil. For binding purposes, `interface{}` also seems reasonable. (Or only half-joking, if I wanted maximum pragmatism for minimal clarity, nil's default type should be error.)

It seems the root problem is that interfaces and concrete values occupy the same semantic positions. But that's way too hard a problem to fix; if I had a magic wand to update all existing code I'd probably just have `== nil` on an interface check for both kinds and try to rescue transitivity.


At the risk of inviting a flame-war, it's distressing to me that a language this recent, with such a purported focus on simplicity, has design-problems this fundamental


I have a hard time classifying this as a "design problem" when it has zero practical implications. I've never encountered a problem that can be traced back to this, nor ever helped anyone else with a problem that can be tracked back to this. If it is a "design problem" it is of the weakest possible class. I can't imagine what would "fix" it that wouldn't be a larger design problem of its own.

The whole "confused about nil interface vs. interface containing nil pointer of some type", while I would contend has a root cause of improperly importing the idea from C++ that nil is automatically invalid when in fact it is perfectly valid in Go to call methods on nil and thus confusing "nil" with "invalid", is at least a cause of real bugs and real misunderstandings.


Sorry, you won't get much agreement from me on this. There's only two kinds of languages, the kind with fundamental design problems and the kind people don't use.

If you are looking for the perfect language may I suggest Forth on a Z80? Otherwise there's going to be a pile of shit hiding somewhere.


I disagree. Issues like this on really core language primitives usually crop up when either:

1) The language lacked hindsight in its domain because there wasn't a ton of prior art (C)

2) The language didn't have enough foresight put into it (JavaScript)

3) The language's modern usage has drifted significantly far from its original usage (Java)

Python is an example of a language people use that doesn't suffer from #1 (thanks to Perl and friends), doesn't really suffer from #2 (at least, Python 3) and doesn't (yet) suffer too badly from #3. It has plenty of flaws and limitations, but not in its foundation. The core language primitives are rock-solid.

#1 doesn't apply to Go, and it's not old enough for #3 to apply. So that seems to only leave #2.

(I should add that this doesn't make a language completely invalid or useless. I use (and even like!) JS despite its flaws; in fact I prefer it to Python overall, mostly due to the ecosystem. It's just disappointing to see issues of type #2, because they seem so avoidable.)


I think Python suffers from #3, as they've been adding new (controversial) syntax like typing, the := operator, and pattern matching. These were intended to help write real-world code at the cost of simplicity and applicability as a teaching language.


It's starting to get there, yeah, though I think any sufficiently-popular language eventually gets there at a certain age


Eh, parsimony is nice to have, but at the end of the day, programming language design is just plain old engineering and engineers make tradeoffs to ship functional code that isn’t too terrible.


It's pointless discussing this. There is no perfect language and every programming language I've ever seen has design problems at one level or another. I know enough of CL, Pascal, Modula, Ada, Rust, Nim, etc., to know their weaknesses, too.

Your C example is also not convincing. Lisp and Algol were already strong contenders at the time. People chose C because they needed a fast and close to metal language to create their own operating system for mainframes. It was a hack just like your category 2. Still, it's a great language.


> There is no perfect language

> It's pointless discussing this

I don't see how the first implies the second. This is a programming forum; it's good to discuss and debate the specifics of these things.

C may not have been the best example of the category, but I think the category stands. Rust for example, much as I love it, is exploring uncharted territory. I wouldn't be surprised if 10 years from now we have ownership-based languages that are much easier to use, simply because lessons will have been learned by that point. Arguably Java's OOP fundamentals are not as good as C#'s because Java had to explore that territory first (setting aside technicalities about what the term "OOP" really means). Etc.


The problem is that decades of experience has shown that only few programmers can discuss the advantages and disadvantages of programming languages for certain purposes in a halfway objective manner, and since few of them implement their own languages even the good discussions often remain fruitless.

I'll recommend Reddit's r/ProgrammingLanguages, as well as Scott's Programming Language Pragmatics as a reading instead. It's not fully up-to-date but I found the comparisons of features and their trade-offs in it fascinating.


Yes, that sounds like a flame war invitation. Issues with nil plays no role in practice.


Huh, so equality isn't transitive in Go. Yikes.


In many languages, equality is often intransitive when you consider implicit conversions. For example:

   type T int

   const a = 0
   var b T = a
   var c interface{} = b
   fmt.Println(a == b, b == c, c == a) // true true false
https://play.golang.org/p/YZ-0Ym5lJ7R


Eh, languages end up with this all the time with type conversions.

In JS:

    false == undefined => false
    false == null => false
    undefined == null => true


First, your comment makes no sense:

    a = 1
    b = 2
    c = 1
    a == b => false
    b == c => false
    a == c => true
Is that surprising? Equality is transitive, inequality is not.

Second, if you have to reach for old javascript "features" (or got forbid PHP) to defend a language you're in a bad, bad place.

Third, the broken equality is not something that's usually praised about javascript. In fact the first rule of javascript equality is that there is only one situation in which `==` is the correct operator: checking that something is null (because that avoids having to differentiate between null and undefined, papering over one javascript wart using another). In every other situation you want "strict" (===) equality.


I managed to get my team to use `lodash.isNil` because we had gotten bitten by this crap too many times. And `lodash.isEmpty` for strings/arrays/etc.


What is absurd about this statement from logical point of view?

  A != B
  A != C
  B == C
...looks like valid logical statement, ie. A = 1, B = 2, C = 2:

  1 != 2
  1 != 2
  2 == 2


What about A=1, B=2, C=3?

   A != B    1 != 2
   A != C    1 != 3
   B == C    2 == 3


If you write `y = x + 1` and say what about `y = 42` and `x = 0`, `42 = 0 + 1` doesn't mean math/equation is absurd, does it?


That's not an example of nontransitivity. You’ve only shown that some nontruthy values are unequal by == and some are equal by ==. Nontransitivity would be a == b, b == c, c != a. But you have the much more expected b != a, b != c, c == a.

JS loose equality is intransitive, though; the standard example being:

  '0' == 0; // true
  0 == ''; // true
  '0' == ''; // false


r==b and b==nil but r!=nil seems like the kind of thing that people give JS a lot of crap over.


Can you please give more concrete example that is valid js but is absurd logically?


This talk lists a bunch of them https://www.destroyallsoftware.com/talks/wat

The first one he talks about is that [] + [] gives you "" (an empty string). I don't know what universe that makes sense in. [] + {} gives you {} back. Meanwhile {} + [] gives you 0. Yes, the order in which you add types that can't logically be added ends up giving you different incoherent answers. {} + {} gives you NaN.

One of the other sibling comments said that it was mostly implicit conversions, which I think is the underlying cause of most of these.

That's not to totally hate on Javascript, but the language could probably use fewer implicit conversions and more type errors. I struggle to think of a situation in which those outcomes are desirable, and it can cause extremely hard to follow stack traces. A JSON API returns a list of objects instead of a list of strings and now your types are screwed up in hard-to-follow ways.


So FYI most of those are misleading. When he's typing into a terminal and it does {}+{} it's interpreting it as and empty code block, and then an unary plus on an empty object, similar to:

{ }; +{}

If you put parentheses around it, or assign it to a value and then print that value then {}+{} isn't empty string.


I don't actually subscribe to the "JS is a bad language" meme (and maybe the meme has died off a bit, partially thanks to typescript), but implicit coercion is one thing that people cite, with behavior exactly like that nontransitive Golang case.

Also see this minitalk (which actually gets some things wrong because of confusion on how the terminal interpreter behaves but is still amusing): https://www.destroyallsoftware.com/talks/wat


I think the classic example is [] == ![] evaluates to true. It’s not absurd logically, just the operators act weirdly/non-standardly.


Honestly that's more of a weirness of the '==' operator, which is kind of deprecated in modern JS. On the other hand I fully agree that non-strict definitions of equality are bound to generate weird cases around nullish values.


Yeah, that one is funny

  > [] == []
  false
  > [] == ![]
  true


However typescript says:

  This condition will always return 'false' since the types 'never[]' and 'boolean' have no overlap.
...for `[] == ![]`. Which is ironic because it does return true, not false. However at least it flags an error!

I wonder how much of weirdness is wiped out by linters/ts/etc.


> How can r not be nil, when it contains a nil value?

In the same way a slice can contain one or more nil values without itself _being_ nil. A gift box containing nothing is not itself nothing.

> And the reason is hopefully apparent from the previous two lines: It depends on the type of "nil".It depends on the type of "nil".

Well, no. "r == nil" tests whether r itself is nil (i.e. the zero value of that interface type) which it is not. It contains b and is thus no longer nil. The type doesn't play a (large) role here. This misconception might be due to the fact that lots of types can be nil: slices, maps, channels, functions and (unfortunately) interfaces.

Nobody is astonished that a slice containing any sort of nils is not nil itself; that a buffered channel containing nils is not nil itself and that the constant function `func()*int { return nil }` is not nil itself either. Only for interfaces this poses a problem. If interfaces wouldn't be "nil" but lets say "zilch" than it would be evident that r != zilch.


0 also is untyped, which is why you can write x==0 regardless of whether x is an int32 or an int64...

Since nil does not have a default type, it's an error to write "x := nil".

But not an error to write "x := 0"?


Because 0 has a "default type" of int. So when the compiler goes to determine the type of x, it sees its set to 0, which the compiler assumes to be an int.

I've found thinking about nil too much in Go hurts or creates weird loops in my brain, so I just tend to worry about it when I'm dealing with interfaces or pointers. Everything else I'm looking at zero-values.


I'd much prefer it if in golang, the type of nil was some "nilType" for which != and == is defined for all other types. Or, define some other operation, like isNil()

Then determining if something is nil could be done without thinking about it at all. nil would be nil would be nil. (Also, it would be illegal to put methods on nilType, of course.)

(Yes, I make my living writing go, and this would make my life easier. And no, in that case, don't allow x := nil.)


No, because as they note constants can have a default type[0]. For integer constants, that's `int`. Meaning if a literal integer constant is not otherwise typed, it will default to `int`.

[0] https://blog.golang.org/constants#TOC_5.


The next paragraph they wrote explains this: "Unlike 0 or "", nil does not have a default type. The default type of a constant is the type that it is implicitly converted to when a type is required. For example, "x := 0" declares a new value x with the value 0, but 0 is untyped. Untyped integer constants have a default type of int, so x is given the type int."


exceptionally clear explanation, thank you.


I'm afraid this rings slightly hollow for me.

We can likewise claim about C that "0 in C is typed in theory and sort of untyped in practice" because constant expressions of integer type that evaluate to 0 can serve as null pointers of any type:

   char *p = 0;        // no cast required
   if (p == 0) ...     // no cast required
   time_t t = time(0); // fine
   char *q = 1;        // error; requires (char *) 1.
We can't think of this 0 as an object; it's just a source code token that provides a null pointer of the right type, which is statically obvious from the use.

The only time it's a problem is in unsafe interfaces:

   int old_style_string_fun();
   int variadic_fun(char *first, ...);

   old_style_string_fun(0);
   variadic_fun("first", 0);
Here, the wrong type is inferred; the 0 just behaves like the ordinary constant of type int. A null pointer to char was intended, which requires (char *) 0 must be used.

None of this challenges the fact that we can have a variable of type int, whose value is 0.


C is an ~untyped language so this comparison isn't very helpful.


Though a popular meme, it is false. C has an extensive static type system, requiring programs to be checked and diagnosed. This is quite helpful in catching bugs and refactoring and all that. That type system doesn't encode everything that some people would like, such as preventing a program from retaining and using a pointer to a datum that no longer exists. It allows implicit conversions that some people would like to be explicit (e.g. integer to floating-point and vice versa), but in those situations, the static type of every operand, and of the result, is known and the compiler chooses the conversion accordingly.


Addendum:

C is untyped in linkage: the combination of multiple translated units into a single program. All of the type checking in C translation happens within a single translation unit. The language standard makes no requirements regarding the retention of type information in the translated code.

An identifier declared as "double foo" in one translation unit at the file scope, as an external name, could be misused in another translation unit via "extern int foo". Similar reasoning applies to function arguments and structure definitions. We can easily write a program in which two translation units have a completely different idea about what is "struct foo", such that one creates a struct foo and passes a pointer to that into the other.

This could happen by accident, at least in the development environment even if the declaration is written in one place (a header file) and everything else includes it. How? Quite simply, the build system could have broken dependencies, and fail to recompile the second translation unit when a declaration has changed.

In practice, adherence to program organization conventions, additional diagnostics available in compilers, and reliable compiler-driven dependency generation fill this gap, though imperfectly. Plus related CI, QA and delivery practices, like never giving a build to testers or customers that was an incremental build from a developer's system; every officially tested build must be a clean rebuild.

There are related issues in the maintenance of libraries, which give rise to situations that dependent code simply will not and cannot be recompiled, so binary compatibility must be maintained.


I'm not sure why it would be significantly better to have reliably "typed nil" over "untyped nil". The concept has no real definition to begin with: what matters about sound typing are the static guarantees a type brings, or by association even if this not really the same concept, the "dynamic guarantees" about the behavior of objets of some "dynamic type" (that is the variable used as a handle to the objet had no guarenteed static type, but in most cases the object themselves have a type). Acting on a "type" of nil values seems to make very little sense. Maybe there are obscur constructs where this would be handy, but at the first glance I would consider it probably indicative of a dirty hack, and recommend looking for alternate solutions.


In Smalltalk, nil was a constant that contained the only instance of the class UndefinedObject.

What some people did, was to put methods on UndefinedObject to implement a "NullObject" pattern for their domain objects. Since everything is default initialized to nil, everything would just work.

The big downside of this, were beginners putting methods on UndefinedObject that they should not have. There's a thin line of conceptual understanding between a bit-bucket that hides errors from you, and a useful NullObject. Some people would call this a "dirty hack" for that reason.


It's probably not useful in Go's type system. It would be useful if there were union types, though. I sometimes end up using pointers for nullable fields because the zero values are significant in whatever domain I'm working in. I either have to make it a pointer, or I have to implement a new special zero value, or I have to make a struct that can keep track of whether a value has been set or not.

It'd be nice to be able to do

    type MyStruct struct {
      ID int || nil
    }
And have the compiler understand that the ID has to be either an int or nil, and to give me errors if I try to use it as an int without making sure it isn't nil.


Much discussion about nil in Go just yesterday:

https://news.ycombinator.com/item?id=26635529


It is very confusing that equality in JavaScript is not transitive: https://imgur.com/5pFXFbR.

Apparently equality in Go is no longer transitive either...


JavaScript is fine so long as you stick to === rather than ==. At work, == is completely banned and tjis is enforces by CI.


I think I'd need to see an example of that to be sure. What's talked about here doesn't really lead to that, because in these examples, values are getting passed through a function call boundary.

There's a lot of rules in Go for what == does, and maybe there's something non-transitive in there, but I'd definitely need to see someone prove it on https://play.golang.org/ to be sure. Offhand I can't think of how to make one.


There is a nontransitive example heading this very comment section:

    var b *bytes.Buffer
    var r io.Reader = b
    fmt.Printf("r == nil: %v; b == nil: %v; r == b: %v\n", r == nil, b == nil, r == b)

    r == nil: false; b == nil: true; r == b: true


It's not confusing if you understand that r is pointer that points to something and b is that pointer, which is nil. So of course r (the variable containing a pointer) isn't nil, b (the pointer) is nil, and r is the pointer b.


Pointers don't normally compare equal to the thing they're pointing to.

If it's acting as a proxy it's confusing the specific way that some things get proxied and other things don't.


Agree, r == b in this example is confusing because in all other instances, the interface{} is a kind of box type. That box type is implicitly unwrapped in the r == b comparison. Although more verbose, I’d be happy with having to do something like this for comparison:

b2, ok := r.(*bytes.Buffer) equal := ok && b2 == b


laughs in C


Thank you, I missed that.


> It is very confusing that equality in JavaScript is not transitive: https://imgur.com/5pFXFbR.

There are 2 types of equality in JS, shallow (==) and 'strict' (===). It's fine because JS is a dynamic, weakly typed language.

Go isn't supposed to be.


'nil' in Go is just an identifier to represent the zero values of some kinds of type. A bare nil is always untyped and it is the only untyped value which has not a default type.

Zero values behave the same as non-zero values in many aspects.

More read: nils in Go https://go101.org/article/nil.html

The facts that several kinds of type share the same zero value literal representation really confuse many new gophers, but this is hardly a problem when you get familiar with the design.

There are proposals to remove the new-gopher confusion.

* https://github.com/golang/go/issues/22729 use different zero identifiers for different kinds of type

* I remembered that there is a propsoal which propose using `null` for non-interface zero values but still using `nil` for interfaces. I couldn't find this issue now.


This reminds me of Rich Hickey's "Maybe Not" talk




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: