Data Race Patterns in Go

bumper_crop · on June 10, 2022

Sorry to say, but these hit close to home for me. A lot of the synchronization paradigms in Go are easy to misuse, but lead the author into thinking it's okay. the WaitGroup one is particularly poignant for me, since the race detector doesn't catch it.

I'll add one other data race goof: atomic.Value. Look at the implementation. Unlike pretty much every other language I've seen, atomic.Value isn't really atomic, since the concrete type can't ever change after being set. This stems from fact that interfaces are two words rather than one, and they can't be (hardware) atomically set. To fix it, Go just documents "hey, don't do that", and then panics if you do.

Groxx · on June 10, 2022

The lack of generics has forced all Go concurrency to be intrusive (i.e. implemented by the person using literally any concurrency), and yeah. It's horrifyingly error-prone in my experience. It means everyone needs to be an expert, and lol, everyone is not an expert.

Generics might save us from the simple, mechanical flaws. Expect to see `Locker<T>` and `Atomic<T>` types cropping up. And unbounded buffered thread-safe queues backing channels. Etc. I'm very, very much looking forward to it.

--- edited to rant more ---

I also really wonder where all these "go makes concurrency a first-class concept" claims come from, because I see it quite a few places, and I feel like it's making some very strong implied claims that absolutely do not exist.

Go has channels and select. That's neat. But on the other hand it has threads... but no thread handles. It has implicit capturing of closures. It has ambiguous value vs pointer semantics. It (style- and ergonomic-wise) encourages field references, which have no way to enforce mutexes or atomics. It has had crippled lock APIs that effectively force use of channels for... I don't know, philosophical reasons?

Go is abnormally dangerous when it comes to concurrency IMO. The race detector does an amazing job helping you discover it, but it's very easy to not use it or not take full advantage of it (i.e. non-parallel tests), and few run their production services with the race detector enabled. Because if they did, it would crash all the time, because there are an absurd amount of races in nearly all of the popular libraries (and in common use of those libraries, because concurrency is not a first-class citizen and you can't tell when it's happening / when it shouldn't happen).

josefx · on June 11, 2022

> I also really wonder where all these "go makes concurrency a first-class concept" claims come from,

Given that some of the main architects behind Go had K&R C as background I wouldn't be surprised if "first-class" just meant that the language defines both a memory model and primitives for threading. C had neither until it basically adopted both from C++11.

btschaegg · on June 11, 2022

I mean, this explanation makes sense, but after thinking about it a bit, I don't think those are even relevant.

I'd wager if one removed all notions of concurrency from Rust and only left in the `Send` and `Sync` traits (along with borrows, of course), it seems like Rust would still warrant such statements way more.

OTOH, saying this about Go due to "memory model and threading primitives" sounds a little bit like describing C++ as a language with "first class functions" because there's `operator()`…

throwaway894345 · on June 11, 2022

That’s certainly how I interpreted it. It never occurred to me to think it meant “thread handles” specifically.

Groxx · on June 11, 2022

I meant it more in that "first-class" tends to mean "this is a thing that is represented in the language / type system".

Go has first-class functions, because you can make a `var fn func() string` field/variable/argument/etc that holds a reference to a func that returns a string.

Go does not have first-class types, because you can't reference or store a type directly. You can use reflection to pass a reflected thing representing a type, but not the type itself. Generics muddies this somewhat, but I'll argue that falls under "generics", not "first-class types". In contrast, Java has both generics and first-class types, because you can pass `SomeClass` itself as an argument.

---

Node.js arguably has first-class concurrency. It has async/await: you do not have concurrency without those keywords. If they exist, you have potential concurrency. If they do not, you do not. (there may be exceptions here for true thread use, and JS runtimes vary, but you get the idea)

Rust has async/await now, and also has Send/Sync, which gives it a very strong claim to "first-class concurrency".

Go's concurrency constructs have no representation in the type system. They're totally invisible. Channels and select are mostly used with concurrency, but they do not define concurrency, and can be (and are) used synchronously as well.

`go` is a keyword, but I don't see how that's any different than `new Thread(fn)`... except that the Thread has a better claim to first-class-ness, because it returns a value that represents the concurrently-executing thread. If you have a thread reference, you know that concurrency exists. The reverse is not true though.

lmm · on June 21, 2022

I would say the notion of something being first-class is that you can manipulate it in the same way as a regular value in the language; in particular, you can use an expression that evaluates to such a thing in all the same ways that you can use a built-in version of that thing. Certainly Java does not have first-class types: you can pass a value that is sort of a representation a flattened form of a type, but you can't use that kind of value in a "new" expression or as a function return type.

AIUI Rust's async/await still has quite a lot of special case support. IMO concurrency is only really first-class in languages like Haskell where you can manipulate async actions the same way as a user-defined type and implement concurrency-related operations in plain old code.

throwaway894345 · on June 12, 2022

I mean, Go clearly has goroutines as a “first class” concept for some value of “first class”, and (as they are a sort of thread) goroutines are concurrency. This is to say, I don’t think your claim about “must have async/await in order to have first class concurrency” is correct in any formal sense (maybe you’re defining “first class concurrency” as requiring async/await rather than asserting that this is what first-class concurrency means to programmers generally?). I agree though that goroutines have no representation in the type system, but that’s because they aren’t values, so one wouldn’t expect them to have a type or a type system representation. Yet they are very much part of the Go runtime and not a library or a syscall or similar.

Groxx · on June 12, 2022

By that description though, Go also has first class types. And that kind of makes the distinction meaningless because essentially every programming language has types.

There might be room to claim first-class support for green threads? But if so it's a very weak "first class" since all you can do is start them.

throwaway894345 · on June 12, 2022

Yes, I think in the general sense of “first-class X”, “first-class types” would refer to any language with a concept of “type” (except perhaps certain dynamic languages where types are built from language primitives, but I’m not so sure about that case). I’m inclined to say that “first-class types” overrides that general formulation to mean “reified types” specifically. It’s also possible that I’m mistaken and the general formulation of “first-class X” always means exactly “reified X” (although I think reflection is a form of reification, and thus Go would have “first-class types” despite your above distinction between first-class types and reflected types), in which case Go doesn’t have “first-class goroutines”.

hexxagone · on June 11, 2022

Go does not have threads but something like "tasks". The fact that no thread handle is exposed allows for transparently moving these tasks across threads if the scheduler decides so.

"go makes concurrency a first-class concept" I think it usually refers to goroutines being built in the language.

"Go is abnormally dangerous when it comes to concurrency IMO". Personnally, it has not been my experience with Go concurrency. However I have hit some issues when trying to ocrhestrate tasks via channels and ended up resorting to atomics to do the job.

saghm · on June 11, 2022

> Go does not have threads but something like "tasks". The fact that no thread handle is exposed allows for transparently moving these tasks across threads if the scheduler decides so.

This doesn't stop there being "task handles" then, though? I think the point GP was making is that something that in most languages would be simple methods on a handle like "wait for this task to finish" or "stop this task" instead need to be done manually in Go with channels (or potentially `Context` in the latter case, although that was a later addition to the standard library). It doesn't really matter whether you call it a thread or a task; either way, it would be nice to get some return value from spawning some background operation and being able to use it to directly interact with it. I agree with GP that it does seem like an odd omission, since I haven't really heard any actual practical explanation for it.

Groxx · on June 11, 2022

Context for cancellation and replacing thread-local variables (or indeed any way to observe your "current" thread) is one of the things I like tbh. Though Context has abysmal performance implications.

But yeah, I want a goroutine handle with a "Wait()" method. Ideally also returning the results. Like most languages. It'd eliminate a ton of manual mutex and channel use that doesn't need to exist.

---

Re thread vs tasks: that's an implementation detail. You write threaded code and it runs in multiple threads with thread-like memory behavior. In all in-Go observable ways it's identical to threads, and it could be changed to use real hardware threads tomorrow and none of the semantics would change at all. Even cgo would stay the same.

Go has (green) threads. Being more specific is relevant for runtime implementation spelunking and performance details, but not otherwise.

saghm · on June 11, 2022

Yeah, I generally think of the word "thread" as referring to OS threads and/or "green" threads depending on context (and in this case I thought it was clear what you were referring to!), but since the person who responded to you made the distinction, I figured I'd use their terminology when explaining what I thought you were saying.

Groxx · on June 11, 2022

I was just leveraging your already-top reply to reply to both of you, sorry about that :) I should've just done two comments. I think you and I are on the same page here.

I think the main reason it doesn't exist is that go had no generics. It'd need to be another custom-generic type (Future[T] basically), and it would make it harder to pass around, just like channels. But since channels are generally intrusively-added, they aren't part of the return signature, so they avoid that generic-return issue. E.g. every "worker pool" accepts a `func()` and callers need to coordinate return values via channels, instead of needing to return a `func[T]()` reference which they have been unable to do until recently (to some degree at least).

Though they probably could've just said "use a Future[interface{}]", like they did for every other generic collection type.

Plus it'd take some of the emphasis off channels, and they seem to really not want to do that. If they were focused on usability instead of channels and select, they'd let us park on multiple mutexes just like channels, just like the runtime does internally a lot to implement all this... but no. Imagine a world where you could `select { case mut.Lock(): ...}`...

saghm · on June 13, 2022

> I think the main reason it doesn't exist is that go had no generics. It'd need to be another custom-generic type (Future[T] basically), and it would make it harder to pass around, just like channels. But since channels are generally intrusively-added, they aren't part of the return signature, so they avoid that generic-return issue.

That's a good point I hadn't thought of! Naively I wan to say they could just "implicitly" make anything returned from a `go func` be passed to a channel and then have `go func` return a channel, but that would require doing a bit of type inference as well as deciding semantics for whether it's possible to get multiple values out of that return channel. It honestly seems like there are some interesting ideas here (e.g. having multiple yields out of a go routine that then get sent to an "output" channel, making a sort of generator-like thing, but I guess I'm not super surprised that Go didn't choose to go this route.

mmarq · on June 11, 2022

My (rather horrid) pattern to address this problem is to wrap the goroutine in a function that returns a channel receiver. When the goroutine ends it sends something to the channel and whatever called it can await the result or completion using the receiver.

randomswede · on June 13, 2022

I have, on occasion, used a similar pattern, but instead of sending something, I simply close the channel (usually with a "defer close(c)" at the beginning of the function/closure that encompasses the main code of the goroutine's work).

That way, if I end up having multiple waiters, they will all be able to proceed.

silisili · on June 11, 2022

I've always thought it would be nice if the go command returned an ID. Doing so would also be completely backwards compatible, of course. Then add a library or few builtins to do things on that ID, at minimum maybe kill it, perhaps get status of it, etc. Maybe not full blown actor model, but having nothing feels powerless.

hnlmorg · on June 11, 2022

Go has OS threads and “green threads” (named “go processes”). You create green threads via the go keyword and the Go runtime assigns that to an OS thread. You can have many go processes to a single OS thread and typically have a maximum of 1 OS thread per CPU core (though that is configurable).

The GP is correct that you cannot manage go processes from outside of that green thread. With (for example) POSIX threads, which still leaves a lot to be desired, you can at least manage the thread from other threads.

Go definitely has some rough edges around threading. The idea is you’re supposed to use channels for everything but in my experience channels have so many edge cases for subtle ways to completely lock up your application that it’s often easier to fallback to the classic mutex-style idioms.

I do really like the go keyword, it’s handy. But I have a background in POSIX threads so probably find concurrency in Go easier than most yet even I have to concede that Go under-delivered on its concurrency promises.

Thaxll · on June 11, 2022

"because there are an absurd amount of races in nearly all of the popular libraries"

This is fud, I ran the race detector with a lot of popular lib and I never found issues like that.

But since you're claiming there are issues everywhere, do you have examples?

tialaramex · on June 11, 2022

I'd say there's an excellent chance your assumptions about types you got from "popular libraries" is more conservative and that's why you never detected any issues.

For example take the JSON decoder. If you have several tasks which can use some data from a JSON blob in parallel, is it OK if they all just share the same JSON decoder?

If you're horrified because this seems obviously like a bad idea, that'll be why you didn't find any trouble. In some other systems your programs would be needlessly slow and clunky as a result, but in Go your assumptions were appropriate.

It seems Groxx expects in this case that either the JSON decoder would work fine used this way, or, the documentation would highlight that you can't do this. Go chooses neither.

Here's Brad Fitzpatrick:

"The assumption when unstated is that things are not safe for concurrent use, that zero values are not usable, that implementations implement interfaces faithfully, and that only one return values is non-zero and meaningful."

These are some pretty important assumptions, or to look at it another way, potential foot guns.

bombela · on June 11, 2022

Replying here to the two siblings comments being confused about the decoder example.

What tialaramex is saying, is that if you have a stream of JSON values, you create a JSON decoder over it. Then every time you call the decode() method, you get the next decoded JSON value.

Then you want to process the JSON values concurrently.

Rephrased, the question was what would happen if you were to have every concurrent task call the decode() method whenever it wants a new value to work on?

It would probably be a data race cluster fuck. But you might find this type of mistakes everywhere in Go. I myself fought things like that in many libraries.

One such occurrence I recall was in the Google Cloud Pub Sub client library. It basically did something similar to this example. Trying to offer concurrency over a stream of messages. It would fail very rarely. And pretty much always passe the race detector. It wasn't fun to debug.

yencabulator · on June 11, 2022

A json.Decoder holds on to a single io.Reader, using it concurrently to decode multiple things is just plain old absurd. How would that even work?

https://pkg.go.dev/encoding/json#NewDecoder

masklinn · on June 13, 2022

> How would that even work?

The same way it works to do so sequentially?

Thaxll · on June 11, 2022

Why would you share a decoder? It makes no sense since you need to decode just once.

masklinn · on June 13, 2022

You can decode multiple values from a stream.

In a language which focuses on concurrency correctness, the decoder would either be thread-safe (in which case you could use it as an input queue) or not be usable from multiple threads (in which case you’d clearly have to create the queue yourself).

nindalf · on June 11, 2022

On atomics in Go, the beta for Go 1.19 was released an hour ago (https://groups.google.com/g/golang-announce/c/SNruPJUSFz0?pl...).

> The sync/atomic package defines new atomic types Bool, Int32, Int64, Uint32, Uint64, Uintptr, and Pointer. These types hide the underlying values so that all accesses are forced to use the atomic APIs. Pointer also avoids the need to convert to unsafe.Pointer at call sites. Int64 and Uint64 are automatically aligned to 64-bit boundaries in structs and allocated data, even on 32-bit systems.

Go 1.19 is expected to release in August.

morelisp · on June 10, 2022

> atomic.Value isn't really atomic, since the concrete type can't ever change after being set.

How does this mean it's non-atomic? As far as I know you can still never Load() a partial Store(). (Also, even if it was possible, this would never be a good idea...)

bumper_crop · on June 10, 2022

That's why I opened with "Look at the implementation". Go is unable to store the type and the pointer at the same time, so it warps what "atomic" means. Pretty much every other language has atomic mean "one of these will win, one will lose". Go says "one will win, one will panic and destroy the goroutine.

In fact, it's even worse than that. If the Store() caller goes to sleep between setting the type and storing the pointer, it causes every Goroutine that calls Load() to block. They can't make forward progress if the store caller hangs.

yencabulator · on June 11, 2022

> If the Store() caller goes to sleep between setting the type and storing the pointer, it causes every Goroutine that calls Load() to block.

Where does this go to sleep: https://cs.opensource.google/go/go/+/refs/tags/go1.18.3:src/...

It looks like a CAS busy loop with preemption disabled, to me.

bumper_crop · on June 21, 2022

Make sure to read between the lines. It only looks like a busy loop. Remember, the OS can pause and preempt your thread at any time. This is a real and likely event.

mbnull · on July 2, 2022

By reading the lines and not between them, you could read these two lines: runtime_procPin() and runtime_procUnpin(). With explicit comments that these pause preemption.

tapirl · on June 11, 2022

> Pretty much every other language has atomic mean "one of these will win, one will lose"

Could you elaborate how "much every other language" implement it?

fastest963 · on June 10, 2022

This is why all the examples call Store immediately with a zero value of the type.

bumper_crop · on June 10, 2022

https://go.dev/play/p/xolc9oPwA0C

Interfaces don't have a zero type, which means that we can't have an atomic.Value which stores Shape. Atomic Value would be much easier to reason about if it had store semantics similar to a regular `var foo Shape = ...`. One of the other comment threads talked about generics helping this, so maybe there is hope.

yencabulator · on June 11, 2022

Parent means

    var bestShape atomic.Value
    bestShape.Store((*Circle)(nil))

morelisp · on June 11, 2022

Which will store it as a *Circle, and only allow more *Circles, not Shapes. That part of GP’s claim is correct.

It just had nothing to do with atomicity; it means something specific, not just “I like the failure mode.”

yencabulator · on June 12, 2022

That's pretty easy to workaround:

    type shapeContainer struct { Shape }

The usual way to use atomic.Value is by writing strongly-typed wrappers anyway, so that doesn't affect your codebase beyond about 3 lines.

morelisp · on June 11, 2022

Abend is a fairly normal and in many ways best way to "lose" in a race. It's fine, it's atomic.

avgcorrection · on June 11, 2022

This is Atomic*

* Just don’t be an idiot. Worse is better.

smasher164 · on June 10, 2022

This definitely matches my experience using Go at my previous organization.

1. Closures and concurrency really don't mix well. The loop variable capture in particular is very pernicious. There's an open issue to change this behavior in the language: https://github.com/golang/go/issues/20733.

2. Yep. I've seen this problem in our codebase. I've grown to just be very deliberate with data that needs to be shared. Put it all in a struct that's passed around by its pointer.

3. This issue is caught fairly easily by the race detector. Using a sync.Map or a lock around a map is pretty easy to communicate with other Go devs.

4. This should be documented better, but the convention around structs that should not be passed around by value is to embed a noCopy field inside. https://github.com/golang/go/issues/8005#issuecomment-190753... This will get caught by go vet, since it'll treat it like a Locker.

5 & 6. Go makes it pretty easy to do ad-hoc concurrency as you see fit. This makes it possible for people to just create channels, waitgroups, and goroutines willy-nilly. It's really important to design upfront how you're gonna do an operation concurrently, especially because there aren't many guardrails. I'd suggest that many newcomers stick with x/sync.ErrGroup (which forces you to use its Go method, and can now set a cap on the # of goroutines), and use a *sync.Mutex inside a struct in 99% of cases.

7. Didn't encounter this that often, but sharing a bunch of state between (sub)tests should already be a red flag. Either there's something global that you initialized at the very beginning (like opening a connection), or that state should be scoped and passed down to that individual test, so it can't really infect everything around it.

0xFACEFEED · on June 11, 2022

> 3. This issue is caught fairly easily by the race detector. Using a sync.Map or a lock around a map is pretty easy to communicate with other Go devs.

I run my Go development server with the -race flag as a default. If it affects performance I'll turn it off but that's very rare in practice. Unfortunately a lot of applications don't run tests against their HTTP endpoints (like only internal library stuff) which is bad bad bad, but the -race flag at least helps mitigate.

To anyone reading who cares:

1) Always run your tests with the -race flag!

2) Always write tests for your HTTP handling code too!

3) Run your dev server with -race for a week and see what happens.

This will hard crash your Go program and there is nothing you can do about it. You can't recover(). Go vet will not catch anything. The -race flag will!

  package main

  import "time"

  func main() {
   m := map[int]int{}
   go poop(m)
   go poop(m)
   time.Sleep(5 * time.Second)
  }

  func poop(m map[int]int) {
   for i := 0; i < 1e10; i++ {
    m[i] = i
   }
  }

shp0ngle · on June 11, 2022

The first link notes that they don’t plan to change the semantics of the range loop, but they might dis-allow referencing it

which is… uhh ok maybe?

But go is really trying to keep backwards compatibility so they won’t just change this

masklinn · on June 13, 2022

Both changes break BC though. I see the underlying divergence of opinion as one of signalling: changing loop semantics would silently change code behaviour, even if it should usually be for the better there’s a bit of an 1172, whereas making the loop variable non-reference-able noisily breaks code which is almost guaranteed to be incorrect.

jchw · on June 10, 2022

This is pretty cool. 50 million lines of code is quite a large corpus to work off of.

I'm surprised by some of them. For example, go vet nominally catches misuses of mutexes, so it's surprising that even a few of those slipped through. I wonder if those situations are a bit more complicated than the example.

Obviously, the ideal outcome is that static analysis can help eliminate as many issues as possible, by restricting the language, discouraging bad patterns, or giving the programmer more tools to detect bugs. gVisor, for example, has a really interesting tool called checklocks:

https://github.com/google/gvisor/tree/master/tools/checklock...

While it definitely has some caveats, ideas like these should help Go programs achieve a greater degree of robustness. Obviously, this class of error would be effectively prevented by borrow checking, but I suppose if you want programming language tradeoffs more tilted towards robustness, Rust already has a lot of that covered.

likeabbas · on June 10, 2022

> I suppose if you want programming language tradeoffs more tilted towards robustness, Rust already has a lot of that covered

Does anyone not want robustness of their language to cover their mistakes?

setr · on June 10, 2022

For free? Of course not.

At cost? … depends on what you’re charging me, and how much I’m getting

likeabbas · on June 10, 2022

Good point. Right now I kind of see modern programming as two fold:

1) Loosely typed to get you what you want faster, but with some mistakes, and 2) Strongly typed that forces you to try harder, but ultimately better

I'm usually happier with the latter. I find I become far more frustrated when I try to write python than I do something like Rust just because I know when I write Python that I will have mistakes I'll have to fix in prod, vs when I write Rust I won't have those mistakes (although it'll take me longer to get something to prod)

jchw · on June 11, 2022

I greatly prefer Rust to Python, but slightly prefer Go to Rust in many situations. Python even with MyPy just feels sloppy, whereas Go does offer decently strong typing at least. As for Go vs Rust, the answer is in the ecosystem, and I blame that on the difficulty of maintaining high quality Rust crates. I know not everyone agrees with this explanation, but I think Rust being complicated makes it hard to offer zero compromises libraries for it that fit the spirit of the language. Every single piece of the ecosystem has its little issues. Today I found out the most prominent websocket library in Rust does not support permessage-deflate yet. Not a huge deal, but it does happen to be an issue for a use case I had in mind. The last few times I wrote stuff in Rust, I ran into similar problems:

- Sqlx is cool; it’s similar to sqlc but with proc macros. Unfortunately, it requires and connects to PostgreSQL… during compilation.

- The GIF crate works, but it’s kind of slow. It seems that image processing may be a little difficult to optimize safely. Also, alarmingly, I ran into a bug that broke the output in only in production builds… (only in higher opt levels.)

- Sometimes, the dependency trees start to look like npm. Understandably, the standard library for Rust is not quite as big as Go.

Go has a lot of great stuff in the ecosystem. There’s the standards library, with great crypto implementations, implementations of various common file formats and markup languages, etc. there’s a pretty good library for parsing raw packet capture, gVisor has a robust TCP stack and a mutex lock checker, there’s tons of linting tools, and golangci-lint ties a lot of them together, sqlc.dev and buf are useful for writing services, and there’s plenty of native Go client libraries for databases, queues, etc. whereas for Rust it seems you’re more likely to be stuck with C bindings, or occasionally worse.

It’s great that Rust is so good at integrating with C code; it may even be better for consuming C libraries than C. However, it is a damn shame that for a lot of stuff in production environments, you are still going to need to fall back to C today.

Rust feels like the perfect language to go nuts in and build the future. It has some issues (my big two are async being kind of stinky and no placement new) but by and large, nobody, even I, a big Go zealot, would deny that Rust feels like the future. If I had to pick a language for a project where I had unlimited time and budget to do it right, it’s Rust every single day.

That said… Today, at least for writing services and command line tools, Go feels like a good tradeoff for people with deadlines and decent standards. I use Go at work, and I have been doing so for almost a decade now. Do I spend time on problems that are wholly preventable by better language design? Absolutely. Does Go save me time by being simple as hell, having a rich ecosystem and compiling very quickly? Also yes.

I do hope to have more projects where I can make effective use of Rust. I have one I’ve been trying to get going for a while, where Rust’s attention to detail and robustness would be amazing to have, and it’s in an environment where Go would not work well. That said, it always proves to be at least a little challenging it seems.

dthul · on June 11, 2022

OT: sqlx tip: you can compile without a DB connection by either using the non-macro version of the queries (at the cost of losing compile time query checking) or you "bake" the query checks using sqlx-cli which produces files that you can check into your repo and allow the macros to compile offline. In that case you only need to re-connect to the DB when the SQL queries change.

armitron · on June 11, 2022

There are no gradients of strong typing. Either a language is strongly typed or not. Python is strongly typed.

For someone with such strong opinions, you seem to lack certain fundamental knowledge which I suggest you rectify ASAP as it will definitely improve your programming skill.

jchw · on June 11, 2022

The phrase “strongly typed” does not have one single agreed upon definitions. I’ve known that for like over a decade, since it comes up extremely often when someone is upset about people not liking x language and they have to jump to semantic debates instead of debating about real problems. I assume you’re not doing that… so I can clear things up: instead of “strongly typed”, substitute in “statically typed”.

edit: Also,

> There are no gradients of strong typing

that's flatly untrue. There certainly is a notion of 'stronger' vs 'weaker'. I don't know where you got this idea from.

eru · on June 13, 2022

Huh? What do you mean by 'strongly' typed in this case?

In common parlance, there are definitely gradients.

For example compared to most mainstream languages, Haskell is strongly typed. But compared to Agda, Haskell's types are pretty weak.

For example, typically Haskell programs don't use types to enforce that you can't divide by zero. In Agda it's relatively easy to enforce that.

armitron · on June 11, 2022

Except, Python is strongly typed.

ohazi · on June 11, 2022

A dig against Rust I sometimes hear is "Oh, data race freedom isn't such a big deal, if you really need it, a garbage collected language like Java will give you that guarantee."

So now I'm hearing that Go, a garbage collected language, doesn't guarantee data race freedom? I guess it's garbage collected but not "managed" by a runtime or something?

Why go to all that effort to get off of C++ just to stop 30% short? These are C-like concurrency bugs, and you still have to use C-like "if not nil" error handling.

Why do people keep adopting this language? Where's the appeal?

sa46 · on June 11, 2022

A data race and garbage collection are unrelated. A data race occurs when:

> two or more threads in a single process access the same memory location concurrently, and at least one of the accesses is for writing.

Rust provides compile time protection against data races with the borrow checker. Go provides good but imperfect runtime detection of data races with the race detector. Like most things in engineering, either approach requires a trade off involving language complexity, safety, compile time speed, runtime speed, and tooling.

oconnor663 · on June 11, 2022

They're unrelated in theory, but in practice a lot of garbage collected languages do try to turn data races into defined behavior. Java requires the JVM to implement some defined semantics for data races, though I think they're still considered terribly confusing in practice. Python prevents data races with the GIL, and JS prevents them by either not having threads at all or not letting them share memory. I think Go is actually somewhat unique among modern, GC'd languages in that data races in Go are true UB (albeit with lots of best-effort checks).

tialaramex · on June 11, 2022

Java promises that any variables touched by a data race are still valid, and your program still runs but it offers no guarantees about what value those variables have, so the signed integer you're using to count stuff up from zero might be -16 now, which is astonishing, but your program definitely won't suddenly branch into a re-format disk routine for no reason as it would be allowed to do in C or C++

Go has different rules depending on whether you race a primitive (like int) or some data structure, such as a slice, which has moving parts inside. If you race a data structure you're screwed immediately, this is always Undefined Behaviour. But if you race a primitive, Go says the primitive's representation is now nonsense, and so you're fine if you don't look at it. If you do look at it, and all possible representations are valid (e.g. int in Go is just some bits, all possible bit values are ints, whereas bool not so much) you're still fine but Go makes no promises about what the value is, otherwise that's Undefined Behaviour again.

I don't think Go is really unique here. Java put a lot of work in to deliver the guarantees it has, and since they turned out to be inadequate to reason about programs which don't exhibit Sequential Consistency that was work wasted. Most languages which don't have the data race problem simply don't have concurrency which is, well it's not cheating but it makes them irrelevant. C has "Sequential Consistency" under this constraint too.

kaba0 · on June 11, 2022

> so the signed integer you're using to count stuff up from zero might be -16 now, which is astonishing

Actually, if it is an int, it is guaranteed to not be any number not explicitly set to (java has no-out-of-thin-air guarantees for 32-bit primitives). In practice on every modern implementation it is true of 64-bit primitives as well.

So the prototypical data race condition of incrementing a primitive counter from n threads can loose counts, but will never have any value outside the 0..TRUE_COUNT range.

tialaramex · on June 15, 2022

Ooh, I did not know this. Do you happen to know where the "no-out-of-thin-air" guarantee is for the 32-bit primitives? Presumably in the Memory model docs somewhere?

kaba0 · on June 15, 2022

I quickly glanced at the spec, but didn’t find it. But I didn’t make it up though, I remember reading it on the jvm mailing list and I found it here described by Brian Goetz himself: https://openjdk.org/projects/valhalla/design-notes/state-of-...

tialaramex · on June 15, 2022

Thanks for that, it makes sense but it's good to know somebody specifically thought about this rather than "Eh, it seems to work".

layer8 · on June 11, 2022

Java code actually consciously tolerates data races for performance reasons, the prototypical example being the implementation of String#hashCode() (racy single-check idiom).

Santosh83 · on June 11, 2022

Memory safe, excellent tooling, excellent base std library, no manual memory management, static binaries, trivial cross compilation, trivial concurrent programming etc. These advantages are still not inconsiderable over C, although not C++. Yes, however, it is not as "safe" as Rust. But downside is C++ & Rust are harder to design for upfront.

masklinn · on June 11, 2022

> Memory safe [...] trivial concurrent programming

As this post demonstrates, data races are trivial and common in Go.

Because many of the core types (interface, slice, map) are non-thread-safe multiword structures, they also break memory safety: https://blog.stalkr.net/2015/04/golang-data-races-to-break-m...

throwaway894345 · on June 11, 2022

I’ve been programming in Go for a decade at this point, and I’m sure I’ve probably run into a data race before, but for the life of me, I can’t think of any specific instances. I’m not sure they’re as common as you think.

Far more often, I’ll run into race conditions in some service (multiple processes touching some network state concurrently), but this happens as often in Go as in Rust or any other language.

mawfig · on June 11, 2022

I suspect if you're not particularly looking for data races, you probably won't recognize their effects when these bugs occur. There is a very large set of C and C++ apps which don't run ASan or UBSan and have a long tail of bugs that are closed as "can't repro" or "probably fixed by x" that are actually the result of UB.

kaba0 · on June 11, 2022

> trivial concurrent programming

If it is chock-full of race conditions, it is not trivial.

beltsazar · on June 11, 2022

> "Oh, data race freedom isn't such a big deal, if you really need it, a garbage collected language like Java will give you that guarantee."

Java programs aren't guaranteed to be free of data race. Java spec guarantees that if that happens, there will be no undefined behavior (like in C++).

ohazi · on June 11, 2022

Hm, maybe I was misremembering. Managed languages like Java do give you memory safety, but I guess data race freedom isn't actually guaranteed.

Now that I think about it, this must be the case, right? You have to get `synchronized` right in Java or else you won't get what you expect.

fulafel · on June 11, 2022

There's a specified Java memory model (JMM) and in the case the programmer shooting themself in the foot with concurrency by eg buggy synchronization, it provides just enough guarantees to protect the integrity of the state of the runtime itself.

(I didn't find this integirty of runtime specified in the JMM spec, hopefully it's in the other specs).

In the JMM terminology, the "you're in the clear" term is "well-formed execution". If you break the rules, you're not in "well-formed execution" land any more, and things may fly out of your orifices, but a specific type of C/C++ style dragon won't maybe fly out of your nose.

So there's a weak kind of memory safety, your app data in may still be garbled, possibly in an attacker-controlled way, but the attacker probably won't get remote code execution.

oconnor663 · on June 11, 2022

There's a tricky distinction here. I'm pretty sure Java does provide data race freedom, in the specific sense that a data race is "Undefined Behavior caused by a write overlapping with another read or write". The Java standard says that the JVM isn't allowed to trigger this sort of undefined behavior. (Maybe some people say it's technically still a data race? I'm not sure of the right formal definition, but anyway the important thing is that in Java the UB doesn't happen here.) However, what happens when you do that can still be extremely tricky, and I think the Java compiler is still allowed to reorder reads and writes in ways that'll be extremely confusing if you have code that looks like a data race. It won't give an attacker arbitrary code execution, but it's very likely still a bug.

tialaramex · on June 11, 2022

Data Race Freedom is, unsurprisingly, Freedom from Data Races. A Data Race is any time when there's concurrent modification of a memory value, on modern hardware with multiple simultaneous execution contexts those modifications could in some sense happen at the same moment.

[NB: Data Races are a subset of Race Conditions. Race Conditions are sometimes just a fact about the world and you need to write programs that cope with this, but they are not necessarily Data Races, if you copy all the files from folder A to folder B, and then delete folder A, somebody meanwhile adding a file to folder A which you then delete despite not having copied it would be a Race Condition, but it is not a Data Race. ]

The reason you want Data Race Freedom is that it's easy for a programming language to offer Sequential Consistency if you have Data Race Freedom, this guarantee is called SC/DRF.

Why do we want Sequential Consistency? Sequential Consistency is when programs behave as if stuff happened in some sequence. The disk reader gets a block from disk and then the encryptor applies AES/GCM to the block and then the network writer sends the encrypted block to the client. It turns out humans value this very much when trying to reason about any non-trivial program. Get rid of Sequential Consistency and the programmers are just confused and can't solve bugs.

So, we want SC/DRF and in most languages you get that by being very careful to obey the rules to avoid Data Races. If you screw up, you don't have Sequential Consistency. In most languages you lose more than that (in C or C++ you immediately have Undefined Behaviour, game over, all bets are off), but even just losing Sequential Consistency is very bad news.

Safe Rust promises DRF and thus SC. So instead of being very careful you can just write safe Rust.

jonathanstrange · on June 11, 2022

AFAIK, you still have to be very careful since data races based on data dependencies can never be excluded in general, that is theoretically not possible. What you get is a guarantee that your program is not in an undefined state. There are still plenty of ways to shoot yourself in the foot with incorrect synchronization.

tialaramex · on June 11, 2022

> AFAIK, you still have to be very careful since data races based on data dependencies can never be excluded in general

Hmm. Maybe I don't understand what you're getting at here. It seems like you're suggesting something like a[b] = x could race in safe Rust because we don't know b in advance and maybe it ends up being the same in two threads ?

But Rust's borrow checker won't allow both threads to have the same mutable array a so this is ruled out. You're going to have to either give them immutable references to a, which then can't be modified and so there's no data race, or else they need different arrays.

This is boringly easy to get right in theory, Rust just has to do a lot of work to make it usable while still delivering excellent runtime performance.

eru · on June 13, 2022

> AFAIK, you still have to be very careful since data races based on data dependencies can never be excluded in general, that is theoretically not possible.

Well, you could always require the programmer to supply a proof that the program is gonna be fine, before you compile anything.

(That means your programming language won't be Turing complete, but you can still code up anything you want in practice. Including Turing machines.)

The likes of Agda and Coq work in this way.

oconnor663 · on June 11, 2022

> A Data Race is any time when there's concurrent modification of a memory value

I do want to nail down the terminology, so help me with this scenario: Two simultaneous relaxed atomic writes to the same variable from different threads. To my understanding, this is not a data race (since this is allowed, while data races are never allowed), but it is concurrent. Do I have that right?

tialaramex · on June 12, 2022

Well spotted. This is arguably a hole, albeit a deliberate one. In practice the main reason people do this is collecting some sort of metric, whose exact value is unimportant and which anyway isn't contemplated by the machine.

If your program tries to actually act on this data then yeah, you have successfully made your own life unnecessarily exciting and debugging your program may be difficult. I think it's fair to say you've only yourself to blame though since you had to explicitly choose this.

oconnor663 · on June 12, 2022

As an interesting example of doing something meaningful with relaxed arithmetic, Arc uses a relaxed fetch_add to increment the refcount: https://doc.rust-lang.org/src/alloc/sync.rs.html#1331-1343. Decrementing, however, uses acquire-release. Apparently shared_ptr in C++ is similar.

tialaramex · on June 13, 2022

Tried to follow this link on my phone and the browser blew up, so I'll look when I'm home. Doubtless in both cases (Rust and C++) people who are much smarter than me have reasoned that it's correct and perhaps if I read the link I'll agree. But if you just asked me off the cuff my guess would have been this wasn't safe and so they should be Acquire-Release.

ackfoobar · on June 11, 2022

> there will be no undefined behavior

and no out-of-thin-air values.

WuxiFingerHold · on June 12, 2022

> Why do people keep adopting this language? Where's the appeal?

There're many aspects to consider when evaluating a tool. To me, Go has one of the best overall packages:

- std lib - tooling - performance - concurrency - relatively easy to get devs - reliable - mature

Also, Go has no substantial drawback. I personally consider an external runtime a drawback, for example.

I also use Rust personally. This discussion shows the value of Rust in terms of correctness. But for my professional projects Rust lacks the ecosystem guarantees that Go has with its great and useful standard lib. Looking at the Cargo dependencies of a mid size Rust web service is scary and reminds me of NPM. A large fraction of essential libs are maintained by a single person. Or unfortunately unmaintained. Rust with Go's std lib would be truly great.

Groxx · on June 11, 2022

If someone is claiming the garbage collection means freedom from data races, they are unambiguously wrong.

Garbage collectors solve double-free bugs and usually memory leaks due to cyclic references.

eru · on June 13, 2022

Yes. Though garbage collectors can have an indirect influence on the design of the language, that makes it easier to handle data races.

(As an example, image how much simpler Rust would be, if they went with garbage collection. Or how much more machinery Haskell would need, if they went with Rust's memory management strategies.)

Groxx · on June 13, 2022

I'm not sure what Rust would gain from a garbage collector - it'd still need all of lifetimes for instance, because ownership is the necessary piece for preventing races.

crabbygrabby · on June 11, 2022

Go has some appeal in general. It's super easy to stand up webby glue appy things in go, and it has a solid cloud ecosystem, maybe the best cloud ecosystem.

That said, people ragging on rust pushing that trope are basically just making stuff up to hate on it. Anyone who looks into the language and views programming languages as tools and understands these issues gets why someone might use rust.

But yea, it's ironic... Especially seeing how many times I've seen smart colleagues get go concurrency wrong.

throwaway894345 · on June 11, 2022

I’ve been dabbling with Rust on and off since about 2014, I was actually surprised at how difficult multithreading still is in the language. It’s neat that it stops you from doing things that could be incorrect, but I couldn’t get anything to compile in the first place, even with mutexes guarding the shared memory (it’s been ~6+ months since I last tried and I’ve forgotten the details except that I ended up reverting to a single-threaded implementation).

hintymad · on June 10, 2022

> and contains approximately 2,100 unique Go services (and growing).

A side topic: this is really not something to be proud of. There used to be more people than quantity of work in Uber and engineers fought for credits by building bogus decomposed services, and the sheer number of services seems indicate it's still so.

jeremy_wiebe · on June 11, 2022

That number really store out to me too. I’d be very curious how they decide what becomes a separate service.

I’d also be curious if the 50m lines of code included generated code.

shp0ngle · on June 11, 2022

I did not work in Uber, but a similar company, and... it's a very political thing, usually.

Taking an existing service and making it into 2 new microservices is a "thing you did". Suddenly, you have "impact" and can claim the new service as "yours". Everyone wants to be a king of their little kingdom.

crabbygrabby · on June 11, 2022

You are me and I am you, I'll always be with you

jordanbeiber · on June 11, 2022

If they still use cadence/temporal[0] extensively this kind of blurs the concept of technical ”services”.

We’ve started to use it (temporal) a bit for general automations, and it’s pretty great. Monorepo with a lot of different activities (“microservices”) makes sense.

The activites are orchestrated in workflows (much like DDD “sagas”) and scheduled via temporal. This gives awesome introspection and observability.

[0] https://temporal.io/

DominoTree · on June 11, 2022

Seems like Rob Pike and co may have failed

"The key point here is our programmers... They’re not capable of understanding a brilliant language... So, the language that we give them has to be easy for them to understand"

laerus · on June 11, 2022

Yep, if Google wasn't behind Go the language would have already been history like so many other half baked technologies. It won't be long untill people will talk about Golang like they do about JavaScript.

icedchai · on June 11, 2022

They said the same thing about JavaScript in the 90's. If Netscape wasn't behind it... Over 25 years later, it's still going strong, even though the entire HTML/CSS/JS model of "app development" is and always was half baked.

crabbygrabby · on June 11, 2022

JavaScript is nutty but let's be honest, js is one of the most widely used programming languages. For better and for worse.

I don't think go lang will die out because it does get some things right. Unfortunately, there's still a bit of things going wrong.

Groxx · on June 11, 2022

Honestly, I expect Go to be lumped together with JavaScript in terms of "weird but still in use" languages in the long run.

There are so many surprising footguns and unsafe patterns that it really stands out as a risky language to me. But it has Google's (implied) backing and it works well enough to be used, and the performance is very good in general.

By this point it can probably survive for quite a long time on momentum alone. Which makes it a moderately-safe-to-use-in-a-business language.

philosopher1234 · on June 11, 2022

What’s the relevance of the quote? It makes you feel insulted?

eru · on June 13, 2022

The relevance is that the submitted article shows that you need to be rather smart to avoid Go's pitfalls.

aaronbwebber · on June 10, 2022

At least some of these would be caught by running your tests with race detection on? I haven't read the whole article yet but as soon as I read the loop variable one I was pretty sure I have written code with that exact bug and had it caught by tests...

https://go.dev/doc/articles/race_detector

Edit: at the _end_ of the post, they mention that this is the second of two blog posts talking about this, and in the first post they explain that they caught these by deploying the default race detector and why they haven't been running it as part of CI (tl;dr it's slower and more resource-expensive and they had a large backlog).

https://eng.uber.com/dynamic-data-race-detection-in-go-code/

Klasiaster · on June 10, 2022

My favorite example is the IP address type which is an alias for a slice of bytes (type IP []byte). Thus, it gets passed by reference instead of by value and you easily end up working on the same data even if you didn't plan to. This will just be a logical bug but there are data structures in Go which result in memory corruption and introduce the risk of (remote) code execution vulnerabilities.

armitron · on June 11, 2022

> Thus, it gets passed by reference instead of by value

Everything in Go is passed by value, including slices.

Go has no reference types, but a lot of people think it does, and that’s a problem. An example of the much bigger problem of low barrier to entry programming, where lots of folks write code but have no deep understanding of the tools they use.

If there’s something that Go proves, it’s that one can’t make a language “idiot proof”. Rust has the same problem in terms of folks creating mess after mess with it, except it’s higher barrier to entry and gets a better caliber of people using it.

tialaramex · on June 11, 2022

You can't make a language "idiot proof" but by changing how we think about the problem we can make a huge difference.

The trick is, what programs do we even want to exist? There's no need to be able to write all the programs you didn't want. In Rust, such programs get consigned to unsafe, which means that yes, sometimes to do general purpose programming (and especially e.g. in Rust's own stdlib) you must use unsafe Rust. But it already means we can constrain "idiots" (or more reasonably, new programmers) to safe Rust and rule out all those problems that aren't in the reduced domain of safe Rust.

You can go much further than Rust. WUFFS isn't a general purpose language at all. While a Rust compiler written entirely in Rust isn't a priority it'll likely happen sooner or later, but a WUFFS compiler written in WUFFS is nonsense, WUFFS doesn't even have strings. WUFFS is for, well, Wrangling Untrusted File Formats Safely, hence the name. Notice not files just the file format. WUFFS has no idea what a file is, no file APIs, since it doesn't know what strings are it couldn't easily name files anyway. But inside its domain WUFFS gets to be 100% safe while also being faster than code you'd actually write in other languages.

Take buffer overflow buffer[n]. In a language like C++ direct access isn't bounds checked and so overflows are common when n is too large, too dangerous. OK, in a language like (safe) Rust this access is bounds checked, now the overflow is prevented when n is too large but the bounds check cost CPU cycles, a little slower.

WUFFS doesn't do either, in WUFFS that variable n was used to index into buffer therefore n is constrained to be 0 <= n < buffer size. If the compiler can see any way that n might exceed this constraint your program does not compile. As a result at runtime there's no overflow and no bounds checking.

A complete idiot's WUFFS GIF decoder might be wrong - it could report spurious decoding errors, it could decode a blue dog as a pink roller skate, render images upside down or even decode JPEG instead of GIF - whatever, but it can't escape the limits of WUFFS itself. It can't go off piste and send your password database to a remote HTTP server or delete all your logs, or send spam emails or run some machine code it found inside the supposed GIF file.

eru · on June 13, 2022

For anyone curious: https://github.com/google/wuffs

dwrodri · on June 11, 2022

Is it just me, or is Golang's concurrency a very double edged sword? My exposure to goroutines and channels was mind-blowing, but I still really struggle reading through Go code and understanding what's going on in memory. The way that Go "abstracts away" ownership feels more like it's hiding important details rather than unnecessary minutia.

Here's a simple question that's stumped me for some time: if multiple go routines are popping values out of a channel, does the channel need a mutex? Why do the "fan-out, fan-in?" examples in the "Pipelines and Cancellation" post on the Go blog not require mutex locks? Link here: https://go.dev/blog/pipelines

Stuff like that, along with the ambiguity of intializing stuff by value vs using make, the memory semantics of some of the primitives (slices, channels, etc). None of it was like "of course". If something is a reference, I'd rather the language tell me it's a reference. Maybe I'm still too new to the language.

bombela · on June 11, 2022

I am pretty sure the channels are safe to use with multiple producers and multiple consumers (mpmc). But somehow I cannot easily find any official doc clarifying that.

Go doesn't do anything to help you with memory safety around concurrency. And the design of the language is also not helping you avoid logical bugs.

After using Rust, all other imperative languages feel like using an angle grinder with a wood saw blade and no guard. Sure you can do really good if you are careful. But thing will go sideways remarkably quickly. And with the constant urgency of shipping for yesterday. It makes sense most programs look like the aftermath of The Boys show.

Groxx · on June 11, 2022

They are MPMC except for closing (which panics if anything writes after close), yes.

eurasiantiger · on June 11, 2022

”Our Go monorepo consists of about 50 million lines of code (and growing) and contains approximately 2,100 unique Go services (and growing).”

What the hell is that company doing?

Try to imagine an ERD or DFD of their day-to-day operations. 2,100 unique services…

dubswithus · on June 10, 2022

> We developed a system to detect data races at Uber using a dynamic data race detection technique. This system, over a period of six months, detected about 2,000 data races in our Go code base, of which our developers already fixed ~1,100 data races.

This isn't open source, correct?

aaronbwebber · on June 10, 2022

Yes it is, it's part of the standard go toolchain as described in the first blog post in the series: https://eng.uber.com/dynamic-data-race-detection-in-go-code/

freyr · on June 11, 2022

> Uber has adopted Golang (Go for short)

Uber has adopted Go (Golang for long)

travisd · on June 10, 2022

Worth noting that some of these can be detected statically -- and some are detected by go vet (e.g., passing a sync.Mutex by value). I don't think it detects the wg.Add bug, but that seems relatively straightforward(†) to add a check for.

(†famous last words, I know)

kjksf · on June 10, 2022

staticcheck has a check for wg.Add misuse (https://staticcheck.io/docs/checks, https://staticcheck.io/docs/checks#SA2000)

jasonhansel · on June 11, 2022

I think the root cause of a lot of these data races is that Go has no way of marking variables/fields/pointers/etc. as immutable or constant. This makes it easy to lose track of shared mutable state.

It's not just data races--it's also logical races, which are near-impossible to detect or prevent without something like transactional memory.

eru · on June 13, 2022

Yes, missing immutability is a big part of the problem in Go.

Given that Go eschewed generics for the longest time, I can sort of see why they left out immutability markers:

To keep your sanity, you'd want some functions to take (and return!) both mutable and immutable data, as the situation requires. But some other functions should only take mutable data or only immutable data.

Thus ideally you'd need some kind of 'generic' (im-)mutability handling in your type system.

(Rust's borrow checker is basically one way to really deal with this (im-)mutability genericity.

For example, a function to do binary search on a sorted array doesn't change the array; thus it could take either a mutable or immutable version. But if the array changes (via another thread) while the function is running, then you might get into trouble.)

baalimago · on June 11, 2022

Passing sync.Mutex and sync.WaitGroup as value is especially irritating considering that the context.Context is meant to be passed as such

sharno · on June 11, 2022

Go picked the concurrency ideas of Erlang but then ignored the main safeguard that makes Erlang's concurrency fearless: Immutability.

And if you feel that Erlang's lack of type safety is an issue, then Gleam has you covered.

masklinn · on June 13, 2022

> Go picked the concurrency ideas of Erlang but then ignored the main safeguard that makes Erlang's concurrency fearless: Immutability.

Not even immutability, isolation.

Though obviously immutability makes things less weird, the real gain in terms of concurrency is that you can’t touch any data other than your own (process’s), and for the most part erlang doesn’t “cheat” either: aside from binaries, terms are actually copied over when sent, each process having its own heap.

Sequential erlang could be a procedural language based around mutability and it wouldn’t much alter its reliability guarantees (assuming binaries remain immutable, or become COW).

eru · on June 13, 2022

Erlang was fun to use!

But even with Erlang, concurrency is hard. Any single process's data is immutable, but if you split a process in twain, the resulting union can behave as if it had mutable state.

And let's not forget about ETS (term storage), which is basically a mutable hash table that you often have to use to get anything done.

In any case, I agree that Go did _not_ improve on Erlang.

thallavajhula · on June 10, 2022

>We developed a system to detect data races at Uber using a dynamic data race detection technique.

By system do you mean a process or a tool that detects these?

ryanschneider · on June 11, 2022

What’s up with the totally broken syntax highlighting in this post, at least on iOS? 2100 micro services and not one of them is a valid syntax highlighter for blog posts.

Edit: oh I see it highlights red and underlines every keyword. I find that incredibly distracting, so much so I assumed their highlighter was broken, but also just realized they are screenshots.

fmakunbound · on June 11, 2022

> contains approximately 2,100 unique Go services (and growing)

How is that possible and what do they do???

Tomis02 · on June 11, 2022

Sadly, people get rewarded for what they built (even if makes everyone's lives harder in the long run) than for exercising restraint and saying "no".

erik_seaberg · on June 11, 2022

There’s a perception that big tech pays longtime employees less than they’re willing to offer new candidates, making promotion or job hopping the best ways to earn the current market rate. And if you copy Google’s promo process, you get promo-driven development, because there aren’t enough projects that actually need that level of complexity.

eru · on June 13, 2022

And sadly, Google isn't even the worst. At least they seemed to allocate quite a few resources for turning off old systems and decommissioning them.

At many other enterprises, old systems never properly die.

It's just as hard, or often harder, to migrate off the last few uses of a system compared to launching a new system. But while you can get promoted for launching a great-enough new system in almost any organisation, good luck getting promoted for your heroic efforts in shutting down obsolete systems.

(I guess it's technically possible. Just unlikely in most places.)

JamesSwift · on June 10, 2022

Really good article, and gave me several ideas to track down some gremlins that have been bugging me in a go codebase recently.

rapiz · on June 11, 2022

Don't take me seriously: I heard people both saying Golang is a better C/Golang is a worse C

gigatexal · on June 11, 2022

In the closure example does declaring a new variable and setting its value to the iterative or the thing being passed in, does that mitigate the pass by reference issue?

yencabulator · on June 11, 2022

Yes.

    for i := 0; i < 10; i++ {
        i := i
        ...
    }

https://go.dev/play/p/P7TunJCL7RS

gigatexal · on June 11, 2022

Thank you!

The example is brilliant. Thanks.

fellellor · on June 11, 2022

https://youtu.be/-bCkha6U70o?t=787

oconnor663 · on June 11, 2022

Is there a mistake in Figure 2? It looks like myResults is captured, but the capture is never used?

smw1218 · on June 14, 2022

I wrote a deep analysis/reaction to this post:

https://medium.com/@scott_white/concurrency-in-go-is-not-mag...

tl;dr Go doesn't magically solve data races and blaming the language itself isn't well supported by the examples/data.

stevefan1999 · on June 11, 2022

can you guys put it up as a SonarQube rules?

metadat · on June 10, 2022

> 2. Slices are confusing types that create subtle and hard-to-diagnose data races

The "Slices" example is just nasty! Like, this is just damning for Go's promise of "_relatively_ easy and carefree concurrency".

Think about it for a second or two,

>> The reference to the slice was resized in the middle of an append operation from another async routine.

What exactly happens in these cases? How can I trust myself, as a fallible human being, to reason about such cases when I'm trying to efficiently roll up a list of results. :-/

Compared to every other remotely mainstream language, perhaps even C++, these are extremely subtle and sharp.. nigh, razor sharp edges. Yuck.

One big takeaway is this harsh realization: Golang guarantees are scant more than what is offered by pure, relatively naïve and unadulterated BASH shell programming. I still will use it, but with newfound fear.

As a multi-hundred-kloc-authoring-gopher: I love Go, and this article is killing me inside. Go appears extremely sloppy at the edges of the envelope and language boundaries, moreso than even I had ever realized prior to now.

Full-disclosure: I am disgusted by the company that is Uber, but I'm grateful to the talented folks who've cast a light on this cesspool region of Golang. Thank you!

p.s. inane aside: I never would've guessed that in 2022, Java would start looking more and more appealing in new ways. Until now I've been more or less "all-in" on Go for years.

hintymad · on June 10, 2022

> I never would've guessed that in 2022, Java would start looking more and more appealing in new ways.

I don't quite understand the hatred (to the point of shouting "using Java? Over my dead body), especially in startups, towards Java. I mean, it's a language, big deal. Java's ecosystem more than enough offsets whatever inefficiencies in the language itself, at least for building many of the internal CRUD services. Besides, people like Martin Thompson shows us how to build low-latency applications with ease too. Libraries like JCTools beat the shit out of many new languages when it comes to concurrency for productivity, performance, and reliability. How many engineers in startups claim that they hate Elasticsearch because "Java sucks"? Yet how many can really build a platform as versatile as ES or a Lucene replacement with economical advantages? How many people in startups openly despise Spark or Flink and set out to build a replace because "Java is slow and ugly". Yeah, I've seen a few. And a payment company insists that Rust is the best language because "GC is inefficient and ugly", even though they are still in the phase of product iteration and all their services simply wrap around payment gateways? What's the point?

Disclaimer: I use Go in work. It's not like I have skin in the game for speaking about Java.

jasonhansel · on June 11, 2022

I actually think the Java ecosystem is part of why people dislike it. Java seems to attract a lot of extremely heavyweight frameworks (like Spring) that are too complex to fully understand and too heavyweight to make sense for most projects.

vips7L · on June 12, 2022

So use helidon, micronaut, javalin, or spark if you want something small, but I suspect in any real application you’ll just end up recreating half of spring. That’s what my company did and it’s not near the quality of anything in Spring.

pkolaczk · on June 11, 2022

> how to build low-latency applications with ease too

That's a bit of a stretch. Surely, you can build low-latency apps, but I'd be very careful with the "with ease" bit. Low-latency Java often means zero heap allocations, aggressive object avoidance / reuse, heavy use of primitive types everywhere, so it is very much low-level like C, only with no tools that even plain old C offers, e.g. no true stack-allocated structs and no pointers. And forget about all the high-level zero-cost abstractions that C++ and Rust offer.

hintymad · on June 11, 2022

Fair point. The "with ease" part has also to do with Java's ecosystem. For instance, Martin Thompson used to teach people how to write a single-producer-multi-consumer queue. In a matter of hours, people can achieve 100M+ reads and writes on a 2014 MacBook Pro (I understand that throughput is different from latency, but given the fixed number of CPUs in this case, the latency of such implementation is also phenomenal). Better yet, Java folks have libraries like JCTools, so they don't event have to spend that few hours to get even higher performance.

My litmus test is how fast one can implement functionalities of the data structures/algorithms in the book The Art of Multiprocessor Programming in production quality. It looks chic languages like Rust are not there yet.

pkolaczk · on June 12, 2022

Don't forget "chic" languages like Rust have the whole C ecosystem at their disposal.

Having done Java for many years and recently also done Rust, I'm not very convinced one ecosystem is richer than the other, when we talk about high performance computing. I've already hit a few things that are present in Rust I wished to have in Java. Generally I find the multithreading/concurrency libraries available in Rust very good.

hintymad · on June 12, 2022

Definitely not everything or HPC. I was talking about building services that don’t need granular management of memory.

christophilus · on June 11, 2022

For me, Java and MySQL kind of died* when they became an Oracle thing. I just don’t want to go near anything that Oracle touches.

The other thing is that I tend to write little programs where simple deployment on a low-resource machine is desirable.

Go can handle that. Java kind of does the job with Graal now.

The JVM is incredible, though, and I love Clojure. I’m hoping that Loom + Graal helps to kickstart more competition in the “concurrent, parallel, simple to deploy” space.

* Died to me; obviously they’re both alive and well in the broad world.

kaba0 · on June 11, 2022

> For me, Java and MySQL kind of died* when they became an Oracle thing. I just don’t want to go near anything that Oracle touches.

Come on, that’s a cheap reason (for Java, for open source db I would also go with postgre but for different reasons). Java is one of the very few languages with a full specification (not “whatever our compiler does, that’s the spec”), it has plenty of fully independent full implementations that even pass one of the most detailed test suites for complete spec-compliance, and the platform is so so much ingrained in the biggest corporations that any one of the following companies could easily, single-handedly finance the future of Java if anything were to happen: Apple, Google, Microsoft, Amazon, Alibaba.

And for all the bad things one can without doubt throw at Oracle, they are surprisingly good at shepherding the language and platform. It has been growing in a very good direction with fast update cycle, it has state-of-the art research and development going on, and with Loom on the near horizon and Valhalla on the slightly further horizon I would say Java has one of the brightest futures ahead. Like, Valhalla would bring automagically a huge performance improvement for free, and Java is very competitive in performance as is.

sudarshnachakra · on June 13, 2022

Agree that Java is pretty good with records / sealed types / loom, but one nice thing about the Oracle Java team is they do not add half baked features (primarily since they have the last mover advantage) - for (e.g.) Valhalla will have value types, but they'll be immutable so they can be freely copied and used. Loom will have structured concurrency on debut, which IMHO makes vthreads manageable.

But I've my own apprehensions about loom which actually breaks synchronized blocks (by pinning the carrier thread), and are used extensively in legacy libraries and even in the more recent ones (like opentelemetry java sdk).

vips7L · on June 12, 2022

Oracle has done great things for Java. The renewed investment over the past few years are bringing amazing things.

tomcam · on June 11, 2022

I like how this was downvoted with no comments. Stay classy, HN

voidfunc · on June 11, 2022

Written a lot of Java, Python, and Go in my career... every single time I see someone take a hardline stance against Java it's always because they had one particular bad experience with it 15-20 years ago and couldn't bend it to their will like Python or Lisp. Or they fought Maven, or some other ancillary tool. Or they rail on the generics and yet the use-cases they come up with for true reified generics are generally niche.

Java's got problems. The biggest one is the framework laden ecosystem and that some of the frameworks are all or nothing. But the language and runtime are rock solid. I don't get the hate.

icedchai · on June 11, 2022

Mostly Java, Python, and a bit of Go here, also. If you're not sure what language to develop a back end service in, you'll rarely go wrong by picking Java. The JVM absolutely is rock solid. The number of libraries and frameworks available is amazing. If you stay way from heavy weight frameworks and use something leaner like Spring Boot or Dropwizard, you'll be fine.

jchw · on June 10, 2022

Slices may just be one of the best and worst parts of Go. They're cumbersome, their behavior sometimes feels 'inexplicable,' and even as an experienced developer you are likely to eventually fallen into one of the traps where your 'obvious' code isn't so obvious.

That said... when programming in programming languages without a slice type, I always want to have one. And though it's confusing at times, the design does actually make sense; without a doubt, it's hard to think of how you would improve on the actual underlying design.

I really wish that Go's container types were persistent immutable or some-such. It wouldn't solve everything, but it feels to me like if they could've managed to do that, it would've been a lot easier to reason about.

masklinn · on June 10, 2022

> And though it's confusing at times, the design does actually make sense; without a doubt, it's hard to think of how you would improve on the actual underlying design.

Go slices are absolutely the worst type in Go, because out of laziness they serve as both slices and vectors rather than have a separate, independent, and opaque vector types.

This schizophrenia is the source of most if not all their traps and issues.

> I really wish that Go's container types were persistent immutable or some-such.

That would go against everything Go holds dear, since it's allergic to immutability and provides no support whatsoever for it (aside from simple information hiding).

jchw · on June 11, 2022

Think you are overstating its importance, but I do agree the language’s biggest pitfalls are easily right here. That said, if you start from first principles and force every feature and construct to be justified ruthlessly, it’s easier to see how they got there. Constness (as a type concept) and immutability are one of those things that can explode into surprising complexity for the language and compiler.

In retrospect, it may have been worth the pain. Maybe in the distant future, Go will have it. For now, if you want a more sophisticated language, options exist, with all the tradeoffs that will entail.

oconnor663 · on June 11, 2022

> it's hard to think of how you would improve on the actual underlying design.

I'm biased as a Rust fan in general, but I think Rust pretty much nails this. Rust distinguishes between a borrowing view of variable length (a slice, spelled &[T]) and an owned allocation of variable length (usually a Vec<T>). Go uses the same type for both, which makes the language smaller, but it leads to confusion about who's pointing to what when a slice is resized.

jchw · on June 11, 2022

I do pretty much agree there, but also, I'm not sure that solves a whole lot of problems in the frame of Go. Since Go lacks ownership or constness as a concept, it would be weird if there weren't convenience functions for e.g. appending to a slice, because it's always possible for that to be done; if the language didn't do it, the end users could certainly write the function themselves. I think they would've needed to expand the language in order to make meaningful improvements to slices.

knorker · on June 10, 2022

Indeed. Probably the most common data structure ever used, list of stuff, Go managed to make subtle and full of surprises. A knife without a handle.

This makes the stated reason for the delay of generics hard to understand. They didn't wait to get list/vector/array/slice right.

jasonwatkinspdx · on June 11, 2022

> it's hard to think of how you would improve on the actual underlying design.

I think ranges are part of D's design they got right, and I think a similar abstraction would be in line with golang's general design ethos, GC design, etc, other than perhaps some folks might pattern match it as "this is like STL therefor bad burn it with fire etc" without actually thinking about it in detail.

kosherhurricane · on June 10, 2022

>>> The reference to the slice was resized in the middle of an append operation from another async routine.

> What exactly happens in these cases?

Go's append looks like this:

mySlice = append(mySlice, newItem)

To me, this makes it very clear that 1) mySlice pointer can now point to someplace entirely different in memory, and 2) there maybe new allocation.

I write both Java and Go. For personal projects, I always choose go.

Groxx · on June 10, 2022

The append pattern also implies the opposite of reality, in that it also (usually!) mutates mySlice. Which is the source of one of the two(?) possible races in that piece of code.

sidlls · on June 11, 2022

I'm no fan of go, although I think it's better than many other languages for services, but the argument here is against the label "append", not the operation. It's a poor name for the operation, but the documentation is quite clear about what's going on. I'd argue that understanding the keywords and builtins of a language is the bare minimum an engineer should do before he starts writing anything in it.

erik_seaberg · on June 11, 2022

  s2 := append(s1, x)
  s3 := append(s1, y)

shouldn’t be allowed, because what it’s likely to do is not what anyone meant. In a pass-by-value language, passing a slice or map by value should copy it, append should be a method that returns void, and passing a pointer should be the way to share state and avoid copies.

sethammons · on June 11, 2022

What would you expect that code to do?

I'd expect to have two different slices, s2 and s3, to contain all the same elements aside from the last.

[a, b, c, x] and [a, b, c, y] and s1 remains [a, b, c]

Groxx · on June 11, 2022

When printing s1, s2, then s3:

It does exactly that, yes: https://go.dev/play/p/rs2FeK_QUjs

    [a b c]
    [a b c x]
    [a b c y]

But maybe it doesn't: https://go.dev/play/p/Na-eL0sOV9e

    [a b c]
    [a b c y]  <- this is now "y"
    [a b c y]

So... maybe they share the same backing array? Lets try setting s2[0] to "z" after appending with the original code: https://go.dev/play/p/mAB-gUb0shB

    [a b c]
    [z b c x]
    [a b c y]

Apparently not. But also apparently yes? https://go.dev/play/p/k1ciGzyS2gc

    [z b c]    <- this changed too
    [z b c y]
    [z b c y]

Let's try appending just one more item before redoing ^ that example, where they all shared the same data: https://go.dev/play/p/5JneXHMeUjx

    [a b c]
    [z b c x x2]
    [a b c y y2]

Notice that in all of these examples, I haven't explicitly declared a length or capacity. There's nothing "funny looking" or clearly intentionally allowing these different behaviors, it's just simple, very-common slice use.

.... so yeah. This is a source of a number of hard-to-track-down bugs.

Jcowell · on June 12, 2022

Is there a bug or issue on this in their GitHub cause wtf is this ?

Groxx · on June 12, 2022

This is expected behavior. It's just an easy thing to screw up by accident.

sethammons · on June 11, 2022

Amazing, very interesting

meowface · on June 11, 2022

Yes, this majorly tripped me up when working on my first big Go project. Spent days hunting for a non-deterministic data corruption issue which was caused by this. It's definitely my fault for not fully reading the documentation and not realizing that append may (and often does) mutate the slice, but I was indeed misled by the `x = append(x, ...)` syntax into assuming it only works off of a copy without modifying the original.

yencabulator · on June 11, 2022

Go's append is pretty much C's realloc, and behaves very much the same; the pointer you get back may or may not be the passed-in pointer.

Also,

> If the capacity of s is not large enough to fit the additional values, append allocates a new, sufficiently large underlying array that fits both the existing slice elements and the additional values. Otherwise, append re-uses the underlying array.

https://go.dev/ref/spec#Appending_and_copying_slices

throwaway894345 · on June 10, 2022

> What exactly happens in these cases? How can I trust myself, as a fallible human being, to reason about such cases when I'm trying to efficiently roll up a list of results. :-/

For me: minimize shared mutable data. If I really can’t get rid of some shared mutable data, I mutex it or use atomics or similar. This works very well—I almost never run into data races this way, but it is a discipline rather than a technical control, so you might have to deal with coworkers who lack this particular discipline.

metadat · on June 10, 2022

Absolutely, the disappointing part is that as code authors, we need to constantly remember about various (otherwise appealing and even encouraged by the language syntax and control constructs) footguns and "never approach such areas" of (totally valid) syntax.

Reminds me of programming in Javascript (it's extreme example, but the similarity is there).

throwaway894345 · on June 10, 2022

Yeah, it’s a bit disappointing. It doesn’t bother me too much, but it could be improved by a linter which could help you find shared mutable state. Without a concept of “const” (for complex types, anyway), I’m not sure how feasible such a linter would be.

morelisp · on June 10, 2022

Without disagreeing that it's an enormous footgun, one good way to avoid such slice issues is to use the uncommon `a[x:y:z]` form to ensure the slice can't grow. As we're starting to write a lot of generic slice functions with 1.18, we're using this form in almost all of them which may add elements.

masklinn · on June 10, 2022

> one good way to avoid such slice issues is to use the uncommon `a[x:y:z]` form to ensure the slice can't grow.

Do you mean you always use `a[x:y:y]` in order to ensure there is no extra capacity and any append will have to copy the slice?

Is append guaranteed to create a new slice (and copy over the data) if the parameter is at capacity? Because if it could realloc internally then I don't think this trick is safe.

morelisp · on June 10, 2022

> Because if it could realloc internally then I don't think this trick is safe.

Slices are 3 word values of (ptr, len, cap). They cannot be "realloced internally", changing any of those three things requires creating a new slice.

masklinn · on June 10, 2022

Of course but the new slice could be (ptr, len+1, cap+x) because realloc() was able to expand the buffer in-place. Which yields essentially the same behaviour as an append call with leftover capacity.

But I guess realloc is a libc function, and Go probably goes for mmap directly and would implement its own allocator, and so might not do that. Unless / until they decide to add support for it.

morelisp · on June 11, 2022

If you start worrying about "what if <C concern> underlying the Go runtime happens?" you'll find a lot worse than realloc. Luckily, in the absence of runtime bugs, you don't have to think about it.

masklinn · on June 11, 2022

It's not a concern about C, it's a concern about the underlying possibility expressed in C terms: sizeclass arenas are useful to an allocator, that means slack, which means the opportunity for in-place allocation resizing.

You advocated the use of a very specific behaviour of `append` as a DID and possibly a correctness requirement of programs.

My worry is about whether this behaviour is a hard specification of the Go language, or just an implementation detail of the primary Go implementation. And how programs applying your recommendation would handle such behaviour changing.

morelisp · on June 11, 2022

Sorry, this is nonsensically mixed up. Even if it reallocs in place, which it may be free to do, the language semantics still guarantee only one observer gets that extended space. Otherwise you would need to worry about this even when dealing with immutable structures like concatenating strings.

masklinn · on June 10, 2022

> The "Slices" example is just nasty!

I've got to say I'm not entirely clear on what they talk about specifically.

Is it simply that the `results` inside the goroutine will be desync'd from `myResults` (and so the call to myAppend will interact oddly with additional manipulations of results), or is it that the copy can be made mid-update, and `result` itself could be incoherent?

Philip-J-Fry · on June 10, 2022

So a slice consists of a pointer to a backing array, a length and a capacity. If you don't use a pointer and pass this slice around you will copy it.

This is problematic because even though you copy it, you're still pointing at the same backing array.

Therefore, a backing array with data like [1,2,3,4,5] could be pointed at by 2 slice headers (slice metadata) looking like

A: {len: 2, cap: 10} [1,2] B: {len:5, cap: 10} [1,2,3,4,5]

So any append operations on slice A will mess up the data in that backing array.

Now, sometimes your append will resize the slice, in which case the data is copied and a slice with a new larger backing array is returned. If this was happening concurrently then you'd lose the data in racing appends.

If the append doesn't need to resize the slice, then you'll overwrite the data in the backing array. And so you'll corrupt the data in the slice.

Here's an example I threw together: https://go.dev/play/p/qRUKUwIf3vx

Although the code in the post doesn't actually look like it has an issue. Their tooling just flagged it up as it potentially has an issue if the copy was actually used in the function. But the `safeAppend` function targets the correct slice each time.

masklinn · on June 10, 2022

I’m a bit doubtful as what you talk about is definitely a slice issue but it’s already an issue in completely sequential code if you reuse appended-to slices.

So while it’s also an issue in concurrent code, it’s really no more so.

Philip-J-Fry · on June 10, 2022

It is an issue in sequential code because as you say, that's just how slices work. But if you're always using the same variable you'll never encounter it because that slice can't change between you reading that variable and writing to it.

Once concurrency is introduced you can now read from the same variable, but another goroutine may have written to the same slice in the meantime. That's why you must protect the read and writes and synchronise them.

It's fundamentally just a race condition issue with unprotected reads. But people often overlook it in the case of slices because they think they're just taking a reference to the slice, which is safe to do concurrently IF slices were reference types. But they're not, they are copied.

masklinn · on June 11, 2022

> Once concurrency is introduced you can now read from the same variable, but another goroutine may have written to the same slice in the meantime. That's why you must protect the read and writes and synchronise them.

But, again, this can easily occur in sequential code as well: you call a function passing it the slice, it mutates the slice internally, it doesn't document that, or maybe the documentation is even wrong, you now hit this issue.

aaronbee · on June 10, 2022

I believe they made a mistake with that example. It doesn't look unsafe to me because the myResults sliced passed to the goroutine is not used. Or perhaps the racy part was left out of their snippet.

Below is what might be what they have meant. This code snippet is racy because an unsafe read of myResults is done to pass it to the goroutine and then that version of myResults is passed to safeAppend:

  func ProcessAll(uuids []string) {
    var myResults []string
    var mutex sync.Mutex
    safeAppend := func(results []string, res string) {
      mutex.Lock()
      myResults = append(myResults, res)
      mutex.Unlock()
    }

    for _, uuid := range uuids {
      go func(id string, results []string) {
        res := Foo(id)
        safeAppend(myResults, id)
      }(uuid, myResults) # <<< unsafe read of myResults
    } 
  }

EDIT: Formatting and clarity

collinvandyck76 · on June 11, 2022

Yea I think your read on the intent here is correct. I re-read that part five times wondering what I wasn't getting.

symfoniq · on June 10, 2022

I have the same question.

They talk about the "meta fields" of a slice. Is the problem that these "meta fields" (e.g. slice length and capacity) are passed by value, and that by copying them, they can get out of sync between coroutines?