This is true. We have in the past eliminated sentinel errors and gotten measurab...

sethammons · on June 1, 2024

I insist on wrapping all errors each time and only removing that when performance testing shows it to be a bottleneck. A top concern of my systems is debugability which includes descriptive, wrapped errors with structured logging (this is a super power for system development and I am surprised when folks don't give love to structured logs and detailed, reproducible errors).

I want organizational velocity in the general case. If wrapping an error is in a hot path and shows up in metrics, yeah, remove the wrapping. Otherwise, wrap the error.

What is your argument against that? It would seem you find the compute savings of non-wrapped errors outweighs developer time and customer impact. If that is not what you are saying, please correct me.

My creds are using Go since 1.2 and writing massively scaled systems processing multibillion events daily for hundreds of thousands of users with 4 to 5 9s of uptime across dozens of services maintained by hundreds of developers earning the company hundreds of millions of dollars.

azurelake · on June 1, 2024

My argument against wrapping for backend services is that is:

1. I think that it is preferable to handle the error where it happened instead of at the top of the stack. For a backend service, there's really only three things you want to do with an error: log it, maybe bump some metrics, and return an error code and ID to the client. You have a lot more information available (including a stack trace if desired) if you handle it at this point.

2. By wrapping the error up the call stack, you're building an ad hoc stack trace. Performance wise, this is (probably, haven't measured) a lot better than an actual stack trace, but as you said yourself, the top concern is debug-ability and developer velocity.

3. Wrapping an error doesn't provide just a stack though, you can add values to the error! Except...what does that really buy you vs. just adding the values to your structured logging system going down the stack vs. doing it on the way back up in an ad-hoc way? Those wrapped error values are a lot more difficult to work with in Grafana vs. searching based on fields.

4. If I have a stack trace, structured log fields, and a correlation ID, I personally don't get any value out of messages like ("could not open file), as I can just use the stack trace to go look at exactly what the line of code is doing. You could argue that with good enough wrapping, looking at the code wouldn't even be necessary, but I think that's pretty rare in practice. It also seems like a lot of extra work to spend a minute loading up the code in an IDE.

5. As mentioned in 1), what the client gets is just an error code and trace ID anyways. In fact, we actively don't want the wrapped context to be sent back to the client since it can be a security concern. If that's the case, we need to remove it and log it anyways. Why not just log the information in the first place?

Anyways, curious to hear your thoughts. I used to advocate for wrapping errors, FWIW.

zachmu · on June 1, 2024

My main argument against this practice isn't performance, it is that it makes error handling more difficult to write, review, and maintain. Treating errors as opaque and passing them up the stack in the general case is automatic and trivial to get right. Wrapping them is not.

I agree with your point about debugging, but I have a different idea how to best achieve it. Rather than wrapping an error at every stack layer, just take a stack trace when the error is created. This works great as long as... you don't design the system to require sentinel errors. Treating errors as rare, exceptional events rather than normal values used for control flow changes how you approach them.

kiitos · on June 1, 2024

> Treating errors as opaque and passing them up the stack in the general case...

...is directly contradictory to one of the most fundamental assertions of the language, which is that errors are values -- https://go.dev/blog/errors-are-values -- and therefore "can [and should] be programmed".

zachmu · on June 2, 2024

This article has a hilariously defensive tone.

The reality is that the vast majority of error handling in go is to do one of two things

1) pass it up the stack 2) wrap it and pass it up the stack

The fact that you must do this explicitly in all cases is a failure of the language. Many people have pointed this out, but the go team and elite members in the community are very dedicated to the myth that every error is precious and special and must be handled in a one-off manner.

kiitos · on June 2, 2024

"This article" is an explanation of a property of the language, written by one of its authors. It's not a position piece, it's just an additional bit of documentation.

I mean, your position is totally valid, no argument. But it's definitely not some kind of objective fact (I certainly don't agree). And it's essentially an objection to fundamental properties of the language as it exists. Whether or not those properties represent a failure of the language is a question for the philosophers, but regardless, your code needs to respond to things as they are, not as you wish they were :)