I've never understood the appeal of these "define struct-like-object" libraries ...

joshuamorton · on Dec 24, 2021

Ignore all of the validation aspects. In python, you have tuples, (x, y, z), then you have namedtuples and then attrs/dataclasses/pydantic-style shorthand classes.

These are useful even if only due to the "I can take the three related pieces of information I have and stick them next to each other". That is, if I have some object I'm modelling and it has more than a single attribute (a user with a name and age, or an event with a timestamp and message and optional error code), I have a nice way to model them.

Then, the important thing is that these are still classes, so you can start with

    @dataclass
    class User:
        name: str
        age: int

and have that evolve over time to

    @dataclass
    class User:
        name: str
        age: int
        ...
        permissions: PermissionSet
        
        @property
        def location():
            # send off an rpc, or query the database for some complex thing.

and since it's still just a class, it'll still work. It absolutely makes modelling the more complex cases easier too.

dgroshev · on Dec 25, 2021

Note that that "location" property should be a method instead of property to signal that it does something potentially complex and slow. Making it a property practically guarantees that someone will use it in a loop without much second thought, and that's how you get N+1.

joshuamorton · on Dec 25, 2021

Fair point! one of various @cached_property decorators might fix this, depending on the precise use case, but yeah this is an important consideration when defining your API.

atorodius · on Dec 24, 2021

well one appeal is that you dont have to write constructors, that‘s already enough of a win for me. then you get sane eq, and sane str, and already you remove 90% boilerplate

nauticacom · on Dec 24, 2021

I really, genuinely don't get the appeal. I don't follow the "less code = better" ideology so maybe that's a contributor but I really don't see how this:

    class Person:
        def __init__(self, name, age):
            self.name = name
            self.age = age

is any worse than this:

    @dataclass
    class Person:
        name: str
        age: int

I'm not writing an eq method or a repr method in most cases, so it just doesn't add much for the cost.

theptip · on Dec 25, 2021

The point is that for data-bag style classes, you end up writing a lot more boilerplate than that if you use them across a project. Validators (type or content), nullable vs not, read-only, etc.

The minimal trivial case doesn’t look much different, but if you stacked up 10 data classes with read-only fields vs. bare class implementations with private members plus properties to implement read-only, and you would start to see a bigger lift from attrs, as there would be a bunch of boring duplicated logic.

(Or not - if your usecases are all trivial then of course don’t use the library for more complex usecases. But hopefully you can see why this gets complex in some codebases, and why some would reach for a framework.)

Spivak · on Dec 24, 2021

The advantage of dataclasses is that they’re hard to mess up. They define all the methods you need to have an ergonomic idiomatic class that is essentially a tuple with some methods attached and have enough knobs to encompass basically all “normal” uses of classes.

It’s a pretty good abstraction that doesn’t feel half as magic as it is.

michaelcampbell · on Dec 24, 2021

Given that code is for people, I've never found a certain amount of idiomatic boilerplate a problem. The desire to remove it all, or magicify it away (eg: Django) has always made me do a bit of an internal eye roll.

glyph · on Dec 29, 2021

To start with, the non-`@dataclass` version here doesn't tell you what types `name` and `age` are (interesting that it's an int, I would have guessed float!). So right off the bat, not only have you had to type every name 3 times, you've also provided me with less information.

> I'm not writing an eq method or a repr method in most cases, so it just doesn't add much for the cost.

That's part of the appeal. With vanilla classes, `__repr__`, `__eq__`, `__hash__` et. al. are each an independent, complex choice that you have to intentionally make every time. It's a lot of cognitive overhead. If you ignore it, the class might be fit for purpose for your immediate needs, but later when debugging, inspecting logs, etc, you will frequently have to incrementally add these features to your data structures, often in a haphazard way. Quick, what are the invariants you have to verify to ensure that your `__eq__`, `__ne__`, `__gt__`, `__le__`, `__lt__`, `__ge__` and `__hash__` methods are compatible with each other? How do you verify that an object is correctly usable as a hash key? The testing burden for all of this stuff is massive if you want to do it correctly, so most libraries that try to eventually add all these methods after the fact for easier debugging and REPL usage usually end up screwing it up in a few places and having a nasty backwards compatibility mess to clean up.

With `attrs`, not only do you get this stuff "for free" in a convenient way, you also get it implemented in a way which is very consistent, which is correct by default, and which also provides an API that allows you to do things like enumerate fields on your value types, serialize them in ways that are much more reliable and predictable than e.g. Pickle, emit schemas for interoperation with other programming languages, automatically provide documentation, provide type hints for IDEs, etc.

Fundamentally attrs is far less code for far more correct and useful behavior.

masklinn · on Dec 25, 2021

> I'm not writing an eq method or a repr method in most cases, so it just doesn't add much for the cost.

Until you need them for debugging.

And dataclasses make them free, at lesst syntactically.

nauticacom · on Dec 25, 2021

I understand repr for debugging (though imo it's a deficiency of the language that custom objects don't have a repr which lists their attributes), but eq is a property of the domain itself; two objects are only equal if it makes sense in the domain logic for them to be equal, and in many cases that equality is more or less complicated than attribute equality.

masklinn · on Dec 25, 2021

> though imo it's a deficiency of the language that custom objects don't have a repr which lists their attributes

It makes perfect sense that attributes be implementation details by default, and `@dataclass` is one of the ways to say they're not.

> eq is a property of the domain itself; two objects are only equal if it makes sense in the domain logic for them to be equal, and in many cases that equality is more or less complicated than attribute equality.

dataclass is intended for data holders, for which structural equality is an excellent default,

If you need a more bespoke business objects, then you probably should not use a dataclass.

ericvsmith · on Dec 25, 2021

dataclasses makes it easy to just add your own `__eq__` method, if the default doesn't suit you. You just define it! Nothing else is required.

So I wouldn't be so quick to abandon dataclasses in such a case.

masklinn · on Dec 27, 2021

> dataclasses makes it easy to just add your own `__eq__` method, if the default doesn't suit you. You just define it! Nothing else is required.

That's completely besides the point, and lots of non-data objects should not be equatable at all, if they even can technically be.

ericvsmith · on Dec 27, 2021

I don't understand your point. Are you saying that `__eq__` shouldn't be implemented at all? There's a switch for that, too.

masklinn · on Dec 28, 2021

I was merely noting that dataclasses are mostly intended for data holder objects (hence data classes), and thus defaulting to structural equality makes perfect sense, even ignoring it being overridable or disableable.

This was in reply to this objection:

> eq is a property of the domain itself; two objects are only equal if it makes sense in the domain logic for them to be equal, and in many cases that equality is more or less complicated than attribute equality.