Python formatting is egregious though, with hopelessly long lines and no sign of...

freyir · on Nov 22, 2018

> Python formatting is egregious though, with hopelessly long lines

This is the result of writing hopelessly long lines of code. Python doesn't force you to write long lines.

I've done it myself, many times, but gradually made efforts to avoid this. I'll move deeply indented code into a new function or split up a long expression with a temporary variable. If I run into a particularly hard-to-format portion of code, it's usually a sign of that my code could be better.

goldfeld · on Nov 22, 2018

But list/dict comprehension puts the programmer in bad shoes, because anything interesting/powerful done with it gets dangerously long. This and other features where non-trivial statements or calls are written seem poorly thought-out, from a formatting perspective (nevermind 120 columns being the style goal for the language and common IDEs, already longish for term vim and Emacs.) Adding to it for the other commenter (vaylian) who asked for example code, pretty much any codebase I've been delving into will have at least one ugly line per non-trivial class implemented (some undergo TWO virtual linebreaks!! Maybe OK for mouse-clickers but vimmers recoil and shriek.) It just doesn't play nice. I won't go into how I dislike the class system which feels bolted-on (to a script language) too because then that'd be a bit unfair and off-topic.

int_19h · on Nov 22, 2018

Comprehensions are very easy to format spanned across multiple lines - since they're always enclosed in some kind of brackets, you can split them over multiple lines in a readable fashion:

   ys = [x + 1
         for x in xs
         if x > 0]

or:

   ys = [
      x + 1
      for x in xs
      if x > 0
   ]

kbenson · on Nov 22, 2018

My biggest issue with list comprehensions is that while it's a useful and fairly readable format for a single list comprehension, either little attention was given to how it would function when taking a comprehension of of comprehension, or they thought the syntax would be so cumbersome people wouldn't do so (ha!).

When compared to a set of functional style mapping and filtering functions, it quickly becomes much less readable as your needs become anything more than trivial. My main issue is that focus for the important aspect of what is being accomplished swings back and forth from the front to the back of the statement multiple times, especially if nested and with conditionals. e.g.

  [(x,y) for x in range(10) for y in range(10) if y > 5 if x < 6]

or, in a similar formatting to what you show:

  [
    (x,y)
    for x in range(10)
      for y in range(10)
      if y > 5
    if x < 6
  ]

In both cases, the accurate reading requires scanning back and forth from beginning to end of statement (whether vertically or horizontally) because the conditionals always postfix the rest of it (and this example is not as complex as it could be). For loops would likely have the conditionals preceding everything, making it obvious, and a set of filtering statements. The functional style also allows for a fairly straightforward reading of what's going on:

  toTen.filter(y=>y>5).flatMap(y=> toTen.filter(x=>x<6).map(x=>[x,y]) );

or:

  toTen.filter(y=>y>5).flatMap(y =>
    toTen.filter(x=>x<6).map(x=>
      [x,y]
    )
  );

Of course the functional style does require at least some minimal knowledge of some concepts often extraneous to novice programmers, so I understand why that wasn't chosen in Python's case. I just wish they had put conditionals in the same positional flow as the rest of the statement.

duckerude · on Nov 23, 2018

That's not quite right. The conditionals are executed in order in the innermost loop. For instance:

  >>> [(x,y) for x in range(2) for y in range(2) if print(x, y) is None if print(x, y) is None]
  0 0
  0 0
  0 1
  0 1
  1 0
  1 0
  1 1
  1 1
  [(0, 0), (0, 1), (1, 0), (1, 1)]

Both conditionals have access to x and y.

So the list comprehension is equivalent to this:

  [(x,y) for x in range(10) for y in range(10) if y > 5 and x < 6]

And could more clearly be formatted like this:

  [
    (x,y)
    for x in range(10)
      for y in range(10)
        if y > 5
          if x < 6
  ]

And would look something like this in a functional style, perhaps:

  toTen.flatMap(x =>
    toTen.filter(x => x<6 && y>5).map(y=>
      [x,y]
    )
  );

kbenson · on Nov 23, 2018

Ah, thanks for the clarification. Python is not a language I use often (but of course is a language seen often).

I actually see what's going on a bit clearer now, as I looked closer and found that the correct way to write what I was originally trying to express is actually:

   [(x,y) for x in range(10) if x < 6 for y in range(10) if y > 5]

which could be formatted as:

  [
    (x,y)
    for x in range(10)
      if x < 6
        for y in range(10)
          if y > 5
  ]

Which is actually much closer to the functional style's flow, and is correctly eliminating iterations earlier in the loop (which is an important consideration).

I was confused because I had seen examples where multiple if clauses where added to the right side, one per loop level (as I showed), and that makes it look like they are operating on the different levels, when in reality they are working on the innermost loop, like you showed.

I'll retract most my complaints then. There's still some question in my mind as to how you would usefully mutate items of the loop and use them in other levels of the loop without recomputing them again, but that might just be my unfamiliarity with the construct.

int_19h · on Nov 23, 2018

I agree. The most readable syntax for sequence comprehensions that I know of is C# LINQ and XQuery FLWOR, and it's no coincidence that they put the projection clause ("select" in C#, "return" in XQuery) last - it follows the overall flow better.

speedplane · on Nov 22, 2018

Both of the examples you give are far less readable than an equivalent for loop with an if statement. Programmers use comprehensions because there is an understanding that they are more efficient and the compiler/interpreter can better optimize them.

duckerude · on Nov 22, 2018

I use comprehensions because they're self-contained. Instead of a loop with a few statements that introduces new variables into the scope, there's a single expression with fewer loose ends. I think using (non-generator) comprehensions for efficiency is usually the wrong reason.

I find the split comprehension more readable than the equivalent for loop. That's mostly because I'm used to that style, of course, but it's also because it's more constrained. Comprehensions have a very limited grammar, but there are multiple ways to write a for loop that builds a list.

BerislavLopac · on Nov 22, 2018

They seem less readable to you because this is not what you're used to. While I agree that complex, one-line comprehensions, possibly with more comprehensions nested inside, can be quite unreadable even for experienced developers, I find that the given examples, with clearly separated expression-loop-condition structure, tend to be quite easy to read and understand, even for junior developers.

speedplane · on Nov 24, 2018

I’m quite comfortable with list comprehensions, I’ve been writing them for decades. I like them because they are concise, efficient, more clearly tell the compiler/interpreter their intent.

That said, they are almost always harder to understand than an equivalent for loop with an if statement. A list comprehension by its very nature groups a number of actions into a single expression, it’s harder to break up the parts.

Even an experienced developer who sees them all the time and can understand one in a second, would probably take a half second to understand the equivalent for/if statements.

BerislavLopac · on Nov 25, 2018

It all depends on how you approach them. If you see comprehensions as a different form of a for loop, I would tend to agree; but if instead you see them as a different form of a map+filter combination, they suddenly become much more clear.

speedplane · on Dec 2, 2018

I agree. If you see comprehensions as map+filter, they make more sense, but it's also exactly that reason that makes them more complex.

List comprehensions are basically a watered down gateway drug to functional programming. Everyone who has ever taken a functional programming course loves it (rightly so), and often tries to find places to use it. Comprehensions do quite a bit in a single expression, it's easy to see that inch towards functional programming.

However, junior developers haven't taken a functional programming courses. They learn to program instructions or statements line-by-line. They are told (not entirely accurate) that every clock tick, the processor moves forward one unit at a time. Your mind starts to imagine the processor in that way, executing a line and going on to the next. Do this, then move forward, then do that. This is procedural programming.

A list comprehension doesn't quite fit that model, because it does quite a bit in a single line (generally map + filter, sometimes reduce). They teach you that units of complexity can generally be broken down line-by-line.

Of course list comprehensions can be formatted to multiple lines, but it is intrinsically something quite different. A list comprehension is not a statement (e.g., var foo = b + 5), it's an expression (['b' if x < 1 for x in y]) and a pretty complex expression at that.

Junior software developers are taught about statements, going line-by-line. They aren't taught about functional programming or complex expressions. I love python comprehensions, but I wish they were presented in a way that was as easy to understand as a for loop with an if statement.

Senior developers wouldn't care, but it would open up a giant world to junior devs.

int_19h · on Nov 23, 2018

What's the difference between a for-loop with an if-statement in it and the comprehension above? Even syntactically, they are almost perfectly matched on tokens. Except with a comprehension, you immediately know that not only it consumes a sequence, but it also produces one (whereas a loop could really do anything).

speedplane · on Dec 2, 2018

See my other answer to a similar question above, but basically for-loops and comprehensions are different due to their "learn-ability" and "read-ability". For-loops are easy for novice programmers to understand and functionality is broken up line-by-line as the processor consumes instructions. Comprehensions are an expression, and do several things in a single line. Novice programmers don't understand them, and I do believe that even for expert developers, they are harder to read than an equivalent for-if loop.

goldfeld · on Nov 22, 2018

To me it's similar to having to break-up a long line in the middle of a non-important function call, because parens is where it's at. Feels like hooking into the middle of nowhere-don't-care just because that's the unpractical rule Python follows, since indentation is part of syntax.

tragomaskhalos · on Nov 22, 2018

There is another Python feature that saves you from this trap however - the nested function. I have lots of code that looks like this:

    def txform_item(x):
      <maybe many lines, as complex as needed, + can see vars in outer scope>
    # right below, for optimal locality of ref for human reader
    new_list = [ txform_item(item) for item in old_list ]

vaylian · on Nov 22, 2018

I follow the 80 columns rule as well. Do you have an example of such unkempt code that you can share?

int_19h · on Nov 22, 2018

Long lines come most often in conditions of if- and while-blocks, mostly because there's no way to split them over multiple lines that isn't visually hideous.

But a condition can be easily split if it is assigned to a variable, and that variable then tested. And naming said variable well can make a comment explaining the condition redundant.

goldfeld · on Nov 22, 2018

Yes, that's a great example, especially with deep nesting which forced indentation makes worse. But the solution feels like working against the grain, when a language should be working for human readability, not for workarounds to make a scripting language featureful.

int_19h · on Nov 23, 2018

One would argue that forcing long conditions to be split up, and intermediate steps named, is rather encouraging readability. ~

jononor · on Nov 22, 2018

Yes, prefer a named variable if the logic gets long. If it is really long, consider introducing a function to calculate the predicate.

growtofill · on Nov 22, 2018

Doesn't Python have an autoformatting tool like gofmt or Prettier?

eirki · on Nov 22, 2018

Black (The uncompromising Python code formatter) is all the rage atm.

https://github.com/ambv/black

sametmax · on Nov 22, 2018

Yes. Most of the projects now use black https://github.com/ambv/black

quietbritishjim · on Nov 22, 2018

The problem that black had last time I saw it on hacker news is that it only inserts whitespace. Unlike braces-and-semicolon languages, that is sometimes not enough to format code well. For example, given the following line:

    x[1][4] = a[2] + b[4]

Black will format like either of these:

    x[1][
        4] = a[2] + b[4]
    
    x[1][4] = a[
        2] + b[4]

But not like either of these, unless you insert the brackets yourself:

    x[1][4] = (
        a[2] + b[4])
    
    x[1][4] = (a[2] 
        + b[4])

Maybe this sounds like just one little problem but I think it's a fundamental flaw. I have seen the result of a Python formatter (not Black but had the same problem) applied to a couple of files and it's a total mess. I'd take inconsistent column widths over that any day. I asked the creator of black about it and he was pretty dismissive.

Rotareti · on Nov 22, 2018

I use yapf and in such cases backslashes work fine:

    x[1][4] =\ 
        a[2] + b[4]

dhuramas · on Nov 22, 2018

Yapf- https://github.com/google/yapf

baq · on Nov 22, 2018

there's autopep8 and a few other tools but none are considered idiomatic.