Because Java’s if does not return a value,
you cannot say:
return if (x < 0) "-" else "+";
This would have been a nice place to make an analogy to the question mark operator, since a java programmer would probably be familiar with it and it allows you to write:
If you're going to reference other languages, `?` is an operator on its own in Swift. Calling `?:` by the name 'question mark operator' is only naming half of the thing.
> if does not have a return value in a language like Java. In other words, it is not an expression, but a statement. Because everything in Clojure is an expression, there is no equivalent construct to Java’s if in it.
Anyone know the language design rationale behind the way Java does it? It seems much easier to make everything an expression. Ive always disliked that part of Javascript and being forced to use ternary operators to do a single line return statement for a conditional. Same with case statements.
If your control structures are expressions then they can lead to code that the maintainers of the language might not want to promote. Remember that language design is not just about being as flexible as possible. If it were we would have stopped with Lisp. Languages are also designed with readability, portability, and simplicity in mind.
Imagine if you could write this (nonsense) code in Java:
bool a = (while (b) { if (c) { b=d;} else { b=e; }});
While it is efficient, some may balk at how “implicit” it is.
Of course I'd say this as a lisper, but as long as the return values of everything is intuitive and well defined, I don't see an inherent problem with such form.
Indeed, if anything, if the intention is to communicate that the Boolean value is the result of some process that runs through a while loop, then explicitly saying that in the assignment seems to me to almost be the best way to do it :\
ymmv.
Now deeply nested while/assignment/ifs, that's a bad-pattern, but I'm not inherently convinced that's an expression/statement problem.
> Anyone know the language design rationale behind the way Java does it?
Because C did it, because Algol did it, because Fortran did it, because that is how assembly code works. Expressions require code to be compiled to something that uses a temporary value stack (or equivalent, like what some continuation-passing style compilers do by allocating values from garbage-collected memory), which, along with subroutines, was an exotic technology well into the 1960s.
Clojure seems to have replaced the "p"[0] suffix in predicates with "?", the former being an old Lisp convention. Interesting choice, I'm mildly miffed they didn't go with the old convention.
Question mark is a Scheme tradition. Clojure is a new lisp, drawing inspiration from many sources.
One of Rich Hickey's goals, stated in many ways and in many of his presentations, was to design a modern lisp not bound by design decisions in old ones. That's why Clojure is not built on or directly based on any specific lisp.
He has made a big point about moving on from lisp-isms such as `car` and `cdr`, which are based on tradition and specific archaic hardware implementation, to modern and within-language-consistent things like `first` and `rest`, which are both self-descriptive and implemented for anything that conforms to the seq protocol, as opposed to `car` and `cdr`, which are tied to cons cells.
I do not recall anything specific that he's said about about the choice of a question mark vs a "p", but I can imagine a similar argument that a question mark is more self-descriptive than "p". "p" only makes sense if you know lisp traditions.
"p" makes sense if you know that it stands for "predicate". It's not nearly as arcane as car and cdr.
Also can we finally settle on a name for these things now? If car and cdr are too arcane we can do away with them (though I like being able to do caddadadr) but why do we need "first" and "rest" rather than the already established "head" and "tail"?
I particularly dislike "rest" since it's a relative term, i.e. in normal speech saying that you're going to do something with "the rest" of the list means something different depending on how much of the list you already used.
In Clojure, first and rest apply to sequences, which are a logical list abstraction. They apply to lists, but they can also be used on sequential views of indexed vectors, maps, sets, database result sets, files in a dir, lines in a file, and an open world of "things that can be seen in some order". head and tail I suspect are much more tied to linked lists and data structure than the more plain first and rest.
I might add that breaking the naming away (from traditions) also helps breaking the mindset away (from "that's how we always did it").
This allows for unified abstract views on things normally distinguished from each other by words, syntax, conventions and mind. This allows stuff like pattern matching generic functional code like the mighty monad with it's transformer.
Learning functional programming incl. higher kinded types (dependent types and such) isn't easy for someone used to e.g. Java,C,C++,Python,Go, and others I can't judge of the top of my head.
If you ask yourself why said functional is beneficial, remember the popularity and typical confusion of the Monad posts/articles. You might not quite understand / "get" it, but they keep coming up and the authors seem intelligent. So you (might) just have a case of the Blub Paradox[0]. You need to understand it to grok it's usefulness/desirability.
You can treat any collection or data structure that implements the sequence protocol as a lazy sequence with
(seq collection)
But you don't have to. If you choose to, you're not tied to a representation of mutable cons cells. There are good reasons to want to iterate over values in a map or a set, while still wanting the lookup characteristics of those collections, rather than a linked list.
This is hardly a controversial stance. Python, for example, supports iteration over many of its data structures and it is recommended to implement the dunder methods for iteration on any class that might reasonably be the basis of an iteration.
All clojure does is agree that iteration over a lazy sequence is a valuable concept. It implements this as an abstract protocol, rather than as a concrete implementation on linked lists.
> "p" makes sense if you know that it stands for "predicate". It's not nearly as arcane as car and cdr.
We are in agreement here. It seems to me, though, that it is an easy case to make that a question mark makes sense if you know English. Surely this is a convention less dependent upon specific knowledge.
Either way, Clojure is its own language and is not beholden to the tradition of any other language. What is most important is that it chooses a convention and implements it consistently. It is not meant to be Common Lisp nor is it meant to be Scheme, nor Dylan, nor Shen, nor Femtolisp, nor any other implementation. It is meant to be Clojure and what that means is defined by Rich Hickey.
Regarding`first`, `rest`, `head`, and `tail`, I don't think the argument based on English usage is the strongest. `head` and `tail` are established in some other programming languages, almost always as operations upon a link list. Those two words in English refer to body parts on some animals, hardly an obvious analog to generic collections. `first` and `rest` are not unheard of in other programming languages and in English do have a natural connection with the idea of a collection. Again, Clojure is meant only to be what Rich Hickey designed it to be, not a faithful reproduction of any other programming language.
Although Scheme's question mark for predicates has had influence in other languages -- besides Clojure, it shows up in Ruby, for example. And for head/tail versus first/rest, I don't really mind either one, but head/tail is pretty strange if you haven't seen it already -- it makes the programmer imagine that their data is some sort of animal that needs to be beheaded (or is it supposed to to be the head and tail of a coin?)
I don't like this since I prefer using adjective for conversions/casts, i.e. I'd like (int x) to mean "x considered as integer". If you mix the two conventions (like you seem to suggest with your last paragraph) then it just becomes ambiguous.
The trouble is that `if(value)` or `value && ..` ends up getting used as an idiomatic shorthand for "if value is present" even in languages (such as javascript and python) where 0 is falsey. Because most of the time it works, so people do it. And then inevitably get bugs when the value happens to be 0.
This can get increasingly hard to reason about the more datatypes you add which interpret 0-ish values as falsey -- as there's always the temptation to do, for 'consistency' with integer truthiness, e.g. https://lwn.net/Articles/590299/ about the python behaviour of dates being falsey in the first second after midnight.
If `null` and `false` are the only falsey values (ala clojure, ruby, elixir etc), that's a rule that's really easy to internalise and reason about, so you never have to worry about things like whether your data type might be falsey in the first second after midnight (and also `if(values)` is actually a correct idiom).
Here's another prime python example. The response object in the requests library has a truthiness equivalent to request success. Can easily trip you to with `res and res.content`
* zero of any numeric type, for example, 0, 0L, 0.0, 0j.
* any empty sequence, for example, '', (), [].
* any empty mapping, for example, {}.
* instances of user-defined classes, if the class defines a __nonzero__() or __len__() method, when that method returns the integer zero or bool value False."
This really does not strike me as being simple.
I personally don't find these shortcuts to be worthwhile. The best approach is that boolean is a separate type, true is true, false is false, and any attempt to use any other type as the predicate of a control flow statement is a (preferably compile-time) error. Writing `if x != 0` is not a large burden.
> I personally don't find these shortcuts to be worthwhile
They are, because they are the difference between driving with the parking brake on or not. Other languages don't have this so that's why people might not see the value in it first.
> Writing `if x != 0` is not a large burden
If you're comparing what can only be an int value, then it's not a burden.
When your variable can assume multiple values, explicitly comparing it with multiple possibilities is a burden:
Examples:
- Your variable is optional and/or is a sequence that might be empty
- You're using .get() in a dictionary
One of the signs someone is unfamiliar with Python is doing multiple comparisons when only `if x:` would suffice
I've worked in a lot of different languages, from ones that work the way I prefer all the way to Python-style languages where anything is a predicate and there are a bunch of semi-arbitrary rules about what qualifies as "false." This isn't a lack of familiarity talking.
Regarding your examples, having an optional sequence is probably not the correct move anyway. Is there actually a semantic difference between no sequence and an empty sequence? If not, the variable should be non-optional. If so, then glossing over those differences by writing `if x` to implicitly check both conditions is unclear. Not sure what you're referring to with get() in a dictionary.
I understand that good Python style is considered to be one where you take advantage of the language's notion of truthiness, but I disagree with its whole approach.
There are Python idioms for exactness, and idioms for truthiness. When in a situation where it's possibly ambiguous to test truthiness 'if x is None:' or 'if x is True': is completely unambiguous.
Most code doesn't need that and the truthiness nature of various data types is helpful in writing clear code.
> Is there actually a semantic difference between no sequence and an empty sequence?
It really depends on your function, but you might want to have the function behave differently if given an empty list rather than no list. One very common case for doing that is when you want to facilitate unit testing of a function.
> Not sure what you're referring to with get() in a dictionary.
A very common case, in which you don't care if the element is not present or it is a "False" element.
Behaving differently on an empty list than on no list is a huge smell. That's really weird and unexpected behavior. Doing it for testing is even weirder.
If you're fetching an element from a dictionary and want not-present to be the same as some false value, fetch with a default value (in Python, pass a second parameter to get()) that you want to see when there's nothing present. Or just extract the value and check for nil or empty separately, nothing wrong with being explicit.
> Python's rule is simple, it's JS that came up with confusing rules.
I think Perl was there first: in addition to 0 and "" being falsy in Perl, "0" is also falsy. Then there is this nifty Perl value "0 but true": if you stick it into a condition, it will evaluate to true, but if you do math with it, it behaves like the number zero.
Suppose a fantasy language has an ergonomic syntax to check existence. It returns a boolean. Additionally, implicit conversions to boolean for objects in the language is randomized for each runtime.
Isn't such a weirdo language still preferable to implicit conversions, even when compared to a language where "null" and "false" are the only falsey values? Because even in those languages the new user must internalize the simple rule and learn from context how it works and why it can be used with impunity. Whereas with my weirdo language the syntax explicitly conveys to all classes of user what is happening in the code.
Furthermore, I'd bet that even in your preferred languages there are less readable idioms that ninjas can leverage in the implicit conversions. In my weirdo language the randomized conversions would thwart the ninjas and keep the entire class of users free from their unreadable tyranny.
Due to exactly this issue, an increasing number of languages have an operator just for "if exists and not null": the null coalescing operator[0].
What "exists" means depends on the language; it can be as loose as checking if the variable has ever been set (PHP), but generally has nothing to do with truthiness/falsiness status.
I'm not sure 0 being falsy is ever a good idea, it's just an assembly-ism that became a C-ism and spread from there. Zero is not a "special enough" value in the ring of integers. In general, these days I prefer as few implicit coercions as possible.
In many Lisps, nil is the empty list literal, but interestingly not in Clojure, where it just maps to JVM null, and vectors are the most commonly used data type anyway. However, it is idiomatic for APIs to accept (but not return!) nil as if it were an empty sequence or collection, unlike in most non-Lisp languages.
It's not just an assemblyism, the near equivalence between 0 and false, and 1 and true is the thing that Boole wrote about for which Boolean values are named. I would argue that it's a math thing, not an assembly thing.
That’s true enough, but in programming languages it’s usually zero=false and any non-zero=true, at least in argument position. That’s more difficult to justify mathematically.
where [1] stands for "any nonzero number". (please let me know if i missed anything!)
so natural numbers (quotiented by an equivalence relation) seem to work as a model for boolean arithmetic. and idk if we need more mathematical justification than "it works fine"
EDIT: changed "integers" to "natural numbers", because
It's a good idea in languages that are both a: statically typed (so a variable that can contain 0 is not also able to contain False) and b: weakly typed (so integers can be implicitly treated as boolean at all). Whether it's a good idea for a language to have both those properties is debatable, but if "if(statically_a_int)" is valid at all, "is zero" is one of the only sensible interpretations ("is negative" being the other, but that doesn't generalize to unsigned types well).
But the fact that some types have a reasonable default value doesn’t make the values intrinsically equivalent to Boolean false. Plus, implicit default initialization is itself not necessarily a desirable feature. Especially not the kind that just default initializes to an all-zeroes bit pattern.
default initialization is a desirable feature in most circumstances, and I would argue most languages should be opt-out rather than opt-in.
There are really only two reasons I can think of for not wanting default initialization.
1. performance
2. to catch mistakes on the edges of your system.
A good example of the second is converting incoming data to enumerations. If you're not explicitly checking for it, you can accidentally default initialize to a valid enumeration. This is why I always set the first enumeration to 1 for any language that allows me to. If someone makes that mistake, the missing data causes an error rather than silently doing the wrong thing.
But I can't really think of any reason outside of those two cases where you wouldn't want default initialization.
And my point was that one of the reasons 0 is chosen as false because it's consistent with respect to default initialization of booleans. I realize that isn't the original reason, but today it's a good reason for having the behavior.
0 and [] are real and meaningful values that have different meanings than nil/null (in what I assume you're referring to as Python) or (in JavaScript) undefined. They should be treated as such.
(That JS coerces the empty string to false is also bad.)
There is both null and undefined in JavaScript. Additionally, an empty array coerces to true because it's an object and objects are always truthy.
Personally I'd prefer that not even null/undefined coerced to false, or could otherwise be used in a boolean comparison, as for me they still signify different meanings and even in type-safe languages can lead to subtle bugs. I really dislike coercion having principally worked with JavaScript/TypeScript.
All of the comments here seem to be addressing the issue of 0 being falsy, and I agree with them - 0 is a perfect 'useful' value that is no different from 1 in many use cases. But no one seems to be addressing the advantage of the empty list/array/collection being treated as falsy, which in my opinion is significant.
The major advantage of empty collections being falsy is that you can kill two birds with one stone - it enables you to reduce your use of nil/null/None dramatically, since empty collections will fail a very idiomatic "if items" check just as well as null. But at the same time, you don't have to depend on users (library or otherwise) being careful to provide an empty list rather than null when they mean 'no value' - if they pass a null then your code will take the exact same shortcut with the exact same simple, idiomatic test. Whereas the alternative is needing to do both the null check AND the empty check any place you cannot trust the users of your function/library/whatever to pass real empty collections rather than nulls, which not only becomes tedious but is quite easy for new programmers to fail to do consistently.
In my own experience, the number of cases where an empty collection implies that data is effectively 'missing' is far greater than the number of cases where an empty list being provided would be semantically different from a null - in both cases "there is nothing there". And being freed to then create and pass around empty lists means that my own code is no longer littered with the necessary guards against Nullness in cases where I am the consumer of my own functions and want to be able to write simple list comprehensions without any guards.
In Clojure this is somewhat less of an issue because, as others have mentioned, it's idiomatic to pass `nil` around in the place of an empty collection, and so it's _generally_ safe to do so where in another language like Python or Javascript the function you're calling might not expect it. But in my opinion there was no real advantage to establishing this convention when it could just as easily have been the other way around, and there would therefore be somewhat less opportunity for inconsistent or inexperienced programmers causing NullPointerExceptions.
0 does not indicate absence of value. Think of temperature - 0 is a valid temperature and is actually not equal to 0 when converted to Fahrenheit. It becomes incorrect to check presence of temperature by just relying on Boolean coercion, you have to check for not-None(in python) explicitly.
Generally the more special values, that are treated as false, you have in a language - the more complex and confusing your programs become. Just my opinion.
Nitpick: a actual zero (ie, in Kelvin or Rankine) is not a valid temperature but a lower bound that temperature can't actually reach (like -infinity for real numbers).
0 being falsy makes no sense outside of the C family of programming languages. But there 0 is overloaded both as a number and pointer (it's a bit more complicated though, as the value of the null pointer doesn't need to be zero but assigning 0 to a pointer will assign the null pointer).
In languages with more elaborate type systems, you usually have the pointer and number representation separated.
0 has no special meaning in regards to boolean logic but it has the special meaning of the identity/zero element in an additive group. Likewise, it makes sense to have "[1] + [] => [1]" in your programming language.
But making zero elements evaluate to falsy is problematic. "" would be falsy as well then.
* Does it have a defined value (true) or not (false)?
* Is it non-zero (true) or not non-zero (false)?
* Does it have non-zero length (true) or not non-zero length (false)?
I think using zero as a magic number is less consistent than not using any magic numbers at all. If you want to check the length, or whether something equals zero, just check for that, not for whether it's "truthy".
Yeah, I don't get it. Your solution is better in near every way. The if statement should be doing a type check, not a value check. And it is redundant since the math will do it.
def my_calc(x, scale = None):
if not scale:
scale = 1
return (x + 4) * scale
where the (x + 4) bit stands for some interesting calculation whose result is being scaled: this code makes it unclear whether or not scale == 0 is intentionally conflated with scale == None or whether this is a programming error. It would be better to do this, if this is intentional:
def my_calc(x, scale = None):
if scale == 0 or scale is None:
scale = 1
return (x + 4) * scale
Or, if it's unintentional, one should have written:
def my_calc(x, scale = None):
if scale is None:
scale = 1
return (x + 4) * scale
Generally speaking we're creating contrived examples because in all fairness, I've only seen this once I can remember in the last 5 years of writing business logic in python. Usually language purists would prefer us to be explicit in our languages. And type safety is the best thing since sliced bread.
But effectively that's why python has taken off. Types do matter, but for most things they don't Lots of good software has been written using python.
I'm not sure Python would do that today without backwards compatibility reasons. Python's True and False are late additions to the language, before that 0 and 1 served their functions.
I think this is one of the symptoms of the changing priority of simplicity in Python language design, at some point the design decision about 0/1 for simplicity became dated enough to change it despite being a big change and requiring backwards compatibility hacks with 0/1.
For what it's worth, I am generally in agreement with you, so you're not alone.
I do think it would've been better to have 0, specifically, not be falsy - unlike the other cases, 0 is not 'the absence of any actual data' but rather a special case of the data. I assume Python stuck with tradition on this one mostly because it is so tightly tied to C and UNIX-style scripting that changing it would've been confusing for most early users.
Do assembly languages even have a concept of truthy/falsy? The ones I'm familiar with always require an explicit predicate, such as "jump if zero" or "jump if not zero."
I'm also used to 0 being false. A possible upside to this is that you can use multiplication and addition as boolean AND and OR, which might not seem that useful, but I remember that BASIC on a ti99/4a had neither AND nor OR, but boolean expressions evaluated to 0 or 1, so you could use this trick.
Instead of
if x = 5 and y = 7 then goto 10
use
if (x = 5) * (y = 7) then goto 10
Good to know when you're 11 years old typing in some program from a magazine meant for another computer that had AND and OR.
It sounds like that was only useful because the language actually had no Boolean type (hence no operations for Boolean types). It doesn't really demonstrate that there is value to a "falsey" 0 in a language that already has a perfectly good value for false.