Lisps are like tiling window managers... Some have a taste for them. Most, don't.
Unfortunately some of the Lisp cool features are hard to implement without its simple syntax. Maybe the lesson is that we should find a f way to have syntax-independent programming languages! Just as we expect any successful language to have at least a couple implementations before being taken seriously, maybe we should expect a language to have at least a couple different alternative syntaxes and instant perfect translation between them (with comments and code style preserved), so I can use syntaxA and work on same codebase with my colleague using syntaxB. Than we'll be sure any metaprogramming or code intelligence tools don't suck because they'll be damn forced to not use the program text but the AST (that will have to be standardized) as they should instead...
Lisps would not even seem so cool anymore if we could get our s together and build languages with (a) standardized AST representations and (b) multiple syntaxes... we've like already imported most lisp features into modern languages anyway, and standardized ASTs would make macros both trivial and manageable...
Imagine something like Go and APL compile down to the same AST, and imagine two programmers working on the same project with different syntaxes as you describe. The Go user one day finds a function called "life" that contains several hundred lines of code, which is clearly bad style and should be broken up, so he goes to talk to his APL using coworker who wrote the function. His coworker seems confused, "it doesn't seem too long to me" he says as he shows the APL function on his screen:
life←{↑1 ⍵∨.∧3 4=+/,¯1 0 1∘.⊖¯1 0 1∘.⌽⊂⍵}
I think what you proposed is a good idea, but it's just an idea. We'd need some new innovation to make it work I think.
Your thought experiment shows how hard it would be. If we could do that, we could translate machine code into readable high-level programs.
To put it the other way around, it reveals something about how choice of programming language affects not only how we write a program, but what we write, and even how we think about the problem.
Rich Hickey had a great comment a long time ago on the Software Engineering Radio podcast [0] that I think speaks to this. He was asked why he didn't just make the ideas in Clojure a library for Java.
Interviewer: "Wouldn't it have been possible or simpler to add this using some kind of library or framework to an existing language? In other words, why does this kind of stuff need language support as opposed to just a library?"
Rich: "In fact it is a library. It's a library written in Java that you can use from Java. The whole thing about a language is, a language is about what does it make idiomatic and easy. So for instance, you can use the same precisely the same reference types, and the STM, and the data structures of Clojure, all from Java... The lack of idioms and language support means using exactly the same constructs, the same underlining code, from Java, is extremely painful compared to Clojure where it is the natural idiom."
I thought about it some more and I think it would go the other way too. APL has a lot of very terse array operations, but in Go you might find a mix of loops and if-statements. Translating those arbitrary combinations of loops and if-statements into APL might be very very ugly, or even impossible.
Consider assembly or basic, which use a lot of goto-statements. It can get very ugly trying to fit some arbitrary goto's into the more structured loops and if-statements we're accustomed to.
Exactly. There's a kind of entropic arrow from higher-level to lower. Moreover the different high-level languages occupy different places in that same space. It isn't just that you can't easily translate between them. It's that they lead to different classes of system being created in the first place.
What should be possible, though, would be to distinguish a surface-syntax from the underlying code: e.g. a language that can transparently convert to/from c-style curly brace blocks, Pascal-style begin/end and s-expressions. possibly implemented as a bunch of git hooks that transforms to a normalized representation (as, e.g. prettier does for JS) on push and transforms to the programmer's preferred representation on pull.
I suspect you'd find that what look like surface differences actually go deeper into the meanings of programs, like a plant whose roots go deeper than one would expect.
There's a special level of Hell that involves streaming xml processing and cobol. Since cobol lacks dynamic memory allocation (speaking of cobol 85 here) and you don't generally build things like linked lists or trees, there's a limited number of things you can conveniently do with streaming xml data. Many cobol programmers are left with a feeling of despair, I wager.
The Sapir-Whorf hypothesis is certainly real in programming languages[1].
Consider assembly or basic, which use a lot of goto-statements. It can get very ugly trying to fit some arbitrary goto's into the more structured loops and if-statements we're accustomed to.
There was a lot of work around equivalance of control flow between goto/conditional branch instructions/unconditional branch instructions and "while programs" in the 70's that showed there's actually a limited variety of control flow patterns, and while programs can be made equivalent. I think the result of this is the nowadays presence of break/continue loop constructs in most languages. Maybe this is a folk theorem, I don't recall really.
I think with a sufficiently advanced compiler pretty much any page of Go code would compile to a few lines of APL albeit in a different style. I have no source for this view though I can't imagine something in Go being more terse.
That sounds like an imaginary non-problem. In such an extreme case, the APL developer would just have to put up with that line broken into 10 lines maybe. Just configure his editor accordingly (maybe to move through text like there are no newlines except some condition or whatever - if you use an "exceptional" syntax waaay different from all the others, it's your problem to hack your Emacs/Atom whatever to handle it for your need!). And btw, you could also have an autoformatter like goftm that will break down that line of APL by a condition like "should not exceed maximum line length in any taxes", whether you like it or not, might even avoid having a conversation altogether.
Even if one of the syntaxes is not text based, maybe even a "syntax that involves arranging blocks in 3D space with a VR headset" or whatever, people could still agree on sensible defaults.
Only issue I'd see might be with open source projects, where you could add a guideline like "please reformat your code so that it looks good in syntax X according to styleguide X-42 or your pull request will not be accepted". Yeah, the conversations might become less democratic, with some "style dictator" needing to push "the right rules" down right every contributors throat to prevent endless bikeshedding, but it could work.
If we'd give of some of this crap "style democracy" and have a few more authoritarian rules, we could instead enjoy multiple syntaxes easily. The ol' "more freedom under tyranny" paradox that people always seem to (like to) forget it actually works, and could work well to code too.
> Maybe the lesson is that we should find a f way to have syntax-independent programming languages! Just as we expect any successful language to have at least a couple implementations before being taken seriously, maybe we should expect a language to have at least a couple different alternative syntaxes and instant perfect translation between them (with comments and code style preserved), so I can use syntaxA and work on same codebase with my colleague using syntaxB.
Microsoft tried something very similar with C# vs VB.net (in the early days of .net there were even more experiments like J#). There exist tools to convert quite seamlessly between C# and VB.net code.
The result: except for some geeks people have decided that they prefer the C# syntax and it seems to me that Microsoft thus slowly tries to fade out VB.net.
So it rather seems to me that people don't like it if there are multiple coexisting syntaxes, because even if there are good converters available, it is still an inconvenience that wastes at least some productivity by mere existence of the multiple syntaxes.
Well, the “right” way to do it would be to make the textual representation of the syntax a view layer concern and not the format in which the code was serialized. For example, you could store everything as s-expressions or some binary representation of an AST and then have a lens that pretty prints the code with the programmers preferred syntax and the reverses the programmer’s edits to the underlying serialization format.
I suspect this would work best for an image-based smalltalk-like system where the “source of truth” would be the version of the code persisted in the image and then the textual representation would be generated from that on demand.
A really interesting extension of this is to use lisp-style symbols instead of strings to represent variables, which would make it possible to localize the functions and keywords of your language so that a developer could collaborate in his/her preferred language.
> The result: except for some geeks people have decided that they prefer the C# syntax and it seems to me that Microsoft thus slowly tries to fade out VB.net.
Not really, VB.NET enjoys much more MS love than F#.
Just check the tooling and documentation at MSDN, or the set of languages supported for UWP development.
> > The result: except for some geeks people have decided that they prefer the C# syntax and it seems to me that Microsoft thus slowly tries to fade out VB.net.
> Not really, VB.NET enjoys much more MS love than F#.
This is surely true, but in my opinion F# is more different from C#/VB.net than "just being for most parts a different code representation". So I would consider F# as an independent programming language - which indeed seems to be fallen out of favor by MS.
> we've like already imported most lisp features into modern languages anyway
It's one thing to implement all kinds of features and another thing to integrate them well. You can bolt wings onto an existing ship, but it won't fly well.
> we've like already imported most lisp features into modern languages anyway
This makes it sound as if there are no modern lisps.
What's the big deal of using a lispy language? Do other languages really need to desperately try to reimplement features that come more naturally to lispy languages? Are parentheses (only the round ones) really that much of a deterrent to motivate excessive work like that?
It seems to me that very few companies use Lisp-like languages used in production. Even Haskell seems more common.
The big deal with using Lispy languages is that companies don't want to use them -- it's a lot easier to hire developers who know other languages. So, I'm stuck with Java whether I like it or not, and I appreciate any functional programming features that can be packed in to new Java releases, even if they end up being a bit clunky.
this doesn't really seem to be a unique feature of 'lispy' languages because the overwhelming majority of languages does not have wide adoption in the market.
The reason why adoption concentrates on a few languages is mostly network effects rather than objective features of the language itself.
Maybe it's time for us to think about how to increase the adoption of languages rather than trying to bolt every feature onto java just because there is a lot of java code and devs. We're going to be stuck in a never-ending mess of complication and legacy.
Yeah, Lisp actually used to benefit from network effects between the 60s and the 80s, before the AI winter struck and Lisp was sacrificed as a scapegoat by the industry (MCC and NASA JPL are probably the most well-publicized cases).
I think OCaml fits your description. ReasonML is an alternative syntax, and it is in fact implemented as simply a parser that generates the same AST as the OCaml parser, plugged into the rest of the OCaml compiler. This is particularly easy in OCaml because syntax extensions are implemented as functions of type AST -> AST, so that type is well documented and fairly stable.
I don't know of any way to translate between the two seamlessly; the translation itself would be pretty easy but preserving whitespace would be tricky.
Yeah, I always wanted to find the time to re-give OCaml a try after reading about ReasonML...
I'm kind of put off after having tried Haskell and seeing now OCaml as an "inferior Haskell", and after looking at F# and seeing it as a "better integrated OCaml". Also folks in ML and data science, the area I care more about now, seem to like F# better, not much mention of OCaml here.
I'd strongly object to calling OCaml an "inferior Haskell". OCaml has a much faster compiler, a great type system (sure, missing some of Haskell's more advanced type features, but adding modules/polymorphic variants), and it's strict.
Not my words, just the "general feel" that some people are projecting.
My experience was just that H felt so elegant and conceptually simple... up until a point (like when you start needing to use lenses or think of monad cobinators). Even the syntax was so nice and readable (love the special monads syntax, the `when` statement etc.).
OC's designers seemed to not be very fond of the concept of simplicity, simplification, reduction, elegance, improving by removing stuff etc. Only on correctness/soundness. Sounds very... French :) Not a bad thing (I live in France now and like it). Just I don't like it in code. I'm more of a "let's symplify shit more and more until maybe the problem goes away, completely, by itself, instead of actually starting to implement a solution, from the get go" person.
Oh, and yeah... I don't think I'd ever write a production system in a lazy language, I agree strictness is a feature. (Maybe I'm biased, but I just like to be able to reason exactly about when some code runs, whether it has already ran by the time a breakpoint is reached and all that old school stuff.)
I see. I think I have a visceral reaction whenever people express sentiments like that, partially because it's Haskell marketing/branding that has worked too well. IMO, Haskell's insistence on purity (which has been an excellent choice advertising wise) has harmed other functional languages. "If you're gonna learn a FP language, why bother with a half-baked impure language?" is a sentiment I've seen thrown around multiple times.
This is somebody else's point I read, but there's also a cognitive dissonance between 1. Haskell is pure, so it's very easy to reason about anything and 2. Imperative programming is just as easy in Haskell as any other language if you use the do notation.
Could you give an example of OCaml's designers removing too much stuff? Are you talking about the lackluster stdlib here haha?
As for laziness, I agree. I like lazy semantics (I'd argue they might almost be strictly superior than strict semantics), but there's too many things it sacrifices. Being able to reason about resource/time composability, having stack traces, having a debugger, etc. is amazing.
To be honest, what gave me a "bitter taste" after looking at OCaml was not the lack of purity or "which stdlib" issue, but the lack of any kind of usable polymorphism.
I mean, having to write + or +. or +/ instead of having something like a Typeclass for nice polymorphic operators? Feels like an annoying straightjacket and syntactic noise. And I don't understand the tradeoff: since I see types a lot like "compiler checked documentation that can also work as basic tests", I see no value in type inference for functions - I'd rather have only local type inference for variables, specify manually all the types of all the functions (this helps you think better anyway) and instead be free to binge on polymorphic operators and functions/methods.
Haskell almost seemed to deliver that with its typeclasses. But then its laziness and monads and weird handling of records make it too weird for practical use for me.
Is there an "OCaml with typeclasses or other form of polymorphism"? :)
As far as the tradeoff, my view on types is a bit different than yours, I think. First, a quick note:
> specify manually all the types of all the functions
This is usually encouraged in OCaml, in the form of a .mli file.
I think types can be a lot more than "compiler checked documentation that can also work as basic tests". They allow you to encode invariants that allow the compiler to check more than basic stuff. I thought this video had a good example. https://www.youtube.com/watch?v=-J8YyfrSwTk&t=20m10s
As Haskell shows, typeclasses are not ruled out by type inference. There is a reason they can't be added to OCaml, though, and it is functors. OCaml's functors are fundamentally incompatible with typeclasses.
I think one of the problems with getting new people into OCaml is that you feel the pain of not having typeclasses long before you understand the power of functors.
The real differences between programming languages start cropping up in type systems, order of initialization in object hierarchies, whether operator overloading is permitted or not, whether operator overloading happens by global operator resolution, instance methods, per-type methods, or some mix of them all; you get the idea.
All the mechanics of how values are created, interact, and are torn down are described by types, whether the types are dynamic or static. It's the type systems that are the limit to language interoperability, not syntax.
I could go further: it is object systems that are the biggest problem. Program with inert data structures and pure functions of input to output, and interoperability is much more easily achieved. But if you expect to create an object whose behaviour is defined according to language A, and poke it into a function that interacts with it according to the idioms of language B, effectively smuggling A's semantics in via the polymorphic type, somebody is going to be surprised.
And semantics are what people care about, too. Syntax does have its bikeshedders, but semantics are what make people move or stay.
> Syntax does have its bikeshedders, but semantics are what make people move or stay.
There are at least two big exceptions to this rule:
- 1. Lisps - some people (some of them really smart btw), just can't stand lispy syntax (I'd say that it's most likely because they got the math/physics notations so deeply engrained in their minds that they can't tolerate breaking apart from "thinking in it")
- 2. operator overloading heavy syntaxes - when writing any kind of sciency code, you'll always have a camp of people that will want to use 50 2-letter operators instead of functions everywhere, and camp of people that will just want "zero tolerance for operator overloading". they'd both have their reasons, and there will be no way to "make peace" -- in a multi syntax setting you'd just have one syntax that will show the `plus` function/method used and another one that would render it as the `+` operator
I'm working (slowly) on a fairly flexible standard syntax and syntax tree. It's not going to be as simple as S-expressions or JSON, though. I need five kinds of lists to get a reasonable mainstream-ish syntax, and this seems a bit unwieldy if you're just using it for data.
I'm not sure what a standardized AST would really be. A standardized concrete syntax tree is doable, but each language is going to have its own statements and expressions (equivalent to special forms in Lisp). This is similar to how we can use JSON objects to represent all sorts of different types as key-value sequences with different sets of keys.
For an evolving language, the AST needs to change from one version to the next. When you add a new kind of statement or expression, the tree-walkers in the tools usually need to adapt. The tools aren't stable unless the language is stable.
- Blocks (curly brackets and separated by newlines or semicolons):
{ a b; c d }
- Dotted lists, where sometimes the dots can be omitted:
foo.bar.baz
In combination, you can write something like:
for x in [1, 2, 3] {
foo.bar(x + 1, x * 2)
}
Which can be interpreted as 7 lists. The top-level list has five items, the last two of which are lists. The last list is a block containing one item, which is a three-item dotted list. The argument list after "bar" is a two-item comma list containing two phrases.
I've implemented this, but I'm not satisfied with it; the corner cases are tricky to understand.
My gut instinct is that translating between any two programs in different languages, that compute exactly the same function, will run into the Halting Problem somewhere. I feel like this is where some actual Computer Science analysis of the problem you are describing before diving head first into coding a solution could really pay off.
That's why you'd need to have a common formally defined semantics underneath, that all syntaxes will be forced to comply with. Solves the halting problem (unless somebody invents a truly weird syntax with meta-meta-templates and context-dependent grammar or whatever - like the "look, I can use C++ templates to implement a compile time Lisp interpreter" horror"), but, yeah, inventing a practical way to enforce that formally defined semantics is a hard problem in itself, and waaaay above may level of compsi knowledge :)
By definition, a program in a turing complete language can be turned into a universal turing machine. More, turing equivalence means that any turing complete language can be turned into any other. The most naive implementation of the GPs suggestion would involve creating an IR, then for every language a compiler to this IR, and a reverse compiler to get back source from this IR. With this, you could trivially jump between languages in a mere two steps.
I've been thinking about this problem a lot lately: does there exist an isomorphism between high level languages, such that you can map their ASTs and type systems back and forth?
I imagine type systems will fail: haskell int -> C int -> haskell maybe int
Haskell's type system is more expressive, so information is lost in the translation to C (that the haskell int is non-null), thus becoming maybe int when you translate back to haskell
Huh this is an interesting idea. Especially if you're willing to only do saving through the compiler, because then you could just save the AST and then when you load it your IDE feeds it to the compiler along with output language and you get it back in whatever you're trying to compile too. Harder to make this editor agnostic but the idea is fascinating.
I can't work on a codebase with a colleague, me writing/reading the code in Scala, he/she writing/reading the same code in Java or Kotlin. I can't code a project in Clojure, then hand it over to a team of programmers that see it and work on it as Java code.
JVM languages are too different from one another (and the "common language" underneath, whatever is it called, it's waay too low level to qualify, practically no one writes/reads it). Different syntaxes would mean need to share a different semantic common to all to have seamless translation. (And yeah, apart from some academic experiments I think we're far from abstract syntax independent semantics in any production used programming language.
To actually be able to treat the multiple JVM languages as syntaxes of the same language with a "seamless" experience you'd need code analysis tools of almost-superhuman intelligence. By that point we'd be out of job anyway ;)
I think the main reason that this wouldn’t work is that compilation isn’t perfectly reversible. Information is lost.
I think you will always have this problem with translation and it’s a analogous to the idea of two people working on a novel, one in French, the other in English, with neither knowing the other language, expecting the word processor to come up with some lower representation that it can use to translate flawlessly back and forth.
Unfortunately some of the Lisp cool features are hard to implement without its simple syntax. Maybe the lesson is that we should find a f way to have syntax-independent programming languages! Just as we expect any successful language to have at least a couple implementations before being taken seriously, maybe we should expect a language to have at least a couple different alternative syntaxes and instant perfect translation between them (with comments and code style preserved), so I can use syntaxA and work on same codebase with my colleague using syntaxB. Than we'll be sure any metaprogramming or code intelligence tools don't suck because they'll be damn forced to not use the program text but the AST (that will have to be standardized) as they should instead...
Lisps would not even seem so cool anymore if we could get our s together and build languages with (a) standardized AST representations and (b) multiple syntaxes... we've like already imported most lisp features into modern languages anyway, and standardized ASTs would make macros both trivial and manageable...