Thank you for being clear on the front page about the goals of the project, and for providing a detailed overview as well.
However this project also badly needs a "hello world" example. Right now there is no way to get started except to start reading the docs front-to-back, which is more than I can tackle during a lunch break.
That said, I really like the idea of a language specifically for encoding automata and I'm excited to try it out.
Agreed. Although I think a bit more than just "hello world".
I still havent't really figured what this language is really about (based on the name and the headline it sounds to me like "a spreadsheet, except without the grid ui", something that I'm still hoping to find/understand.)
It might be I just haven't read carefully enough but I only have so many minutes of lunch break : )
This looks pretty cool, and the detailed descriptions are great, though I agree with others that an additional, shorter and more illustrated overview would also be nice.
Anyway, I came to complain about the syntax, knowing full well that tastes differ etc.
But look:
(round, parens, construct, sequences)
and hence for grouping they use braces:
a = b + {c * d}
and they also have sets built into the language! But as braces are already taken, those are constructed with square brackets:
[1, 2, 3, 1, 2, 3] = [1, 2, 3]
And there are range types as well, using angle brackets:
<1..31>
The departure from traditional math syntax is just a bit too weird in all this. I'm not saying any other language's syntax is perfect, but really, not using braces for sets is strange. Why do this? Pretty much any other permutation of the different kinds of parenthesis for the different constructs would seem more understandable to me.
If I had designed this language, I would have used [] for sequences, {} for sets, and () for grouping. Or maybe, to be closer to mathematical notation but farther from other programming languages, it would make even more sense to use <> for sequences and [] for range types.
"The reason Cell was designed to be a domain-specific language, and its compiler to be a code generator, should be obvious. Choosing the main language for your application has usually more to do with the tools, libraries and support that are available for it than it has with the language itself, and one generally has to be, by necessity, very conservative there. But a domain-specific language that integrates with your primary language, instead of replacing it, stands a much better chance of actually being useful, as it can be adopted gradually, and only when, and to the extent that, it provides a clear advantage. It's just another tool in your toolbox. "
That's a really interesting way of thinking about it.
In my opinion that's not a great reason for not trying clojure (although it may be a great reason for not using clojure). Then again, if all you do is native work, then clojure is likely a bad fit in its current form.
I suppose a better phrase is, "I haven't had an interest in Clojure apart from the language itself, and dealing with the JVM and toolchain is something I find frustrating enough that I have chosen not to dive into Clojure yet."
My limited experience with the Java toolchain is that it's unnecessarily difficult to use outside of an IDE (classpaths, etc.), and using Clojure on top of it is even more difficult.
My experience is that most people who create tools for the Java ecosystem are quite content with having a bulky mess of tools, and consider IDEs (and nothing else) to be the gold-standard.
It's just something that I'd rather not contend with, even though Clojure (and Clojurescript, which suffers from the same problems) as a language sounds quite nice.
Outside of the initial installation (brew install or apt-get install or whatever you use), you really don't have to deal with the JVM ecosystem or classpaths unless you interop with Java stuff (in which case, you'd be using JVM anyway).
Leiningen (and boot) isolate you from all of that. I've been using Clojure since 09 and I don't remember the last time I even had to think about a classpath. I also use vim (and more recently, spacemacs) and not an IDE. Its not an issue in Clojure.
Although, I agree with you on the JVM in general and can see how its a reasonably assumption, even if not actually true.
Personally I still love Racket and think that is the LISP of the future. Racket's number one enemy is it is fun and so everyone likes come up with their own solution. Here is one binding to LLVM from 6 years ago.
> Personally I still love Racket and think that is the LISP of the future.
I agree. It seems quite likely that the next "hundred-year language" will either be racket itself, or live in racket's ecosystem. That, or something like red[0]
This is definitely a plus, but it still doesn't get you all the way: if my IDE/editor of choice doesn't support syntax highlighting, autocompletion, error highlighting, jump-to-definition, etc. for Cell itself, then it's still difficult to integrate into my workflow.
On first glance, Cell feels like a very serious effort and I wouldn’t be surprised if it has an impact.
The embedding model is a nice and thoughtful touch. I think a lot of imperative systems could benefit by converting their semantic cores into code in an engine that’s reactive, structurally typed, transactional, and functional-lite (Cell appears to not even have generalized higher order programming, a bold but defensible choice). The relational automata (actors?) are also quite interesting.
From the high-level overview, I can imagine from my experience that there is a need for something similar.
But, at first glance, the web site doesn't address the following questions:
1) What problem does it solve that a conventional approach doesn't? More specifically, what's the ideal use case? Embedded apps / IoT? Protocol implementations? Something else?
1bis) You claim that the only significant project written in Cell is Cell itself, which is quite an achievement ("Bootstrapping"), but does it mean that the ideal use case for Cell is writing compilers?
2) Who are you? What's your long term plans for this project?
3) Are there any alternatives? (Ragel, SMC, SCXML based tools...?) How do you compare to them?
The "implicit argument" feature seems really neat. I've grown really tired of needing to thread some random flag through all of creation to pass it to some dark corner of the code.
I agree, but with this system you'd still need to update all the "implicit" blocks (which surround every function which passes on the implicit argument) to add the new flag. I'd love to see something closer to full dynamic scope, where you don't even need the "implicit" blocks.
It's about context and readability. The more implicitness you work into your language, the less you know about a block of code just by reading that block.
Sure, but that doesn't make features like this automatically bad -- it's a tradeoff between completely local understanding and extra specificity that may or may not be helpful. For example, most languages will implicitly choose a meaning for "+" (integer addition, floating-point addition, etc.) based on the types involved. That's usually a better choice than making the programmer write (and read) a more specific form like "Float.+" or "Int.+".
Oh definitely, I'm not disputing that. It's a spectrum, and different people will be more or less comfortable with different amounts of explicitness/implicitness. The grand-something-parent was talking about "implicit implicits" which... without some more context (heh), sounds like it might be a bit off the deep end.
Racket has a really interesting way of doing things. It has sort of global variables that can be set for only a limited time and are only seen by the setting thread.
Add a few examples of known / easy-to-grasp algorithms: sorting, processing of a stream / tree / graph, a concurrent interaction, some easy parsing, etc. A few examples on which your language shines, shows some of the key features that make it different and worth learning.
Also, aesthetics do play a role: if your language is excessively hard to parse, or your examples contain a lot of copy-paste, etc, it might be a legitimate turn-off. It may be worth tolerating if other features are great. E.g. something like computing a convex hull of 2D points in APL or J may look terrible for an untrained eye, but you can nevertheless notice that it's like 40-50 characters total.
J has an almost explicit goal of being concise; it is specifically a feature they're reaching for, and thus makes sense to promote in such a fashion.
But its absurd to judge the language on its syntax, if the language is uninterested in the syntax, until its almost unnacceptably noisy or confounding. Looking into a language should ideally be first based on structural and semantic differences, and then syntactic.
And a sufficiently semantically different language will offer little with its code snippets, as it wont make sense without understanding the paradigm. Even J; there's not much to take away other than that its dense from just the snippet. If anything, it mostly serves to scare people away as it approaches line noise, particularly when you don't yet know why it can be successful in such a fashion
Also, when all proglang website do and are the same, one new that replaces examples by long text, even though it requires my brain to actually think, grabs my attention in weird ways
If you do an example less trivial than "Hello world", it also lets people see a concrete example of the semantics, rather than just getting a high-level description of it
Im just arguing that its not good to have code samples slapped onto the front page of every language, and particularly that this behavior shouldn't be "expected and glorified"; not that samples shouldn't be readily and easily available (they generally should be! Its part of the summary of the language)
If you don't understand it, that doesn't necessarily mean that you are at fault. It could be poorly explained. The existence of others who appear to understand it means nothing. After-all, gobbledygook science papers have been accepted to peer review journals on multiple occasions.
In this case, especially, you shouldn't feel bad if you have to look up terms. Both relational and reactive are clique specific terms.
I wonder why this is flagged? At first glance it doesn't seem spammy or inappropriate (although it is a bit dense and confusing, but not as much as those Urbit posts).
I just wanted to quickly answer a few questions and explain why the language exists at all. The points below are pretty much incomplete and in random order, with no clear thread among them. I hope they're useful anyway:
Cell was designed first and foremost to combine functional programming and the relational model, with the former providing the computational capabilities and the latter the ability to easily encode, navigate and update complex information/state, with many different types of entities and relationships between them.
It also borrows other ideas from the database world: chief among them is the practice of removing all forms of redundancy from the data structures that are used to encode the state of an application. There are two main reasons for this: first of all, a data structure that contains little or no redundancy is much less likely to be left in an inconsistent state by a buggy update. Secondly, the code that updates the state of the application becomes a lot simpler, because there are fewer things to update.
Redundancy includes not only information that is duplicated, but also information that can be derived by other information that is already present in the dataset. Eliminating derived information is more difficult than doing the same with duplicate information of course, because there are in many cases performance issues. In order to address those, Cell is meant to include several form of memoization, but that's a long-term effort.
A paper that came out more than a decade ago, "Out of the tar pit" (google it), explains in more detail the notion of "essential state" and the idea of combining functional programming and the relational model. I'm not connected in any way to the authors of that paper, and the particular flavor of the relational model implemented in Cell is very different from the one advocated in that paper, but until the documentation on the website is complete, that's probably the best reference to those ideas I can point you too.
I believe the relational model is by far the best data model I know of, and that it's far superior to the records+pointers data model used by most programming languages, and by object-oriented languages in particular. I can't explain why in just a few lines, though. It's a part of the website/documentation I'm still working on, but which I hope to put out relatively soon. Note however that the relational model is not the same thing as SQL: in relational databases the relational model is saddled with a number of limitations that would not be acceptable for a programming language (like the fact that the types of the columns in a relation/table can only be chosen from a few predefined datatypes) and SQL itself is of course an abomination of a query language. But that's not the only way to do it, just like C++ is not the only way to do OOP.
Cell is not meant to be an alternative to spreadsheets, or to do computational biology (I'm actually starting to suspect the choice of name is a rather unfortunate one). It's not meant to write compilers either. Quite the opposite, actually: it's designed to build stateful systems. Of course, functional languages are in general a pretty good choice for writing compilers, and Cell does have a (semi-)functional subset, which was used to write the compiler.
The target application for Cell is actually a particular type of stateful systems. Let's agree to call them "reactive systems", or maybe "event-driven systems", in what follows. These systems are usually at rest (meaning that there's no code being executed, no call stack, and no instruction pointer) and when they are the only property they have is their state. They react to external events by changing their state, which is conceptually an atomic, all-or-nothing operation. Their new state after an event depends only on the state before the event and the event itself. They do not do any sort of I/O, and are totally oblivious to the world around them.
It's a very general concept, that can be applied to many types of applications. An example is the Elm architecture. In order to define an application in Elm, you've to define the type of the state of your web application; an initial state for your application; the types of all messages your application can receive; and a transition function that given the current state of the application and a message returns the new state of the application. There's actually more than this, but let's ignore that for now.
Static automata work in a similar way. The two main differences is that you can make use of the relational model to define the type of the application state (as if you were defining a small database schema) and that the transition function instead of returning the new state of the application returns a set of atomic update operations (set/insert/delete/update) that are applied to the current state to produce the new one. This is similar to what you do to update the data contained in a database: you just send a set of atomic insert/delete/update command that are executed by the DBMS.
I'll stop here, this post is already too long. Maybe I'll post more tomorrow, if anyone is still around. Does this make things a bit clearer?
> Maybe I'll post more tomorrow, if anyone is still around.
I can only speak for myself but I will definitely be around to read another post if you write one.
The design is fascinating and I'm very excited to see a new language that combines functional, relational, and reactive elements.
One design choice that I'm particularly interested in is your choice to fix the set of possible values in a way that supports generic value introspection and orthogonal persistence. There are clear benefits but some costs include possible obstacles to abstraction / modularity and also extensibility. I'm wondering how you see these trade-offs.
(Replying to my own comment because I don't see a way to edit it.)
Another question I have is about recursion. Browsing through the code a little bit, I see loops and comprehensions are often used but I'm not sure I've seen a single recursive function. Is recursion something you've consciously tried to avoid and if so can you speak about the rationale on that topic?
I'm really curious if there are any plans to have an interpreter for this language? I'd love to be able to provide it in a sort of python notebook type setting to help work with data/events. Imagine using it to create an 'if this then that' type service where signals could be external api references.
I'm really excited to learn more about this language.
This is currently a preprocessor to C++. C++ doesn't need it; there is enough rope in C++ to make a reactive programming framework entirely in C++, and hang yourself with the rest.
Is this what everyone who creates insane spreadsheets should actually be using? I hope the author is just being too modest to come out and say "Excel killer".
I feel it’s more of a core building block for eventual Excel killers in various verticals.
It’s really cool, but it’s hard to imagine an Excel jockey reading this overview and getting enthused - it’s still heavy on CS concepts. Apps on top of this would perhaps offer conveniences that hide the engine’s full power to some extent in exchange for feature accessibility.
There is a opening colon on if/for/while, and a closing semi-colon. This seems the same ergonomically as an open brace and a close brace, and needlessly different.
I can understand leaving off the closing delimiter for a Python-like syntax. But this doesn't seem to be a good compromise.
The language sounds awesome in many other respects, which I have yet to dive into. But yes unfortunately this is the first thing people see.
My philosophy is to have new syntax where there are new semantics, but old syntax for old semantics. And blocks seems like they have no new semantics, so it should just choose C/Java style or Python style.
It reminds me of when Rust was first released, and they used brk for break and cnt for continue. That was just silly, because they mean exactly the same things as they do in every other common language, with a needless syntactic difference. People program in more than one language simultaneously, and all these differences add up.
I must add that Haskell somehow manages to combine both, so you can use braces if needed, but normally people use indents, and neither leads to syntactic uncertainties.
Honestly that kind of extra feature was a bit of a hindrance when learning Haskell. Always felt like I was learning the wrong style no matter which style I went with.
Agreed. For people from a C, C++ or Java background ";" on it means an empty statement. Braces are much more common for this. Or do something like "endif" or "endfor" or just "end"
"Structural vs nominative debate aside, you might have noticed a major omission in the data model: closures"
Then goes on to explain that they would be hard to serialize.
Not so. If you take at how ruby handles closure members when serializing with Marshal.dump, you will see that it just writes a memory address and doesn't actually store a call graph. So why not just omit it like ruby does?
Is Ruby able to deserialize the closure into the same process that serialized it? In any case, it won't be able to do that in a different process; e.g. if the data is sent over the wire.
I think the approach of not allowing closures in data that may have to be serialized is a much better approach than just writing a memory address and leaving the recipient to deal with the problem.
A "simple rule" that's not programmatically enforced is just a convention, and conventions are too easily ignored. I'd be ok with a type-system based solution that distinguishes serializable and non-serializable types, but silently writing nondeserializable data is just asking for trouble down the road.
This language in particular is built around having value semantics for everything (e.g. writing to a database and reading back is guaranteed to result in an indistinguishable value); and that enables nice features like persisting the complete application state on disk, deterministic replay of logs and so on.
I think for the goals of this language, disallowing closures as data is a reasonable choice.
I stopped reading at "There's no such thing as user-defined equality". Defining equality, and other functions, is an intrinsic part of defining a type. Whatever Cell may be good for, it's obviously not for me.
However this project also badly needs a "hello world" example. Right now there is no way to get started except to start reading the docs front-to-back, which is more than I can tackle during a lunch break.
That said, I really like the idea of a language specifically for encoding automata and I'm excited to try it out.