Fascinating. I was just thinking about something like this today because I wanted to use generators. JavaScript is odd in that it's hard to make changes to the language and get browser vendors to support them. Fortunately, all JavaScript implementations so far are turing complete, and so we get projects like this that let us keep moving forward, even while IE6 holds us back...
Narcissus is pretty cool - it's used as a testbed for new Harmony features. It's more reliant on some Spidermonkey specific features than Continuum, though.
Caja and Narcissus are definitely comparable parallels, but Continuum is unique in that it fully implements the ECMAScript object model, runtime, and standard library (except for a few things which are still in progress). I would say that Continuum is the only one that qualifies to be called a "JavaScript engine" in its own right.
Narcissus is a meta-circular interpreter that puts a thin layer over the host engine, and requires that host engine itself to have some ES6 features (it can only actually run in Spidermonkey and V8 with the --harmony flag enabled).
Caja is more of a wrapper that protects access to capabilities than anything. In fact it specifically avoids fully parsing source code itself in order to be performant. Its goal is to sandbox code, not interpret it.
Continuum implements the ES6 Object Model and nearly fully implements the internal algorithms of the ~450 page ES6 specification (aside a small handful of things which are a bit out of date or remain to be implemented). In fact, the only thing it does not implement itself is RegExp, which it currently wraps the host engine's functionality to provide. It even fully implements the Date internal algorithms (https://github.com/Benvie/continuum/blob/gh-pages/engine/bui...).
Additionally, it self-hosts much of its own code. All of the ES6 standard library is itself written in ES6 (https://github.com/Benvie/continuum/tree/gh-pages/engine/bui...) and is executed in the virtual machine each time a realm is created. Roughly 25% of Continuum's code is written in ES6, while the other 75% is written in ES3 (I plan to reverse this ratio in time).
Also Tachyon is a full JS engine built in JS. It's incomplete but has the ability self-host itself and JIT compile machine code (all written in JS), so it gets bonus points to offset the fact that its implementation is incomplete https://github.com/Tachyon-Team/Tachyon
If you look at the TODO list, it looks like they want to replicate PyPy's approach. I wonder, have they considered writing an rjavascript to rpython compiler and just using the pypy infrastructure as is?
I had considered this, but it would also require the other end of it. RPython is basically compiled to C which is compiled to machine code. The ultimate output format for this needs to be JavaScript, something like the form specified by https://github.com/dherman/asm.js. So we need JavaScript (ES6) coming in one end and asm.js coming out the other end. At this point the only remaining usable pieces from PyPy are the processing that happens in the middle. Admittedly this is a huge part of it and would still be very valuable to have access to, but there's probably a better approach to this than having to redo both ends of the pipeline, and losing the ability to self-host the compilation process in a JavaScript engine (browser).
Perhaps pairing PyPy with emscripten would be the ideal solution for this, but that would have to be a completely new project because nothing that's currently in Continuum would be useful for that (except probably the standard library that's written in ES6 and executed in the VM).
OK, I understand. I was under the impression that the purpose of the meta-interpreter was to create a virtual machine outside of the browser, but having re-read your README, I now understand this isn't the case.
Still, I would recommend trying to reuse PyPy's infrastructure, even if only for the short term. An immense amount of time and thought has gone into creating it that you would end up replicating otherwise.
Besides, PyPy doesn't just target C, it also target's JVM and .Net, so in writing a new backend for JS, you wouldn't be working against the grain.
There's no need to have a restricted Javascript. All you have to do is to write code in ES6 with the type annotations and the entire program will be fully type-inferable (tho the inferring part may take a while because you probably need to do something like a Cartesian product for duck typing). Once you do that, you can do a lot of optimizations.
You need to do things like not dispatch to valueOf and toString for coercion. You basically just need to get rid of dynamic coercion dispatch, and do what you said (annotate types) and that's all that's really needed. But the ECMAScript runtime is pretty substantial so having a minimized subset of that runtime would make it possible to self-host most of the higher level interpreter in ES6-like code instead of having most of it implemented in ES3 (as it is now). This makes a significant difference in verbosity of the code. As converted to ES6, I estimate the code would be half of what it is now (currently it contains ~23k lines of ES3 code, 7k lines of ES6 code, so 30k total).
Actually, you need to not use a few more features too, like not dynamically attaching properties to objects at run time, and dynamically creating new clasess... but the point is, you don't need to define a new language when all you have to do is to use only a subset.
Well RPython is just that: a restricted subset of Python (hence the name). Similarly, asm.js is a (small) subset of ECMAScript targeted at compile-to-js code generators. It's just about specifying what the subset contains, not inventing a new language.
I'm gathering the goal of this is to be a polyfill, for older browsers, as ES6 comes online, allowing for a faster transition. If that's the case it seems to be well ahead of the curve...