Show HN: Python to C++14 transpiler

pjmlp · on Jan 18, 2016

It is called a compiler even it outputs code in another language, transpiler is some neologism from JavaScript developers without a background in compiler design.

chrisseaton · on Jan 18, 2016

I'd defend the word 'transpiler' and I did my masters and PhD on language implementation, and that's where I work professionally now, so I'm not ignorant of compilers.

I think it implies a translation from one high level language to another (not that 'high level' is well defined either), with only desugaring and maybe type-checking - no real lowering or optimisations. That's a useful subset of compilers, so can have its own word I believe.

pjmlp · on Jan 18, 2016

So for you a C compiler that follows the initial workflow is a transpiler?

Or an Eiffel implementation, as another example.

chrisseaton · on Jan 18, 2016

I think a transpiler is a compiler that does a high-level to high-level translation with only a few simple transformations and almost no optimisations and produces relatively readable output.

If a C or Eiffel compiler meets that definition then yes. I'd still call it a compiler, but I would also call it a transpiler.

If you're offended by the term transpiler because all transpilers are also compilers, then I don't know why you wouldn't also be offended by the term compiler because all compilers are also 'programs', so lets just call everything a 'program'.

pjmlp · on Jan 18, 2016

Apparently I am the one wrong here, given the 1964 paper link provided by Ded7xSEoPKYNsDd.

Still the word sounds strange.

poizan42 · on Jan 18, 2016

Why are people giving this word so much flak? Yes it's a neologism, but it's a useful one. A decompiler is a compiler to, it just goes from a low level language to a high level one. Do you suggest that we stop using the word decompiler as well?

kitd · on Jan 18, 2016

'Transpiler' goes back long before the Javascript/CoffeScript world. Pascal/C and FORTRAN/C have a history of 'transpilation' products.

It's just short-hand for source-to-source compiler and IMHO has been used enough to warrant its inclusion in a programmer's dictionary.

pjmlp · on Jan 18, 2016

Yes they have, and we called them compilers.

Can you dig me a 70's paper with the transpiler word on it?

Ded7xSEoPKYNsDd · on Jan 18, 2016

There's this paper from 1964, shortly before the last paragraph.

http://comjnl.oxfordjournals.org/content/7/1/28.full.pdf+htm...

pjmlp · on Jan 18, 2016

Thanks, it seems I need to fix my stance on it.

srean · on Jan 18, 2016

> It is called a compiler even it outputs code in another language

Indeed and most compilers do exactly that. Cant help getting a little annoyed whenever I hear this 'transpiler' word. I guess the ship has sailed, oh well !

nimitkalra · on Jan 18, 2016

From what I've noticed, these "transpilers" output code that is readable (the code itself is written as though a human wrote it) where as "compilers" output code that has been optimized and show effects of name mangling in the code itself, etc. Just an observation.

I think it makes sense to use a different term for this "compiler"-esque behavior. For example, I might edit the output of CoffeeScript generated Javascript whereas I wouldn't know how to modify the output of gcc.

pjmlp · on Jan 18, 2016

The output of gcc can be straight readable Assembly, once upon a time a normal language to write business applications in.

Also C compilers used to generate Assembly text files, which were then piped into the Assembler.

So no, transpiler doesn't make any sense.

kqr · on Jan 18, 2016

This is even better highlighted by something like the GHC compiler, which compiles Haskell to (ultimately) machine code, because it does this in several steps:

1. As a first step, it "transpiles" Haskell to the very similar high-level intermediate language called Core. I could read and write programs in Core without too much effort because it's still a very high-level language.

2. Then it "transpiles" the Core code to the not significantly lower level STG source code – human readable, but starts dealing in a little lower-level stuff and contains much fewer bells and whistles. I'd prefer not to work in STG, but I totally could if I had to.

3. After this, it again "transpiles" the STG source code to the C-- (C-minus-minus) intermediate language, which is decidedly low-level (slightly lower level than regular C code), but there's still a clear and relatively simple translation between STG and C--.

4. As a final step, it "transpiles" the C-- source code to assembly (a closer match than you may think, coming from C), which can be assembled into machine code.

At no point in this pipeline is there a significant magic translation being made – despite the fact that Haskell is probably one of the most high-level languages we know and machine code is as low-level as you get.

omaranto · on Jan 18, 2016

People believe some languages are "higher level" than others and call a tool that translates programs written in A to programs written in B a compiler if A is clearly higher level than B and a transpiler if B is of a level higher than or roughly equal to A.

You can certainly think this distinction doesn't merit using a different word, but you shouldn't think that people who use the word "transpile" use it as a synonym for "compiler".

lmm · on Jan 18, 2016

The distinction is meaningful and valuable. Language evolves. No need for the attacks.

morgenkaffee · on Jan 18, 2016

You are right. C++ is quite different to Python and so it should be called a compiler.

But it fits the project since I don't have any background or experience in compiler design.

tormeh · on Jan 18, 2016

Transpilers are a subcategory of compilers. It's just a more specific word.

jmgao · on Jan 18, 2016

Alternatively, translator, the word that CFront used.

Sharlin · on Jan 18, 2016

And "translation" is still the term of art for the compilation process at least in the C and C++ standards.

drvortex · on Jan 18, 2016

Strictly you are right. However, by the strict definition of a compiler, an interpreter is nothing but a sort-of "real time compiler".

But a compiler has come to mean a program that translates source-code into machine code.

By this new understanding, a transpiler is a combination of a compiler and a decompiler with different input and output languages.

So po-tay-toes, po-taa-toes. No one cares, we know what is meant by transpiler.

programmer_man · on Jan 18, 2016

The word transpiler was used before we had web browsers, much less javascript.

Point taken that it is still a compiler.

agumonkey · on Jan 18, 2016

or pretty printing

throwaway999888 · on Jan 18, 2016

I for one am just waiting for the cispiler.

Symmetry · on Jan 18, 2016

An optimizing cispiler sounds really interesting. Something that applies all the optimizations that the compiler would do when translating your C to machine code but showing you the results in C. Of course not all optimizations are representable in C (calling conventions for instance) but many are.

jmgao · on Jan 18, 2016

There are several examples of this in the Java world, such as proguard. They're mostly centered around code obfuscation, but they do some useful optimizations as well.

haberman · on Jan 18, 2016

Key sentence: "The goal is to showcase the power of C++14 templates and not to create a fully functional transpiler."

Viewed through that lens, this is a really novel and cool demonstration.

Joky · on Jan 18, 2016

This seems to me to be very close to what Pythran [0] is doing (since 2011), except that Pythran includes some type inference and bridge with Python code.

So view through that lens I'm not really seeing the novelty right now?

[0]: http://github.com/serge-sans-paille/pythran/

anon4 · on Jan 18, 2016

There is also nuitka - http://nuitka.net/

Sharlin · on Jan 18, 2016

Should be noted that this only works on a "statically typeable" subset of Python where every variable has a de facto static type inferred at the first assignment. For instance, the following valid Python code would output invalid C++:

  var = []
  var = 2

morgenkaffee · on Jan 18, 2016

Yes you would have to program in a subset of python. But type annotations are not always needed.

For the array I do a little hackery. You can define the array without an initial value in the container and I can guess the value type.

  arr = []
  arr.append(1)

it will spit out

  std::vector<decltype(1)> arr{};
  arr.push_back(1);

exprx · on Jan 18, 2016

Well, you could always begin another scope with each assignment.

    auto var = // this type can't be inferred because it's not used
    {
      auto var = 2;
    }

I'm too exhausted to think why this may not be applicable.

marvy · on Jan 18, 2016

Python:

    x = 2
    if stringy: x = '2'
    print x+x # 4 or 22???

C++:

    int x = 2;
    if(stringy) string x = "2";
    print x+x; // 4 (string x is out of scope)

Fede_V · on Jan 18, 2016

There's a few similar process. Shedskin, Nuitka, Pythran, etc. They are all pretty cool projects, and worth looking into just to learn new techniques.

srean · on Jan 18, 2016

... and unpython, spyke .... all very cool and with their own spin. What is a little worrisome is the rate at which these get abandoned and how many of them target some specific subset of Python. I particularly liked unpython's take.

seivan · on Jan 18, 2016

Would someone be kind to explain how the transpiler/compiler knows that num of T1 generic(?) type can be used with the <= operator? Or that something the user themselves have to define?

Wouldn't be something like "T1 where T1 is Numeric"?

Thanks!

inglor · on Jan 18, 2016

This is how C++ templates work. They are a form of polymorphism through how you use the object and not based on its type.

If you pass a type that cannot be used with the <= operator it will error in compile time.

Being able to do this is part of why C++ templates are much more powerful than Java/C# generics and why they enable a different (and alternative) form of polymorphism to inheritance and explicit interfaces.

seivan · on Jan 18, 2016

I see. In e.g Swift, you'd have to define before hand that T1 could be used with <= while here I guess it adds in the method for "each" T1 that could be used with it.

I assume this just duplicates the method for each type at compile time instead of at runtime try to figure it out?

Damnit, now I wanna rewrite back to C++.

jschwartzi · on Jan 18, 2016

In C++ it just prints an error message and halts compilation if you try to use a type which doesn't support the semantics of the template function. It does not add functions which are not declared already.

aldanor · on Jan 18, 2016

That's what C++ concepts are for (maybe we'll see them in C++17).

morgenkaffee · on Jan 18, 2016

It is all magic of Clang or GCC. Once you call the template with T1 the C++ compiler will enforce that your passed type supports the <= operator.

techdragon · on Jan 18, 2016

Writing something like this for Python 2 is like throwing a urine filled water balloons at all the progressive developers working hard to get the Python community transitioned to Python 3.

Don't have enough reasons to stick with shitty old Python 2, well then here's another anchor for your boat!

Edit: The first pull request was for Python 3 support, hooray.

anc84 · on Jan 18, 2016

No need to phrase it so nastily.

marktangotango · on Jan 18, 2016

You say nasty, I say colorful. No need to characterize it so prejudicially.