Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Python to C++14 transpiler (github.com/lukasmartinelli)
96 points by morgenkaffee on Jan 18, 2016 | hide | past | favorite | 44 comments



It is called a compiler even it outputs code in another language, transpiler is some neologism from JavaScript developers without a background in compiler design.


I'd defend the word 'transpiler' and I did my masters and PhD on language implementation, and that's where I work professionally now, so I'm not ignorant of compilers.

I think it implies a translation from one high level language to another (not that 'high level' is well defined either), with only desugaring and maybe type-checking - no real lowering or optimisations. That's a useful subset of compilers, so can have its own word I believe.


So for you a C compiler that follows the initial workflow is a transpiler?

Or an Eiffel implementation, as another example.


I think a transpiler is a compiler that does a high-level to high-level translation with only a few simple transformations and almost no optimisations and produces relatively readable output.

If a C or Eiffel compiler meets that definition then yes. I'd still call it a compiler, but I would also call it a transpiler.

If you're offended by the term transpiler because all transpilers are also compilers, then I don't know why you wouldn't also be offended by the term compiler because all compilers are also 'programs', so lets just call everything a 'program'.


Apparently I am the one wrong here, given the 1964 paper link provided by Ded7xSEoPKYNsDd.

Still the word sounds strange.


Why are people giving this word so much flak? Yes it's a neologism, but it's a useful one. A decompiler is a compiler to, it just goes from a low level language to a high level one. Do you suggest that we stop using the word decompiler as well?


'Transpiler' goes back long before the Javascript/CoffeScript world. Pascal/C and FORTRAN/C have a history of 'transpilation' products.

It's just short-hand for source-to-source compiler and IMHO has been used enough to warrant its inclusion in a programmer's dictionary.


Yes they have, and we called them compilers.

Can you dig me a 70's paper with the transpiler word on it?


There's this paper from 1964, shortly before the last paragraph.

http://comjnl.oxfordjournals.org/content/7/1/28.full.pdf+htm...


Thanks, it seems I need to fix my stance on it.


> It is called a compiler even it outputs code in another language

Indeed and most compilers do exactly that. Cant help getting a little annoyed whenever I hear this 'transpiler' word. I guess the ship has sailed, oh well !


From what I've noticed, these "transpilers" output code that is readable (the code itself is written as though a human wrote it) where as "compilers" output code that has been optimized and show effects of name mangling in the code itself, etc. Just an observation.

I think it makes sense to use a different term for this "compiler"-esque behavior. For example, I might edit the output of CoffeeScript generated Javascript whereas I wouldn't know how to modify the output of gcc.


The output of gcc can be straight readable Assembly, once upon a time a normal language to write business applications in.

Also C compilers used to generate Assembly text files, which were then piped into the Assembler.

So no, transpiler doesn't make any sense.


This is even better highlighted by something like the GHC compiler, which compiles Haskell to (ultimately) machine code, because it does this in several steps:

1. As a first step, it "transpiles" Haskell to the very similar high-level intermediate language called Core. I could read and write programs in Core without too much effort because it's still a very high-level language.

2. Then it "transpiles" the Core code to the not significantly lower level STG source code – human readable, but starts dealing in a little lower-level stuff and contains much fewer bells and whistles. I'd prefer not to work in STG, but I totally could if I had to.

3. After this, it again "transpiles" the STG source code to the C-- (C-minus-minus) intermediate language, which is decidedly low-level (slightly lower level than regular C code), but there's still a clear and relatively simple translation between STG and C--.

4. As a final step, it "transpiles" the C-- source code to assembly (a closer match than you may think, coming from C), which can be assembled into machine code.

At no point in this pipeline is there a significant magic translation being made – despite the fact that Haskell is probably one of the most high-level languages we know and machine code is as low-level as you get.


People believe some languages are "higher level" than others and call a tool that translates programs written in A to programs written in B a compiler if A is clearly higher level than B and a transpiler if B is of a level higher than or roughly equal to A.

You can certainly think this distinction doesn't merit using a different word, but you shouldn't think that people who use the word "transpile" use it as a synonym for "compiler".


The distinction is meaningful and valuable. Language evolves. No need for the attacks.


You are right. C++ is quite different to Python and so it should be called a compiler.

But it fits the project since I don't have any background or experience in compiler design.


Transpilers are a subcategory of compilers. It's just a more specific word.


Alternatively, translator, the word that CFront used.


And "translation" is still the term of art for the compilation process at least in the C and C++ standards.


Strictly you are right. However, by the strict definition of a compiler, an interpreter is nothing but a sort-of "real time compiler".

But a compiler has come to mean a program that translates source-code into machine code.

By this new understanding, a transpiler is a combination of a compiler and a decompiler with different input and output languages.

So po-tay-toes, po-taa-toes. No one cares, we know what is meant by transpiler.


The word transpiler was used before we had web browsers, much less javascript.

Point taken that it is still a compiler.


or pretty printing


I for one am just waiting for the cispiler.


An optimizing cispiler sounds really interesting. Something that applies all the optimizations that the compiler would do when translating your C to machine code but showing you the results in C. Of course not all optimizations are representable in C (calling conventions for instance) but many are.


There are several examples of this in the Java world, such as proguard. They're mostly centered around code obfuscation, but they do some useful optimizations as well.


Key sentence: "The goal is to showcase the power of C++14 templates and not to create a fully functional transpiler."

Viewed through that lens, this is a really novel and cool demonstration.


This seems to me to be very close to what Pythran [0] is doing (since 2011), except that Pythran includes some type inference and bridge with Python code.

So view through that lens I'm not really seeing the novelty right now?

[0]: http://github.com/serge-sans-paille/pythran/


There is also nuitka - http://nuitka.net/


Should be noted that this only works on a "statically typeable" subset of Python where every variable has a de facto static type inferred at the first assignment. For instance, the following valid Python code would output invalid C++:

  var = []
  var = 2


Yes you would have to program in a subset of python. But type annotations are not always needed.

For the array I do a little hackery. You can define the array without an initial value in the container and I can guess the value type.

  arr = []
  arr.append(1)
it will spit out

  std::vector<decltype(1)> arr{};
  arr.push_back(1);


Well, you could always begin another scope with each assignment.

    auto var = // this type can't be inferred because it's not used
    {
      auto var = 2;
    }
I'm too exhausted to think why this may not be applicable.


Python:

    x = 2
    if stringy: x = '2'
    print x+x # 4 or 22???
C++:

    int x = 2;
    if(stringy) string x = "2";
    print x+x; // 4 (string x is out of scope)


There's a few similar process. Shedskin, Nuitka, Pythran, etc. They are all pretty cool projects, and worth looking into just to learn new techniques.


... and unpython, spyke .... all very cool and with their own spin. What is a little worrisome is the rate at which these get abandoned and how many of them target some specific subset of Python. I particularly liked unpython's take.


Would someone be kind to explain how the transpiler/compiler knows that num of T1 generic(?) type can be used with the <= operator? Or that something the user themselves have to define?

Wouldn't be something like "T1 where T1 is Numeric"?

Thanks!


This is how C++ templates work. They are a form of polymorphism through how you use the object and not based on its type.

If you pass a type that cannot be used with the <= operator it will error in compile time.

Being able to do this is part of why C++ templates are much more powerful than Java/C# generics and why they enable a different (and alternative) form of polymorphism to inheritance and explicit interfaces.


I see. In e.g Swift, you'd have to define before hand that T1 could be used with <= while here I guess it adds in the method for "each" T1 that could be used with it.

I assume this just duplicates the method for each type at compile time instead of at runtime try to figure it out?

Damnit, now I wanna rewrite back to C++.


In C++ it just prints an error message and halts compilation if you try to use a type which doesn't support the semantics of the template function. It does not add functions which are not declared already.


That's what C++ concepts are for (maybe we'll see them in C++17).


It is all magic of Clang or GCC. Once you call the template with T1 the C++ compiler will enforce that your passed type supports the <= operator.


Writing something like this for Python 2 is like throwing a urine filled water balloons at all the progressive developers working hard to get the Python community transitioned to Python 3.

Don't have enough reasons to stick with shitty old Python 2, well then here's another anchor for your boat!

Edit: The first pull request was for Python 3 support, hooray.


No need to phrase it so nastily.


You say nasty, I say colorful. No need to characterize it so prejudicially.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: