More

caustic · on April 8, 2013

> This has some real promise. Congrats!

Thanks!

> Are you planning to parse PHP?

Sure, more languages are on the way! The problem with PHP, however, is that when it is used in a HTML page as template language, you can't get much from it. On the other side, when it is a simple class file with no HTML markup, things should work as with any other language.

pbiggar · on April 8, 2013

When it's used as a HTML page with interspersed PHP, you just convert the HTML to strings. Look at the phc parser (http://phpcompiler.org). You can definitely get good static analysis info (I wrote the phc static analyser).

caustic · on April 8, 2013

> It will probably be pretty hard to apply that to C (cpp) and Lisp (macros, and especially the programmable readers in some Lisps).

C is really hard to parse because of preprocessor. We built a prototype parser for it, and it mostly works, but it's not ready yet for the prime time.

Lisp must be a much simpler language to parse, though.

> scanning for API changes between two versions of a framework

Yes, we have this idea on the to-do list! Imagine that you are comparing two different branches of development, or two different tagged releases, and see high-level API changes between them.

caustic · on April 8, 2013

> Question: For every language, are you writing a full-blown language parser to get the semantic information you need?

That is correct. The structural diff algorithm is generic, it operates on langauge-neutral feature trees. And those are built with the langauage-specific parsers.

> Can I hook in new parsers to add language support?

Writing a parser is not an easy job. But some day we hope to open API for developers to write custom plugins.

gabriel · on April 8, 2013

This alone is awesome.

I've had the opportunity to write a small interpreter on the job before. Trust me: I know it is no small task to bite off something like this! Tip of the hat to you.

I'm currently stuck in the .NET world and one of the things I've been working on lately was to have a program fix certain aspects of a medium-to-large code base, but done via semantic parsing of the code base so that I know I'm typesafe and such.

In C# there are open libraries like NRefactor, of course Mono itself, and in the Microsoft world they are working on their own compiler as a service product named Roslyn. Do you think any of these efforts would even help an effort such as your are doing?

I'm asking not so much for C# stuff, but because I feel a momentum coming up that could enable stuff such as a live coding environment (my code is in execution as soon as I write it), and the idea of the debugger is the same as my production run-time (not sort of the same, but really the same). I wish up and coming languages would tackle this stuff head on today.

caustic · on April 8, 2013

> Do you think any of these efforts would even help an effort such as your are doing?

If anyone else is going to make something similar for .NET, then yes.

But we are going to build our own custom C# language parser.

psantosl · on April 9, 2013

Yep, there's such a thing for C#, just check this: http://plasticscm.com/sm/index.html

caustic · on July 14, 2012

Or apply Bayes' theorem to a real world problem http://yudkowsky.net/rational/bayes/

caustic · on July 13, 2012

I would also add to your list:

* http://www.informatik.uni-trier.de/~ley/db/conf/dbpl/

* http://arxiv.org/corr/home

* http://citeseerx.ist.psu.edu/index

* http://rjlipton.wordpress.com/

* http://www.scottaaronson.com/blog/

* http://scientopia.org/blogs/goodmath/

and the like

SilasX · on July 13, 2012

Wow, Scott Aaronson's blog must be really good to add it to a list that already includes it!

caustic · on June 1, 2012

> Pathological love for Java and anything resembling Java.

And I think that's the good thing. Over years I finally realized, it's not the programming language you use that matters, or makes you look smarter. It's the kind of problem you are trying to solve by writing code. Computer Science field is far, far reacher than the PL research subfield.

I mean, you may be writing another boring enterprise web application in Haskell, or solving Artificial Intelligence problem in Visual Basic. I would prefer later rather than former, although I hate VB. I know, both situations are contrived, this is just a thought experiment to illustrate my point.

Don't get me wrong, I love "esoteric" programming languages. Few years ago I spent quite a lot of time playing with Haskell, Prolog Lisp, etc, and I don't regret it. No, I don't use these on daily basis or going to, but I studied a hell lot of coll new stuff. Most importantly, it taught me about new _paradigms_ of programming, that I really think every programmer should understand.

These days, however, I try to squeeze as much math and algorithms from different domains as I can in my poor stupid head and I think that payoff of this would be much bigger for me.

PS

Few month ago at a local functional programming meetup some guys presented their Scala solution to a trivial problem of validating web forms in a rather trivial web application. Their solution employed the whole lot of functional stuff, like functors, mappings and the things whose names I forgot. I was trying to understand what they are doing but they lost me after fifth minute of the presentation. It took them maybe a week to write all this code. Do you see the irony?

caustic · on April 12, 2012

For those who want to learn more about this kind of algorithms on strings, there is a great book named, unsurprisingly, "Algorithms on Strings" (http://www.amazon.com/Algorithms-Strings-Maxime-Crochemore/d...).

Although I must admint, jogojapan has written a really clear and thoughtful explanation of Ukkonen's algorithm on stackoverflow, the best one I've ever read.

ahelwer · on April 12, 2012

I've not read that textbook, but another excellent text is "Algorithms on Strings, Trees, and Sequences" by Dan Gusfield (http://www.amazon.ca/Algorithms-Strings-Trees-Sequences-Comp...). It has a strong bioinformatics leaning, so you learn all sorts of interesting near-real-world applications for the algorithms.

Starts off talking about all the standard exact pattern matching algorithms, then moves on to suffix trees, then inexact matching, then finishes off with some advanced topics (that I have not read yet). Anyway, I'm really enjoying reading it and definitely recommend it.

caustic · on Aug 22, 2011

TiddlyWiki is a single self-contained, self-modifying HTML file that does not require any server whatsoever. Installing it is as simple as downloading initial template file to your disk. Being text file, it is searchable, mergeable, etc. Copy it on a USB stick, put it in a dropbox folder and you got your backups. But don't get fooled by the apparent simplicity, it is a full blown wiki, with themes and plugins, if you want them.

AndrewDucker · on Aug 22, 2011

What happens if I have it open on two PCs at once, through the wonder of Dropbox? Does it cope?

JabavuAdams · on Aug 22, 2011

No. I often have this problem just with two instances open accidentally. I.e. left one minimized, forgot then opened it again.

naner · on Aug 23, 2011

If you're using Chrome then keep it in a pinned tab. (Right click on tab, "Pin Tab")

Fargren · on Aug 23, 2011

Or Firefox. Firefox will also send you to the open tab if you try to go to an URL you already went to, instead of opening a second instance of the same page.

mynegation · on Aug 23, 2011

Last write wins. Although tiddlywiki does detect the situation when file was changed on disk, there is not much you can do at this point. Before tiddlywiki I used to use pmwiki on a server and it has the ability to merge changes (via diff3). I miss that dearly in tiddlywiki.

caustic · on Aug 22, 2011

> Does it cope?

Probably not.

gbog · on Aug 23, 2011

Ok, it is a text file, a web page, but why so complex an interface? I feel it is very weird to have the content of paragraphs inserted just below the TOC in http://www.tiddlywiki.com/ (I am guessing this page showcases TiddlyWiki, right?)

Simple anchor links should work much better, and would not break the back button.

caustic · on Aug 22, 2011

I put my TiddlyWiki files in a DropBox folder. I think something as heavyweight as git backend would be overkill for such a simple task.

shaggyfrog · on Aug 22, 2011

I don't understand how git is a backend at all; I can manipulate all of the data (text files) without needing git at all. Compare with a wiki, which requires some sort of backend to work with any of its data.

When I use git in this sense, I'm talking mostly about version control and backup.

dekz · on Aug 22, 2011

Heavyweight git backend? Care to expand on that?

tomjen3 · on Aug 23, 2011

You have to add, commit and possibly merge just to add today's agenda or fix a simple typo. Compare that to how easy wikimedia is.

dekz · on Aug 23, 2011

I think I misinterpreted the parent, none the less. https://github.com/github/gollum

caustic · on July 25, 2011

That's an interesting concept, but it seems like they are not alone in this field. Just recently I stumbled upon a web page by Moshe Sipper, who seems to work on a very similar topic "Darwinian Software Engineering":

    http://www.moshesipper.com/finch/

He makes a rather grandiose claim -- “We believe that in about fifty years' time it will be possible to program computers by means of evolution. Not merely possible but indeed prevalent.”