Dulwich: Pure Python implementation of git

koenigdavidmj · on Aug 1, 2011

Github's hg-git plugin ( http://hg-git.github.com/ ) is built with this, and so manages to avoid a dependency on the git binaries.

dborowitz · on Aug 1, 2011

Incidentally, the second most frequent contributor to hg-git after schacon is Augie Fackler, one of the Google engineers who helped with Google Code's git implementation (and also a frequent hg hacker).

mvzink · on Aug 2, 2011

A few months ago, I was at a talk at Google Chicago with two of the original creators of Subversion (Ben Collins-Sussman and Brian Fitzpatrick). After profuse apologies, they said everyone in the room should switch from git to hg, and that people who still used git just didn't realize how much it sucks. Then I realized why Google released hg support before git support :P

dimatura · on Aug 2, 2011

What didn't they like about Git?

self · on Aug 2, 2011

At the time they evaluated hg and git, hg had better http protocol support.

thesz · on Aug 2, 2011

And that's all?

Security, speed, all that's don't outweight http protocol support?

durin42 · on Aug 2, 2011

The http protocol was definitely the dealbreaker. That said, if you think hg is less secure than git you've bought into some FUD (the models are so similar it's almost silly, and the security quality is identical). The speed differences between hg and git aren't perceivable for the bulk of projects, even at sizes larger than most corporate repositories I've heard of. There are some operations that are faster for git (notably history rewriting), and others that are faster for hg (notably blame and per-file log). The two systems are very similar and just make some slightly different tradeoffs.

thesz · on Aug 2, 2011

As far as I can tell, all changesets in Git are summed by SHA-1. The sum is also an ID for the changeset. You cannot change a changeset without modifying its' SHA-1 sum. This design make Git secure from tampering.

The ID for Hg changesets are some 48-bit numbers, like fb43b575b296. I do not think that this size is safe enough.

durin42 · on Aug 2, 2011

Mercurial prints the first 12 bytes of the hexlified sha1 by default, but everything is recorded using the full sha1, and can be referenced as such. You can view the full sha1 in a number of ways, the easiest would be "hg log --limit 1 --debug".

koenigdavidmj · on Aug 2, 2011

Mercurial uses full-length SHA1 sums internally, same as git. It just prints the first 48 characters for user convenience, unless you happen to have two objects that share that substring.

durin42 · on Aug 1, 2011

I'm also the actual maintainer of hg-git at this point, FWIW.

unshift · on Aug 1, 2011

if hg-git uses dulwich, and it's all pure python, how come i see it running git-index-pack amongst other git (what i assume are shell) commands?

durin42 · on Aug 1, 2011

Huh?

Are you pushing to a local repository? When pushing to a local repo, dulwich calls a local git binary. I don't see any calls to git-index-pack in either hg-git or dulwich in a cursory grep, which matches my memory.

unshift · on Aug 1, 2011

i'm pushing to github. reason i ask is that it segfaults -- see https://github.com/schacon/hg-git/issues/216

koenigdavidmj · on Aug 1, 2011

Something like that is not exactly core functionality. It could be a situation where hg-git runs purely in Python for everything that it 'needs' to do, and will optionally use the native commands for other git functionality.

simonw · on Aug 1, 2011

This is the library that Google used for Google Code's git support.

swombat · on Aug 1, 2011

Dulwich is a real town in the South of London, with a relatively large school called Dulwich College, where I studied. A bit spooked to see a library named after it...

brendn · on Aug 1, 2011

"Dulwich is the place where Mr. and Mrs. Git live in one of the Monty Python sketches." (http://pypi.python.org/pypi/dulwich)

arthurdenture · on Aug 1, 2011

Not only that, there's a bus that goes to Dulwich Library, which I was obligated to snap a picture of when I was visiting London.

glenjamin · on Aug 1, 2011

Does anyone know what the performance of this on windows (maybe with PyPy?) is compared to MSys Git or Cygwin.

If it's pretty good there should be some mileage in making a reasonable git client for windows based on this.

amethyst · on Aug 2, 2011

That's exactly what I was thinking of when seeing this. It could prove fruitful for getting a "Tortoise" interface for git that's both easy to maintain and doesn't rely on clunky bits like msysgit.

cincinnatus · on Aug 2, 2011

How about an Iron Python based implementation. Then it could be 'native' code

baq · on Aug 2, 2011

msysgit was fast enough for me in a 250+kloc project.

it was certainly faster than dog-slow g++.

uriel · on Aug 1, 2011

What i would like to see is an Hg implementation in Go.

andrewflnr · on Aug 2, 2011

Why, exactly? It sounds like an oddly interesting idea to me, too, but I'm not sure why and I'd like to hear your reasons.

thristian · on Aug 2, 2011

The recently-announced fork of the Plan 9 operating system, 9front, keeps it source in a Mercurial repository, and includes a Go compiler in the base operating system (which makes sense, since the creators of Plan 9 are also pretty much the creators of Go). I guess a Mercurial port to Go would make life a lot easier there.

uriel · on Aug 2, 2011

Avoiding the overhead of python's startup time alone would be nice, hg is impressively fast for being written in Python, but could be much faster and memory-efficient in a compiled language where you have control over memory layout.

And that is without going into the potential of perhaps taking advantage of concurrency and parallelization for some things.

pointyhat · on Aug 2, 2011

I'm actually considering doing this now you've said it. If there would be enough interest that is. There are very few "non-python" complete hg implementations.

durin42 · on Aug 2, 2011

I started tinkering with an implementation of revlog on the plane back from OSCON. If I get anywhere useful I'll post the sources somewhere.

pointyhat · on Aug 2, 2011

Would be interested in that. Can you post back here if you do stick them somewhere.

durin42 · on Aug 2, 2011

I'll try my best to remember!

sitkack · on Aug 1, 2011

I don't know why this makes me so happy.

morphir · on Aug 2, 2011

python is like china. By making a shameless copy for them self to use.