Hacker News new | past | comments | ask | show | jobs | submit login

This really speaks to the reliability of Git.

Are there any examples of projects with 1kk+ commits that use SVN, Mercurial, Perforce, or some other SCM?




Mercurial was used at Facebook afaik, and I would guess they ended up exceeding 1 million.


When I left, the diff number was in the 15 million range. Not all diffs are landed, but I would assume >60% are, so FB's repo is almost certainly above 10M commits


Sheeeit. At some point, just easier to archive the thing and start with a fresh import.


But you want to have history.


And Google uses a hacked up Perforce.


Nothing hacked about it. They rewrote it completely, keeping just the interface for compatibility. Perforce scales very well but still has a single server at its core - at some point no matter how much money google threw at that machine (it used to be the beefiest single server they had), it just couldn't keep up.


Yes. Sorry, I was a bit sloppy in my expression.

It started out as vanilla, then hacked up, and then the re-write.

I think Facebook's mercurial was/is also a special edition? Or is everything they do upstreamed?


Apache had a single SVN repository for all projects in the past. That reached 1889412 commits.

https://svn.apache.org/viewvc


http://hg.mozilla.org/try appears to have over 3M commits, and probably in excess of 100k heads (effectively git branches, although I don't think git has any proper term for a commit with no children that isn't referred to by a branch).

Strictly speaking, it's not actually the main project repository (which has closer to 600k commits), but the repository that contains what is effectively all of the pull requests for the past several years (more specifically, all the changes you want to test in automation).

The closed-source monorepos of Google (perforce IIRC), Facebook (Mercurial), and Microsoft (Git) are all going to be far larger than any open-source repository, of which Linux is in the largest size class but not the largest (I believe Chromium's the largest open-source repo I've found).


> although I don't think git has any proper term for a commit with no children that isn't referred to by a branch

I think this would be one case of a “detached head”.


Google is mimicking perforce command line. The backend is 100% proprietary.

Microsoft is based on Git, but with a lot of engineering on top of it: https://devblogs.microsoft.com/bharry/scaling-git-and-some-b...


Google announced they had 35 million commits to their monorepo, five years ago.


Do they have a quick response team to incarcerate newbies who commit binaries to their giant monster?


No, because piper doesn’t care how big your files are, and devs don’t ever need to pull or clone the repo locally.

When they were still using actual Perforce there was a team who would browbeat people who had more than a hundred clients. That is they only time I can remember running up against a limit of the SCM.


That's a non-issue for scalable version control systems they use - perforce and now its in-house replacement (piper).

Gamedev companies LOVE perforce because it scales, they keep game assets in it and those can be huge.


By "scalable" I assume you mean "centralized". As in the repository is hosted on a single machine, so you only have to worry about one machine meeting the hardware requirements for storing all that data. That scales better with repo size, but it scales worse with the size of your development team. I'm sure the Linux kernel has orders of magnitude more developers than a typical video game or engine.


> By "scalable" I assume you mean "centralized".

Yes, at scale it has to be. Google has hundreds of terabytes of data in their monorepo, you can't check out it all! Historically centralized used to be the norm - previous popular VCS generations (CVS, subversion) are all like that. DVCS (git & co) came into dominance only in the last decade or so.

> That scales better with repo size, but it scales worse with the size of your development team.

Google switched to piper around 2014 when they had ~50k employees. Perforce monorepo worked pretty well for them until then. It has certain costs to scale a monorepo that far - lots of investment into tooling to make it all work, needs dedicated teams. But it can be scaled. And it offers certain benefits that are very difficult to harness in multi-repo setups - ability to reason about and refactor code across entire repo is the biggest one.

> I'm sure the Linux kernel has orders of magnitude more developers than a typical video game or engine

Linux kernel development is very different, decentralized across many different companies and ICs, hence the need for DVCS systems like git. In a corporate environment like google or gamedev, it is much easier to keep version control centralized and dedicate a team to maintaining it.


Right. I just wanted to point out that centralized and distributed version control are both designed to scale, just in different ways.


They have extensive code review.


Epic’s Unreal Perforce repo is >1.5 million at this point.


>1kk

That's going in the ol' geek toolbox



Oh, I know. That one is for sillies


Epic Games' p4 depot has well over 1mm changelists. Many of those numbers are taken up by developer changes that never get submitted, and many are automated merges though


Isn't OpenBSD at the 500k'ish mark using CVS?


I thought I read that Linux jettisons old history every few years for the sake of practicality, and that if you want the full history you have to look at special archive repos. Am I wrong? I wouldn't blame them; git is fast, but it's not that fast, and cloning becomes a bear after only a few hundred thousand commits (and I would be surprised if that's the only operation that scales poorly).


No they don't do that. Only happened once when moving to git: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/lin...


That link is not responding for me, but here's a mirror: https://github.com/thorvalds/linux/commit/1da177e4c3f41524e8...




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: