What I don't understand is how they accomplish larger collaborative changes. The...

CamouflagedKiwi · on Jan 4, 2019

Unfinished work is not typically checked into master (and it's certainly not regularly broken).

What is more common is that very large changes are checked in as a series of individually compatible changes, and often broken up across the repository (there are of course tools to help with this). It's relatively rare for multiple developers to work on a single changelist; it's much more common to break the work into separate changelists.

Haven't worked there for some years now so I'm a bit rusty on some of the detail.

ravibhatt · on Jan 4, 2019

It is amazing to see how we declare something as too obvious or natural. Like trunk based development, so obvious that questioning it makes one a fool :-)

Git and its model was the best thing few years back. Now since google is doing all its dev in the main trunk/master, it must be correct and more intelligent.

Wouldn't it be a case that they went with what they had at a certain time and continue to use it as everyone is used to it and it still works? Not sure if google analysed if branching was bad and then chose trunk based development?

I cannot understand how a company that has a well defined process doing branches, is doing it wrong? or how it is so not optimal etc. I guess it is a matter of processes and culture. None of the great companies are great because their source control strategy (or code) was excellent.

We developers always over analyse everything and come up with excellent logic and some of us are gifted with words more than others.

hknd · on Jan 4, 2019

You'll have to remember that all of those big companies have their tools and processes customized for their scale.

Example: Instead of branching you would just create a `changelist` (a commit, a set of changes to files) and work on that. You can show it to your colleagues. You can build and test it. You can send the id to anyone to have a look at it, or test it themselves. You can have multiple changelists depending on each other, without being commited.

This might be an interesting read for you: https://paulhammant.com/2014/01/08/googles-vs-facebooks-trun...

hopler · on Jan 4, 2019

You can use git forks or whatever for development. This philosophy just says that you only push to production environment from one standard head trunk.

mcguire · on Jan 4, 2019

"Google use Perforce for their trunk (with additional tooling), and many (but not all) developers use Git on their local workstation to gain local-branching with an inhouse developed bridge for interop with Perforce.

"Branches & Merge Pain

"TL;DR: the same

"They don’t have merge pain, because as a rule developers are not merging to/from branches. At least up to the central repo’s server they are not. On workstations, developers may be merging to/from local branches, and rebasing when the push something that’s “done” back to the central repo.

"Release engineers might cherry-pick defect fixes from time to time, but regular developers are not merging (you should not count to-working-copy merges)"

jsty · on Jan 4, 2019

You could also gate experimental code behind feature flags which aren't set in prod

weinzierl · on Jan 4, 2019

I agree that feature flags can be a solution sometimes. The presentation and the paper I linked to in my question discusses this, but they also mention large-scale refactorings and this is where I don't see how feature flags can help.

For example: How do they untangle a wad of code that is large enough that it takes longer than a few days and more than a single developer to get the code back into a state that is acceptable for trunk?

The changes required for this kind of refactorings can be all over the place, regardless of any organizational boundaries in your code. I can't see how changes of this nature can be put behind feature flags.

ThaJay · on Jan 4, 2019

- build an interface to the code to be changed

- make all code that uses said code use the interface instead

- build the feature switch into the interface

richardwhiuk · on Jan 4, 2019

You don't do that - you lots of small incremental refactors. I think Google also have tools to make a large refactor into a small change.

(i.e. you create the new API, migrate stuff to use it, deprecate the old one, migrate the rest, retire the old one).

Cedricgc · on Jan 4, 2019

https://trunkbaseddevelopment.com/branch-by-abstraction/

VikingCoder · on Jan 4, 2019

Try like hell to use dependency injection in the first place.

wahnfrieden · on Jan 4, 2019

There’s a wealth of reading available to you if you look up “trunk-based development” as a keyword. Likewise with “continuous integration” (the actual practice, not the build tooling). Jez Humble for instance has written extensively on this.

weinzierl · on Jan 4, 2019

I know about this but haven't found an answer to my question. This was in part my motivation to post this question here.

jorlow · on Jan 4, 2019

You develop features behind compile or runtime flags and keep it off until it's ready to ship. This is what chromium.org does so that might be a more accessible way to see it in practice.

hinkley · on Jan 4, 2019

Yhere are a bunch of ways to land code that doesn’t get run yet, by adding new functions or default parameters and conditional logic.

mateuszf · on Jan 4, 2019

You don't have to check in unfinished work, you can break your work in parts and make people work on independent parts, every one of which moves progress forward incrementally.

Well, you could actually call that "unfinished" because in the beginning the code doesn't accomplish the task, but progressively it will become more useful.

weinzierl · on Jan 4, 2019

> you can break your work in parts

You can when you can but you can't when you can't.

I 100% agree with you that we should work this way whenever possible and we should work hard to keep our code in a state that lets us cleanly divide work. In my experience it is not always possible to split up work that way. Think of untangling dependencies of a larger part of the code as an example.

ithkuil · on Jan 4, 2019

Sometimes you think you can't because you either haven't learned the right tricks, or because it takes more effort and thus you opt to fork off your work in a separate long standing branch in order to optimize your development velocity at the expense of possible surprising costs during merge (if other people also make your same choice).

Other times it's genuinely necessary to make a long standing branch. In those cases, you just do it. Trunk based development should not be a dogma, just a different default choice.

afpx · on Jan 4, 2019

I found this for you:

https://trunkbaseddevelopment.com/

I only scanned through it, but it seems similar to the de facto way of doing things before distributed version control systems became popular (in the late 2000s?).

ownagefool · on Jan 4, 2019

I think you mistunderstood that link is arguing for. It's basically github flow with tagging on what you release, only goes into a bit more details and discussed alternatives and suffers a "bit from too much information"

https://guides.github.com/introduction/flow/

Is a cleaner and more obvious guide.

The idea is you have a constantly usable master, and your branches should be short lived so you don't hit a brick wall trying to get reviews and merge on your massive change sets.

Ultimately it means you want to test and review your change before it goes into master as opposed to creating "production", "staging" and "develop" branches, which largely just kick the can down the road and is a different way to solve that "what's deployed where" issue.

afpx · on Jan 4, 2019

Thanks - I should have read through it more.

weinzierl · on Jan 4, 2019

I know about this and it doesn't answer my question. This was in part my motivation to post this question here.

geoffhill · on Jan 4, 2019

None of the other replies try to explain specifics of how this works, so let me illustrate an example of two teams collaborating to add Feature X to the monorepo without branching:

1) Team A checks in their code to provide Feature X. Their code is not used anywhere in the codebase yet, however full unit test coverage exists for the public API; this is required for code review.

2) Team B checks in their code to turn on Feature X in their product, gated under a command-line flag which by default uses the old behavior.

3) Team B checks in an integration test that flips the flag and makes sure everything works as planned.

4) If Team B requires changes to Feature X to get expected behavior, they communicate those changes to Team A and someone from either team (using available human resources) makes the changes.

5) Team B checks in a small change to flip the flag by default.

6) Team B monitors their product. If things go awry, only the very latest change is reverted and repeat (4).

7) Once stability is achieved, Team B checks in a change to remove the flag.

radicality · on Jan 6, 2019

I've been at FB for a few years now (similar style) and this model of a monolithic repo, no branching, and simply submitting 'difs' (a list of changesets / patches) which get merged directly into master after the 'diff' is accepted and you land it seems much easier to me. Maybe it's just because I got used to it, but now whenever I have to touch it I find the branch-based development confusing.

tieze · on Jan 4, 2019

It might be a practical thing. I've heard from a Googler (a couple of years back) that getting changes in can take ages, and by the time the change lands, there's a good chance that there are merge conflicts, and the cycle starts over. Branches would make this even more painful.

hknd · on Jan 4, 2019

Depends on the size of the change. Small changes are preferred in most trunk-based-dev companies.

FYI: There are (multiple) tools in Facebook and Google which are an abstraction on top of their VCS. (e.g. which feels more like git, where you can work on a stream of changes which depend on each other without actually pushing anything to head)

nradov · on Jan 4, 2019

Branches actually make this much less painful. Just reverse merge from the trunk back to each branch on a frequent basis.

_cs2017_ · on Jan 4, 2019

Can you clarify why branching helps collaboration? In other words, why is it harder to commit to trunk when you have several developers working on a feature?

weinzierl · on Jan 4, 2019

Branching enables me to share unfinished work with my collaborators. Sometimes I don't want to commit to trunk yet but still share code and collaborate on a part of the code base.

jahaja · on Jan 4, 2019

If they're using Perforce one can "Shelve" a CL (changelist, similar to a commit in git) to make it available for others to unshelve. This can be used as a workaround, albeit limited, to share work-in-progress stuff.

hknd · on Jan 4, 2019

So you can prepare a change for trunk, and share the change with your colleagues to collaborate on.

Don't see why branching should be needed for this.

weinzierl · on Jan 4, 2019

That's what I meant in my question with "share code by other means". It works but in my opinion it is a large pain and I can't believe people at Google work by sending patches back and forth.

skj · on Jan 4, 2019

It's not like we (I am a Googler) email patch files around. Everything is integrated into the system. You create a CL (change list), it automatically gets a number. People can review it, test it, or fork it (make a new CL using your CL as a starting point) as much as they want, all from that CL number.

fouronnes3 · on Jan 4, 2019

So basically it's a branch?

izacus · on Jan 4, 2019

Think of a CL like a Pull Request that has (and can only have) a single commit.

It's visible in code review UI, has a description, has tests run on it, it can be merged by other people and it can be referenced from anywhere. Eventually it's merged into the head or dropped.

idontpost · on Jan 4, 2019

So this is identical to squashing and rebasing your feature branch?

xyzzyz · on Jan 4, 2019

Technically yes, but people don’t use it and think about it this way. Changelists are supposed to be small, couple hundreds of line changes at most. You don’t develop a complete feature of thousands of lines in a single CL, that would be insanely hard to review. What happens is that work gets split into small chunks, and each one is submitted separately, not to feature branch, but straight to head.

idontpost · on Jan 4, 2019

This sounds identical to our workflow with git for all practical purposes. 1) New story gets a branch. 2) Branch gets squashed and rebased on most recent master before PR. PRs are generally under 1,000 lines changed. 3) PR is merged to master after code review.

skj · on Jan 4, 2019

It's missing a lot of what people expect in git branching, like history within the branch and arbitrary digraphs for forking and merging.

If every branch was always merged back into head before doing anything else, and always had its commits flattened into one, and someone forking off of your branch was basically opening it up, copying the changes in your clipboard, and pasting it into a new branch with no attribution or history, then sure.

mic47 · on Jan 4, 2019

No, it's more like a patch. In branch, you drag with you all the dependent changes, while with patch, you have only the actual change, and information which CL -- or PR in github terms -- is dependent.

With branches, if someone updates the branch you depend on, your work is based on stale stuff, and it can get ugly. Just try to do it on github :-)

jbf1001 · on Jan 4, 2019

Functionally, this may be different but I am struggling to see how conceptually this makes any difference in the development process.

mcguire · on Jan 4, 2019

What makes this different from a branch?

hknd · on Jan 4, 2019

It's actually really easy to create a "patch", so people usually create small "patches" and send them to people if they need any feedback on those.

A "patch" is actually just a commit (actually a changelist) which can be viewed, commented and edited in the browser based code review and IDE tool.

Imho I find it much easier to get an url of a "patch" and comment on it inline, instead of having to checkout a branch etc.

If you have a question to a specific example I'm happy to answer it in the way it would've been done within Google/Facebook.

weinzierl · on Jan 4, 2019

Thank you, I appreciate your effort to help me understand this better and our exchange helped me to make progress.

One thing I infer from your answer is that it seems that there is an established process and dedicated tooling for working with patches at Google. I think a lot of my pain with patches stems more from the lack of process and lack of an agreement on formats and standards in my environment than from the use of patches per se.

Where I still see an advantage of branches is that they facilitate documentation of what has been done by whom and when. All of this documentation is in the same place and form as the documentation of changes in the trunk. It all is in commit messages whereas patches are only documented somewhere else, possibly in the Email or IM used to send the patch. Even if most of the branch documentation does not survive on trunk when we squash the final merge it is still there and easy to find as long as the branch doesn't get deleted. When I want to look up why I applied a certain patch I'll have to dig through my messages. I think that makes it harder to work with patches than with branches.

brown9-2 · on Jan 4, 2019

I think what you are missing is that Google's SCM has different concepts and terms than Git.

Google's system is derived from Perforce, which has the concept of a changelist (think: commit), which can be "pending" and stored on the server for review/cloning by other developers: https://www.perforce.com/perforce/doc.051/manuals/p4guide/07...

This allows you to share work without (in Git terms) pushing to master. Branches in Perforce-like systems tend to be more heavyweight and permanent (IIRC you have to branch an entire path of files, it is not the same as the Git concept of "branch" which is just a commit that points to another parent commit).

You can think of the system as enabling you, in Git terms, to create pull requests without the creation of an underlying branch.

hknd · on Jan 4, 2019

A "patch" in Google/Facebook/Twitter is the same as a commit. It has a (mostly) descriptive commit message, references to bug tickets and might contain links to documentation, screenshots and mocks.

You basically work on a "patch" (changelist), get feedback from others and send it out to review at the end. Before you can submit (commit) it, you'll have to sync to "head" (to have the latest changes) and run all tests. ^ most of this happens automatically, and as most changelists ("patches") are small, this happens very fast and async in the background.

ranma42 · on Jan 4, 2019

FWIW coreboot is an open-source project that uses a similar style, where you need to upload your change to the review tool (https://review.coreboot.org/, which is using gerrit https://www.gerritcodereview.com/) and people comment and LGTM in there and then it gets committed to the master branch once everything looks good.

hopler · on Jan 4, 2019

Your changes aren't sitting on your machine. They are hosted on a server or in a git fork. After test and code review, you merge to head before deploying to production or other people make follow-on changes.

majikandy · on Jan 4, 2019

Make the changes smaller and more valuable and do it on trunk.

weinzierl · on Jan 4, 2019

This should always be Plan A but in my experience it is not always possible. Think of untangling dependencies between a large number of components as an example.

fergus_google · on Jan 4, 2019

Untangle one dependency at a time.

sigsergv · on Jan 4, 2019

I think they don't branch code because perforce branches are horrible and merging them back to HEAD is extremely painful process for monorep.

StreamBright · on Jan 4, 2019

Even git or hg branches are horrible. Once you have multiple people working on the same codebase and touching the same files it is pretty horrid to manage. I know several companies not using branches because the merge conflict resolution takes too much time.

The PDF explicitly calls out the time consuming part:

"Almost all development occurs at the “head” of the repository, not on branches. This helps identify integration problems early and minimizes the amount of merging work needed. It also makes it much easier and faster to push out security fixes."

sigsergv · on Jan 4, 2019

Rebased branches in git are nice (for one developer only unfortunately), there are some pain of course but when rebase is performed often (once a day) it consumes not so much of time. The real pain begins when some huge commit is pushed to HEAD but it still manageable.

Anyway, I feel sad that so much efforts were put on really nice VCS concepts and almost no one use them in enterprise development.

dilyevsky · on Jan 4, 2019

Google havent been using p4 since around 2013

sigsergv · on Jan 4, 2019

Piper interface and workflow is heavily influenced by perforce.

dilyevsky · on Jan 4, 2019

Except it’s now highly scalable that is taking care of branching performance. Nothing technically making it difficult to branch afaik. In fact rapid (grape) used it pretty heavily to track rollouts if I remember correctly.

raverbashing · on Jan 4, 2019

So, use a better tool?

philosopher1234 · on Jan 4, 2019

Or, don’t branch. Is branching so essential?

lordnacho · on Jan 4, 2019

Isn't it essential for mental organisation? How do you think about what's different about a set of changes without some sort of DAG?

username90 · on Jan 4, 2019

They are grouped by linking them to issues in the issue tracker. All commits will then get a link to the issue and the issue gets a link back to the commit. This way you can easily track and read the full context of old changes.

Example issue, note that public ones are not associated with commits: https://issuetracker.google.com/issues/122326181

yjftsjthsd-h · on Jan 4, 2019

Just do one thing at a time? Today, I am working on X; my commits are for X, and details are in the commit message.

dharmab · on Jan 4, 2019

That breaks as soon as you have to interrupt working on Nice To Have Feature X to working on Inportant Bugfix/CVE Y.

addicted · on Jan 4, 2019

How often does that happen to an individual developer though?

Once a month? In an averagely well run company even that may be towards the higher end.

Should your entire development strategy be based on a once a month occurrence?

idontpost · on Jan 4, 2019

> Once a month?

Closer to once a week for me.

hopler · on Jan 4, 2019

Make one change at a time.

raverbashing · on Jan 4, 2019

If you don't want to be stuck while a colleague does changes that conflict with what you're changing then yes, you need branching.

hknd · on Jan 4, 2019

You don't need branching for this.

If 2+ people are working on the same file which might result into a conflict, you can either: - handle the conflict as soon as you merge your branches somewhen in the future or - handle it when trying to commit your change to head

Only difference is whether you handle the conflict now, or in the future.

hopler · on Jan 4, 2019

You need to communicate with your colleagues to coordinate your work.

StreamBright · on Jan 4, 2019

What do you mean? Branching is not a solved problem once people are editing the same files.

ashelmire · on Jan 4, 2019

Git is actually pretty good at automatically resolving conflicts within files; unless you edit the same lines, it’s easy. If you do edit the same lines, merging is pretty straightforward.

This whole conversation the last day or two on HN has been kind of nuts. Like everybody agrees you shouldn’t put all your code in a single file, right? Why not? It would let everyone see all of the source code in one place! But it would be huge and hard to avoid conflicts. So we split things into files. Then “trees”, etc...

Basically it sounds like googles monorepo is really a bunch of repos glued together with changes in one triggering changes in others. The difference, it seems, is that google does not get to benefit from the things OS developers like about git. It’s like google developed custom versions of GitHub, circleci, and other tools and are marketing that as a better solution (just build several billion dollar solutions to manage your monorepo!).

And even after all that, google has a bunch of separate repos for important open source or secret work.

sigsergv · on Jan 4, 2019

Sometimes one-line commit could cause merging conflict that requires hours of communications. Or even days in distributed teams.

StreamBright · on Jan 4, 2019

Sure, so here is a challenge for you.

Original file:

a = 1

Contributor X:

a = 5

Contributor Y:

a = 0

Both contributors created a pull request and submitted it. In the description they both state that the new value should be the one they put it. How would you resolve this issue in a timely fashion making sure you do not take down a service accidentally and do not slow down development too much. I intentionally gave you a very simple example but if you want we can go into rolling out new features, fixing security bugs and a lot more where such issues arise. And, no git will never be able to solve these issues.

I don't think that HN going nuts (except few zealots) and these problems come from the nature of software development in general. We have seen how Google solves these (monorepo, custom CI/CD, etc. etc.) and there are other companies solving it different ways (maybe have a branching model, using Github). People are just putting out here they experience and based on that and they level of understanding the perceived solutions.

ashelmire · on Jan 4, 2019

What does google do? You’re working on a line of code and the trunk changes. Your local no longer aligns with it. You have a conflict.

Someone’s changes get committed first. That’s a business decision, not a code tooling one. Second pr has to adjust. Same on both mono and poly repo, just using different words.

At least branches let you have the choice, which cannot be said for branchless.

yitosda · on Jan 4, 2019

I think there's a bit of confusion about what exactly Google is doing. IIUC they use (and develop) gerrit.

Effectively this workflow means everyone is working on the same branch and the first commit to pass review gets in. The next guy will have to rebase.

In the end one wins as you said, but the level of detail is rebasing individual commits, not merging entire branches.

So this ends up being the age-old rebase vs merge discussion.