Merge vs Rebase: Do They Produce the Same Result?
I get asked quite a lot whether I recommend a merge
-based workflow, or one
where people rebase
onto master. But to be quite honest, I couldn't
possibly care less. Your workflow is your workflow after all, it's up
to your team to work in the way that's most productive for you.
For some teams that's merging, for some teams that's rebasing...
n the end, the code gets integrated and the end result is the same either
way, whether you merge or rebase it, right?
Right?
If you're a rebase
fan, you've probably run into cases where you get
conflicts during a rebase
that you wouldn't get during a merge
. But
that's not very interesting... is there a case where merge
and rebase
both finish and produce a result, but a different tree?
Is git-merge
guaranteed to produce the same results as git-rebase
?
No!
It's actually not a guarantee; in fact, you can create two branches that
merge
differently than they rebase
. To avoid any spoilers, I've hidden
the details in case you want to think about this on your own. 🤔 Click
"expand" below to see the details.
Click to expand...
Hello, world.
You can follow along with this GitHub repository.
Imagine that you have two branches, one is master
, and the other is
the unimaginatively named branch
branch. They're both based off a
common ancestor 0d7088f
. Further, imagine that your branch
has
two commits based off that common ancestor:
Ancestor 0d7088f | branch 3f3ca4f | branch 09d3ac4 |
---|---|---|
One | One | One |
Two | 2 | Two |
Three | Three | Three |
Four | Four | Four |
Five | Five | Five |
Six | Six | Six |
Seven | Seven | 7 |
Eight | Eight | Eight |
Finally, imagine that your master
branch has a single commit based
off the common ancestor:
Ancestor 0d7088f | master f2e864b |
---|---|
One | One |
Two | 2 |
Three | Three |
Four | Four |
Five | Five |
Six | Six |
Seven | Seven |
Eight | Eight |
What happens when you try to merge
or rebase
these?
When Git merges two branches, it only look at the tip commit in each
branch, and compares them to their common ancestor. It does not look at
any intermediate commits. In the above example, when we merge branch
into master
, the algorithm looks at the changes made in branch
by
comparing commit 09d3ac4
to the common ancestor commit 0d7088f
.
It also looks at the changes made in master
by comparing commit
f2e864b
to the common ancestor commit.
The merge algorithm compares each line1 in the common ancestor, comparing
it to the file in branch
and the file in master
. If the line is
unchanged in all branches, then there's no problem - that line is brought
into the merge result. In this example, line 1 in unchanged in both
branches, so line 1 of the merge result will be One
.
If a line is changed in only one branch, then that change is brought
forward into the merge result. In this example, line 7 is changed only
in branch
. So in the resulting merge, line 7 will have the contents from
branch
, which is the digit 7
. Also, line 2 is changed only in
master
, so in the merge result it will be the digit 2
.
Merge Result |
---|
One |
2 |
Three |
Four |
Five |
Six |
7 |
Eight |
Remember that merge only looks at the tip commits, so comparing the
common ancestor to branch
, line two appears unchanged, since the
ancestor and tip are identical.
Rebase works a bit differently - instead of doing a three-way merge
between the tip commits on each branch, it tries to replay the commits
on one branch onto another. In the above example, if we want to
rebase branch
onto master
, then Git will create a patch for
each commit on branch
and apply those patches onto master
.2
When you rebase, Git will switch you to the master
branch, checking
out f2e864b
. Then Git will apply the differences between the common
ancestor and the first commit on branch. In this example, the patch
between the common ancestor and the branch changes line two from Two
to 2
. But that's already the value of the file in master
. So
there's nothing to do, and the patch for 3f3ca4f
applies cleanly.
Then a patch for the second commit on the branch is applied: it changes like two back to the text representation, and changes line seven to a digit. So the rebase result is:
Rebase Result |
---|
One |
Two |
Three |
Four |
Five |
Six |
7 |
Eight |
So rebase
preserves the changes in the branch
while merge
preserved the changes in master
.
Generally these sorts of changes will cause a conflict instead of
different results. It was key that in branch
we changed the contents
of line 2 back to the contents in the common ancestor. That allowed the
merge engine to consider that the line in branch
was unchanged.
Merge Result | Rebase Result |
---|---|
One | One |
2 | Two |
Three | Three |
Four | Four |
Five | Five |
Six | Six |
7 | 7 |
Eight | Eight |
So... is this a problem?
It might seem concerning that this comes up when there was an apparent
revert of your changes. Logically, both the branch
and the master
branches changed line two, but then branch
changed it back. So
although this seems rather derived, it's not that unlikely.
But whether you prefer a merge
workflow or a rebase
workflow, you
should be careful of your integration and following good development
practices:
-
Code review, ideally using pull requests, so that your team members have visibility into changes before they're integrated into
master
. -
Continuous integration builds and tests, as part of your integration workflow. Ideally, with build policies to ensure that builds succeed and tests pass.
So make sure to do proper code reviews, which keep this an interesting difference instead of an actual problem in your workflow.
-
Strictly speaking, the merge engine doesn't actually look at lines, it looks at groups of lines, or "hunks". But it's easier to reason about individual lines for this example. ↩
-
By default,
rebase
will create and then apply patches, but when invoked withgit rebase --merge
then it willcherry-pick
the changes. This uses the merge engine instead of patch application, but in this example, the results are the same. ↩