when exactly does a git merge conflict arise
I’m using git to track changes to my LaTeX documents. I tend to keep feedback from co-authors in a separate branch and merge it in later. So far things seem to magically merge properly, but I would like to know when exactly a merge conflict occurs, so that I can obtain some real trust in the merging process (I would not like text to come out funky of course).
There are a number of questions on StackOverflow that seem to ask the same thing, but none of the answers get very specific. For example this answer that specifies that a conflict occurs if changes were made to the same region, but that makes me wonder what exactly those regions are. Is it just changes made to the same line, or is some context taken into account?
It’s on a line by line basis, and the answer is sort of both no and yes: context does matter, but the amount that it matters is tricky. It’s both more and less than you might think at first.
You might want to skim through this answer to a related question first, for background. I will now assume that we have
Next, we look at the two diffs:
If we see that all three versions of file F are different (so that F appears in both outputs, and the changes differ), we must then work, in essence, diff-hunk-by-diff-hunk. Where diff hunks overlap, but make different changes, Git declares a conflict. But—this seems to be your question—what, exactly, does “make different changes” mean?
I think this is best shown by example. For instance:
Note that while these changes don’t touch the same line, in a sense, they also do touch the same line, in a sense. I added a line 7 (pushing old line 7 down to line 8), and I deleted the old line 7. These are, apparently, the “same” line. So:
Let’s abort this merge and consider the tip of branch
This time there was no conflict, even though both diff hunks touched the same general area. The second diff deleted a line that was not “touching” the added line, so Git considered this safe.
If you experiment more, in this same fashion, you will find exactly which seemingly-overlapping changes are combined successfully, and which result in a conflict. Obviously changes that directly overlap, e.g., where both delete original line 42 and insert a different new line 42, will conflict. But all changes are always represented as “delete some existing line(s), though maybe zero of them” followed by “add some new line(s), though maybe zero of them”. A change—even one that changes, adds, or deletes just one word within a line—deletes a nonzero number of existing lines and adds a nonzero number of new lines. A pure-delete (of one or more complete line) adds zero lines, and a pure-insert deletes zero lines. In the end, it comes down to: “Did both ours-and-theirs changes touch the same line number?” The context becomes almost irrelevant, except that when deleting zero lines, or inserting zero lines, the context itself “is” the lines, in a sense. (I’m not sure how much sense this claim makes, so if it’s incomprehensible, that’s my fault. 😉 )
(Remember also that if you are modifying the “merged so far” file as you work, you must use the original base-file’s line numbers when looking at whether a change touched “the same” lines. Since both “ours” and “theirs” have the same base version, that’s an easy short-cut we can use here.)
A three-way merge is not a patch
Note that this differs from applying a patch, which is done without a common base version to start. In the case of a patch, the context is used much more heavily: the diff hunk header provides the location for searching for the context, but since it might be applied to a different version of the file, the context allows us (and Git) to make the same change at a different line, as long as the context still matches.