Git is not a version control tool, right? It’s a graph-manipulation tool that you can use to support version control methodologies. So this is how I use git to practice version control.
I’m going to be very explicit throughout this, using long forms of git flags and commands, and avoiding many shortcuts that I actually use in the day-to-day. I’ll write a follow-up with those shortcuts if I get the chance.
If you take nothing else away from this, at least know that
git pull is terrible and should be avoided.
My use of git is particular to the context in which I use it. That context is the green and lush valley between the mountains of GitHub and Heroku. GitHub provides a web-accessible record of the state of my development on a project, and Heroku provides a target I can deploy to with git.
The GitHub side is a bit more central to how I use git, so we’ll focus on that. Under most circumstances, my git-world looks like this:
github.com:Some/Project.git <----> github.com:me/Project.git ^ | v localhost:me/Project.git
That is, there’s a project I’m contributing to (either as a primary collaborator, or just an interested citizen of the open-source world), and it’s on GitHub. There’s my fork of it on GitHub. And there’s the working copy on my local machine.
The general flow is like this:
- Changes I make locally get pushed to my fork.
- Changes in my fork get pull-requested to the original project.
- Changes in the original project get fetched-and-fast-forwarded in my local copy, then pushed to my fork.
To talk about how I do this, we’ll need to talk about the kinds of objects that I keep in my mental model of my git-world.
First, there are a few tools that are intrinsic to git and GitHub:
- pull requests
I augment these with some further categories that exist only in my head:
originremote:The original project’s remote I always call
upstream(this collides a bit with some other other git terminology, but it’s not been confusing so far), and my fork I always call
- read branches and write branches:Related to the point above. Some branches are local copies of information on
upstream, and they are read-only: I never commit to them. Other branches are local copies of information on
origin, and they are writable: all my commits go on these branches, and get pushed to origin.The read-only branches include
master. I never commit on
master, only on write branches, which make their way back into master eventually.This distinction is kinda crucial, as it helps me avoid merge bubbles and confusing history states.
For the most part, all my write branches are based off of
master, which is in turn tracking
upstream/master. Every once in a while, I will have a branch based off of something else. For example, say that I am working on contributions to a feature branch a friend is working on. In that case, I add one more remote (beyond
origin) to track their fork on GitHub. I update my remotes (see “The Commands” below), and I make a local read-branch that tracks the branch on their remote that I’m working on. I then make a write-branch based off of that local branch to work on.
All of this is in aid of one of my fundamental principles: updating tracking information and updating branch state should be clearly separated activities.
(As a side note, this is why I think
git pull is toxic; it combines two operations, first a
git fetch, which updates some of your remote tracking information, and then a
git merge, which may be a fast-forward merge, but may as likely introduce a merge bubble, making the operation hard to cleanly reverse.)
When my friend has updated their branch on their fork (by merging my code, or by adding some of their own, or even by doing the impolite thing and rewriting history on that branch), I can update my remotes, see how different my local read-branch and their fork’s version of the branch are, make smart choices about what to do, and if all’s clear, hard update my read-branch to match their version. Then I can repeat that process with my branches off of it: I will have the ability to see the state of the differences cleanly, and not have to awkwardly back out via the reflog.
OK, enough of me pontificating, you just want to know what git commands to run, and damn the torpedoes, right? Well, that way lies pain, so do take the time to understand what git is doing to the commit graph, but. Here’s what it looks like for me:
git clone email@example.com:wlonk/SomeProject.git cd SomeProject git remote add upstream firstname.lastname@example.org:TheOriginalFounder/SomeProject.git git remote update --prune
At this point, I have two remotes (as per figure 1), and all local information about them is up-to-date
git branch some-feature-branch master
Now I’ve made a branch that I’ll use as a write-branch; the authoritative copy of it is here on my local machine. It’s tracking my local master, too, which is, in turn, tracking the upstream’s master. If you want to automatically make your new branches track the branch you’re on when you start them, set
git config branch.autosetupmerge always. See http://grimoire.ca/git/config for more good-but-nonstandard git configs.
git checkout some-feature-branch
And now I work on this branch! Work work work, commit commit commit. Hm, maybe I want to clean up my history. OK:
git rebase --interactive # Explaining how to use this is out of scope git push origin
Oh, wait, there’s some upstream work I want to incorporate. It’s in the upstream’s master, and my PR to upstream hasn’t yet been merged.
git remote update --prune git checkout master git merge --ff-only upstream/master
If the above fails, stop, look around you, and then calmly make good choices.
git push origin master
Cool, now my local master and my fork’s master both look just like upstream’s master
git checkout some-feature-branch
Let’s just replay the work in this branch onto the new master:
And we can deal with conflicts as they arise. And finally, we have to force-push (generally considered bad, so be careful!) to rewrite history on our remote:
git push --force origin
And remember, if all else fails: