Image 01

Transneptune

beyond the Kuiper Belt, over the sea

Archive for May, 2015

Git Shortcuts

Saturday, May 23rd, 2015

I talked about how I use git. Let me talk about how I actually use it.

I have an extensive [alias] section in my .gitconfig. Any sufficiently frequently used command gets abbreviated to two (or occasionally three) characters.

    st = status
    ci = commit
    co = checkout
    cob = checkout -b
    di = diff
    amend = commit --amend
    ane = commit --amend --no-edit
    aa = add --all
    ff = merge --ff-only
    rup = remote update --prune
    po = push origin
    b = branch -vv
    dc = diff --cached
    dh1 = diff HEAD~1

So in actual day-to-day usage I’ll type g st and g b just reflexively, while I’m pondering where the code is at (depending on whether I’m thinking about code to stage or feature branches), and g aa && g ci when I need to commit the current working tree verbatim, and g rup && g ff when I need to make a local read-only branch match its upstream.

Really, the key here is to pay attention to which commands you use frequently, and enshrine them in your aliases. Also, clean out your aliases sometimes if your behaviors change and you stop using certain commands.

Double-Entry Timekeeping

Saturday, May 16th, 2015

I can credit Ben Warren with the phrase, and the particular expression of the idea.

Like double-entry bookkeeping, you don’t just track one value, you track in and out from each account separately. In particular, you track the time you’ve allocated to tasks, and you track the time you’ve spent on tasks.

One thing I’ve noticed in some organizations is a reluctance among those who do-the-work (as opposed to manage-the-work, and yes, I know this is an unfair division, but I trust you know what I mean) to allow visibility into the second part. How time actually gets spent.

I think that this is often motivated by a reluctance to risk blame: if we make it clear how time was spent, “waste” will become visible and blame will be apportioned. But doing so costs us some very valuable information. If we can’t see how we actually spend time, we can’t figure out where the actual waste lies and remove it.

(An aside about waste: in knowledge work especially, time spent apparently not working is not the same as waste. There’s value in long-term sustainability and value in surprise solutions to be had by allowing people some space to muse. You can’t just sit and churn out code if you want that code to be full of good decisions. You need to pause, ponder, learn, as well. So when I talk about “waste”, I really mean priority-shifting, interruptions, vapid features, and things like that.)

But blame is almost always very wrong to ascribe anyway. It is a truth (very nearly) universally acknowledged that complex systems do not fail because of a single cause. System accidents are the norm, and result from many pieces interacting to produce an undesired effect. A complex system can typically handle a handful of failures, because each section of the system gets designed with the local failures in mind, but the interaction of those failures can cascade and cause larger effects that the system cannot tolerate. If you think that either your system or your organization are not complex systems, you are probably wrong. So, usually, if something goes wrong, it is not the fault of one person. It is usually more accurate to ascribe the fault to the modes of interaction and the ways that failures are handled by components of the code or the organization.

How I git

Saturday, May 9th, 2015

Git is not a version control tool, right? It’s a graph-manipulation tool that you can use to support version control methodologies. So this is how I use git to practice version control.

I’m going to be very explicit throughout this, using long forms of git flags and commands, and avoiding many shortcuts that I actually use in the day-to-day. I’ll write a follow-up with those shortcuts if I get the chance.

If you take nothing else away from this, at least know that git pull is terrible and should be avoided.

The Setup

My use of git is particular to the context in which I use it. That context is the green and lush valley between the mountains of GitHub and Heroku. GitHub provides a web-accessible record of the state of my development on a project, and Heroku provides a target I can deploy to with git.

The GitHub side is a bit more central to how I use git, so we’ll focus on that. Under most circumstances, my git-world looks like this:

github.com:Some/Project.git <----> github.com:me/Project.git
                                        ^
                                        |
                                        v
                                   localhost:me/Project.git

That is, there’s a project I’m contributing to (either as a primary collaborator, or just an interested citizen of the open-source world), and it’s on GitHub. There’s my fork of it on GitHub. And there’s the working copy on my local machine.

The general flow is like this:

  1. Changes I make locally get pushed to my fork.
  2. Changes in my fork get pull-requested to the original project.
  3. Changes in the original project get fetched-and-fast-forwarded in my local copy, then pushed to my fork.

To talk about how I do this, we’ll need to talk about the kinds of objects that I keep in my mental model of my git-world.

The Tools

First, there are a few tools that are intrinsic to git and GitHub:

  • remotes
  • branches
  • pull requests

I augment these with some further categories that exist only in my head:

  • upstream remote and origin remote:The original project’s remote I always call upstream (this collides a bit with some other other git terminology, but it’s not been confusing so far), and my fork I always call origin.
  • read branches and write branches:Related to the point above. Some branches are local copies of information on upstream, and they are read-only: I never commit to them. Other branches are local copies of information on origin, and they are writable: all my commits go on these branches, and get pushed to origin.The read-only branches include master. I never commit on master, only on write branches, which make their way back into master eventually.This distinction is kinda crucial, as it helps me avoid merge bubbles and confusing history states.

For the most part, all my write branches are based off of master, which is in turn tracking upstream/master. Every once in a while, I will have a branch based off of something else. For example, say that I am working on contributions to a feature branch a friend is working on. In that case, I add one more remote (beyond upstream and origin) to track their fork on GitHub. I update my remotes (see “The Commands” below), and I make a local read-branch that tracks the branch on their remote that I’m working on. I then make a write-branch based off of that local branch to work on.

All of this is in aid of one of my fundamental principles: updating tracking information and updating branch state should be clearly separated activities.

(As a side note, this is why I think git pull is toxic; it combines two operations, first a git fetch, which updates some of your remote tracking information, and then a git merge, which may be a fast-forward merge, but may as likely introduce a merge bubble, making the operation hard to cleanly reverse.)

When my friend has updated their branch on their fork (by merging my code, or by adding some of their own, or even by doing the impolite thing and rewriting history on that branch), I can update my remotes, see how different my local read-branch and their fork’s version of the branch are, make smart choices about what to do, and if all’s clear, hard update my read-branch to match their version. Then I can repeat that process with my branches off of it: I will have the ability to see the state of the differences cleanly, and not have to awkwardly back out via the reflog.

The Commands

OK, enough of me pontificating, you just want to know what git commands to run, and damn the torpedoes, right? Well, that way lies pain, so do take the time to understand what git is doing to the commit graph, but. Here’s what it looks like for me:

git clone git@github.com:wlonk/SomeProject.git
cd SomeProject
git remote add upstream git@github.com:TheOriginalFounder/SomeProject.git
git remote update --prune

At this point, I have two remotes (as per figure 1), and all local information about them is up-to-date

git branch some-feature-branch master

Now I’ve made a branch that I’ll use as a write-branch; the authoritative copy of it is here on my local machine. It’s tracking my local master, too, which is, in turn, tracking the upstream’s master. If you want to automatically make your new branches track the branch you’re on when you start them, set git config branch.autosetupmerge always. See http://grimoire.ca/git/config for more good-but-nonstandard git configs.

git checkout some-feature-branch

And now I work on this branch! Work work work, commit commit commit. Hm, maybe I want to clean up my history. OK:

git rebase --interactive  # Explaining how to use this is out of scope
git push origin

Oh, wait, there’s some upstream work I want to incorporate. It’s in the upstream’s master, and my PR to upstream hasn’t yet been merged.

git remote update --prune
git checkout master
git merge --ff-only upstream/master

If the above fails, stop, look around you, and then calmly make good choices.

git push origin master

Cool, now my local master and my fork’s master both look just like upstream’s master

git checkout some-feature-branch

Let’s just replay the work in this branch onto the new master:

git rebase

And we can deal with conflicts as they arise. And finally, we have to force-push (generally considered bad, so be careful!) to rewrite history on our remote:

git push --force origin

And remember, if all else fails:

Call the reflog!