Image 01

Transneptune

beyond the Kuiper Belt, over the sea

Archive for the ‘computers’ Category

Getting your docs on Read the Docs

Monday, February 26th, 2018

You’ve seen it before: software projects with lucid and lovely documentation, all available on subdomain.readthedocs.org. So, how do?

Well, the good folks at Read the Docs have made your life very easy. Assuming your documentation’s reStructuredText source is kept in git, and you are OK with pushing it to GitHub, you can get Read the Docs building your documentation for very-near free.

Once you’ve made a repository on GitHub with your docs in it, go to readthedocs.org and sign up or log in. Then, go to your dashboard and select “Import a Project”. Click the plus sign next to the project you want, and you’re done. Every time you push to GitHub, Read the Docs will rebuild your documentation.

But you want more than the basics, right? So let’s talk about what to do if your needs are a little left of center.

If your docs are more than just plain Sphinx, you might need some third-party packages to build them correctly. You may also need a specific Python version. To set either of these, you want a file called readthedocs.yml or .readthedocs.yml, where you can set some values. For this particular case, you probably want to set the python.version and requirements_file keys.

For example:

requirements_file: requirements.txt
python:
  version: 3.6

If your docs are part of a Python package, much of this will get autodetected from the setup.py, but if they’re not, it’s useful to explicitly set them.

I find that it’s very useful to explicitly set a more recent version of Sphinx. Also, if you want to use a custom theme, be sure to throw it in the requirements file.

Another common kind of customization you might want or need is a custom domain. Say you want your docs hosted at https://docs.example.com instead of at https://example-project.readthedocs.org. First, for simple HTTP, there are two steps:

  • Add a CNAME record in your DNS that point to the Read the Docs servers.
  • Add a Domain object in the Project Admin > Domains page for your project on Read the Docs.

For HTTPS, you need to set up some sort of SSL proxy. The full scope of this is outside what I can write here, but using a service like Cloudflare1 to terminate the SSL connection and then proxy to Read the Docs is not a bad approach. If you want to do this by hand yourself, consider Let’s Encrypt as a signing authority for your SSL certificates.

If you want to learn more, you can read the Read the Docs docs.


  1. But not Cloudflare. I cannot endorse them. 

Let’s talk about Sphinx

Monday, September 11th, 2017

I’ve fielded a few questions lately from folks about Sphinx, reStructuredText, and documentation in the Python world and more broadly. So I thought I’d write up a bit of an intro to how I use and understand these tools, and how better documentation makes me a better developer.

Restructured understanding of reStructuredText

There’s a lot of cargo-cultish behavior and confusion around writing Python docs, and I think that a lot of this comes from seeing some Sphinx documentation, finding something curious in the source, and then not being able to learn what it is or where it comes from. The first source of confusion, I think, is between Sphinx and reStructuredText. Here’s the distinction: reStructuredText is a markup language (with a processor implemented in the docutils package), and Sphinx is a documentation builder or compiler, that takes a bunch of reStructuredText files and makes them into a whole suite of documentation.

Since reStructuredText is in some ways the basis, let’s start by talking about it. It’s a fairly straightforward markup language. If you’ve used Markdown before, it may be just similar enough to cause confusion. Headings are indicated with a series of =, -, or ~ on the following line, paragraphs are broken with two newlines (a single newline in a paragraph is ignored), italic is indicated with surrounding *...*, bold with surrounding **...**, and fixed-width with surrounding `...`. Lists have leading - or *. All broadly familiar.

(There are fancier things, like links and tables, but you can look those up yourself when you need them.)

Now, there are two things that reStructuredText supports that make it particularly useful as a basis for Sphinx: directives and roles. These are custom bits of markup that you can use to define special structures in the abstract representation of the document, and emit into the final compiled document in whatever way needed.

Roles are inline, and generally look like this: :role:`text`. The role is whatever the name of the custom role is (ref, class, index, or weirder things), and the text is whatever text gets wrapped in that role.

Directives are block-level, and look like this:

.. directive_name::
   :directive_arg1:
   :directive_arg2: value

   Some text that gets the directive applied to it.

They can have some keyword arguments (in the immediately following lines, with : surrounding) and a whole block of arbitrary paragraphs following.

There’s a small set of roles and directives included in reStructuredText itself, and a larger set in Sphinx, and vast wide world of third-party ones. Some builtin ones include directives for sidebars, warnings, topics, and roles for things like links.

The Sphinx’s riddle

So now we can see what’s in a pure reStructuredText file. What’s Sphinx give us? It gives us a whole passel of things, but most importantly it gives us a range of output formats, support for cross-references, and support for pulling documentation information from our Python source.

Sphinx starts with a file called conf.py in the root folder, which sets a number of values. Most crucially, it configures the location of the root reStructuredText file, and sets the theme and options used for the HTML, PDF, ePub, and other output. It also lets you specify extensions and third-party packages that enable more roles and directives.

You obviously don’t want your documentation to exist as one giant .rst file, though, right? Just like you wouldn’t want any webpage to be a single giant page. So you break your docs into many distinct .rst files, and cross-link them. The most crucial form of cross-reference is a .. toctree:: directive in the root .rst file, linking together and ordering all the other .rst documents. But you can also use the ref role to cross-link documents in the middle of a paragraph. For instance, make a reference target using this directive:

.. _some-ref:

# Wotta Heading

This is a section I want to refer to later.

And then in another document, link to it:

So, if you consider :ref:`some-ref`.

And that’ll render like this:

So, if you consider <a href="./some_file.html#wotta-heading">Wotta Heading</a>.

(There’s some magic in there, about how headings are turned into URL fragments, and how cross-linking finds the right file, but I hope the general idea is clear.)

Where do I go from here?

Start a documentation project! Or add docs to an existing project. Here’s a quick recipe:

pip install sphinx
sphinx-quickstart

Now you’ve got a conf.py and an index.rst. Write some docs, run make html, and open _build/html/index.html in your browser.

That’s all! You’ve got docs.

One more thing

Now, personally, I like using the dirhtml builder, instead of the html builder, because I like my URLs to end with a /, not a .html. But that makes it harder to view changes locally. So I made a little tool to serve the docs locally, and while it’s at it, to rebuild them as you change them. It’s called phix, and you can get it from PyPI. Assuming you’re running Python 3, just run pip install phix, and then run phix in the root of your documentation project. You can then see your docs at http://localhost:8000/ as you work on them. Enjoy!

Next steps

This is probably the first in a series. I’ve got a lot of other things I’d like to write about on this subject. I’ll link them here as I write them:

  • getting your docs on Read the Docs
  • building docs for different targets
  • writing custom Sphinx themes
  • writing Sphinx extensions
  • why you should try documentation-driven development

Craft and writing prose

Monday, March 28th, 2016

Prose is important; even if you’re writing for no one else, you’re writing for yourself in the future. The best developers I know write a lot of prose, both documentation and commit messages and comments. The value is not immediately visible, it won’t make your tests pass, but it actually is part of keeping technical debt low; some tech debt comes from not knowing in detail what a piece of code does. Good docs and comments and commit messages let you know at least what it’s supposed to do.

So, prose is a part of the craft of software development, but not just any prose: particularly technical prose, explanatory, lucid prose. And it’s not something we, as developers, are usually taught.

We’re, generally speaking, a group that’s pretty good at self-education, though. We pick up new languages, frameworks, protocols, and principles. So not only must we do this, but we can do this. It’s hard, because it’s big and fuzzy (there are no tests that can tell you if your prose is correct), but still, we can do it.

This post is just an exhortation, not a guide, though. The subject is too large to cover with a simple trick or a general principle. So, how do you begin to write better? First, write more and read more. Second, get your stuff edited.

Thanks to Ryan and Owen.

Programmers aren’t special

Monday, March 21st, 2016

Listen to craftspeople of all sorts. Learn from outside the bubble. We are not magical. What we do is not magical. It has some cool properties, so do other things. Learn.

(Yes, this means learn about how writers write. Yes, this means learn about how carpenters carpent. How psychologists psychologize. How baristas bar. How sailors sail. All of it.)

Embellish later

Monday, March 14th, 2016

So, @wholemilk said something I liked:

Yeah. But sometimes that’s hard. Why?

I am a big believer in the value of considering the emotional landscape of any labor, but in my day to day, that means mostly programming.

When you have something that’s working but still incomplete, it can be a big emotional effort to break it so you can start to move it towards the next stable point, the next point at which it’s working(-but-incomplete).

It’s like editing a document: partway through applying edits, you have broken sentences lying around, half-formed ideas not fully explicated, and the thing’s a mess. Sometimes that even shows through, if you fail to clean up after a change to the first half of a sentence leaves the last half incoherent.

It is similar with code.

Fundamentally, this is why source control matters. It is a tool to help with the emotional labor of breaking your work. If you know you can always go back to a stable point, swinging out into the void for a bit becomes exploratory, not all-or-nothing.

Code Folding

Monday, December 7th, 2015

Nick made a comment about code folding, and I promised him an essay. Here it is. It’s not very long.

I like code folding. When I know a mature project well, and the modules are kinda big, I want to open a file and not be distracted by seeing things I’m not working on.

I dislike code folding. When I don’t know a project well, I want to be able to see everything without having to drill down into it. When I do know it well, I want the modules to be small enough that there’s just one topic in each file, and so opening a file does the thing that opening a code fold is supposed to do.

So, the second case is ideal, but also idealistic—and we don’t live in an ideal world. Sometimes there are other pressures against breaking a project into a billionty well-named modules, too, and you have to live with the fact that a given module might have some variety of content.

Code folding, like window focus, should ideally follow my unknown innermost desires—the computer should always know just what I want without my having to tell it. Alas.

The Historiography of Git, or How I learned to stop caring and love writing history

Monday, November 30th, 2015

This post was occasioned by hearing about the experience of pairing with Nvie, and then talking with Owen about how he does similar things.

Git is notorious for allowing you to rewrite history, which rubs some people the wrong way, but which I and some others think is actually pretty neat and useful, if you don’t abuse it. From what I hear, when Nvie is busy coding, his commit messages are just about “foo” or “wip” or whatever, and it’s only after he reaches a good stopping point that he goes back, rewrites history to say something meaningful, and commits. As Owen said to me when we were talking about this, he does something similar, but has had a hard time explaining what utility he gets from it to others. So I promised I would try to explain it.

I don’t think the utility of this became clear to me until I started working on Hexes (which is dormant until I figure out a good way to write a testing framework for it—ideas welcome!). Every time I would take it out to work on it, I would bring it to a stable point, then commit. When I wanted to work on a new feature, I would necessarily break things that were working, for a while, and having good version control practices let me move forward with confidence, knowing I could roll back if I had to. (At some level, obviously, this is the or a purpose of version control, but the realization hit me deeply on that project.)

But the next step was to be able to roll back not to just the last stable version before I started work on this feature, but to the quasi-stable points between the last and the next fully stable points, where I had expressed an idea but not worked out all the kinks in it. And so, clearly, that’s what other commits are for. So why not make them “real” commits?

Because real commits take too much time and energy. They pull you out of the thought process you are not yet done with and demand you shift from “writer” mode to “editor” mode. You just need a quick hand-hold to let yourself move forward, not a full belay anchor. (NB: I do not rock climb and have no idea what I’m talking about there.) You should be keeping your attention on the code and the architecture, and not the individual anchor points. On the flow.

But why do you then go back through and clean up your commits? Isn’t that a lot of work? Yes! But it also has a lot of value. It’s a communication tool, and even if you’re on a team of one, you will eventually be communicating with yourself in the future. If you don’t ever review your git history, and don’t think you ever will, well, I hope you’re wrong. And if you’re not on a team of one, know that anyone who looks at your PRs will probably step through each commit and try to see what you intended to do there. Leaving your commit history as a series of WIPs is like a craftsperson who keeps their bench an utter mess—it shows that you haven’t yet learned how to work on a team, and that you probably don’t understand the cost you incur yourself that way because you’ve become accustomed to it.

This is actually writing history, not rewriting it. History is not a series of events, but the interpretation you impose on it after the fact. Imagine what a disaster it would be if “history” books were just enormous piles of primary sources with no analysis, synthesis, cross-referencing, or organization. To put it another way, when you rewrite your commit history, you are doing the job of a historian, making sense of and imposing structure on the events that happened in a given period.

Of course, before you do any of this, get real comfortable with git. Worrying about losing work is not worth the value this gives, but if you’re worrying about losing work with git, you are, and I say this with love, still using it at a novice level. You should never have a tool as central to your work as a version control tool make you worry about loss. Understand it more, that’s your next task.

Maybe this’ll at least make you pause and think for a moment about the value this approach can offer. I’m not at all sure it’s convincing, but I tried.

Thanks to Ben Warren and Owen Jacobson for input on this post, and Jonathan Chu for the initial inspiration. Ben says, of arguments about git:

I don’t understand most people when it comes to git. It is a time-travel system, the end. Quibbling over the rightness of time travel when you are using a time machine seems like missing the point.

Oh-my-zsh, virtualenvwrapper plugin, errors when you cd

Wednesday, June 24th, 2015

(Because a few people have been having this problem.)

Have you recently updated oh-my-zsh on your OS X install? Do you use the virtualenvwrapper plugin? Are you seeing “workon_cwd:6: command not found: realpath” whenever you try to cd?

There’s a simple fix: brew install coreutils. That will provide the realpath command.

Git Shortcuts

Saturday, May 23rd, 2015

I talked about how I use git. Let me talk about how I actually use it.

I have an extensive [alias] section in my .gitconfig. Any sufficiently frequently used command gets abbreviated to two (or occasionally three) characters.

    st = status
    ci = commit
    co = checkout
    cob = checkout -b
    di = diff
    amend = commit --amend
    ane = commit --amend --no-edit
    aa = add --all
    ff = merge --ff-only
    rup = remote update --prune
    po = push origin
    b = branch -vv
    dc = diff --cached
    dh1 = diff HEAD~1

So in actual day-to-day usage I’ll type g st and g b just reflexively, while I’m pondering where the code is at (depending on whether I’m thinking about code to stage or feature branches), and g aa && g ci when I need to commit the current working tree verbatim, and g rup && g ff when I need to make a local read-only branch match its upstream.

Really, the key here is to pay attention to which commands you use frequently, and enshrine them in your aliases. Also, clean out your aliases sometimes if your behaviors change and you stop using certain commands.

How I git

Saturday, May 9th, 2015

Git is not a version control tool, right? It’s a graph-manipulation tool that you can use to support version control methodologies. So this is how I use git to practice version control.

I’m going to be very explicit throughout this, using long forms of git flags and commands, and avoiding many shortcuts that I actually use in the day-to-day. I’ll write a follow-up with those shortcuts if I get the chance.

If you take nothing else away from this, at least know that git pull is terrible and should be avoided.

The Setup

My use of git is particular to the context in which I use it. That context is the green and lush valley between the mountains of GitHub and Heroku. GitHub provides a web-accessible record of the state of my development on a project, and Heroku provides a target I can deploy to with git.

The GitHub side is a bit more central to how I use git, so we’ll focus on that. Under most circumstances, my git-world looks like this:

github.com:Some/Project.git <----> github.com:me/Project.git
                                        ^
                                        |
                                        v
                                   localhost:me/Project.git

That is, there’s a project I’m contributing to (either as a primary collaborator, or just an interested citizen of the open-source world), and it’s on GitHub. There’s my fork of it on GitHub. And there’s the working copy on my local machine.

The general flow is like this:

  1. Changes I make locally get pushed to my fork.
  2. Changes in my fork get pull-requested to the original project.
  3. Changes in the original project get fetched-and-fast-forwarded in my local copy, then pushed to my fork.

To talk about how I do this, we’ll need to talk about the kinds of objects that I keep in my mental model of my git-world.

The Tools

First, there are a few tools that are intrinsic to git and GitHub:

  • remotes
  • branches
  • pull requests

I augment these with some further categories that exist only in my head:

  • upstream remote and origin remote:The original project’s remote I always call upstream (this collides a bit with some other other git terminology, but it’s not been confusing so far), and my fork I always call origin.
  • read branches and write branches:Related to the point above. Some branches are local copies of information on upstream, and they are read-only: I never commit to them. Other branches are local copies of information on origin, and they are writable: all my commits go on these branches, and get pushed to origin.The read-only branches include master. I never commit on master, only on write branches, which make their way back into master eventually.This distinction is kinda crucial, as it helps me avoid merge bubbles and confusing history states.

For the most part, all my write branches are based off of master, which is in turn tracking upstream/master. Every once in a while, I will have a branch based off of something else. For example, say that I am working on contributions to a feature branch a friend is working on. In that case, I add one more remote (beyond upstream and origin) to track their fork on GitHub. I update my remotes (see “The Commands” below), and I make a local read-branch that tracks the branch on their remote that I’m working on. I then make a write-branch based off of that local branch to work on.

All of this is in aid of one of my fundamental principles: updating tracking information and updating branch state should be clearly separated activities.

(As a side note, this is why I think git pull is toxic; it combines two operations, first a git fetch, which updates some of your remote tracking information, and then a git merge, which may be a fast-forward merge, but may as likely introduce a merge bubble, making the operation hard to cleanly reverse.)

When my friend has updated their branch on their fork (by merging my code, or by adding some of their own, or even by doing the impolite thing and rewriting history on that branch), I can update my remotes, see how different my local read-branch and their fork’s version of the branch are, make smart choices about what to do, and if all’s clear, hard update my read-branch to match their version. Then I can repeat that process with my branches off of it: I will have the ability to see the state of the differences cleanly, and not have to awkwardly back out via the reflog.

The Commands

OK, enough of me pontificating, you just want to know what git commands to run, and damn the torpedoes, right? Well, that way lies pain, so do take the time to understand what git is doing to the commit graph, but. Here’s what it looks like for me:

git clone git@github.com:wlonk/SomeProject.git
cd SomeProject
git remote add upstream git@github.com:TheOriginalFounder/SomeProject.git
git remote update --prune

At this point, I have two remotes (as per figure 1), and all local information about them is up-to-date

git branch some-feature-branch master

Now I’ve made a branch that I’ll use as a write-branch; the authoritative copy of it is here on my local machine. It’s tracking my local master, too, which is, in turn, tracking the upstream’s master. If you want to automatically make your new branches track the branch you’re on when you start them, set git config branch.autosetupmerge always. See http://grimoire.ca/git/config for more good-but-nonstandard git configs.

git checkout some-feature-branch

And now I work on this branch! Work work work, commit commit commit. Hm, maybe I want to clean up my history. OK:

git rebase --interactive  # Explaining how to use this is out of scope
git push origin

Oh, wait, there’s some upstream work I want to incorporate. It’s in the upstream’s master, and my PR to upstream hasn’t yet been merged.

git remote update --prune
git checkout master
git merge --ff-only upstream/master

If the above fails, stop, look around you, and then calmly make good choices.

git push origin master

Cool, now my local master and my fork’s master both look just like upstream’s master

git checkout some-feature-branch

Let’s just replay the work in this branch onto the new master:

git rebase

And we can deal with conflicts as they arise. And finally, we have to force-push (generally considered bad, so be careful!) to rewrite history on our remote:

git push --force origin

And remember, if all else fails:

Call the reflog!