Making GitHub Art

The contribution heatmaps on GitHub profiles are interesting. Although they are intended to be passive data visualizations, they don’t have to be. Specifically, they can act as a 7xN pixel –very slowly– scrolling display. After realizing this, I decided I had to do something to shape the blank canvas that is my GitHub commit log.

“An artist is somebody who produces things that people don’t need to have.”
― Andy Warhol

The plan

Ostensibly, it should be pretty straightforward. The color of each cell of the heatmap is based on the number of commits made that day, so one just needs to automate the appropriate number of commits per day to get the desired shading. For simplicity, I decided to start by using the darkest shade possible to build some text.

The execution

And to be honest, it pretty much was that simple. The most difficult part was finding a Python library to automate the git commits. Many StackOverflow discussions essentially suggested rolling your own functions because it is relatively simple and flexible. Had I been building something I cared about more, that might have been the way to go, but I was determined not to spend more than a few minutes on this project and I didn’t need a lot of flexibility. I really wanted to find something off-the-shelf with good documentation.

Connecting to GitHub

I tried a few valiant entries into the Python/GitHub API space, but what some lacked in functionality the others lacked in documentation. Finally, I tried github3.py and found the right mix. Without too much trouble, I was able to automate connecting to GitHub and making commits. After a little research it looked like ~40 commits per day would be enough to keep the color scaling the way I wanted it.

There is a link to the GitHub repo at the end of this post. These are the main functions for connecting and committing to GitHub:

from github3.py import login
import time

# Login helper
# Comma separated credentials are stored
# in the first row of auth.csv.
def github_login():
    with open('auth/auth.csv', newline='') as f:
        text = csv.reader(f)

        for row in text:
            user_name, password = row

    session = login(user_name, password)

    return(session)

# The function that submits the commits.
# The number of commits should be set to
# something quite a bit higher than your
# normal number of daily commits. Changing number_of_commits
# may also require changing sleep_time
# so that things still complete in a reasonable
# amount of time.
def do_typing(num_of_commits=30, sleep_time=20):

    me = github_login()

    repo = me.repository('your_github_username', 'GitHubTyper')

    for i in range(num_of_commits):
        # Create a file
        data = 'typing file'
        repo.create_file(path = 'files/dotfile.txt',
                         message = 'Add dot file',
                         content = data.encode('utf-8'))

        # Get the file reference for later use
        file_sha = repo.contents(path = 'files/dotfile.txt').sha

        # Delete the file
        repo.delete_file(path = 'files/dotfile.txt',
                         message = 'Delete dot file',
                         sha = file_sha)

        time.sleep(sleep_time)

Translating letters to useable format

With a way to connect in hand, the code needed to know when to connect. Basically, I needed an on/off switch for every day represented on the heatmap. If the switch is on, the committing function should run, making the cell dark. If it is off, the committing function shouldn’t run, leaving the cell gray (or close to it, depending on what other commits are made that day).

Since we’re using the heatmap to display text, a matrix-based font seemed to make sense. If you’ve seen dot-matrix font styles, these will look familiar. Each position in the matrix corresponds to a day on the heatmap. I used values of ‘1’ and ‘0’ to indicate on and off days, respectively. (And technically these are lists, not matrices, but they are laid out like matrices to make them easier to create.)

As an example, here is the setup for the letter ‘A’:

letters_dict = {
'A' : [0,1,1,1,0,0,
       1,0,0,0,1,0,
       1,0,0,0,1,0,
       1,0,0,0,1,0,
       1,1,1,1,1,0,
       1,0,0,0,1,0,
       1,0,0,0,1,0],
...
}

These matrices are time consuming to create, so I’ve only created the few that I needed. If you make more feel free to send them along via pull a pull request.

Automating and scheduling the runs

Now that I had a way to do commits programmatically and something to commit, I needed a way to schedule the Python script to run at the appropriate time. The ultimate goal was to be able to tell the script what I wanted to do at the beginning and have it run unsupervised for a few weeks until it completed.

This was achieved with PythonAnywhere.com and a little bash script. Python Anywhere is a Python-oriented hosting environment. Among many other things, it can be used to schedule Python scripts to run at certain times of the day. A free account allows one daily task and http calls to a whitelist of domains. Fortunately, one task is all we need to run and GitHub.com is on the whitelist.

After uploading the Python code, I created a really simple bash script that calls the main Python script and is scheduled to run daily:

#!/bin/sh
python3.5 GitHubArt/main.py '$echo "Hi"' '2016-07-31'

And that’s all. The first parameter is the message to display on the GitHub heatmap, the second is the date on which to start typing. Since the GitHub heatmap starts with Sunday at the top, this date should also be a Sunday.

Philosophical Implications

I took a very obvious approach for someone with no artistic talent – I am using this functionality to print out a *nix command. Christo would not be impressed. Honestly, though, the prospect of using this to make art seems really cool. Given that the intensity can differ for each cell in the heatmap, it is essentially as versatile as a grayscale palette. If I had an artistic bone in my body I might give it a shot. For now, I’ll just use text and appreciate my simple creations as a Buddhist would, for their intrinsic and ephemeral beauty.

You can find all the project files here:
https://github.com/bryancshepherd/GitHubArt

Setting up Bluehost DNS for a GitHub Jekyll blog

Update 7/13/2017: I’ve since switched back to a WordPress-based site. Although I liked the clean, light feel of the Jekyll site, adding content took too much time. Since WordPress now handles LaTeX and Markdown pretty well and has many other benefits, I find I’m much more likely to keep it updated. I’m leaving this article for posterity.

When switching to a Jekyll/GitHub-based blog I found the available directions for setting up the DNS rather unhelpful. In Googling for additional information, many of the top search results contradicted each other or were out of date. Additionally, there was nothing specifically related to my hosting provider – Bluehost.

For posterity, here is how to set up Bluehost’s DNS settings to point to a GitHub blog served from the default user page repository, i.e., <username>.github.io. This information is current as of 5-27-2015.

Edit your CNAME file

Setting up the user page repository is out of scope here, so I’m assuming you have done that already. If not, there are a number of Google results that will help you with it.

First, decide whether you want the URL displayed in the address bar to include ‘www’ or not. This will determine what you need to change on the Bluehost side. There is a whole technical sidebar discussion and set of jargon that we’ll avoid here (and that I am not an expert in), but unless you know you want a certain version, you’re fine choosing based on preference.

Once you’ve decided, edit the CNAME file in your <username>.github.io repo to match your decision. If you don’t already have this file, you will need to create it. This file will have only one line, either:

www.yoururl.com

or:

yoururl.com

Edit your Bluehost DNS setting

After editing your CNAME file you need to change your DNS settings in Bluehost by follwing these steps:

  1. Log in to your Bluehost account
  2. In cPanel go to ‘DNS Zone editor’
  3. Select the appropriate domain from the drop down menu
  4. In the ‘Add DNS Record’ form do one of two things, depending on your CNAME edit (do not do both!)
  • If you used ‘www’ in your CNAME file
    1. Enter ‘www’ in the ‘Host Record’ field
    2. Select ‘CNAME’ from the ‘Type’ drop down
    3. Enter your GitHub user page repo name, e.g., <username>.github.io in the ‘Points To’ field
    4. Leave all other fields at their defaults and click ‘add record’
  • If you did not use ‘www’ in your CNAME file
    1. Under the ‘A (Host)’ DNS settings, delete any existing ‘@’ host records
    2. In the ‘Add DNS Record’ form, enter your URL, without ‘www’ in the ‘Host Record’ field, e.g. ‘yoururl.com’
    3. Leave (or select) ‘Type’ as ‘A’
    4. Enter the first GitHub IP address 192.30.252.153 in the ‘Points To’ field
    5. Leave all other fields at their defaults and click ‘add record’
    6. Repeat to add another GitHub IP address 192.30.252.154

After you’ve made one set of changes above, give them a few hours to propagate. You can check the status at What’s My DNS

Creating your personal, portable R code library with GitHub

Note: I sold ProgrammingR.com and its online assets in 2015. I’m posting this for posterity, but no longer control access to the GitHub account.

As I discussed in a previous post, I have a few helper functions I’ve created that I commonly use in my work. Until recently, I manually included these functions at the start of my R scripts by either the tried-and-true copy-and-paste method, or by extracting them from a local file with the source() function. The former approach has the benefit of keeping the helper code inextricably attached to the main script, but it adds a good bit of code to wade through. The latter approach keeps the code cleaner, but requires that whoever is running the code always has access to the sourced file and that it is always in the same relative path – and that makes sharing or moving code more difficult. The start of a recent project requiring me to share my helper function library prompted me to find a better solution.

The resulting approach takes advantage of GitHub Gists and R’s ability to source via a web-based location to enable you to create a personal, portable library of R functions for private use or to share.

The process is very straightforward. If you don’t have a GitHub account, getting one is the first step. After doing so, click on “Gist” in the menu bar to create a new Gist. (There are a couple of reasons for using a Gist instead of a formal repository. For one, repositories cannot be made private unless you have a paid GitHub membership. Gists, however, can be created as “Secret” and therefore available only to people who have the exact URL.) Name your Gist, add your code, and choose whether you want it to be a secret or public Gist to save that revision. After saving your Gist you will be taken to a screen displaying your new Gist.

Now that you’ve created your Gist, you just need to source it in your R code. That step is easily done via the source() function. Rather than including a file path as you would with a local file, you simply include the URL to your Gist. Note that you can’t just copy the URL of the GitHub page you’re currently on. You need to source the URL of the raw code. To get that URL, click on the <> button on the top right of your Gist (circled in red below). This will take you to the raw code for the current revision. To get the path that always points to the most recent revision, remove everything in the URL after “/raw/”.

GitHubGist

Regardless of which version you link to, you’ll still be left with a long URL that includes a randomly generated alphanumeric string that you don’t want to have to try to remember. Here’s a protip, use a URL shortener that let’s you define a custom URL (e.g., tinyurl) to create a shortened version that is easy to remember. Something like the little library at http://tinyurl.com/ProgRLib (YES! You can use and add to this library!).

Now you’ve got a portable, easy to reference, personal library of functions that you can easily include in your code or share with others.