Coding while hiring

If you’re hiring people to write programs, you should work on a program with them.

This advice is nothing new but is too easy to ignore or botch. Switching my interviews to live coding on a real program is the single biggest hiring improvement I’ve ever made. It was helpful enough that I don’t think any of these concerns should stop you from doing the same:

  • Should there be a written test?
  • If so, how tricky should we make it?
  • How many days do they need to complete it?
  • Should they do it on a whiteboard to prove how good they are with markers?

As an engineering team, we’ve worked very hard to develop a culture that rewards programmers who learn to understand the business and our customers, and who make good choices about the right code (and quality) for the problem.

A lot of advice about coding tests comes from companies very different from ours. Instead of working on problems at the bleeding edge of distributed systems or category theory, we work on websites that delight and inform other programmers (and regular folk) with books and videos. That means that most of what I have read from others about interviewing sounds completely contrary to what we do. Instead of favoring a candidate that can plow through a ton of homework, I want someone with enough experience to ask the right questions and challenge shaky assumptions. Instead of picking someone who can write algorithms on the board, I want someone who can use their tools effectively and has great habits.

All this distaste for traditional interviewing methods meant that I was ripe for a change when I sat in on a CTO roundtable at an unconference a few years ago. The theme that emerged was hiring and I was fortunate enough to hear a few very good folks talk openly and honestly about the techniques they used. One shared some very clear lessons about how to do coding interviews and out of that I’ve built my own approach. Here’s the mechanics behind it.

Continue reading

A style guide for Python tests

Python programmers are fortunate to have a clear, reasonable style guide in PEP8. While PEP8 is widely followed by professional and amateur Python programmers, there’s no widely adopted equivalent style guide for testing in Python.

To help our own team improve code reviews and train newcomers (both to the company and to Python), I’ve started a draft style guide outlining how we write tests for Python code at Safari. It is meant to be expanded and refined over time.

Here’s the outline of the draft style guide (the complete version is available at https://github.com/safarijv/python-testing-style-guide). What would you add or remove?

Continue reading

Node.js is wonderful for robots

I was having a problem.

We use JIRA for ticket-tracking. It assigns each ticket a project code and ticket number, or key. I had been encouraging the team to refer to specific tickets in our chat room when discussing what they were working on, versus, “Hoping to finish up the thing with the error on the page.” My problem was that they started actually doing this, but there were now so many people collaborating that we had no idea what any individual “issue key” meant.

Them: Hey, Keith. I was noticing that there
  are a few dependencies linked to ZOMG-1337. 
  How's it going on BLOOP-9234?

Me: Uh...

We needed someone to help us with the menial task of looking up the associated JIRA ticket and summarizing the current status in all of our chatrooms. This sounded like a job for a robot. What surprised me was that it was also a perfect job for something other than Python, specifically Node.js.

An automated notification from GitHub triggers a response from jirabot

Continue reading

Conference-driven development

TL;DR — Release products at conferences. The products will be better and you’ll be happier.

Test-driven development is a technique that helps programmers build large applications from small, working components. It has been successful enough to unlock developers’ innate love of acronyms, ranging from ATDD and BDD to MDD and UGG. TDD is important in the industry because it forces a mental shift inside the programmer’s mind. Like most humans, programmers are all too willing to succumb to really lame brain bugs. Instead of falling for the trap of designing and implementing a grand cathedral in one single volcano of brilliance, TDD focuses on a continuous stream of achievable, minor, functional bricks. Conference-driven development offers similar rewards for virtuous choices, but works for the whole product development team rather than just programmers.

Continue reading

Wordless programming

Over the holiday break, I re-read Andy Hunt’s Pragmatic Thinking and Learning on my phone. I had started it mainly to force myself to re-evaluate the iBooks reading experience, but quickly became immersed (again). The book makes an informed, but opinionated, introduction to brain architecture, learning theory, and neuroscience. The compelling central theme is that knowledge workers, programmers in particular, are hopelessly bad at using their minds.

As a parent of two young children, hopelessness is something I’m acutely familiar with. From the outset, a decidedly verbal father like myself is forced to communicate with his kid using a range of awkward and non-verbal techniques. It sucks. Even as they’ve grown older and grasped language, I’ve continued to be astonished by how ineffective words are at teaching key behaviors. “Go back to sleep” is easy for an expert adult to say, but nearly meaningless to a kid. Similarly, they need a different tool than language to distinguish between their “whiny voice” and a tolerable one.

Words fail us

Hunt argues that this blindness about the limitations of language is a particular weakness of programmers, who are tremendously attached to representing everything verbally. He recounts a story from the Inner Game of Tennis about teaching an older neophyte:

The next exercise was to listen to the sound of the ball hitting the racket. If you’ve never played, the ball makes a particularly sweet, clear sound when it hits just the right spot on the racket. This fact wasn’t made explicit; our student was merely told to listen.

Next, it was time to serve. First, she was to just hum a phrase while watching Gallwey serve in order to get the rhythm of the motion. No description of the movements; just watch and hum. Next, she tried the serve—humming the same tune and focusing on the rhythm, not the motions. After twenty minutes of this sort of thing, it was time to play. She made the first point of the game and played a very respectable, lengthy set of volleys.

It is easy to dismiss this focus on sound and movement as irrelevant to programming, but that’s the trap. What non-verbal tools do we ignore as we collaborate and teach?

Programmer pedagogy is terrible in general, so it’s obvious to start there. What would it look like if we showed, in the most literal sense, a learner good testing, how to search for a bug, or planning a requirements doc? This is a part of the draw of pair programming, but we rarely reference or emphasize any non-verbal elements.

Pictures in particular

As a programmer who never ever uses UML, I am often surprised how attached and excited my non-programmer colleagues get about a diagram of a software system or process. There’s a whole lot packed into one of these sketches even without the words:

A complicated diagram from a whiteboard (but without labels)

A diagram I actually drew on the whiteboard for colleagues, with the labels removed.

What would change about internal communication if we spent as much time drawing a new icon to represent a new project instead of arguing about the name?

A pictorial representation of the sounds over a telephone line that start a modem connection

A picture of the sounds required to start a modem connection, by Oona Räisänen.

Many HTTP APIs are too “chatty.” How would our API design change if we drew a picture of the desired interplay without ever writing a line of code?


There has to be more opportunity for non-verbal thinking than just images. Is there an opportunity in expressing the auditory layer of programming? The movements? It all sounds terribly New Age, but then I remember trying to talk to a six-week old kid.

Building Distributed Teams: Driving meetings with Google Docs

Over the last few years, Liza and I have had the pleasure of building an ever-expanding engineering team. We’ve managed to find great people from across the country and were 100% distributed & office-less until a few folks moved into the new Safari office in Boston two months ago. Because our team was remote by rule rather than by exception, we’ve been forced to develop a culture that exploits new tools whenever they can help us cooperate and collaborate. One particular habit we’re fond of is running meetings through Google Docs.

As is typical with developers, we are not generally fond of meetings, especially recurring meetings, so we have tried to distill them into their productive, fundamental essence1. While our meetings are still far from perfect, I think we’ve developed some conventions worth sharing with other distributed teams.

One minutes to rule them all—in real time

Google Docs sometimes makes it too easy to create and share new documents, so the first lesson is to fight against this: use the same Google Doc for the same meeting week after week and write it during the meeting. This practice is laughably simple, but it removes the biggest threats to useful minutes:

  1. the attendees (claim) to not know where the minutes were/are
  2. the minutes feel worthless because they’ll be ignored forever after
  3. the attendees (wrongly) feel that someone else will write the minutes
  4. the attendees claim the minutes did not accurately capture the discussion

To set this up, just use an extremely clear title for the document (“{Team} {Purpose} Rolling Minutes”) and then add the new minutes at the very top for each meeting. Make it clear from the very first meeting that everyone is expected to help write the minutes in real time (plant some willing collaborators beforehand, if necessary).

It’s worth rotating to a new document every 6–12 months or Google Docs will be crashy.

Collect agenda beforehand (the “Pending” bucket)

Once you’ve established that the same document will be used for each meeting in the series, it is time to start turning the rest-of-the-week time into a lever that makes the meeting itself shorter. We keep an (empty) bulleted list under a Pending heading at the top of every minutes document. The Pending list gets filled by folks as something occurs to them throughout the week. In more extreme cases, the Pending list must be filled beforehand or the meeting itself is summarily canceled (depends on the meeting). Developing the Pending list asynchronously can also make it easier for less outspoken people to make sure their topics get some space in the larger forum.

Establish a repeating structure

While it’s easy to screw up, a carefully crafted meeting structure can help everyone understand when they’ll be actively participating and when the thing is nearly done. The problem is that you have to frequently evaluate the structure to make sure it still actually helps the team communicate rather than being wasteful boilerplate.

Our “big” meeting looks like:

PREVIOUS ACTION ITEMS
(social pressure to finish what you promised)

CURRENT WORK
(*extremely* short Before/Now/Next updates from each person
 don't skip this, as it gets every single human to actually say words at each meaning, 
 forces people to write down what they did [more social pressure],
 and establishes a basic record of what the team was doing in any given month)

DISCUSSION
(a heading for every item that was on the Pending list, plus anything emergent)

ACTION ITEMS
(every time a meeting ends without meaningful tasks assigned to specific humans, a kitten loses its wings)

A repeating structure also helps answer questions about who promised what. You just go down far enough to find it in the expected place in the minutes of a previous meeting. While this is a simple act, doing it consistently makes it clear that the minutes serve a purpose and that the each member of the team is accountable.

Force collaboration and attention through humor

The biggest benefit of cloud-based minutes is the opportunity to use a meeting as a way to help the group gel a tiny bit more, week after week. For distributed teams, the chances for true collaboration and team-building are already extremely limited, so we take whatever we get. Specifically, I want to use the minutes as a tool to have the team:

  • see each other actually (visibly) contributing to a shared project
  • laugh with each other
  • pay attention

A screenshot of a Google Docs document with humorous images and silly fonts

To achieve these goals, we need only two things: silly cat pictures, collaborative authoring. When the team knows that their colleagues are humorously defacing/lolcatting their section of the minutes, ignoring the document is nearly impossible (we don’t actually use Comments in minutes as much as you would expect). Juxtaposing the boss’ description of a particularly rough moment in the release process with a sad panda provides a rare moment to let off steam for people who almost never see each other face to face. And watching a document being written and edited by 5–10 people at once is really quite enchanting.


1 Andrew, our CEO, bought a huge stack of these and forced us to read them, before, ahem the next meeting.

Joining the IDPF Board

I am pleased to be joining the Board of Directors of the International Digital Publishing Forum (the IDPF). As I suggested in my nomination statement, I will try to focus on three specific goals:

  • Developing clear documentation and best practices to ensure that reading systems consistently implement the technical capabilities of EPUB 3 to achieve a common, interoperable experience
  • Promoting tools and techniques that allow digital publications to be accessible to readers across a range of cultures and reading abilities
  • Broadening both the technical contributors’ and IDPF’s own leadership to include and better serve the international publishing community outside of Europe and North America

It is humbling to be able to join such an accomplished group of contributors, and I hope we are able to serve the members while being receptive to critiques and contributions from the wider community.

Oration: A tiny tool for HTML from Google Docs + tweets

Lighter-weight manuscripts were one of the big ideas from the Books in Browsers 2012 talk that Liza and I presented last month. In our particular case, we focused on the combination of voice recognition and wordprocessor-free authoring, but this is really part of a larger trend which Peter Brantley captured: an “explosion of new services, spreading across many niches of story-telling that never before were beneficiaries of Internet technologies.” Although Google Docs are now more than five years old, the publishing community has not fully grasped the disruptive potential of writing in a world that doesn’t worry about files or formats, where sharing is native and painless.

Our talk presented a very simple demonstration of that disruptive potential by stringing together basic tools for making manuscripts (Dragon Dictate, Google Docs), editing (edits in Google Docs), and commenting (Twitter #hashtags). In the hope that these building blocks might spur more interesting work, I’ve released the code behind one piece of our talk, a tiny project-let called Oration.

Oration transforms a series of Google Docs inside a folder into a presentation with static HTML in the center, Google Doc comments on the left, and Tweets matching a hashtag on the right.

Continue reading

Capturing More Authoring: Liza & Keith’s Books in Browsers 2012 session

By Keith Fahlgren and Liza Daly

A major theme of this year’s Books in Browsers was authoring. Liza and Keith have been trying to move our thinking about digital books beyond the low-level plumbing of files and formats, so we focused on what authoring will look like when files are irrelevant, distribution is seamless and transparent, and voice recognition is mainstream. What we (almost!) pulled off was a demonstration of a new mode of writing:

  • creating manuscripts via voice recognition and Google Docs
  • distributed editing via Google Docs and Google Docs comments
  • collecting marginalia via Twitter

You can watch all of this mostly happen, then totally fall apart during the live-demo, which we were “fortunate” enough to have recorded and preserved as a video:

(In fact, the software worked but it relied on Github Pages to post the output; it seems that we triggered some kind of traffic throttling system as our code rapidly posted update after update. We sincerely appreciate the audience’s good humor throughout.)

Streaming authoring: a demo

The actually functioning self-generated, self-published, live-annotated transcript of our talk is now available. It’s worth reading separately from this post.

A three column version of the talk transcript, with specific annotations from Google Docs on the left, the actual captured content in the middle, and tweets on the left

The vision

Our fundamental idea is that a new ecosystem of tools – like Google Docs, social media, or Siri – will obsolete the laborious workflow of modern publishing: wordprocessor followed by emails followed by files followed by conversions followed by FTP followed by static, siloed presentation (followed by silence).

Manuscript

The first stage of the new process will be based on markedly simpler tools for creating the rough manuscript. While first drafts are likely to be created with the familiar interface of hands + keyboard, as Peter Brantley remarked at Books in Browsers, “We need new entry points for authoring.” His comment referred to video; our direction was live narration and speech recognition.

In our demo, we captured the transcription of Liza’s conference presentation with voice recognition in real time. Each time Liza switched slides, the slide content and transcript was automatically pushed via the Google Drive API to a folder in Google Docs.

Editing

Live gatherings present an opportunity for a different mode of editing because of the tremendous inefficiency of wasted, uncaptured thinking. A conference like Books in Browser is full – literally – of sharp, thoughtful people who travel great distances to focus their brains on a single topic. To harness some of this brainpower to improve the manuscript, we encouraged the attendees (including remote viewers following the live-stream video) to add comments, corrections, and feedback to each Google Doc slide-transcript. The comments are presented in the pane on the left and editors’ corrections were integrated instantaneously.

Commentary

The final task was to capture a layer of marginalia in the pane on the right. We harvested the ambient and ephemeral twitter stream and rooted each tweet to the exact corresponding moment in the presentation itself. While this is the least deliberate form of creation/editing, it actually worked out well. We’re amazed how thoughtful and complex some of the tweets were, composed in the moment.

“What is this thing called?”

While of course we were disappointed that the demo didn’t quite work, enough people engaged with it that we can’t regret trying something a little out there. As we developed the idea, we found a lot of possible directions for further thought that all seemed interesting.

From what comes a book?
Defining what a book is has become a cliché of every publishing conference, but in this case we really did think about it. Considering every formal or informal talk an opportunity for deliberate authoring greatly expands our capability to create preserved narratives and “books.” This could be a conference, a business meeting, a storytelling session among friends and family, or the inside of a classroom.

The classroom, on- and offline
It’s likely that many, if not most, classrooms are going to be hybrid online and offline experiences. Online participation puts local and remote users on the same footing, and asynchronous commentary means that students who require more time to compose their thoughts get the benefit of “classroom participation.” Is copying down the instructor’s lecture the best use of a students’ attention? How can live transcription, plus peer editing, help students who can’t write quickly, are too easily distracted, or have gotten lost in the material?

Voice is coming
This experiment taught us that voice recognition is at a tipping point. Right now, it’s underutilized by software developers, game-makers, and content creators, but speech recognition (and text-to-speech) will soon be a transformative technology now that it’s become commoditized. Paired with inexpensive mobile technology, its potential reach in the developing world alone is staggering. What does “user interface” and “user experience” mean when voice may be an input or an output?

(While we disabled commenting in the Google Docs to preserve the experiment, we’d love to read further thoughts here.)

Google Apps Auth for Internal Django Sites

Usernames and passwords are lame. Everything that makes them lame on the wider web makes them doubly lame on your company intranet. Here’s how we stopped writing password reset forms.

At Safari, we’ve been trying to make it easier to prototype little applications to show to our colleagues. At the start of the year, we also switched the entire company to Google Apps for Business. While this has mostly been a win for our IT staff and coworkers (with some major exceptions), the range of APIs and developer-focused services provided by Google is tremendous. In particular, it is incredibly straightforward to wire together Django, the django-social-auth package, and Google’s OAuth 2.0 identity service to create secure web applications that are only available to our colleagues. And no more password forms.

API Credentials

The first step in using Google to manage your identities for your Django project is getting API credentials from Google:

  1. Sign into your Google Apps account in your browser
  2. Visit https://code.google.com/apis/console#access in the same browser
  3. On the left menu, Create a new Project
  4. To start, you don’t need any Services, so select the API Access tab rom the left menu and “Create an OAuth 2.0 client ID…”
  5. Fill out the Client ID form for a “web application” and use localhost:8000 as your hostname

Now that you have API Access, you need to Edit settings for the new “Client ID for web applications” you just created. Specifically, you need to enter new “Authorized Redirect URIs” (one per line):

http://localhost:8000/complete/google-oauth2/
http://{dev server}/complete/google-oauth2/
https://{prod server}/complete/google-oauth2/

These are the URLs that Google will return the user to after they have authenticated. Omit the dev server and prod server if you don’t yet know them.

Next, we’ll use those credentials to setup the django-social-auth package, so keep this page open.

Using django-social-auth

django-social-auth is a great package for getting started quickly, in part because it supports a wide range of services out of the box and also because it has detailed documentation.

After you’ve installed the django-social-auth package inside your virtualenv, you need to follow the basic configuration instructions for your Django project. In your settings.py, make sure 'social_auth' is in the INSTALLED_APPS and then run ./manage.py syncdb to get the new tables that django-social-auth requires.

You will also need to add a few more things to settings.py:

# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Django Social Auth Config

AUTHENTICATION_BACKENDS = ( 
    'social_auth.backends.google.GoogleOAuth2Backend',  # putting this 1st means that most users will auth with their Google identity
    'django.contrib.auth.backends.ModelBackend',        # ...but this one means we can still have local admin accounts as a fallback
)

LOGIN_URL          = '/login/google-oauth2/'       
LOGIN_ERROR_URL    = '/login-error/'

SOCIAL_AUTH_RAISE_EXCEPTIONS = False
SOCIAL_AUTH_PROCESS_EXCEPTIONS = 'social_auth.utils.log_exceptions_to_messages'  # ...assuming you like the messages framework

GOOGLE_OAUTH2_CLIENT_ID      = 'yourCLIENTidHERE'  # this is on the credentials web page from above
GOOGLE_OAUTH2_CLIENT_SECRET  = 'YOURsecretHERE'    # this is also on the credentials web page from above
GOOGLE_WHITE_LISTED_DOMAINS = ['your-domain.com']  # this is what actually limits access

SOCIAL_AUTH_COMPLETE_URL_NAME  = 'socialauth_complete'
SOCIAL_AUTH_ASSOCIATE_URL_NAME = 'socialauth_associate_complete'

# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The most important line from the above is the GOOGLE_WHITE_LISTED_DOMAINS. It’s this setting that limits access to users inside your organization.

Views and URLs

Now that we’ve got auth from Google, we need to wire it up. For a normal application, you’ll want to create a typical view and template for logging in, errors while logging in, a logout view (to delete your cookies), and then ensure you are using the login_required decorator or other access control. For this blog post, I’ll just sketch these out, based mainly off your main urls.py:

...
from django.contrib.auth.decorators import login_required
from django.contrib.auth.views import logout
from django.views.generic import TemplateView
...

urlpatterns += patterns('', 
    url(r'', include('social_auth.urls')),                                          # we absolutely need these ones

    url(r'^$', TemplateView.as_view(template_name="login.html")),                   # also fairly important
    url(r'^logout/$', logout, {'next_page': '/'}, name='gauth_logout'),             # this one is nice, but not totally required

    url(r'^login-error/$', TemplateView.as_view(template_name="login-error.html")), # if you've set up messages, you could loop through them here

    # Now we can test whether this stuff works
    url(r'^secrets$', login_required(TemplateView.as_view(template_name="secrets.html"))),  

)

If we set this up and then create a secrets.html in our templates directory:

THIS IS A SECRET!

… and a login.html in our templates directory:

<p>Use your work email credentials to sign in to this application: 
  <a href="{% url socialauth_begin 'google-oauth2' %}?next=/secrets">Sign In</a>
</p>

At this point, you should be able to start the Django debug server with ./manage.py runserver.

Trying it out

If you visit http://localhost:8000/secrets with your browser now, you should be able to try it out. First off, you should not see your secrets yet (even if you are logged in). Before you get there, you need to both be logged into a Google account and grant access to let Google give your identity to this new application. After that happens, Google and django-social-auth will double-check that you are legit and pass the GOOGLE_WHITE_LISTED_DOMAINS. Finally, Google will send you back to the application and in this case you will be redirected to your ?next param.

Do also try it with a personal GMail account to make sure it errors with the login-error.html template. That’s the GOOGLE_WHITE_LISTED_DOMAINS at work.

Gotchas

Well, the most obvious failure is that you use a browser that is logged into your personal GMail rather than your Google Apps for work. The less obvious failure is that this will only work locally if you are running the Django debug server on port 8000 and putting localhost:8000 into your browser (127.0.0.1 won’t work).

It’s probably also worth adding this to one of your loggers inside your LOGGING:

        'SocialAuth': {
            'handlers':['console'],
            'propagate': True,
            'level':'DEBUG',
        }