Posts about technology (old posts, page 3)

In which I discover how Avahi slowed down ssh

This page, at the time of writing, is served by a virtual machine running at linode. Nothing special: just Debian 6. The node is more than just an Apache server: I connect to it via ssh a lot (most often indirectly, with Git).

As long as I can remember, there’s been a noticeable delay in establishing an ssh connection. I’ve always written this off as one of the side-effects of living so far away from the server; but the delay is present even when I’m in the US, and doesn’t affect latency as much once the connection is established.

Tonight, I decided to find out why.

First, I started a parallel instance of sshd on my VM, running on an alternate port.

# /usr/sbin/sshd -ddd -p 2222

Then I connected to this port from my local system.

$ ssh -vvv ln1.civilfritz.net -p 2222

I watched the debug output in both terminals, watching for any obvious delay. On the server, that delay was preceded by a particular log entry.

debug3: Trying to reverse map address 109.171.130.234.

The log on the client was less damning: just a reference to checking the assumable locations for ssh keys.

debug2: key: /Users/janderson/.ssh/id_ecdsa (0x0)

I confirmed that the address referenced by the server was one of the external NAT addresses used by my ISP. Presumably sshd or PAM is trying to determine a name for use in authorization assertions.

At that point, I set out to understand why there was a delay reverse-mapping the address. Unsurprisingly, KAUST hasn’t populated reverse-dns for the address.

$ time host 109.171.130.234
Host 234.130.171.109.in-addr.arpa. not found: 3(NXDOMAIN)

real    0m0.010s
user    0m0.004s
sys     0m0.005s

That said, getting the confirmation that the reverse-dns was unpopulated did not take the seconds that were seen as a delay establishing an ssh connection.

I had to go… deeper.

# strace -t /usr/sbin/sshd -d -p 2222

Here I invoke the ssh server with a system call trace. I included timestamps in the trace output so I could more easily locate the delay, but I was able to see it in real-time.

00:27:44 socket(PF_FILE, SOCK_STREAM, 0) = 4
00:27:44 fcntl64(4, F_GETFD)            = 0
00:27:44 fcntl64(4, F_SETFD, FD_CLOEXEC) = 0
00:27:44 connect(4, {sa_family=AF_FILE, path="/var/run/avahi-daemon/socket"}, 110) = 0
00:27:44 fcntl64(4, F_GETFL)            = 0x2 (flags O_RDWR)
00:27:44 fstat64(4, {st_mode=S_IFSOCK|0777, st_size=0, ...}) = 0
00:27:44 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7698000
00:27:44 _llseek(4, 0, 0xbf88eb18, SEEK_CUR) = -1 ESPIPE (Illegal seek)
00:27:44 write(4, "RESOLVE-ADDRESS 109.171.130.234\n", 32) = 32
00:27:44 read(4, "-15 Timeout reached\n", 4096) = 20
00:27:49 close(4)                       = 0

That’s right: a five second delay. The horror!

Pedantic troubleshooting aside, the delay is introduced while reading from /var/run/avahi-daemon/socket. Sure enough, I have Avahi installed.

$ dpkg --get-selections '*avahi*'
avahi-daemon                    install
libavahi-client3                install
libavahi-common-data                install
libavahi-common3                install
libavahi-compat-libdnssd1           install
libavahi-core7                  install

Some research that is difficult to represent here illuminated the fact that Avahi was installed when I installed Mumble. (Mumble might actually be the most-used service on this box, after my use of it to store my Org files. Some friends of mine use it for voice chat in games.)

I briefly toyed with the idea of tracking down the source of the delay in Avahi; but this is a server in a dedicated, remote datacenter. I can’t imagine any use case for an mDNS resolver in this environment, and I’m certainly not using it now.

# service stop avahi-daemon

Now it takes less than three seconds to establish the connection. Not bad for halfway across the world.

$ time ssh janderson@ln1.civilfritz.net -p 2222 /bin/false

real    0m2.804s
user    0m0.017s
sys     0m0.006s

Better yet, there are no obvious pauses in the debugging output on the server.

Avahi hooks into the system name service using nsswitch.conf, just as you’d expect.

$ grep mdns /etc/nsswitch.conf
hosts:          files mdns4_minimal [NOTFOUND=return] dns mdns4

Worst case I could simply edit this config file to remove mDNS; but I wanted to remove it from my system altogether. It should just be “recommended” by mumble-server, anyway.

# aptitude purge avahi-daemon libnss-mdns

With the relevant packages purged from my system, nsswitch.conf is modified (automatically) to match.

$ grep hosts: /etc/nsswitch.conf
hosts:          files dns

Problem solved.

backporting sudo’s #includedir

sudo version 1.7.2 (possibly earlier) adds the ability to fragment the sudoers file into smaller chunks via an #includedir directive. This is a boon for our use of puppet, as it affords us the ability to configure sudo in multiple modules at the same time, rather than centralizing all of our privilege escalation information in one module.

class s_gpfs
{
    [...]

    file
    { '/etc/sudoers.d/nrpe-mmfs':
      content => "nrpe ALL = NOPASSWD: /usr/lpp/mmfs/bin/mmgetstate\n",
      owner   => 'root',
      group   => 'root',
      mode    => '0440',
    }

    [...]
}

Here, we allow the nrpe user (part of our automated monitoring infrastructure) to run the gpfs command mmgetstate as root.

Unfortunately, we also have to support systems who’s sudo implementation predates this new feature. (cough SLES 10 cough) In order to provide this functionality forward–compatibly, I wrote a Python script that inlines the contents of files as indicated by an #includedir directive that would otherwise be ignored as a comment in older versions of sudo.

#!/usr/bin/env python


import sys
import re
import glob
import os
import fileinput


include_directive = re.compile(r'^[ \t]*#includedir[ \t]+(.*)$')


def main ():
    for line in fileinput.input():
         match = include_directive.match(line)
        if match:
            directory = match.group(1)
            sys.stdout.write(inlined_content(directory))
        else:
            sys.stdout.write(line)


def inlined_content (directory):
    files = get_files(directory)
    return ''.join(read_all(files))


def get_files (directory):
    return [f for f in glob.glob(os.path.join(directory, '*'))
            if os.path.isfile(f)]


def read_all (files):
    for file_ in files:
        try:
            yield open(file_).read()
        except IOError:
            yield ''


if __name__ == '__main__':
    main()

webbrowser | Python Batteries

Today’s reverse spelunking through the Python standard library reveals the webbrowser module. As a library, it allows a Python application to interact with the default web browser on the host OS, opening a url in a new browser window or tab. This functionality is exposed at the shell, as well:

$ python -m webbrowser
Usage: /System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/webbrowser.py [-n | -t] url
    -n: open new window
    -t: open new tab

You can use the webbrowser module like this to open web pages from any shell or shell script.

$ python -m webbrowser -t http://www.civilfritz.net/

The module does what it can to do the right thing given your environment, and will open a browser on X11, OS X, Windows, and will even open a text–mode browser if no graphical browser is available.

I noticed (on OS X, anyway) that if you don’t specify a protocol (e.g., http) you get an exception.

$ python -m webbrowser -n www.civilfritz.net
0:39: execution error: An error of type -2110 has occurred. (-2110)

The browser still opened successfully, though.

You can find more information on the webbrowser module at the Python website.

Getting things done after Getting Things Done

I first looked into Getting Things Done my first year out of university. Though I suppose having some sense of personal organization and time management would have been nice to have in the seventeen years of study prior, my new state as an employee sent me searching for something more than excuses and a general habit of procrastination.

I'm the kind of person who visits a bookstore just to hang out. (If I ever conquer my own commercialism I'll hopefully transform into a library patron.) It was at one of my many trips to the local Barnes & Noble that I saw a copy of David Allen's book. Though I had heard of the book before, but the simple cover, pleasing proportions, and unassuming title shone through my initial cynicism. I picked up paperback, and made short order.

Getting Things Done, by David Allen

Soon my life was awash in contexts… lists… habits… projects… actions. I had a little paper notebook that contained everything I needed to not have to remember; I had 43 folders in my desk and in my reader, and even more manilla in a box full of files; I had an other box full of "stuff" that kept the stuff out of sight; and I was trying (mostly failing) to have weekly reviews about what I had done, what I was doing, and what I needed to do next.

This actually worked pretty well. I spent less time worrying about forgetting things, because if I needed to remember it, I just wrote it down. I spent less time trying to spin up to productivity because I already had "next actions" for all of my "projects."

I even had an empty email inbox.

…but all of this new headroom gave me the freedom to notice seams between David Allen's proposed system and my own requirements. First was in the sense of contexts: though, in a corporate office, simple things like "@phone" or "@desk" or "@home" effectively partition a task space into appreciable chunks, I work in computers. The vast majority of my tasks are "@computer" or, at the very most, "@Internet." That doesn't do much to calm the mind when you're staring at a long to–do list.

I liked the ubiquity and tactility of paper, but the medium has its faults. Completed tasks clutter up the page, and to clear them you have to transcribe any remaining items to a new page. Reorganizing items into different contexts bring the same problem. There's no way to archive (let alone audit) task history without even more transcription, since tasks for different projects are physically intermingled. Most damningly, separating tasks from the notes that go with them is both a mental and physical context–switch that plagues every non-trivial task.

For its faults, GTD had actually taught me a lot of really good lessons:

  1. The brain is way better at thinking than remembering.

  2. The less you have to remember, the more you can think.

  3. The more you need to think, the less you want to think.

  4. Don't waste time making the same decision twice.

  5. Lists add a sense of progression to otherwise intangible work.

  6. There is too much to do to consider all at once.

All existing GTD software was powerless to placate a sysadmin's sensibilities. It's all point–and–clicky, high–friction, and, worst case, web-based. I did the only thing I could do: I moved all my lists to text files, added one project ("write command-line GTD software" and one action ("brainstorm requirements for GTD software").

That project didn't go so well; but it's ok, because since then I've migrated to Emacs Org-Mode.

The Org–Mode Unicorn

Rather than being a software environment that I had to re–factor my workflow into, Org–Mode provides a rich set of (extensible and seemingly–infinitely–configurable) functions to manipulate my text lists–cum–text–files as I see fit. All of that on top of a mature <strike>text editor</strike>lisp runtime (albeit one with which I had no experience).

First off, I configured Org–Mode with some familiar list item types:

(setq org-todo-keywords
      '((type "ACTION(a!)"            "|" "DONE(d!)")
        (type "PROJECT(p!)"           "|" "DONE(d!)")
        (type "WAITING(w!)"           "|" "DONE(d!)")
        (type "SOMEDAY(s)" "MAYBE(m)" "|")
        (type                         "|" "DELEGATED(g@)" "CANCELLED(x@)")))

…then configured some simple logging:

(setq org-log-into-drawer t)
(setq org-log-reschedule 'note)
(setq org-log-redeadline t)
(setq org-log-done 'time)

Suddenly my lists grew automatic logging in the form of the LOGBOOK drawer:

* PROJECT make a new first post on civilfritz
:LOGBOOK:
- State "PROJECT"    from ""           [2011-08-15 Mon 21:04]
:END:
** DONE figure out the post sorting problem
CLOSED: [2011-08-16 Tue 20:35]
:LOGBOOK:
- State "DONE"       from "PROJECT"    [2011-08-16 Tue 20:35]
- State "PROJECT"    from "ACTION"     [2011-08-16 Tue 08:04]
- State "ACTION"     from ""           [2011-08-15 Mon 22:31]
:END:
** ACTION write about getting things done after getting things done
SCHEDULED: <2011-08-16 Tue>
:LOGBOOK:
- State "ACTION"     from ""           [2011-08-16 Tue 21:09]
:END:

Of course, that's a lot of clutter, too; but that's just what's physically stored in the file. Org–Mode provides a flexible view of the outline. For example:

* PROJECT make a new first post on civilfritz
  :LOGBOOK:...
  * DONE figure out the post sorting problem...
  * ACTION write about getting things done after getting things done
    SCHEDULED: <2011-08-16 Tue>
    :LOGBOOK:...

That's much easier to look at. In Emacs, color is used to make the content even clearer.

As simple as these little bits of text are, the triviality of their automation means that they can be parsed by other parts of Org–Mode. Most notably, by the agenda.

(defun org-find-agenda-files ()
  (find-lisp-find-files "~/agenda" "\.org$"))
(setq org-agenda-files (org-find-agenda-files))
(setq org-agenda-start-on-weekday 6)
(setq org-agenda-skip-scheduled-if-done t)
(setq org-agenda-skip-deadline-if-done t)
(setq org-agenda-custom-commands
      '(("S" "Unscheduled actions" tags-todo "TODO=\"ACTION\"+SCHEDULED=\"\"")
        ("D" "Undefined deadlines" tags-todo "TODO=\"WAITING\"+DEADLINE=\"\"")))
(setq org-stuck-projects
      '("TODO=\"PROJECT\""
        ("ACTION" "WAITING")
        nil
        nil))
Org-Mode agenda view

The agenda serves the same function as the context "next action" lists from GTD; except where contexts are static, the agenda is dynamic, built on-demand and filtered by arbitrary tags (which replace contexts themselves). Further, the "stuck projects," "unscheduled actions," and "undefined deadlines" lists make it easy to find orphaned tasks (now a part of my weekly review).

* ACTION [#A] weekly review                                       :work:home:
  SCHEDULED: <2011-08-20 Sat ++1w>
  :LOGBOOK:...
  :PROPERTIES:...
  - Review stuck projects (C-c a #)
  - Review unscheduled tasks to be done this week. (C-c a S)
  - Review waiting items with no specified deadlines. (C-c a D)
  - Review someday/maybe items. (C-c a t 5 r, C-c a t 6 r)
  - Review the past week's accomplishments. (C-c a a l v w b)
  - Review the upcomming week's actions. (C-c a a v w)

All of the historical logbook data is pulled together in the global logbook view, which I can now inspect separately (again, as part of my weekly review).

Org–Mode logbook view

I use Org–Mode to record virtually everything that I do or need to do, either at work or at home. It really has become my post-GTD, and I have yet to find a requirement that surpasses it. On the contrary, I often find new solutions just as streamlining reveals deeper bottlenecks.

I'll post more of my .emacs and workflow in the future, I'm sure. Until then, feel free to send any questions or comments my way.

bash ‘local’ directive eats status codes

Did you know that the local directive in bash returns a status code? I didn’t.

#!/bin/bash

function return_nonzero
{
    return 1
}

function main
{
    v=$(return_nonzero)
    echo $?

    local v=$(return_nonzero)
    echo $?

    local v
    v=$(return_nonzero)
    echo $?
}

main

At runtime:

$ bash local-return-code.sh
1
0
1

At least some bugs teach you something new.

birthdays | leaving Facebook

I’m in the process of closing my Facebook account. I never really used it: I’m just not that sold on the idea of a fake social network that rules your life. There are three use cases for me, though, and I need to port the relevant data (or service) elsewhere: the birthday calendar; the instant messaging service, and the photos that are already there.

The birthday calendar

I don’t want to lose track of the birthdays that people have published on Facebook. Going forward, I’ll have to maintain the calendar myself, and that’s fine; but I need to port those birthdays to something more standardized. (That is, likely a regular calendar with iCalendar support.)

Much to my surprise, Facebook makes an iCalendar file available explicitly for birthdays. It’s under Events→Birthdays→Export Birthdays. This link provides a webcal: url, which wikipedia tells me is an unofficial url for serving iCalendar files. In OS X 1.6, this url was automatically parsed by iCal, which wasn’t precisely what I wanted, so I just addressed the same path over http: and got a standard .ics file in my Downloads.

I already have a ‘Birthdays’ calendar in Google Calendar (which I’ll probably be trying to move away from at some point, too) so I just imported this .ics file and merged it into the existing calendar. There’s definitely birthdays in there that I don’t really care about (sorry, peoples!), but I can filter those down as they come up.

I thought I understood Unix filesystem permissions

I’ve been using ’nix operating systems (mostly Linux distributions) since my freshman year of college. I’m mostly self-taught, fair enough, but there was an appreciable quantity of ’nix in my coursework as well. I’ve worked in HPC since 2006. I use Apple OS X as a primary desktop OS because it’s a BSD that I don’t have to get working myself.

With this in mind, imagine my embarrassment to today discover a fundamental misunderstanding of ’nix filesystem permissions.

We, at the KAUST supercomputing laboratory, use a central LDAP directory for authentication and authorization. Predominately Linux hosts use a combination (not necessarily all at the same time) of nss, pam, and sssd to communicate with this directory.

Way back in time immemorial, a coworker designed a series of scripts for doing basic CRUD operations; e.g., create an account. This solution not only creates a posixAccount object, but also creates a posixGroup object. This group has a cn equal to the posixAccount’s uid, and a gidNumber equal to the posixAccount’s uidNumber. This posixGroup is used as the primary group for that account: it’s gidNumber is stored in the posixAccount’s gidNumber. The intent here is both to simplify management of “project groups” such that none of them are used as the user’s primary group and to protect user files from other accounts on the system.

For reasons based mostly in my own compulsive desire for neatness, we decided to eliminate these user-private groups in favor of the universal primary gid “100” (or “users”). The CRUD script was updated accordingly, and the long task of updating the existing filesystem began:

find /gpfs \( ${long_series_of_groups} \) \( \
    \( ! -type l -exec chmod g-rwx {} \; \) , \
    -exec chown -h :100 {} \; \)

The intent here is to chown any file currently owned by a user-private group to the new users group, and to chmod group rights from such files (since effectively no group rights were granted, given the ownership by a user-private group).

I was met with surprise when access was later denied to such files.

Apparently, ’nix filesystem permissions are disjointed. Access by the file owner is only mediated by the owner bits; access by group members (other than the owner) is defined only by the group bits; and the other bits only apply to users that are not in the first two categories. This means that a file like testfile:

-rw----r-- 1 root users 0 2011-07-03 07:38 testfile

is not readable by members of the users group, even though the r bit is set in the lowest order.

It seems that, rather than chmod g-rwx, I should have copied group permissions from the “others” access rights.

edit:

A coworker has advised that I post the find command that I’m using to fix this. Apparently using multiple -execs is convoluted or something.

find $fs -group users ! -type l \( \
    \( -perm -o+r -exec chmod g+r {} \; \) , \
    \( -perm -o+w -exec chmod g+w {} \; \) , \
    \( -perm -o+x -exec chmod g+x {} \; \) \)

migration-to-linode

I’m migrating to linode. I’m getting

  • Debian Squeeze

  • twice as much memory for the same price

  • IPv6

  • to avoid the forced migration to rackspace

I’ve already moved dns and mumble services, and that went smoothly enough. I’m afraid of moving http for some reason, so I’m just going to do it and fix problems when they arise.

Let me know if something breaks.

edit: Well, that seems to have gone well. Authentication even still works.

The only thing left is mail.

merging two disjoint git repositories

I have this persistent desire to play Pathfinder with some friends. A while back I set up a separate ikiwiki repository for it, but decided today that I wanted to handle it as just a section of this main wiki.

I considered just copying what little content I already had from one wiki to the other; but a little reflection led me to try to just merge the two repositories.

First, I just moved everything in the RPG wiki to the /rpg directory in that repository. Then I added the rpg repository as a remote for the primary repository. git fetch rpg && git merge rpg/master. Done.

Git is awesome.

LaTeX BEAMER

I just made my first whole presentation with the LaTeX BEAMER class. It wasn’t terribly painful, but wasn’t all that intuitive, either.

At least now I’m in control of the data.