Thursday, February 12, 2015

In a BIND

The Domain Name System (DNS) is one of the foundations of the internet. It resolves symbolic names to numeric addresses,letting computers know that "www.google.com" resolves to one of half a dozen addresses beginning 75.125.21; it resolves aliases to their canonical names, translating "www.ibm.com" to "e2898.x.akamaiedge.net"; it tells to world where to send email addressed to "yahoo.com" or "rcn.net"; it knows that such names have a finite time to live (TTL) and will check again when that time is over. Without DNS, the internet, and for that matter corporate networks of any size, would be unworkable. The software that serves up these names is referred to as BIND on UNIX and Linux machines. At most times in most places it runs properly and nobody but sysadmins gives it more thought than anybody but plumbers and electricians give to water and electricity.

On Tuesday,this week name resolution seemed to be slow and occasionally unreliable. When somebody from the Help Desk mentioned this to me, I logged into the server that runs the named daemon, and sent it a message (SIGHUP) to make it reread its configuration. This had an effect: the system quit answering queries. The program did write out many messages to the effect that it could not find addresses for particular root servers. A BIND that cannot retrieve information from the root servers is of no use for resolving addresses outside its own network, and, I discovered, may be so busy trying and failing that it can do little else. Yet there was no reason the program should have had trouble finding the root servers--the root hints file was fine.

The failure of DNS (domain name service) quickly stops the work of many parts of a network. Email will not go out, and users cannot connect to web sites. I was one of those users, so the usual recourse of checking on Google for the sense of apparently senseless error messages didn't work. After repeated restarts of the named daemon and the caching daemon, we got back to a fairly stable condition, where the name service would respond correctly, if not on the first try, then on the second.

That lasted until I tried another restart Wednesday morning. The daemon reported that it could not find the root servers. Restarting the named and caching daemons was not working. A SIGINT to the named daemon produced a dump of the state in named_dump.db, but that didn't tell me much. Eventually a BSD-oriented blog suggested that the forwarders in my configuration file could be the problem. I commented them out, restarted, and life returned to something like normal.

Clearly I need to be better at reading named_dump.db, and I need to know more about the work of the forwarders in the configuration.

I did notice a few things in named_dump.db that I don't ordinarily think about. There are many curious domain names out there, for one. A few, copied at random, are

  • mundeleinparks.org
  • dayofthegirl.org
  • rantlifestyle.org
  • suspendedforspamandabuse.com
with the last perhaps being my favorite. And I noticed how many domains I found for Amazon Web Services, and other such providers. I vaguely knew that such providers serve many sites, but it looked to me as if about half of the addresses our name server knew about had "aws" somewhere in the second-level domain.


3 comments:

  1. I wrote a comment that seems to have vanished, to the effect that 'daemon' and 'sigh up' are rather lovely and suggest that programmers have poetic souls

    ReplyDelete
    Replies
    1. I had to search for "sigh up". On UNIX and Linux systems there are a number of "signals", with names such as "SIGINT" (interrupt), "SIGHUP" (hang up), and "SIGKILL".

      But yes, whoever came up with "daemon" for such processes had a touch of the poet.

      Delete
  2. This comment has been removed by a blog administrator.

    ReplyDelete