Thu, 03 Jul 2008

// Diomidis Spinellis's book "Code Reading: An Open Source
// Perspective" has this messy code in chapter 2
// > http://www.spinellis.gr/codereading/spinellisch02.pdf
// from netbsdsrc/games/worms/worms.c:419:

op = &(!x ? (!y ? upleft : (y == bottom ? lowleft : left)) :
(x == last ? (!y ? upright : (y == bottom ? lowright : right)) :
(!y ? upper : (y == bottom ? lower : normal))))[w->orientation];

// He rewrites it as follows, the idea being to improve its
// readability:

struct options *locations[3][3] = {
     {upleft, upper, upright},
     {left,    normal, right},
     {lowleft, lower, lowright},
};
int xlocation, ylocation;
// Determine x offset
if (x == 0)
     xlocation = 0;
else if (x ==  last)
     xlocation = 2;
else
     xlocation = 1;
// Determine y offset
if (y == 0)
     ylocation = 0;
else if (y ==  bottom)
     ylocation = 2;
else
     ylocation = 1;
op = &(locations[ylocation][xlocation])[w->orientation];

// In my opinion, the above is actually worse than the original, as a
// result of being so much longer.  However, then Spinellis suggests
// this version, which he credits to Guy Steele:

op =
  &(        !y ? (!x ?  upleft : x!=last ?  upper :  upright ) :
     y!=bottom ? (!x ?    left : x!=last ? normal :    right ) :
                 (!x ? lowleft : x!=last ?  lower : lowright )
   )[w->orientation];


// I agree that the code needs rewriting, but inspired by Scheme, I
// would do it without magic numbers, without nine separate variables
// for the struct options pointers, and with conditions and their
// consequents on the same line:

struct bar { struct options *left, *middle, *right } upper, middle, lower;
struct bar *a = (y == 0      ? &upper :
                 y == bottom ? &lower :
                               &middle);
struct options *b = (x == 0    ? a->left  :
                     x == last ? a->right :
                                 a->middle);
op = &b[w->orientation];

Mon, 30 Jun 2008

I use a sort of log-structured filesystem for my notebooks.  I fill
the notebooks in chronological order (more or less) from the second
page to the last page.  (The first page is left blank at first.)
Everything is under some heading; the current heading is repeated at
the top of every page, with the date, but sometimes there are several
headings on a single page.  The headings are underlined so they're
easy to see looking at the page.

So I can find things by paging through the recent pages and looking at
the headings.  When that gets to be too much, I append a new "table of
previous contents" section, under a heading just like everything else;
it lists all the headings, with dates, since the last "table of
previous contents".  The first page contains a list of tables of
previous contents, with their dates, so that I can find them
relatively quickly.  This allows me to find my notes more quickly by
reading through the few pages that are full of tables of previous
contents, rather than leafing through all the pages in the book
looking for headings.

If I were a disk, which I'm not, this would be a reasonably efficient
scheme for writes: regardless of how much stuff I have to write, I
could append it all in a single write to the end of the
currently-written data, possibly including a new table of previous
contents, then update the "superblock" on the first page with a
pointer to the new table.  So writing any amount of data less than a
notebookfull requires a seek to the end of the previous ToC, possibly
a read of data following it, a write of the new data, and possibly a
second seek and a second write to the superblock.  Two seeks.  Finding
something in a notebook with three ToCs requires at most four seeks:
one to each ToC, then another one to the data; if it's not listed in
any ToC, you can sequentially scan for it after the last ToC.

With this scheme, there's a tradeoff (for either humans or for disks)
between the amount of sequential scanning you may have to do (due to
still-unrubricated items) and the number of ToCs you may have to seek
to and read.

Beatrice pointed out the other day that it would be easier for a human
to write the notes sequentially from the beginning of the book, while
writing the ToC entries sequentially from the end of the book.  This
way, all the ToC entries are in a single sequential chunk, the
tradeoff between maximum sequential scan length and ToC fragmentation
is eliminated, and writing still requires only two seeks.

Of course she is correct, and this might be a reasonable strategy for
log-structured filesystems too, although there are usually more levels
of indirection: from superblock, through various levels of inodes and
directories, to the actual file extents on disk.  You could probably
do a reasonable job by putting a B-tree of pathnames at a fixed
location of the disk, and putting the inodes and data extents
contiguously somewhere else.  `/var/cache/locate/locatedb` is a
reasonable approximation of the contents of this B-tree; on my current
laptop, it's 5.3MB, indexing 95GB of files using 596 662 inodes
(i.e. 596 662 files, although `sudo locate / | wc -l` only finds 
494 488 files.).

Repacking a 5-20MB B-tree when it got too large and loose would take a
significant fraction of a second on a modern disk, but on my laptop
would take perhaps 10-20 seconds, due to the slowness of on-CPU disk
encryption.  So it might be better to defragment the tree incrementally.

Thu, 26 Jun 2008

(This is available in HTML at
<http://canonical.org/~kragen/html-succinct.html>.)

HTML is more succinct for things in its intended domain than
S-expressions, but still has better error-detection and correction
capabilities.

S-expression fans like to say that HTML, SGML, and XML are just
bastardized S-expression languages.  SGML partisans often respond that
matching end-tags allow for better error-reporting and correction.
But for typical HTML content --- mostly running text with a little bit
of interspersed markup --- S-expressions are not only harder to
correct, but also more verbose.

Consider this partial paragraph from the Ur-Scheme web page
<http://pobox.com/~kragen/sw/urscheme>:

    <li><b>Reasonably fast.</b> It <b>generates reasonably fast
    code</b> &mdash; when compiled with itself, it runs 2½ times
    faster (in user CPU time) than when it's compiled with <a
    href="http://www.call-with-current-continuation.org/"
    >Chicken</a>, 1½ times faster than when it's compiled with...</li>

Now, in traditional HTML, I could have left out the quotes around the
URL and the ending `</li>` tag.  Consider this S-expression version:

    (li (b "Reasonably fast.") " It " (b "generates reasonably fast
    code") " " mdash " when compiled with itself, it runs 2½ times
    faster (in user CPU time) than when it's compiled with "
    (a :href "http://www.call-with-current-continuation.org/"
    "Chicken") ", 1½ times faster than when it's compiled with...")

Most of the markup constructs take up more characters here:

    LI: '<li></li>'    (end tag could be omitted in traditional HTML)
        '(li "")'
    B:  '<b></b>'
        '(b "") '
    B:  '<b></b>'      (the second one)
        '" (b "") "'
    --- '&mdash;'
        '" mdash "'
    A:  '<a href=""></a>'  (quotes could traditionally be omitted)
        '" (a :href "" "") "'

If you look at this in a fixed-width font, you'll see that the number
of markup characters is detectably smaller in the S-expression
serialization of the structure, with the exception of the first two.
I maintain that this is typical of the bulk of HTML, especially if you
weight it by how often people write it instead of how often it gets
sent to browsers.  You can come up with examples where that is not the
case:

    <html><head> <title>...</title>
                 <link rel="stylesheet" href="../../style.css" />
                 <meta http-equiv="Content-Type" content="..." />
                 <style type="text/css">...</style></head>...</html>

vs.

    (html (head (title "...") (link :rel "stylesheet" :href "../../style.css")
                (meta :http-equiv "Content-Type" :content "...")
                (style :type "text/css" "...")))

but those structure-heavy, text-light examples with long-winded tag
names are relatively rare for people to read and write.

Of course, the cost of terser syntax is often that errors are hard to
diagnose.  Ada's `end loop`, `end if`, `end record`, and so on mean
that if you leave out an `end` delimiter, the compiler will usually be
able to tell you which one you left out.  At the opposite end of the
spectrum, S-expression languages in which all the various kinds of
`end` are spelled as `)` can only tell you when they get to the end of
the program or to something that doesn't make sense in the current
context.

> This is not a phenomenon limited to end-delimiters.  In
> programming languages, there are many other examples of verbosity
> that helps to diagnose errors; for example, explicit type
> declarations, mandatory delimiter characters (in cases where the
> syntax would be no more ambiguous if they were removed from the
> grammar), sequences of single-line comments, and the conventional
> parenthesization of the arguments of fixed-arity functions ("ratio
> square sin x square sin y" is perfectly unambiguous, after all,
> and Forth, PostScript, Logo, and REBOL use more or less that
> syntax.).

However, in the case of HTML, the terser syntax does not make errors
harder to diagnose; in fact, the HTML syntax permits better
error-detection and even error-correction, because all of the end-tags
are explicitly labeled.  (It differs from SGML in this regard; in
SGML, you can write `<li><b/Reasonably fast./ It ...</>` and eliminate
the redundant end-tags altogether.)

Mon, 23 Jun 2008

(Available in HTML at <http://canonical.org/~kragen/wood-pda.html>.)

Polished-Stone Handheld Computers
---------------------------------

So I've been thinking about making a handheld computer with the look
and feel (shininess, irregularity, weight, seamlessness) of a polished
semiprecious stone.

One way to do this would be to embed the electronics in polyester
resin poured into a mold, with an embedded induction coil for
charging, some embedded lead shot for weight, and a dark, but not
quite opaque, surface layer to hide the interior except for when it
was glowing.  Input would probably be piezoelectric, localizing
surface taps or using rhythm.  (See earlier kragen-tol post [magic
boxes and secret knocks][magic].)  Output could be through embedded
LEDs shining through the surface layer or through audio, especially if
you held it against a window.

(How much lead shot would you need?  Lead has a density of 11.3g/cc,
against quartz's 2.6g/cc and [the polyester resin's 1.11g/cc][EP4117],
so only 14.6% of the volume would need to be lead to equal quartz's
density.)

It would be shockproof, waterproof, crushproof, not particularly prone
to damage from ESD, and it would feel really good in your hand.  Some
hard silicone around the outside might improve its thermal
conductivity.  (There are hard silicone resins with high thermal
conductivity, right?)

Beatrice suggested that you could use an actual polished semiprecious
stone instead; cut out a circle from one side, drill out a cavity
underneath, put the electronics inside, pot them with epoxy, replace
the circle, wipe off the excess epoxy, and then polish the result.

Wood-Block Handheld Computers
-----------------------------

Another "everyday object" kind of electronic device case: a block of
wood.  Some time ago I saw a web page about a wooden clock.  It seems
to be widely available now; for example,
<http://svp.co.uk/products-solo.php?pid=4989&ref=froogle&ci_src=18615224&ci_sku=8028>
advertises it for £93.99.  It explains:

> A totally minimal block of wood with digital numbers floating
> across the surface. These clever clocks have a very thin layer of
> real maple wood veneer that permits the LEDs to shine through.
> 
> Each one is slightly different due to the natural variation in
> wood grain.
> 
> Dimensions: 208 x 90 x 90mm
> Weight: 1.2kg

Another page says:

> TO:CA 'wood' LED clock designed by kouji iwasaki in 2002. this
> 'wooden' LED clock won top prize at the asahikawa international
> design fair in 2002.

A third page says they're actually made of MDF under the maple veneer,
and has a photograph of the back that seems to confirm this, and a
fourth page says the manufacturer is "Takumi of Japan".

I think a handheld computer that looks like a block of wood would be
pretty nice too.  Something the size of a business card (3.5" x 2", or
89 x 51 mm) but fairly thick (say, 15mm), with veneer on at least one
side.  The resolution of the display would be limited by the light
blurring on the way through the translucent veneer; each spot of light
would have a radius on the order of the thickness of the veneer.
[Veneers are typically 0.8mm][veneers] but are available as thin as
0.3mm.

If spaced 1.6mm apart, you could get almost 1800 pixels in a
rectangular array into the business-card size.  You could do a little
better with a hexagonal array: if the distance from the center of a
regular hexagon to the center of one of its sides is r, then the
distance to one of its corners is about 1.15r, which is the same as
the length of each side; and its area is 1½ * 1.15r * 2r = 3.45r²,
which is 14% smaller than a square circumscribed around the same size
of circle.  In the case of r=0.8mm, you'd have 2.2mm² per pixel
instead of 2.56, so you'd get about 2000 pixels.  But then you'd have
to deal with the hexagonal array in your software.

1800 pixels is enough for about 45 letters in a traditional 5x8
single-bit-deep font, which is pretty cramped; my cheap two-year-old
US$30 cellphone has something like 65 letters'worth of space on its
display.  But it's enough to be useful.  It's a lot more than any of
the under-US-$10 devices I picked up for the "[cheap electronics
dissection project][electronics]" in 2006, and they are useful for
some things.

I don't know how easy or hard it is to populate a PC board with
1800-2000 LEDs.  I know I wouldn't want to do it by hand.

You could hollow out the middle of a block of wood with just a drill
and jigsaw; a keyhole saw or wire saw might work in place of the
jigsaw.  Cutting all the way through it would be a lot easier than
just chiseling out a hollow in one side of the block; then you'd need
to put veneer on both sides instead of just one.  To add strength and
keep it from sounding hollow, you'd probably want to pot the whole
interior with epoxy or something.

You could have a couple of finishing nails visible on one end if you
wanted to charge it through actual electrical contacts rather than
with induction.

Other Everyday Items
--------------------

You could also embed handheld computers in the following: oyster
shells; bricks; pens (I suggested this previously on kragen-tol);
ceramic tiles; beanbags, pillows, and stuffed animals (like the Chumby
and the Furby).

References
----------

[EP4117]: http://www.eagerplastics.com/4117.htm "Eager Plastics EP4117"

Eager Plastics, aka Eager Polymers, has an "[EP4117][] General Purpose
Polyester Laminating Resin" with a density of 1.11 g/cc.

[magic]: http://lists.canonical.org/pipermail/kragen-tol/2002-April/000700.html

In April 2002, I posted "[magic boxes and secret knocks][magic]" to
kragen-tol. 

[veneers]: http://www.diyinfo.org/wiki/Using_Veneers "an article on DIYinfo.org"

The article [Using Veneers][veneers] describes the different kinds of
wood veneers available today.

[electronics]: http://courageous.murch-sitaker.org/~kragen/electronics/

In 2006 I wrote a web page about my "[cheap electronics dissection
project][electronics]", where I bought a bunch of cheap electronics
and looked inside them.

Thu, 19 Jun 2008

So I just upgraded to Emacs 22 in April, despite Debian Etch not
supporting it.  It solves several of my daily annoyances with Emacs
21:
- It recognizes "Password: " as a password prompt, so ssh and sudo get
  the benefit of me not having to manually type M-x send-invisible.
- I can paste Unicode text into it from a web browser, including
  asymmetrical quotes, real apostrophes, and em dashes, and have it
  save them to a UTF-8 file without fuss.  (Although it still displays
  the quotes in an obnoxious double-width fashion until the file has
  been saved and reloaded.)
- TRAMP works out of the box.
- The documentation is included, unlike in Debian.  (There's a
  licensing dispute over whether the GNU Free Documentation License is
  free enough to satisfy the Debian Free Software Definition.)
- comment-region now asks what comment syntax to use if it doesn't
  know.
- When I run e.g. "darcs" by itself in shell-mode, occasionally Emacs
  used to take quite a while to display its output usage message,
  because it was reading it one character at a time.  This has been
  fixed.

I also anticipate joy using MuMaMo, but I haven't actually tried that
yet.

There are some changelog/news entries that sounded pretty good:
    ...if you set `set-mark-command-repeat-pop' to t.  I.e. C-u C-SPC           
    C-SPC C-SPC ... cycles through the mark ring.  Use C-u C-u C-SPC
    to set the mark immediately after a jump.  [Haven't tried this yet.]

    ...M-% typed in isearch mode invokes `query-replace' or
    `query-replace-regexp' (depending on search mode) with the current
    search string used as the string to replace.  [Haven't tried this
    yet.]

    You can now customize the use of window fringes.  To control this
    for all frames, use M-x fringe-mode or the Show/Hide submenu
    of... [so now I can have two 80-column windows on my screen at
    once, which is awesome]

    A new minor mode `next-error-follow-minor-mode' ... In this mode,
    cursor motion in the buffer causes automatic display in another
    window of the corresponding matches, compilation errors,
    etc. [Haven't tried this.]

    The new command `multi-occur' is just like `occur', except it can
    search multiple buffers. [Useful. Also I didn't know about
    `occur`.]

    The grep commands provide highlighting support. Hits are fontified
    in green, and hits in binary files in orange.  Grep buffers can be
    saved and automatically revisited.  [This is in fact extremely
    awesome.]

    In addition, when ending or calling a macro with C-x e, the macro
    can be repeated immediately by typing just the `e'. [This sounds
    nice, but the F3 and F4 macro keybindings are better.]

    The new package longlines.el provides ... "soft word wrap" [like
    actual word processors have since the 1970s.  Turns out to be
    fantastic.]

    SES mode (ses-mode) is a new major mode for creating and editing
    spreadsheet files.  [Haven't tried this yet.]

    The new package table.el implements editable, WYSIWYG, embedded
    `text tables' in Emacs buffers  [Haven't tried this yet.]

    The new package flymake.el does on-the-fly syntax checking of
    program source files.  [Haven't tried this yet.]

    savehist saves minibuffer histories between sessions.  [Haven't
    tried this yet.]

    isearch in Info uses Info-search and searches through multiple
    nodes.  [This is fantastic.]

    Atomic change groups: To perform some changes in the current
    buffer "atomically" so that they either all succeed or are all
    undone, use `atomic-change-group' around the code that makes
    changes.  [Sounds like a fantastic idea, but I haven't tried it
    either.]

So far I've only noticed two new annoyances: one is that it uses its
own python-mode that I don't like as well as the one that comes with
Python, and the other is that C-x C-f RET no longer reverts the file
to the version in the filesystem (assuming the buffer wasn't edited);
now you actually have to type the filename.

The stuff in the NEWS file (C-h N) looks pretty innocuous.  Nothing is
terribly exciting, though.