Mon, 09 Apr 2007

> random_shuffle looks pretty hairy.

random_shuffle, as usually implemented, is based on swaps (establishing a  
random permutation) -- but one can also think of it as sorting with a  
"broken" comparison function which returns random results, so there may be  
a clever alternative to doing a real sort with an appended random key?

> There's a non-clever way to make any sort stable, which is to add a
> secondary sort key that is the original position in the list.  There
> may be a clever way to do a stable mergesort without knowing the list
> lengths in advance, but I don't know it.

The idea of radix-sort is that one can reduce the problem of sorting on a  
wide key to several sorts on narrow keys -- adding a secondary sort key is  
a bit self-defeating for this use.

The brute-force way to preserve stability is to reverse the reversed list  
in between passes, as used here with integers providing lists (and a pair  
of integers providing a tape):
http://en.literateprograms.org/Merge_sort_(dc)

The clever way (as used with actual tape drives) was to reverse the sense  
of comparison on each pass.

> That leaves only the heap and permutation operations.  I really don't
> know about them.

I'd guess heaps and permutations are essentially swap-based, and that  
swaps imply random access.
(permutations are closely related to insertion/selection-sort, which are  
not so far away from heap sort)

Knuth Vol 3, "Sorting and Searching", might be a good reference.  Anything  
that was good for external sorting with tapes ought to be good with  
lists.  (indeed, being good with tapes implies that one should be able to  
find more cache-friendly implementations than the standard  
singly-linked-list)

> A remarkably large share of the uses of array indexing by numerical
> variables I found in both my Python code and my JavaScript code were
> for some variant of the string.join problem --- given an array of N
> strings, connect them with N-1 commas, except when N=0, in which case
> we should connect them with N=0 commas instead of N-1 = -1 commas.

Squint at it properly, and tail-call optimization becomes an instance of  
string.join: instead of "a,b,c," (where x is a jump and ',' the equivalent  
return) we want to generate "a,b,c".  The dual (context preservation when  
building argument lists) follows the same pattern: we only need to  
save/restore contexts if there are at least two overlapping evaluations.

This may be a practical reason for the last couple of decades' interest in  
monads: we prefer to program in an algebraic, tree-like form (code as well  
as data), while the iron prefers to handle linear sequences of code  
operating on contiguous chunks of data (op x state -> state') -- and  
monads provide machinery to go from tree inputs to flat results in a  
theoretically nice manner.

-Dave

Introduction
------------

I'm trying to figure out what kind of data structure I should use for
ordinary ordered collections in Bicicleta: some kind of numerically-
indexed vector, or a Lisp-style externally-singly-linked list?  I feel
that I have to decide this now, because I need to add variadic
functions and introspective slot/property/method listing.

Weighing on the side of numerically-indexed vectors, there are more
efficient implementations possible (I think), they work reasonably
well in Python, JavaScript, and Perl, and they don't require people to
think recursively.

Weighing on the side of Lisp-style lists, they don't require an
additional primitive type, they don't introduce numbers (as indices)
into parts of your program that have nothing to do with arithmetic,
and they might not require people to think recursively.

Both structures are capable of the same set of operations; it's just a
matter of which ones are expensive and which ones are dangerous.

So I thought I'd look to see if one or the other would really cramp
anybody's style.

(In here, where I say "array", it means "vector", and where I say
"list", I might mean either "linked list" or "vector".)

Contents
--------

Introduction
Contents
STL Algorithms
Common Lisp Sequences Dictionary
My Own Programming Style in Python
My Own Programming Style in JavaScript
Efficiency
The Strange Case of string.join
Conclusions

STL Algorithms
--------------

So here's a survey of the 1994/1995 STL algorithms and the iterator
types they use.  The STL contains a comprehensive collection of
generally-useful algorithms, and carefully distinguishes what kind of
data structure traversal each algorithm needs.  Immutable Lisp-style
lists can support "input" and "forward" traversal, but not
"bidirectional" or "randomaccess", and by virtue of being immutable,
they don't support "output" directly --- but generally "output"
iterators can map to a returned list value.

    for_each input
    find input
    find_if input
    adjacent_find forward
    count input
    count_if input
    mismatch input
    equal input
    search forward
    copy input output
    copy_backward bidirectional
    iter_swap forward
    swap_ranges forward
    transform input output
    replace forward
    replace_if forward
    replace_copy input output
    replace_copy_if input output
    fill forward
    fill_n output
    generate forward
    generate_n output
    remove forward
    remove_if forward
    remove_copy input output
    remove_copy_if input output
    unique forward
    unique_copy input output
    reverse bidirectional
    reverse_copy bidirectional output
    rotate forward
    rotate_copy forward output
    random_shuffle randomaccess
    partition bidirectional
    stable_partition bidirectional
    sort randomaccess
    stable_sort randomaccess
    partial_sort randomaccess
    partial_sort_copy input randomaccess
    nth_element randomaccess
    lower_bound forward
    upper_bound forward
    equal_range forward
    binary_search forward
    merge input output
    inplace_merge bidirectional
    includes input
    set_union input output
    set_intersection input output
    set_difference input output
    set_symmetric_difference input output
    push_heap randomaccess
    pop_heap randomaccess
    make_heap randomaccess
    sort_heap randomaccess
    max_element forward
    min_element forward
    lexicographical_compare input
    next_permutation bidirectional
    prev_permutation bidirectional
    accumulate input
    inner_product input
    partial_sum input output
    adjacent_difference input output

So only the following 18 algorithms out of the 64 require
bidirectional or random-access iterators:

    copy_backward reverse reverse_copy random_shuffle partition
    stable_partition sort stable_sort partial_sort partial_sort_copy
    nth_element inplace_merge push_heap pop_heap make_heap sort_heap
    next_permutation prev_permutation

The other 46 algorithms work fine without.  For most of the above 18,
there are also well-known ways to get the desired effect with
side-effect-free data structures.

copy_backward sounds like reverse_copy, but it makes a forward copy
--- it just does it in reverse order so that it works when copying
stuff upwards to an overlapping range.

reverse is very straightforward to perform with linked lists.  I think
it is only done with a bidirectional iterator in STL because there's
no "backwards output iterator", which may be because STL's lists are
doubly-linked, so they have bidirectional iterators.

random_shuffle looks pretty hairy.  Probably best just to assign a
random number to each element and sort them, although that feels like
cheating.

partition (and stable_partition) is straightforward to do with lists,
just less efficient.

As for sort and stable_sort, quicksort and mergesort can easily be
done on lists.  The clever thing about mergesort is that it's normally
a stable sort, which is to say that it doesn't reorder elements that
compare equal, but running mergesort using "forward iterators"
suggests that you should use a different way of dividing up the list
--- say, unshuffling it --- but that doesn't give you a stable sort.
There's a non-clever way to make any sort stable, which is to add a
secondary sort key that is the original position in the list.  There
may be a clever way to do a stable mergesort without knowing the list
lengths in advance, but I don't know it.

partial_sort selects the smallest N elements and sorts them.  A lazy
mergesort should provide this nicely.

nth_element is quickselect, which I think can also be done with lists
without too much difficulty.

inplace_merge can be done with merge; indeed, in STL, merge is also
present.

That leaves only the heap and permutation operations.  I really don't
know about them.

This discussion leaves out the fact that several of the algorithms
(upper_bound, lower_bound, equal_range, binary_search, includes) have
implementations that run faster when they have random-access iterators
available.

Common Lisp Sequences Dictionary
--------------------------------

Common Lisp has a similar, though slightly smaller, set of algorithms
that can work on any kind of sequence, including lists and vectors;
it's known as the Sequences Dictionary.  Here's the list:

    (copy-seq sequence)
    (elt sequence index) setfable
    (fill sequence item &key start end)
    (make-sequence result-type size &key initial-element)
    (subseq sequence start &optional end) setfable
    (map result-type function &rest sequences+)
    (map-into result-sequence function &rest sequences+)
    (reduce function sequence &key key from-end start end initial-value)
    (count item sequence &key from-end start end key test test-not)
    (count-if predicate sequence &key from-end start end key)
    (count-if-not predicate sequence &key from-end start end key)
    (length sequence)
    (reverse sequence)
    (nreverse sequence)
    (sort sequence predicate &key key)
    (stable-sort sequence predicate &key key)
    (find item sequence &key from-end test test-not start end key)
    (find-if predicate sequence &key from-end start end key)
    (find-if-not predicate sequence &key from-end start end key)
    (position item sequence &key from-end test test-not start end key)
    (position-if predicate sequence &key from-end start end key)
    (position-if-not predicate sequence &key from-end start end key)
    (search sequence-1 sequence-2 &key from-end test test-not key 
            start1 start2 end1 end2)
    (mismatch sequence-1 sequence-2 &key from-end test test-not key
            start1 start2 end1 end2)
    (replace sequence-1 sequence-2 &key start1 end1 start2 end2)
    (substitute newitem olditem sequence &key from-end test test-not
            start end count key
    (substitute-if newitem predicate sequence &key from-end 
            start end count key)
    (substitute-if-not newitem predicate sequence &key from-end 
            start end count key)
    (nsubstitute newitem olditem sequence &key from-end test test-not 
            start end count key)
    (nsubstitute-if newitem predicate sequence &key from-end 
            start end count key)
    (nsubstitute-if-not newitem predicate sequence &key from-end 
            start end count key)
    (concatenate result-type &rest sequences)
    (merge result-type sequence-1 sequence-2 predicate &key key)
    (remove item sequence &key from-end test test-not start end count key)
    (remove-if test sequence &key from-end start end count key)
    (remove-if-not test sequence &key from-end start end count key)
    (delete item sequence &key from-end test test-not start end count key)
    (delete-if test sequence &key from-end start end count key)
    (delete-if-not test sequence &key from-end start end count key)
    (remove-duplicates sequence &key from-end test test-not start end key)
    (delete-duplicates sequence &key from-end test test-not start end key)

There's also the Conses Dictionary, which is a much larger collection
of algorithms that apply specifically to conses, mostly singly-linked
lists made of them; many of these are widely-applicable algorithms as
well, like set-difference and mapcan, that just happen to work only on
lists.  I haven't surveyed them because there are so many of them,
although perhaps I should survey them if my objective is to see
whether picking vector as a default ordered collection would cause
problems.

There's an Arrays Dictionary too, but it is much more minimal,
containing things like array-dimensions and simple-vector-p.

Some notes on these, because they are different in some particulars
from the corresponding algorithms in other languages:
- unlike in C++ or Haskell, there is no lazy sequence type in Common
  Lisp.
- in the absence of initial-value, reduce calls its function with zero
  arguments when the sequence is empty.
- map maps in parallel over the sequences, which could express certain
  kinds of vector math more conveniently.  It stops when any of the
  input sequences runs out.
- map-into stops when it reaches the end of its output sequence or any
  of the input sequences.
- setfing subseq only works if the new subsequence is of the right
  length, unlike in Python.
- as in STL, sequence indices count from zero, and ranges designated
  by start-end pairs include the start but not the end.
- many of the functions include a :key argument which extracts the key
  to be compared, tested, etc.  Usually, this is equivalent to
  composing some functional argument with the key, potentially in some
  hairy way (e.g. for sort or merge, #'< becomes 
  #'(lambda (x y) (< (key x) (key y)))) but can be implemented more
  efficiently.

My Own Programming Style in Python
----------------------------------

In Python, Perl, and JavaScript, the "lists" are numerically-indexed
arrays, so if there's anywhere where I'd be prone to lots of numerical
indexing of lists, it should be in these languages.  So here's the
Python I have lying around:

    bicicleta/objcalc.py
    bicicleta/objcalcpp.py
    fib.py
    laptoptable.py
    readbmp.py
    safe_repr.py
    ~/cursmail/cursmail.py
    ~/bloompy/bloom.py
    ~/textindex/textindex.py

One by one:

* bicicleta/objcalc.py, bicicleta/objcalcpp.py

These two are the object-calculus interpreter I posted to kragen-hacks
a couple of years ago.  It's extremely non-numerical.  It accesses
lists with zip, for x in y:, list comprehensions, list appending with
+, string.join, and construction of lists with "list displays" (the
Python name for listing a bunch of expressions inside square brackets,
separated by commas, to make a list of their return values); and
there's one place where it says:

        if not lines[-1]: lines = lines[:-1]

It also looks at strings a bit --- it cares whether their length is 0
and whether their char 0 is in "self.namechars".

With these exceptions --- testing for emptiness, looking at the first
item, looking at the last item, removing the last item --- it gains
nothing from the array nature of Python lists.

* fib.py

Contains no lists, because it's a deliberately stupid implementation
of the fibonacci sequence and is only four lines long.

* laptoptable.py

This program parses a text file full of RFC-822 headers and turns it
into a web page with a table.  It tests the first character of a line
with line[0], strips the first character of a key with k[1:], and
tests len(sys.argv) and extracts the first item with sys.argv[1];
otherwise it accesses lists and strings with string.strip,
string.replace, string.join, map, list comprehensions, string.split,
list displays (sort of).

There's also a fair bit of JavaScript in this file.

* readbmp.py

This program is fairly numerical.  It parses a 24-bit-color BMP file
into an array of arrays of floats, generates another array of random
floats, adds them together element by element, and turns the result
into PostScript.

Accordingly, it has stuff like this:
    (width, height) = struct.unpack('<ll', header[18:26])
    (depth,) = struct.unpack('<h', header[28:30])
    return [[(0.5 - image[ii][jj] * 0.5 + pad[ii][jj]) % 1 
             for jj in range(len(image[0]))]
            for ii in range(len(image))]

But it still mostly constructs the arrays with list.append, which adds
a single value to the end of a list, even though it could know the
size of the array before it starts building it; and it still mostly
iterates over the arrays with "for x in y:" and list comprehensions.
The ii jj code above probably should have been written with map,
which, like Common Lisp's map, can map over two sequences at once:

    pad_pixel = lambda pix, padpix: (0.5 - pix * 0.5 + padpix) % 1
    pad_row = lambda imgrow, padrow: map(pad_pixel, imgrow, padrow)
    return map(pad_row, image, pad)

That would leave the only numerical indexing in the program up in the
BMP-header-parsing code, which really *should* be using numerical
indexing, since the "spec" I was using, Greg Roelofs' /etc/magic entry
for BMP files, was written in terms of numerical byte offsets.

* safe_repr.py

This program generates debugging representations of possibly infinite
Python data structures in several different ways --- but without using
too much time or returning too much data.  So it accesses arrays and
strings with %, list comprehensions, string.join, and the like, and
occasionally list.append and list.pop, but it does frequently truncate
strings to a specific numerical length --- that's in the spec!  This
is in statements like this:

	astr = astr[:self.maxlen - self.curlen]

Also, a couple of parts of it do a lazy version of string.join by
iterating over the indices of a sequence, conditionally appending
commas in between; and there's another part where it adapts a
(functional map over a) native Python sequence to an ad-hoc lazy
iteration protocol made out of tuples and thunks:

    def _safe_repr_5_map(xform, val, ii):
        if ii == len(val): return empty_stream
        else:
            return (xform(val[ii]), lambda: _safe_repr_5_map(xform, val, ii+1))

Those parts would be slightly simpler if they were using an iterator
protocol that allowed them to see when the sequence was almost
exhausted, rather than using integers to iterate over the sequences.

* ~/cursmail/cursmail.py

This is an outdated copy of my mailreader.  It uses numbers to index
lists a lot, in the following cases:

    if ii < len(self.lines): return self.lines[ii]
    self.mbox.seek(self.msgs[-1])
    self.mbox.seek(self.msgs[index])
    return self.mbox.fp, self.msgs[index]
    msgid = fields[0]
    self.set_tags(msgid, fields[1:])
    while isinstance(payload, type([])): payload = payload[0].get_payload()
    front, astr = astr[:width], astr[width:]
    else: return astr[:width]
    msg = msglist[msgnum]
    msgid = message_id(mboxlist[message_index])
    if message_matches(search_terms, mboxlist[newmi]):
    no_terms = [term[1:] for term in terms if term.startswith('-')]
    mbox = file(argv[1])

A lot of this is because I wanted to save memory, so I stored a small
amount of data about a large number of messages in an array, I mean a
list.  Some of the other instances happened because the email.Message
API is gross, because I was doing stuff in curses and therefore cared
about how many characters wide things were (although UTF-8 plays havoc
with that anyway), and the usual dropping-the-first, getting-the-rest,
and getting-the-last kind of stuff.

This program is different from the others because it deals with more
external interfaces (this version imports 9 modules) and because I was
concerned with memory usage.  I should probably switch away from the
integer-indexed list of messages, though, and create a mailbox-pointer
class instead.

* ~/bloompy/bloom.py

This program uses arrays, in a big way!

	    self.filter[bucket] |= (1L << bit)
	    if not self.filter[bucket] & (1L << bit): return 0

There are only those two lines, but they are kind of the core of the
program.  It also does a little light string processing:

	for suffix in ['s', 'ing', 'ed', 'es', 'er', 's', 'ly']:
	    if word.endswith(suffix): yield word[:-len(suffix)]

* ~/textindex/textindex.py

This program is a little larger than the others at 1800 lines, and it
implements an on-disk full-text search engine supporting a couple of
different file types and using an innovative data structure.  It
imports 12 modules and implements generalized Bloom filters, binary
serialization, some minimal data compression, and a bunch of other
stuff.  So I haven't read it as thoroughly as the others.

It keeps an array of document hashes that gets indexed numerically in
order to allow the rest of the code to use small integers to represent
indexed documents.

It uses numerical indexing of strings for some minimal parsing:
    self._headwords_fname = firstline[len(magic):-1]
    rv = tuple(map(urllib.unquote, line[:-1].split(' ')))

Also it uses the old numerical interface to os.stat because I didn't
know any better at the time:
    return os.stat(fname)[stat.ST_SIZE]

Some of the tests use numeric slicing to ignore certain elements of a
tuple:
    ok(reader.find('0032')[0:2], find_rv)

It uses numeric slicing to manage byte buffers that are windows onto a
file:
    start += end
    buf = buf[end:]
    self.buf = self.buf[pos:]

It uses numeric slicing to chop up a SHA-1 hash, represented as a
binary byte string, into smaller hashes:
    return (gethash(bighash[0:4])[0], gethash(bighash[4:8])[0],
            gethash(bighash[8:12])[0])

Of course the generalized Bloom filter uses numerical indexing as
well:
    self.additions[bucketnum] = bucket = [doc_id]

* overall

Perhaps unsurprisingly, I seem to try to avoid numerical indices for
Python lists and strings, and the majority of my uses fall into one of
these easy-to-fix categories:
- extracting the first or second item
- extracting the last item
- dropping the first or last item
- keeping track of the state of a simple forward iteration over the list
- interfacing with some Python module that returns a list with special
  significance attached to each element (sys.argv, os.stat)

However, there are some cases demonstrated here that are slightly
harder to fix:
- parsing binary data structures specified in terms of byte offsets
- truncating strings to a specified length
- implementing a Bloom filter
- managing byte buffers

There's this additional concern: iterating over arrays with numbers is
pretty easy to understand, and it's pretty easy to figure out how to
do stuff.  For example, I wrote the parallel iteration over the lists
of lists by using integer indices because I had forgotten that
Python's "map" could iterate over multiple lists in parallel; it was
good that I could just do it instead of having to look through the
manual for some way to iterate over multiple lists in parallel.

Maybe iterator objects can be just as nice that way, but I'll have to
see it to be sure.

My Own Programming Style in JavaScript
--------------------------------------

I have some bits of JavaScript lying around, too.

    laptoptable.py
    sk.js
    formulasheet/formulasheet.js
    formulasheet/freevars.js
    (didn't bother to analyze the following; they total 871 lines)
    js-calc.js
    slow3d/torus.js

* laptoptable.py

This is nominally a Python program, but most of its contents are a big
JavaScript string.

The first thing the JavaScript contains is a bunch of very short
functions like this:

    function forEach(list, fun) {
        for (var ii = 0; ii < list.length; ii++) fun(list[ii])
    }
    // Is item an element of list?
    function element(item, list) {
        if (list.length == null) throw "attempt to find element in non-list"
        for (var ii = 0; ii < list.length; ii++)
            if (item == list[ii]) return true
        return false
    }

Those are there to avoid numerical indexing in the rest of the code.

There's a place where I drop the first item in an array:

    var name = args.splice(0, 1)

JavaScript regular expression matches return arrays:

    var rv = parseFloat(match[1])

And there are other places where I get the first item in an array, or
drop it:
    this.title = this.cells()[0].firstChild.textContent
    rv.splice(0, 1)

Here's a loop where I want to loop over both indices and their
corresponding values, which is stupid in retrospect:
    for (var ii = 0; ii < data_values.length; ii++) {
        values.push([ii, data_values[ii]])
    }

As you can guess, that array being constructed there gets indexed with
integer constants.  I probably should have written {i: ii, d:
data_values[ii]}.  Except that the whole thing is stupid.  But I do
probably need to be indexing data_values with a column-number
variable.

There's another loop later where I loop over the indices of an array
in order to insert its contents into the document with commas in
between.

And another loop where I return the numerical position in an array of
an object with a certain key --- like (position ... :key ...) --- in
order to be able to delete it from the list later.  This actually
happens in two places.

* sk.js

This is a graph-reduction evaluator for SK-combinators, and as such we
would expect it to be entirely non-numerical in nature.

It depends on MochiKit, so it really has no excuse for indexing arrays
numerically.  But it does.

It has:
- a loop that iterates over two arrays in parallel (by indices)
- a loop that does, effectively, join(' '), but could be written with
  a forEach because it starts with an initial value from outside the array
- a forEach that uses an Array as a stack with shift and unshift, and
  therefore uses stack[0] to get at the top-of-stack;
- a function 'last' that returns list[list.length - 1]
- various accesses to expression arguments by position rather than by
  name.

* formulasheet/formulasheet.js

This is mostly a simple user interface wrapped around eval, with some
funky stuff to simulate reflection on free variables.  It doesn't
depend on MochiKit and has its own forEach function, but also uses
explicit numerical indices in several places for other kinds of
iteration: find (sometimes with a :key), :from-end, skipping the first
element, dropping the first and last characters of a string.

* formulasheet/freevars.js

This is actually the funky stuff to simulate reflection.  It indexes
arrays by number because JavaScript's regexp matches are indexed by
number, but that's it.

* overall

I use numerical indices a lot in JavaScript, probably even when I
should be using MochiKit's iteration system instead.  (But MochiKit's
iteration system is sloow.)  Mostly I use them for kinds of iteration
that are a little more complicated than simple forEach, but I also use
them for getting the last item, interfacing with regular expressions,
deleting items from arrays, and in place of Objects with named
properties.

Efficiency
----------

The big attraction of arrays is that most operations on them are
constant-time and damned fast: accessing an element, moving to the
next or previous element, getting the length of the array.  Even
appending a new element is amortized constant time, and again, damned
fast.

However, singly-linked lists have one potentially huge advantage: they
support structure sharing.  They probably ought to explain this in CS
101 when they explain singly-linked lists, but generally they don't,
and the students are left thinking that singly-linked lists are just a
sort of extreme optimization for cases where you don't want to pay the
extra pointer field and update cost for a doubly-linked list.

But, for example, with singly-linked lists, you can maintain a large
stack throughout a computation --- say, in a shift-reduce parser ---
and make a list of all of the stack states throughout the parsing
process, in a relatively reasonable amount of space.  And, in
particular, you can build up a list element by element without
affecting the views of old versions of the list, and you can likewise
iterate over the suffixes of the list efficiently.

And iterating over a singly-linked list is almost as fast as iterating
over an array, modulo locality of reference.

But all of that is only true as long as you aren't mutating the list.
Once you start mutating the list, all of these nice advantages of
structure sharing disappear.

This drawback may not be very relevant to Bicicleta, where I don't
expect mutation in general to be very common.

The Strange Case of string.join
-------------------------------

A remarkably large share of the uses of array indexing by numerical
variables I found in both my Python code and my JavaScript code were
for some variant of the string.join problem --- given an array of N
strings, connect them with N-1 commas, except when N=0, in which case
we should connect them with N=0 commas instead of N-1 = -1 commas.

In Knuth's article "Structured Programming With Go To Statements", as
I remember it, this kind of loop is called the "N and a half times
loop".  I seem to remember that he somewhere suggested a "loop
... while condition ... repeat" construct which allows you to do the
termination test anywhere in the body of the loop, to make this kind
of loop straightforward to write, and I think TeX ended up with that
as its normal while loop construct.

The code I wrote was generally something along these lines:

    for ii in range(len(items)):
	for item in xform(items[ii]): yield item
	if ii < len(items) - 1: yield ', '

With a Python-syntax version of Knuth's construct, we could remove the
-1 and the redundant test:

    ii = 0
    loop:
        for item in xform(items[ii]): yield item
        ii += 1
    aslongas ii < len(items):
        yield ', '

Using Python's iterator protocol, this can be rewritten in an
integer-free form, although it's slightly longer:

    i = iter(items)
    try: yield i.next()
    except StopIteration: return
    for item in i:
        yield ', '
        yield item

A more powerful iterator protocol would allow a simpler version, very
similar to the index-based version:

    i = iter(items)
    while i.nonempty:
        yield i.current_item
        i.next()
        if i.nonempty: yield ', '

Conclusions
-----------

Even when your collections are vectors, it's probably better to
iterate over those vectors with some kind of iterator object rather
than directly with indices --- the code is generally clearer, and as a
bonus, it can work with non-vector collections.  The cases where
integer indexing is really needed are less-common things like
implementing hash tables, efficient sorting and binary search, heaps,
permutations, 3-D graphics, and binary file format parsing.

However, these less-common things are actually really important, so
the language can't get away without a primitive vector type for a very
long time.

But if most things will be accessing vectors and other collections
using an iterator protocol, then variadic function argument lists and
introspective slot listing don't need to be vectors; they can just be
objects that implement the iterator protocol.  So I can put off the
hairy user-interface-design question of how to deal with simple
ordered collections in a language without mutation for a while longer.