Tue, 24 Jun 2008

* zooko <zooko at zooko.com> [2008-06-24 02:30]:
> On Jun 23, 2008, at 1:50 PM, Aristotle Pagaltzis wrote:
>> The problem with the Reiser family FSs is that they are
>> inherently brittle. Now that they have been sufficiently
>> debugged, they no longer lose data often, but if you have even
>> a small unrepairable corruption, it is still more likely that
>> you’ll lose half your disk instead of just a few files, as is
>> the extX family’s failure mode.
>
> How do you know this? It doesn't seem to be implied by any of
> the papers that I referenced, but nor is it contradicted by
> them. I would like more data.

Purely from my own reasoning.

In extX, inodes, directories and the alloc bitmap are all
separate, and two of them are randomly accessible linear data
structres. Actually in some important ways even directories are.
There are a few crucial bits of metadata about these data
structures that, if destroyed, would preclude you from finding
them at all (eg. the superblock and such), but those are not
written to during normal operation. If you lose a directory,
the inodes are still there so you lose the tree structure but
none of the contents; if the bitmap is affected, as long as you
notice the inconsistency you lose nothing (so it’s really just
a cache); if you lose inodes, only the files described by the
affected inodes are lost. It’s simply impossible to do much
non-localised damage because the metadata layout has such low
entropy.

Of course that’s also a big reason why it’s impossible to make
extX fast for operations involving a lot of metadata.

In constrast, reiserX mediates all metadata through a Btree. If
you lose any subtree, the entire information about that subtree
becomes unreachable. You can use a carving-type tool and some
heuristics to try to find the metadata after the fact and restore
it as well as possible, but your chances are still mediocre. This
is how reiserX gets its phenomenal speed, of course – every bit
of metadata read from the disk helps avoid having to read more
metadata. Entropy is very high. That’s also the reason for it’s
sky-high CPU cycle consumption.

But it does mean that it is inherently brittle, because you need
all of the participating metadata to get at any piece of data,
whereas in extX a lot of the participating metadata only serves
as middle men providing indirection.

This is an information-theoretically rooted tradeoff. It is
mathematically impossible to make a filesystem both extremely
robust and extremely fast, because those properties lie at
opposite ends of the redundancy scale.

And for my own proclivities, reiserX goes too far toward the
performance end of the scale. At the same time I don’t think
extX is the be-all end-all on its part of the scale; I think
it is entirely posssible to achieve robustness at least close
to that of extX without having to accept nearly as limited
performance.

Hence my general dislike of reiserX.

Regards,
-- 
Aristotle Pagaltzis // <http://plasmasturm.org/>