Remark on store.encodedir and lowerencode for hashed paths

Adrian Buehlmann adrian at cadifra.com
Fri Oct 5 06:50:45 CDT 2012


The code for hybridencode before we did the recent series of speed refactorings
was (revision 65df60a3f96b):

_maxstorepathlen = 120
_dirprefixlen = 8
_maxshortdirslen = 8 * (_dirprefixlen + 1) - 4

def _hybridencode(path, auxencode):
    if not path.startswith('data/'):
        return path
    # escape directories ending with .i and .d
    path = encodedir(path)
    ndpath = path[len('data/'):]
    res = 'data/' + auxencode(encodefilename(ndpath))
    if len(res) > _maxstorepathlen:
        digest = _sha(path).hexdigest()
        aep = auxencode(lowerencode(ndpath))
        _root, ext = os.path.splitext(aep)
        parts = aep.split('/')
        basename = parts[-1]
        sdirs = []
        for p in parts[:-1]:
            d = p[:_dirprefixlen]
            if d[-1] in '. ':
                # Windows can't access dirs ending in period or space
                d = d[:-1] + '_'
            t = '/'.join(sdirs) + '/' + d
            if len(t) > _maxshortdirslen:
                break
            sdirs.append(d)
        dirs = '/'.join(sdirs)
        if len(dirs) > 0:
            dirs += '/'
        res = 'dh/' + dirs + digest + ext
        spaceleft = _maxstorepathlen - len(res)
        if spaceleft > 0:
            filler = basename[:spaceleft]
            res = 'dh/' + dirs + filler + digest + ext
    return res

For hashed paths, this does

    path = encodedir(path)
    ndpath = path[len('data/'):]
    aep = auxencode(lowerencode(ndpath))

This "fails" to encode directories ending in

   .I, .D, .HG, .Hg, .hG

because encodedir() only replaces the *lowercase*

   .i, .d, .hg

For example, a directory 'foo.I' will be lowerencoded to 'foo.i',
which encodedir() won't see, as that has already been done.

In practice, this doesn't matter, as hashed paths end with a fourty
character hash, followed by .i or .d., which is impossible to collide
with a directory name, as those can't be longer than 8.

See also http://selenic.com/repo/hg/rev/810387f59696, which moved the
encodedir step from filelog into store (2009-05-20).


More information about the Mercurial-devel mailing list