[PATCH 2 of 5 v5] store: implement fncache basic path encoding in C

Adrian Buehlmann adrian at cadifra.com
Wed Sep 12 09:35:16 CDT 2012

On 2012-09-12 00:59, Adrian Buehlmann wrote:
> On 2012-09-10 22:34, Bryan O'Sullivan wrote:
>> store: implement fncache basic path encoding in C
> I have a (possibly crazy) idea:
> What if we would do a new repo format - let's call it "fasthash" [1] -
> with the following characteristics:
> a) fixes issue3621
> b) does a slightly simpler encoding for hashed paths

Spinning this idea a bit further, we might also do a simpler encoding of short
paths while we're at it.

For example, we could do a simpler version of store._auxencode:

_winreservednames2 = '''con prn aux nul com lpt'''.split()
def _auxencode2(path, dotencode):
    >>> _auxencode2('.foo/aux.txt/txt.aux/con/prn/nul/foo./bla.txt', True)
    >>> _auxencode2('.com1com2/lpt9.lpt4.lpt1/conprn/foo /bla.txt', False)
    '.com1com2/_%lpt9.lpt4.lpt1/_%conprn/foo _%/bla.txt'
    >>> _auxencode2('foo. ', True)
    'foo. _%'
    >>> _auxencode2(' .foo', True)
    '_% .foo'
    res = []
    for n in path.split('/'):
        if n:
            if (dotencode and n[0] in '. ') or (n[0:3] in _winreservednames2):
                n = '_%' + n
            if n[-1] in '. ':
                n = n + '_%'
    return '/'.join(res)

I think this should be considerably simpler to translate into C code
than _auxencode.

The encoding done by _auxencode2 is a bit greedier than the old _auxencode,
as, for example, it (needlessly) also encodes 'lptxyz' to  '_%lptxyz', but I
think this might be worth the simpler C code.

Note: The input sequence '_%' is encoded as '__%', so this doesn't collide with
the '_%' I've used. We currently have only used output tokens '_a'..'_b' (done
by function store.encodefilename). '_a' is produced by encoding 'A'.

More information about the Mercurial-devel mailing list