RFC: version (big) file snapshots with storage outside a Mercurial repo with snap

Klaus Koch kuk42 at gmx.net
Tue Aug 17 13:52:49 CDT 2010


On Aug 17, 2010, at 12:43 AM, Adrian Buehlmann wrote:

> On 15.08.2010 21:03, Klaus Koch wrote:
> [snip]
>> The repository and bug tracking of snap can be found at:
>> 
>> http://bitbucket.org/kuk42/hgsnap/wiki/Home
> 
> Wow. ~4000 lines of Python code!
> 
> First and fast thought I had: As a user, I wouldn't want to have to depend on that
> much extension code for storing my crown jewels in mercurial repos (let alone maintain
> that code...).
Oh please no FUD :)  Well, if you look beyond the boilerplate code for all the wrapping, most of the functionality could be implemented by small patches of Mercurial's functions.  However, one would most likely first see how this works out as extension, if only regarding the workflow and general approach.  I wrote some additional commands for the bigfiles extension, this resulted in ~2000 lines of Python code, never really worked and was a maintenance nightmare.

> As an aside: Your hybridencode function looks questionable to me:
> 
> def hybridencode(path):
>    """stored snap file names get the same handling as Mercurial data files"""
>    hpath = _hybridencode('data/'+path)
>    if hpath.startswith('dh/'):
>        return hpath[len('dh/'):]
>    elif hpath.startswith('data/'):
>        return hpath[len('data/'):]
>    raise util.Abort(
>        _("this Mercurial's hybridencode returns unexpected path encoding"))
> 
> Mapping dh/ and data/ store paths into the same namespace may actually
> produce filename collisions.
> 
> Which is one of the reasons why I separated these into dh/ and data/ in
> mercurial.store.hybridencode.
Hm, I just wanted to save the additional path.  Still, no collision can happen.  First, to the original path is added as extension the sha1 of the content.  Since the extension is kept, only if the content were identical would we see the same sha1 and then it does not matter whether the names clash---we just save storage.  If the content were different, snap would add an integer to the sha1.

Nevertheless, I may remove that code.  It is not really needed and we save some lines of Python.

Klaus


More information about the Mercurial-devel mailing list