[RFC] kbfiles: an extension to track binary files with less wasted bandwidth

Greg Ward greg-hg at gerg.ca
Wed Aug 10 21:03:46 CDT 2011


On Wed, Aug 10, 2011 at 4:00 PM, Andrew Pritchard <awpritchard at gmail.com> wrote:
> After a lot of refactoring and bugfixing, as well as plenty of naming
> and copyright concerns, it now seems that largefiles is nearly ready
> to be added to the Mercurial core repository as a bundled extension.

Awesome! I should mention that I am quite happy to see this happen,
and look forward to the day when I can migrate from bfiles to
largefiles at work. So I'm finally looking at the code, running the
tests, etc.

Various concerns...

1) I strongly believe you need some better docs: just trying to piece this
   thing together from 'hg help' is not enough, which is why I wrote usage.txt
   for bfiles. Please steal that and hack it up so it describes largefiles.
   Your post to this list that started this thread a week or two ago was also
   excellent: I suggest you crib from it liberally.

2) test-lockout.t is failing. I tried Mercurial 1.7, 1.8, and 1.9 -- different
   failures, but it failed with all of them. Let me know if you cannot reproduce
   and I'll give more detail.

3) The copyright attribution is inaccurate: bfiles was written as a
work for hire,
   and so the first copyright line should be my employer, just as I put in the
   copyright statements for bfiles a few days ago (see
   http://hg.gerg.ca/hg-bfiles/rev/6f832a089582). I took the liberty of
   granting some copyright to myself on the grounds that I have spent a lot of
   my own time on bfiles over the last year or so. If my boss
disagrees and decides
   to give me a hard time over that, I'll let you know. ;-)

4) I see you made liberal use of my hgtest.py module. Too bad: I'm pretty sure
   Matt won't like that. The best thing about hgtest.py is that it offended Matt
   so deeply that he implemented the fine new unified test system in
Mercurial 1.7.
   The worst thing about it is that converting tests from hgtest.py to
unified is
   a slow, painful, tedious manual process. I've converted many of the
tests I wrote for
   bfiles, but not all.

5) Has anyone reviewed the changes to bfiles since you forked to see
if there's anything
   there that needs to be in largefiles? I guess I'll do it if no one
else has, but
   a) I don't know the fork point and b) I don't want to duplicate the
work if it has
   already been done.

6) It would be nice to make migration from bfiles painless and transparent.
   My first suggestion: make the '.hglf' prefix configurable. Then bfiles
   users can just set it to .hgbfiles and not have to go through a painful
   repository conversion just to remap the standin filenames.

   (Hey, has anyone else noticed that '.hglf' looks like 'mercurial linefeed'?
   I hope this doesn't get confused with eol...)

> At this point, if no one has any objections, comments, or concerns, I
> can collapse largefiles into a patch against Mercurial (still
> preserving the original repositories on http://developers.kilnhg.com),
> place it in a clone of http://selenic.com/hg, and submit it as a pull
> request (because a 7500-line patchbomb seems ill-advised).

I still very much want to see a repo that *accurately* preserves the
history of bfiles + kbfiles + largefiles. That is, I think you should:

1) start with a partial clone of bfiles, up to the changeset that Fog Creek
   forked it to create kbfiles
2) import patches from the Kiln repo that trace the history of kbfiles
3) import patches from the largefiles repo that Benjamin created on June 20

The resulting repo will give an accurate history of largefiles, and
IMO that is what should be saved for posterity in a prominent
location.

If you need any help doing that, just ask. I'm motivated to get it
done, but I don't have access to the Kiln repo. All I can do is
suggest ways to get the job done.

Greg


More information about the Mercurial-devel mailing list