bfiles integration into core hg

Tue Apr 13 16:47:10 CDT 2010

On Tue, Apr 13, 2010 at 3:37 PM, Benoit Boissinot <bboissin at gmail.com> wrote:
> Sorry if this has already been discussed, I didn't follow closely
> bfiles development.

Nope, AFAIK your idea is original.

> While reading stuff about lwcopy, it got me thinking how close it
> feels to bfiles. Can't bfiles be implemented in the same as lwcopy?
>
> - store a fake revision, with filelog metadata 'bfiles=<sha1sum of the
> file>' followed by an empty text
> - like lwcopy, teach filelog to first read the metadata, if it's a
> bfiles entry it should fetch it from the store instead of building it
> from the revlog
> - like lwcopy, for efficiency it would be better to modify the network
> protocol (changegroup format) to support it.

Neat.  That would be much more transparent than bfiles currently is.
But the implementation is sufficiently different that I'm not sure it
should be called "bfiles".  If this gets added to the core, it should
be called something *other* than "bigfiles" or "bfiles", and then
those two extensions can quietly die.

Have you thought about the UI?  At the very least, you need a way to
add a new big file -- the equivalent of "hg bfadd".  After that, I
think everything should just work ... unless someone wants to convert
a regular file to "big".  That might need another command.

Also, converting a bfiles repo to your scheme would break changeset
IDs, so bfiles-the-extension would not disappear overnight.

> What's missing? (I think some places should accept "streams" instead
> of strings to avoid keeping the object in memory, but that would be a
> later optimization)

I think that is critical, not just an optimization for later.  It's
one of the primary reasons I wrote bfiles, and keeping a lid on memory
use is therefore a key part of bfiles' contract.

Greg