[PATCH stable] largefiles: handle merges between normal files and largefiles (issue3084)

Matt Mackall mpm at selenic.com
Tue Dec 13 17:53:44 CST 2011


On Wed, 2011-12-14 at 00:17 +0100, Martin Geisler wrote:
> "Na'Tosha Bard" <natosha at unity3d.com> writes:
> 
> > 2011/12/13 Martin Geisler <mg at lazybytes.net>
> >
> >> Matt Mackall <mpm at selenic.com> writes:
> >>
> >> > Do largefiles never go through filemerge?
> >>
> >> No, not really. They do run run through filemerge.filemerge, but it
> >> only offers users this prompt:
> >>
> >>  largefile %s has a merge conflict
> >>  keep (l)ocal or take (o)ther?
> >>
> >> Seems a bit limited to me, to be honest, since users will probably
> >> need to abort the choose blindly at this point and then dump the two
> >> versions by hand.
> >
> > I think if we printed something like:
> >
> > keep (l)ocal or take (o)ther
> > (other is newer)
> >
> > It would prevent a huge number of cases where the user must abort and
> > find out which one is the one he wants. Most use cases with a binary
> > file, if you want the one that is newer -- it's a newer version of the
> > library, or an updated PNG image, or whatever.
> 
> Yeah, but they are in some sense "concurrent" since they were both made
> in paralle from the same ancestor version. So neither is newer than the
> other. But I guess you knew that and just want to compare commit dates?
> That might help, but it wont tell the user much.
> 
> When I was in Copenhagen, you explained that you use largefiles to store
> compiled libraries that are used for the compilation of your product.
> That may be typical, but it was my impression that largefiles was (also)
> meant to be used by people that work "actively" on the large files and
> so treat them like real source artifacts for their product. Then it
> doesn't make sense to just pick the newest image -- there has been some
> communication problem if two artists have worked on the same image in
> parallel and they have probably both had some intent with their commits.
> 
> Based on that, I think that we'll at least need something like
> internal:dump here. It writes the two versions of the file to the
> working copy so that the user can inspect them. Ideally, we would get
> the normal merge machinery to work like normal so that people could
> setup their own merge tools, e.g., a tool that picks the newer file.

Really, I'd like these complex merge decisions to be something you can
defer and then come back and do "hg resolve somefile" and get an
appropriate prompt. Steve worked on this, but I never understood his
patch.

Your internal:dump idea is interesting, but a little unfortunate in the
context of largefiles. Also, does it dump the ancestor, or just "other"?
We've already got "local" by definition.

> This is kind of separate from what this patch does: right now, merges
> where a file changes largefile-status are *broken*. You do the merge and
> then 'hg status' aborts afterwards. That should be fixed. Making the
> merge behave more like a normal merge could be a second step.
> 
> >> >> This patch fixes this by extending the manifest merge to deal with
> >> >> these kinds of conflicts. If there is a normal file 'foo' in the
> >> >> working copy, and the other parent brings in a '.hglf/foo' file,
> >> >> then the user will be prompted to keep the normal file or the
> >> >> largefile. Likewise for the symmetric case where a normal file is
> >> >> brought in via the second parent. The prompt looks like this:
> >> >
> >> > Seems to me we should just always promote files to largefiles on
> >> > merge. Or, apply the existing 'add' thresholds/patterns to decide.
> >>
> >> Yeah, I also wanted to do this at some point, but Na'Tosha told me
> >> she was fine with the simpler solution of just prompting so I went
> >> with that first. I'll have to look at things again to figure out
> >> if/how an upgrade patch could look.
> >
> > As commented before, I don't think relying on the 'add'
> > thresholds/patterns is a good idea at all. My opinion as a largefiles
> > user is that letting the human decide is best, and that automatically
> > upgrading it to a largefile is second best.
> >
> >
> >> >> The status and diff output looks peculiar after a merge where the
> >> >> type of a file changed. If a normal file 'foo' was changed to a
> >> >> largefile, then we get:
> >> >>
> >> >>   $ hg status
> >> >>   M foo
> >> >>   R foo
> >> >
> >> > Eep. That's seriously ugly.
> >>
> >> Yes, agreed :) After looking at the largefiles code I'm afraid I find
> >> the whole concept rather ugly. Basically all commands need wrapping
> >> and adapting to make '.hglf/x' and 'x' be the same file. It feels
> >> brittle and confusing. More papering over could of course hide the
> >> output above, but it is in some way what you would expect when you
> >> have these two files for every largefile.
> >
> > I don't find largefiles confusing, but brittle (with some sharp edges)
> > is a good way to describe it :-)
> 
> I meant that the code is confusing. Having to wrap all commands to
> carefully make them see and not see the largefiles is weird. Things like
> directory renames are broken because of this -- the merge code does not
> "see" the largefiles, so if you have
> 
>   dir/normal
>      /large
> 
> and you move dir/normal to other-dir/normal, then Mercurial will think
> that you've renamed the whole directory since dir is now empty. That can
> of course also be fixed, but it's just an example of major parts of the
> code now must deal with largefiles.
> 
> Greg, Benjamin, et al: did you give any thought to using the
> encode/decode filters (or something similar) to handle this instead?
> That is, "decoding" a SHA-1 into the file content when writing the
> working copy, and "encoding" the file content back to a SHA-1 value when
> commiting to the repo?
> 
> > I think conceptually, I would expect to see:
> >
> > $ hg status
> > M foo
> >
> > Becuase *conceptually*, foo is the same file, whether it is a
> > largefile, or a regular file. What the best thing to actually show in
> > hg status is a bit hard to say.
> 
> I think that's the right thing to show as well.

We might need a cleaner, more self-contained hack underneath so that
it's more transparent at higher levels and needs fewer wrappers. 

-- 
Mathematics is the supreme nostalgia of our time.




More information about the Mercurial-devel mailing list