[PATCH 3 of 3 RFC] import: add new --faithful flag to use metadata but relax checks

Matt Mackall mpm at selenic.com
Tue Apr 12 23:59:06 UTC 2016


On Tue, 2016-03-29 at 09:23 -0500, Kevin Bullock wrote:
> > 
> > On Mar 27, 2016, at 13:16, Pierre-Yves David <pierre-yves.david at ens-lyon.org
> > > wrote:
> > 
> > On 03/11/2016 01:12 PM, Augie Fackler wrote:
> > > 
> > > On Fri, Mar 11, 2016 at 3:11 PM, Pierre-Yves David
> > > <pierre-yves.david at ens-lyon.org> wrote:
> > > > 
> > > > 
> > > > 
> > > > On 03/11/2016 06:25 PM, Augie Fackler wrote:
> > > > > 
> > > > > 
> > > > > # HG changeset patch
> > > > > # User Augie Fackler <augie at google.com>
> > > > > # Date 1457539063 18000
> > > > > #      Wed Mar 09 10:57:43 2016 -0500
> > > > > # Node ID 7d53477e4496e8f2b16b12ed445407e79bbb787b
> > > > > # Parent  602504c64084d85820c883a43b02951a61e992f5
> > > > > # EXP-Topic import
> > > > > import: add new --faithful flag to use metadata but relax checks
> > > > > 
> > > > > Sometimes it's helpful to import a patch with as much of the metadata
> > > > > (especially parents) intact as possible, but some bit of extra didn't
> > > > > make the trip through the exported patch. This gives users a tool to
> > > > > preserve as much metadata as possible without having to get an exact
> > > > > byte-for-byte match on the import process.
> > > > 
> > > > In my opinion, the key part here is to:
> > > > 1) bypass working copy
> > > > 2) use the parent informations
> > I'm not sure we should automatically turn the bypassing here. Having a flag
> > that make sure a patch is applied on parent.
> > 
> > (On the same "perpendicular" but important topic, there is the question of
> > updating on the result or not).
> > 
> > > 
> > > > 
> > > > I would rather see a name related to parents that "faithful". I don't
> > > > "faithful" is very explicite and it mostly make sense in regards with
> > > > --exact.
> > > I'm not in love with the name "faithful", so I'd love constructive
> > > suggestions about what we could name the flag.
> > The important part here is the fact we read and use the parent information
> > in the patch. So I think having "parent" in the name make sense:
> > 
> >  --originalparents
> >  --useparents
> >  --onparents
> >  --parents
> >  --onorigin
> > 
> > I think --useparents is my favorite but I don't have a strong opinion.
> It seems like we have three classes of metadata here:
> 
> 1. user, date, description, branch name - always imported, no problems here

There's definitely a set of stuff we do ok at, yes.

> 2. node id + original parent (others?) - can be used as hints in importing

This is probably one of the key things that makes this confusing. Some people
are using --exact not because they care about exactitude, but because they want
to preserve topology when they're moving commits.

This turns out to be hard.. as soon as you think about the SECOND import. If
you've failed to exactly preserve the hash, the second import's parent won't
exist (sad trombone). So --faithful as implemented here ends up disappointing.

Which is why parents has always been tied to --exact. Even though --exact isn't
"about" parents per se, it was always the only -possible- way to preserve the
DAG topology with import/export.

But obsmarkers actually might allow us to do something reasonable here by
importing the second commit onto the successor of whatever it thinks its parent
should be. If we do something in that direction, we could have an option like --
topo ("preserve topology").

> 3. extra data that is part of the commit hash but not the patch - causes
> problems with --exact

We should try harder here (probably by default), but it'll be a long way from
perfect because:

4. the numerous classic and git diff corner cases
   - linefeed style
   - empty vs missing files
   - exec bits and symlinks
   - funky filenames
   - merge/copy/rename cases that git doesn't care about
   - etc.

5. the numerous edges of our own metadata hashing
   - dates with floating point seconds
   - unsorted extra
   - unsorted file-level metadata
   - pre-UTF8 commits
   - etc.

6. all the awful things that email clients do


And it's unlikely we'll fix all of those.. ever.

-- 
Mathematics is the supreme nostalgia of our time.



More information about the Mercurial-devel mailing list