[PATCH 1 of 2 RFC] export: list all key/values from extra in the patch header

Wed Jan 6 04:48:37 UTC 2016

On Mon, 28 Dec 2015 10:21:58 -0500, Yuya Nishihara <yuya at tcha.org> wrote:

> On Sun, 27 Dec 2015 15:29:57 -0500, Matt Harbison wrote:
>> On Fri, 25 Dec 2015 08:16:01 -0500, Yuya Nishihara <yuya at tcha.org>  
>> wrote:
>> >> This uses the existing mechanism for extensions to export data in the
>> >> header,
>> >> but deviates from the design of handling specific keys.  That seems
>> >> reasonable
>> >> for core behavior, since:
>> >>
>> >>   1) I don't see any way to locate all of the extras used in core and
>> >> hgext/
>> >>   2) Even if there was, I'm not sure that there's a way to enforce
>> >> handling them
>> >>      here
>> >>   3) Even if there was, exporting via a new hg and importing with an
>> >> older one
>> >>      would result in silent data loss.
>> >
>> > I'm not excited about this. I think most of extra data (e.g.
>> > amend_source)
>> > are noisy rather than informative.
>>
>> I agree preserving some may not be generally useful if exporting from  
>> one
>> repo and importing elsewhere (e.g. amend_source).  It's hard to tell  
>> what
>> a user will be interested in though, and picking and choosing is
>> problematic as mentioned above.  I can see how unconditional import of
>> everything would be a pain for people running a patch based submission
>> process.
>>
>> I went looking for a bug report I read this summer where someone
>> complained about import trouble, and it was diagnosed as not preserving
>> all of the extras.  I ended up finding this bug where mpm pondered  
>> making
>> "more/all" extra available, as well as again in the more recent bug
>> referenced in the second patch.  Maybe he can clarify.
>>
>>    https://bz.mercurial-scm.org/show_bug.cgi?id=2742
>>
>> (I couldn't find the report I'm thinking of, either in bz or the ML, but
>> maybe it was a rehash of something like this?
>> http://markmail.org/message/pnfg7exvvtc7wzzx)
>
> Good point. Importing all extras would sometimes useful if he want to
> reproduce exactly the same commit.
>
>> >> If each extension handled the import/export processing for its own
>> >> keys, that
>> >> would mean the extension would need to be loaded for both import and
>> >> export.
>> >> That may not be reasonable for some extensions.  (e.g. I only load
>> >> convert when
>> >> actively converting something.)  This doesn't preclude extensions  
>> from
>> >> adding
>> >> headers.  For example, it may be useful for transplant to emit a  
>> human
>> >> readable
>> >> header for 'transplant_source', since the raw value is an escaped
>> >> binary.  No
>> >> processing would be necessary on import, since the raw key/value is
>> >> still
>> >> present.
>> >
>> > Really? IIRC, 'transplant_source' is an unescaped binary node id,  
>> which
>> > may
>> > contain random bytes including NUL, CR and LF.
>>
>> Sorry, I meant this as two separate thoughts:
>>
>> 1) Needing extensions loaded at the time of export or import is bad.
>>
>> 2) Extensions could still export machine write-only, but human readable
>> lines, and it would be OK.  Similar to how there are two lines for date-
>> the old machine readable line is the only thing parsed on import.  If  
>> the
>> transplant extension wrote a line with the hex node to be human  
>> friendly,
>> it wouldn't need to deal with that line on import because core already
>> handled the raw escaped value (see the tests).  Therefore, no data loss  
>> on
>> import if transplant isn't loaded.
>
> Maybe I don't follow, but the core can't handle raw "transplant_source"
> because it may contain LF. It isn't escaped, the test runner does escape.
> So the core have to at least escape or ignore "transplant_source".

Oh, I see what you mean.  I didn't realize the test runner was what was  
doing that, since I've never transplanted outside of the tests.  (I'm a  
little surprised nobody has complained about mangled output with -T  
{extras}.)

So I guess if core has to special case this one attribute, it might as  
well go a few more lines and node.hex()/node.bin() it.  Then this whole  
paragraph goes away.  I guess it also points out that it would be useful  
to test for and skip attributes with non printable characters in case some  
3rd party extension decided to stuff binary data in extra.  Then it would  
be on them to handle it.

>> Basically, I was trying to justify that even though extras are  
>> generically
>> exported from core, there may still be a reason for the existing  
>> framework
>> to provide extensions a way to hook into the process.  (I assume that  
>> the
>> "Available At" line that's popped up recently is courtesy of an  
>> extension,
>> but dealing with things outside of ctx.extra() like that is probably  
>> rare.)