[PATCH 1 of 2 RFC] export: list all key/values from extra in the patch header

Sun Dec 27 14:29:57 CST 2015

On Fri, 25 Dec 2015 08:16:01 -0500, Yuya Nishihara <yuya at tcha.org> wrote:

> On Sun, 20 Dec 2015 22:52:26 -0500, Matt Harbison wrote:
>> # HG changeset patch
>> # User Matt Harbison <matt_harbison at yahoo.com>
>> # Date 1450661679 18000
>> #      Sun Dec 20 20:34:39 2015 -0500
>> # Node ID eff20351cc9b29232493885887e82576f772d1b6
>> # Parent  5df74b2f296df7f44a08106df4f9dd97a5aa726a
>> export: list all key/values from extra in the patch header
>>
>> This uses the existing mechanism for extensions to export data in the  
>> header,
>> but deviates from the design of handling specific keys.  That seems  
>> reasonable
>> for core behavior, since:
>>
>>   1) I don't see any way to locate all of the extras used in core and  
>> hgext/
>>   2) Even if there was, I'm not sure that there's a way to enforce  
>> handling them
>>      here
>>   3) Even if there was, exporting via a new hg and importing with an  
>> older one
>>      would result in silent data loss.
>
> I'm not excited about this. I think most of extra data (e.g.  
> amend_source)
> are noisy rather than informative.

I agree preserving some may not be generally useful if exporting from one  
repo and importing elsewhere (e.g. amend_source).  It's hard to tell what  
a user will be interested in though, and picking and choosing is  
problematic as mentioned above.  I can see how unconditional import of  
everything would be a pain for people running a patch based submission  
process.

I went looking for a bug report I read this summer where someone  
complained about import trouble, and it was diagnosed as not preserving  
all of the extras.  I ended up finding this bug where mpm pondered making  
"more/all" extra available, as well as again in the more recent bug  
referenced in the second patch.  Maybe he can clarify.

   https://bz.mercurial-scm.org/show_bug.cgi?id=2742

(I couldn't find the report I'm thinking of, either in bz or the ML, but  
maybe it was a rehash of something like this?   
http://markmail.org/message/pnfg7exvvtc7wzzx)

>> If each extension handled the import/export processing for its own  
>> keys, that
>> would mean the extension would need to be loaded for both import and  
>> export.
>> That may not be reasonable for some extensions.  (e.g. I only load  
>> convert when
>> actively converting something.)  This doesn't preclude extensions from  
>> adding
>> headers.  For example, it may be useful for transplant to emit a human  
>> readable
>> header for 'transplant_source', since the raw value is an escaped  
>> binary.  No
>> processing would be necessary on import, since the raw key/value is  
>> still
>> present.
>
> Really? IIRC, 'transplant_source' is an unescaped binary node id, which  
> may
> contain random bytes including NUL, CR and LF.

Sorry, I meant this as two separate thoughts:

1) Needing extensions loaded at the time of export or import is bad.

2) Extensions could still export machine write-only, but human readable  
lines, and it would be OK.  Similar to how there are two lines for date-  
the old machine readable line is the only thing parsed on import.  If the  
transplant extension wrote a line with the hex node to be human friendly,  
it wouldn't need to deal with that line on import because core already  
handled the raw escaped value (see the tests).  Therefore, no data loss on  
import if transplant isn't loaded.

Basically, I was trying to justify that even though extras are generically  
exported from core, there may still be a reason for the existing framework  
to provide extensions a way to hook into the process.  (I assume that the  
"Available At" line that's popped up recently is courtesy of an extension,  
but dealing with things outside of ctx.extra() like that is probably rare.)