[PATCH 1 of 2] patch: support diff data loss detection and upgrade

Thu Dec 31 05:42:46 CST 2009

Le 31/12/09 11:31, Thomas Arendsen Hein a écrit :
> * Patrick Mézard <pmezard at gmail.com> [20091231 10:12]:
>> Le 30/12/09 18:38, Patrick Mezard a écrit :
>>> # HG changeset patch
>>> # User Patrick Mezard <pmezard at gmail.com>
>>> # Date 1262194535 -3600
>>> # Node ID 81c70639660b227332226cf2679e5c4f698fb486
>>> # Parent  c31ac3f7fd8f83ff728bfda18e2d0a5276f2bff8
>>> patch: support diff data loss detection and upgrade
>>>
>>> In worst case, generating diff in upgrade mode can be two times more expensive
>>> than generating it in git mode directly: we may have to regenerate the whole
>>> diff again whenever a git feature is detected. Also, the first diff attempt is
>>> completely buffered instead of being streamed. That said, even without having
>>> profiled it yet, I am convinced we can fast-path the upgrade mode if necessary
>>> were it to be used in regular diff commands, and not only in mq where avoiding
>>> data loss is worth the price.
>>
>> Matt, does it look like what you expected?
> 
> I guess it is not.
> 
> Imagine "hg export -o %n.patch 0:100" or patchbombing multiple
> changesets and want to abort (instead of warn) if this would not be
> possible with standard patches before writing 90 of 101 files or,
> even more important, before sending out 7 of 10 mails to a public
> mailing list.
> 
> Ideally in this case there would be a quick check if
> - copies have been used
> - executable status has changed
> - empty files have been created or deleted

Thank you, I missed empty file removals.

> - binary files are affected
> 
> and only after that real patches will be generated.

Sure, but this quickcheck must run over all individual patches being generated, with the same arguments passed to patch.diff(). If you look at patch.diff() you will see this quickcheck *is* patch.diff() with dodiff=False, disabling binary or unidiff generation. So a first approximation of this quickcheck is to call patch.diff() and discard the result. Or we can add some --dry-run option to avoid generating the output. I would consider duplicating the checking logic elsewhere only if these solutions are not good enough.

--
Patrick Mézard