[PATCH] changegroup: allow sending snapshot deltas in cg2

Michael Edgar adgar at google.com
Mon Dec 1 01:39:46 UTC 2014


On Sat, Nov 29, 2014 at 2:22 PM, Gregory Szorc <gregory.szorc at gmail.com> wrote:
>
> Mike,
>
> I had a random thought the other day: could your work on censored nodes be "abused" to inject "pre-history" onto existing repositories?


Hi Gregory,

Thanks for taking interest! Unfortunately, censor support as envisioned
probably won't help as directly as you'd like. Presently I've been limiting
censorship support to filelogs, since the goal was to hide sensitive content
and I'm trying super hard to keep the surface area minimal.

An extension or core addition supporting your use case will definitely be
easier once the censorship work settles into core. The details around exchange
are general and seem the trickiest part so far.

The additional work for changelogs would be to define how tombstone metadata is
stored (presumably a key in "extra"), check it in changelog's #checkhash
method, then and add special casing for censorship whenever changelog entries
get read.

Let me know if you have any more questions! I anticipate sharing a detailed
design on a wiki page early this week.

Cheers!

Mike


>
> On Thu, Nov 27, 2014 at 12:37 PM, Michael Edgar <adgar at google.com> wrote:
>>
>> Sorry for taking a while to get back to this thread - I wanted to take
>> a big step
>> back before committing to any one direction.
>>
>> On Tue, Nov 25, 2014 at 6:58 PM, Matt Mackall <mpm at selenic.com> wrote:
>> >
>> > On Tue, 2014-11-25 at 13:17 -0800, Pierre-Yves David wrote:
>> > >
>> > > On 11/21/2014 05:28 PM, Matt Mackall wrote:
>> > > > On Fri, 2014-11-21 at 17:53 -0500, Mike Edgar wrote:
>> > > >> # HG changeset patch
>> > > >> # User Mike Edgar <adgar at google.com>
>> > > >> # Date 1416610038 18000
>> > > >> #      Fri Nov 21 17:47:18 2014 -0500
>> > > >> # Node ID ce1d4cdad3e2a324198a348f5a62f86e9b0e1a73
>> > > >> # Parent  a179db3db9b96b38c10c491e6e7e7ad5f40a7787
>> > > >> changegroup: allow sending snapshot deltas in cg2
>> > > >>
>> > > >> The changegroup2 format allows each revision to be sent with
>> > > >> a configurable base. That base is presently restricted to
>> > > >> p1, p2, or the previous revision in the revlog. By allowing
>> > > >> a null base, we can send snapshot deltas when it is efficient
>> > > >> to do so.
>> > > >
>> > > > Not sure what you mean by efficient here. We generally assume that
>> > > > bandwidth is more scarce than CPU, so calculating a new delta is
>> > > > generally preferred to sending a full revision. This seems to prefer the
>> > > > opposite trade-off?
>> > >
>> > > This change comes to support the "Censored" nodes effort. The censored
>> > > node cannot be used as delta base (and would likely be inefficient
>> > > anyway) so we have an alternative. It some case it would be an easy but
>> > > suboptinal solution to issue a full delta. In some other it will be the
>> > > only available option.
>> >
>> > This doesn't get me any closer to answering the above concerns.
>>
>> I agree with Matt, this change is wrong. I came upon the "no snapshot"
>> limitation in the course of implementing censorship exchange, but too quickly
>> assumed lifting the "no snapshot" restriction would be appropriate in
>> non-censor cases too.
>>
>> The revlog might not have a deltaparent because there is actually no suitable
>> delta base globally, but there are purely local reasons it might have stored a
>> snapshot. Recent changes to limit delta chain length are one example (hg:
>> 76effa770ff9). Variation in compression algorithms are another.
>>
>> This change trusts the revlog whenever it has no delta base, but the revlog
>> only provides a *hint* as to a suitable delta base, and changegroup/exchange
>> ignores that hint if it's unhelpful in the exchange context. For example, if
>> it can't prove the recipient has the delta base, it ignores the hint. I agree
>> with Matt that the bandwidth/CPU tradeoff is also an important reason to
>> ignore the hint.
>>
>> .
>> As for the censor line of work: I have a new approach that I'm writing up
>> and will share on a Wiki page early next week. It will include a channel for
>> signaling that a revision is tombstoned without decoding the revision's
>> text. This new signal can ultimately be considered in code paths like this
>> one.
>>
>>
>>
>>
>>
>> --
>> Michael Edgar | Software Engineer | adgar at google.com | 518-496-6958
>> _______________________________________________
>> Mercurial-devel mailing list
>> Mercurial-devel at selenic.com
>> http://selenic.com/mailman/listinfo/mercurial-devel
>
>



-- 
Michael Edgar | Software Engineer | adgar at google.com | 518-496-6958


More information about the Mercurial-devel mailing list