[PATCH RFC] repo: add an ability to hide nodes in an appropriate way
Ryan McElroy
rm at fb.com
Thu Mar 30 13:08:36 EDT 2017
On 3/30/17 3:15 PM, Pierre-Yves David wrote:
> On 03/27/2017 07:24 PM, Augie Fackler wrote:
>> On Mon, Mar 27, 2017 at 10:27:33AM +0200, Pierre-Yves David wrote:
>>>
>>>
>>> On 03/27/2017 12:32 AM, Kostia Balytskyi wrote:
>>>> # HG changeset patch
>>>> # User Kostia Balytskyi <ikostia at fb.com>
>>>> # Date 1490567500 25200
>>>> # Sun Mar 26 15:31:40 2017 -0700
>>>> # Node ID 43cd56f38c1ca2ad1920ed6804e1a90a5e263c0d
>>>> # Parent c60091fa1426892552dd6c0dd4b9c49e3c3da045
>>>> repo: add an ability to hide nodes in an appropriate way
>>>>
>>>> Potentially frequent usecase is to hide nodes in the most appropriate
>>>> for current repo configuration way. Examples of things which use this
>>>> are rebase (strip or obsolete old nodes), shelve (strip of obsolete
>>>> temporary nodes) and probably others.
>>>> Jun Wu had an idea of having one place in core where this
>>>> functionality
>>>> is implemented so that others can utilize it. This patch
>>>> implements this idea.
>>>
>>> I do not think this is a step in the right direction, creating
>>> obsolescence
>>> marker is not just about hiding changeset.
>>
>> But we're using them for that today, all over the place.
>
> Actually, we do not use it all over the place. Current markers usages
> in core are:
>
> amend:
> - record evolution history
> - prune the temporary amend commit
> (that I would rather see disappear all together)
>
> rebase:
> - record evolution history on successful completion
>
> histedit:
> - record evolution history on successful completion
> - prune temporary nodes on successsful completion
> (This ones used to be stripped even with evolution)
>
> (let us talk about shelve later)
>
>
> So they are mainly used to record evolution history on successful
> operation (their intended usage) between "real-changesets". This is
> the intended usage of obsolescence markers.
>
> In addition, they are also used in a couple of place to hide temporary
> changesets (in the "utility-space") after an operation succeeed.
> This a bit an unorthodox uses of the obsolescence markers, but is
> "okay" to use them in these case because we know:
> 1) this is utility-space so user never interact with them.
> 2) we only create them for completed successful operation so we'll
> never need them ever again.
> Strictly speaking we could just strip these temporary commits (and we
> have done so in the past) and this will not change anything for the
> user. Any other hiding mechanism aimed at "internal temporary" commit
> would do the job just fine. The internal commit never needs (and
> actually should ever) to leave the user repository.
> In practice, we could use in memory commit for most of these temporary
> commit (if not all) and never have to deal with hiding them.
>
>> We're also having pretty productive (I think!) discussions about
>> non-obsmarker
>> based mechanisms for hiding things that are implementation details of
>> a sort (amend changesets that get folded immediately, shelve changes,
>> etc).
>
> (Let me be at be long to try to be clear and comprehensive)
>
> I can see three categories of things we want to hide:
>
> :obsolete changeset (evolution semantic):
>
> A rewrite operation has created/recorded a new version of this
> changesets. that old version no longer needs to be shown to users (if
> possible). There is a strong semantic associated with such changesets
> and the property will be propagated to other repositories
>
> :temporary/internal changesets:
>
> Such changesets are created as a side effect of other operation
> (amend, histedit, etc). They have the following properties (once we
> are done using them)
>
> * They are irrelevant to the users,
> * We should never-ever base anything on them,
> * We should never-ever access them ever again,
> * They should never-ever leave the repository.
> * They never stop being internal-changesets
>
> :locally-hidden changeset:
>
> An hypothetically (not used) yet local-only hidden mechanism
> (similar to strip but without actual repo stripping). This could be
> used for local hiding/masking of changeset in the "real-changeset"
> space. Such data would not be exchanged. Commit reappears when
> "recreated" or re-pulled.
>
> ---
>
> To take practical examples:
>
> :amend:
> - amend source: "obsolete-changeset"
> - temporary changeset: "internal-changeset"
>
> :rebase:
> - rebase source, on success: "obsolete-changeset"
> - rebase result, on abort: "local-hiding"
>
>
> :histedit:
> - histedit source, on success: "obsolete-changeset"
> - histedit temporary node, on success: "internal-changeset"
> - histedit result, on abort: "local-hiding"
> - histedit temporary node, on abort: "local-hiding (internal-node)"
>
> extra notes:
> * If local hiding (strip) would take care of obsolescence marker,
> rebase and histedit could create them "as they go" instead of to the
> end: "on success".
> * In rebase --abort and histedit --abort, strip is used on freshly
> created changesets, so its performance impact is limited
> * we use "local hiding" (strip) on temporary nodes on histedit abort
> because we needs to be able to reuse them if histedit is rerun. So we
> cannot throw them in oblivion just yet.
>
> ---
>
> regarding implementations
>
> "obsolete-changesets": we already have a solid mechanism for that.
>
> "internal-changesets"" has very constrained property. it allow us to
> abuse obsolescence markers for them.
> It would be quite easy and fast to build an ad-hoc mechanism for them
> and stop abusing obsmarkers. (for example a simple bitmaps or maybe
> some "internal-*" phases. Having a dedicated tacking of internal
> changesets helps enforcing the their property and simplify the
> obsmarker space.
>
> "locally-hidden-changesets": this is trickier. For now the only usable
> mechanism we have for them is stripping. We cannot use obsolescence
> marker for them since obsolescence markers are building a global
> persistent and distributed stated and we are explicitly aims at
> staying light and local in this case (I can elaborate on that last bit
> if you think it is needed). Having or not having node-version does not
> make a difference here. locally-hiding do not want to have any global
> side effect and creating obsmarkers will always results in global side
> effect.
>
> Having local-hidding mechanism is actually not too far away, it
> requires a couple of things:
> 1) applying the hiding,
> 2) storing the data,
> 3) automatic lifting of the hiding.
>
> (1) is already here for free. As explained during the sprint, the API
> for hidden things is already flexible enough from the start to be fed
> with other mechanism than just obsolescence.
> (2) should not be too complicated, either a bitmap approach or a root
> based approached (bitmap is probably more solid).
> (3) probably already exist in Jun "archived" experiment. We have to
> make sure recommit/repulling a locally-hidden changeset makes it
> available again.
>
> Overall, this is not too much work.
>
> One extra thing to take care of is obsmarkers. When we strip a
> changesets we want to be able to strip directly related obsmarker.
> Strip is not currently able to do so, but it will be able to do so
> soon and we have a clear technical path here. So non-destructive
> "hiding" of changeset would needs its counterpart for obsmarkers
> before we can use it in case where obsmarkers are involved..
>
>>> It is about recording the
>>> evolution of time of changesets (predecessor ← [successors]). This
>>> data is
>>> used in multiple place for advance logic, exchanged across
>>> repository etc.
>>> Being too heavy handed on stripping will have problematic
>>> consequence for
>>> the user experiences. Having an easy interfaces to do such heavy handed
>>> pruning will encourage developer do to such mistake.
>>
>> I'm struggling a bit to understand your argument here. I think your
>> argument is this: hiding changes can be done for many reasons, and
>> having a simple "just hide this I don't care how" API means people
>> won't think before they do the hiding. Is that an accurate
>> simplification and restating of your position here? (If no, please try
>> and restate with more nuance.)
>
> My point go a bit further than that. Let me restate:
>
> * The purpose of obsolescence marker is to (globally) track the
> evolution of changesets through rewrite,
>
> * The fact obsolete changesets get hidden is just an (intended)
> consequence of this tracking,
>
> * Creation of new obsolescence markers should be done after careful
> consideration and weighting.[1]
>
> This current proposal goes in the wrong direction in regards with the
> above. Creating a shinny "This hides nodes" hammer that use obsmarker
> underneath is misleading. It will encourage developer to create
> obsolescence markers in case where it is inappropriate to do so.
> Because "hide" is the wrong semantic for markers.
>
> On a general basis, a function creating markers without taking a
> (predecessors → successors) mapping is a strong hit that this function
> not approaching obsolescence markers in an appropriate way.
>
> [1] eg: we allow ourself to use obsolescence marker for some internal
> changeset because we have carefully though about it and we decided it
> won't be problematic. I would rather not do it at all.
>
>>> The fact this function only takes a list of nodes (instead of (pec,
>>> suc)
>>> mapping) is a good example of such simplification. Series from Jun
>>> about
>>> histedit is another. Obsolescence markers are not meant as a generic
>>> temporary hidden mechanism for local/temporary nodes and should not
>>> be used
>>> that way.
>>
>> In broad strokes, I agree here. Have we made any conclusive progress
>> on a mechanism for archiving changes? The reason we're seeing all
>> these "abuses" of obsolete markers is that they're currently the only
>> append-only means by which things can be discarded, and there are (as
>> you know) ugly cache implications to running strip.
>
> Strip is far from ideal. However there are multiple mitigation
> mechanism that exist today to limit its impact on cache and there is
> many low hanging fruit to improves this mitigation.
>
> In addition, changeset filtering currently also have some bad impact
> on cache, so they do not simply goes away if we stop using strip.
> Their are also many low hanging fruits to mitigate this impact.
>
>> (I'm sort of +0 on this series, but I'd be more enthusiastic about
>> having a proper hiding mechanism in place that isn't obsmarkers before
>> we go too far down this road.)
>
> To summarize my rational above, the proposed function is misusing the
> obsolescence markers semantic and will leads to problematic usages of
> them. So I'm -1 for it.
>
> I'm not just bringing this out of potential fear. In the past years
> I've seen (and prevented) people to misuse obsolescence marker for
> histedit-abort, rebase-abort, and histedit-abort again, etc… (we just
> fixed a regression as this thread was starting).
>
> I hope this long message help to clarify various concept. We have way
> forward to reduce the use of stripping without abusing the
> obsolescence concept in a way that will create issue for users. These
> way forward are in reach and would not take too long to build.
>
>
For what it's worth, I find this essay pretty convincing: obsmarkers
have their use-case, they are lightly abused now (for temp-amend-commit)
but it probably makes sense to not abuse them further.
It sounds like an "internal" phase above "secret" would actually map
fairly well onto the temp-amend-commit, hidden shelves never meant to be
exchanged, aborted rebase, etc places.
Pierre-Yves, since you've thought about this a lot, does a phase here
make sense (note that I'm not saying a general hiding place like
"archive" phase, but just for internal nodes that we generally don't
want shown or exchanged, but are also not "technically obsolete".
More information about the Mercurial-devel
mailing list