[PATCH 4 of 4 V2] obsolete: allow cycles

Wed Mar 22 01:42:11 EDT 2017

Excerpts from Gregory Szorc's message of 2017-03-21 21:25:03 -0700:
> Correct me if I'm wrong, but can't incorrect clocks result in the "wrong"
> version of a changeset being visible? For example, I'm working on different
> machines and pushing changes to a central server. I create divergence then
> correct it later via various obsolescence markers. But, because of clocks,

Let's look at an example of divergence:

    o   C (amended from A)
    |
    | o B (amended from A)
    |/
    x   A

If you see the above divergence in one machine, fix it by "hg prune B -s C",
and verify the fix on that machine, then the fix could be applied globally.
That means, if the related markers are sent to elsewhere, the fix will be
observed elsewhere as expected.

If the clock was wrong, what you may notice is "hg prune" looks like a
no-op. But it does not unhide things.

In this case only "A" is hidden, since you are not creating markers
involving "A" to resolve the divergence, nothing gets "unhidden".

I'm trying to construct a case that you may be concerned about, to explain
the difference caused by date:

                                        o C (amended from B)
                                        |   (marker 2.2)
    o B (amended from A)                | x B (amended from A)
    |   (marker 1.1)                    |/    (marker 2.1)
    x A                                 x A

    Machine 1                          Machine 2

Then if Machine 1 gets markers from Machine 2, if marker 1.1 is newer than
marker 2.1, B stays visible and is a divergence. You may think "B" should be
hidden, but that's also explainable. Think about the following order:

    1. hg amend # A -> B
    2. hg amend # B -> C
    3. hg update --hidden -r A; ...; hg amend # A -> B

If "3" is the latest operation, it has the intention to make whatever it
creates visible - that's B. And that creates a divergence.

The key here is the order of operation 2 and 3. If 2 happens after 3, then B
gets hidden as expected. Currently, this cycle will get both B and C hidden,
which I don't think is a sane behavior.

> the "wrong" changeset is unhidden and I unknowingly start basing new work
> on it instead of the "correct" changeset. I accidentally land a predecessor
> because I assumed obsolescence "just worked" and introduce a subtle bug in
> the process.
> 
> What are the processes in place to prevent this from happening? How will a
> power user even know to use a power command to fix the situation?

They will notice an hg command looks like a no-op. And we may be able to
print some friendly help about how to fix it, or just write a faked date
automatically with a warning.

> >
> > Not to mention most systems have sane clocks,
> 
> This is dangerous to believe.
> 
> Anecdotally, I've experienced the following problems on just the machines I
> use day-to-day:
> 
> * An old battery on a motherboard caused the system clock to go haywire,
> drifing so fast that the default ntp configuration failed to readjust the
> clock fast enough, leading to clock skew on the order of several minutes.
> Other times, the system would boot up without any persisted time, thinking
> it was at some epoch. This resulted in mtimes decades in the past.
> 
> * Hyper-V attempts to adjust the system clock inside running Linux VMs by
> default. Windows was on California time but the Linux VM was on UTC. The
> timezones conflicts and every few seconds Hyper-V and something in systemd
> would race to fix the system time. The system time bounced around 7 hours
> at a time.
> 
> * Virtual machines (in Virtualbox IIRC) resumed after hibernate with the
> system time from when they were powered down. It took several seconds or
> minutes for NTP or some other process to kick in to adjust the VM's clock
> to reality.
> 
> And these problems all occurred on machines with internet connectivity.
> Could you imagine what would happen if they weren't able to communicate
> with an NTP server?

As I said, the worst case is only about visibility - no repo corruption, no
data loss, no broken consistency (ex. two synchronized repos have different
understanding about things). And visibility is possible to fix by simply
adding markers.

> > unrelated - only those cycles matter - cycles are uncommon.
> >
> 
> OK. Maybe that makes things better. In what scenarios can we get cycles?

Sorry, I was wrong. Precisely speaking, the date will also affect non-cycles
where a successor is another marker's precursor. So in theory it could
unhide commits in non-cycle situations.

You can think it's approximately "sorting markers within a single marker
chain". It's definitely not sorting all markers in obsstore, which is also
unacceptably slow.

We avoid creating cycles currently. But at Facebook "unamend" needs to
restore the old hash, which will be done by just creating a cycle.

> >
> > I think dsop summaried this up well:
> >
> >   Mar 13 10:57:26 <dsop> junw: so basically it boils down to: using date is
> >   not perefct, it makes the solution easy and elegant and if clocks on
> >   computers are wrong, the user might have a non-optimal user experience,
> >   but we never loose data
> >
> 
> Strictly speaking, yes, we never lose data. But I'd argue that using the
> wrong data (changeset) inadvertently would be a massive bug and would
> create distrust in version control.

In theory we can build a DAG for the obsstore, and have some fancy conflict
resolution. I feel it's making the problem unnecessarily much more complex
and the code complexity isn't worth the problem it solves (or creates).

> >
> > I've been thinking about the cycle problem for a long time and don't think
> > there is a better solution practically. The current approach (tens of
> > lines)
> > is probably the most elegant thing I've ever contributed to the list.
> > You're
> > encouraged to suggest new ideas. But if the new idea is like some fancy
> > format change plus some fancy conflict resolution during exchange, which
> > sounds like thousands of lines, I think it's reasonable to say no-no to it.
> >
> 
> I agree the solution is simple, elegant, and probably works most of the
> time. But you've built this castle on an unstable foundation. Convince me
> it doesn't matter.
> 
> As for alternatives, a correct solution needs to refer to the marker it is
> replacing. Since cycles should be rare, can we record the full "replaced
> marker" data inline in the new marker? In local storage, that could be an

If "markers being replaced" are explicitly recorded, you will miss remote
markers that can be possibly replaced because you don't know them at the
time appending a new local marker. So you end up with some "conflict
resolution" logic during exchange.

That is not very different from just using the offsets - since obsstore is
append-only, new markers just "replace" old ones (I don't think there is an
exception that the newly added marker is intended to be replaced by a
previous one when working locally). It's simpler but has the same exchange
headache.

> integer offset to the marker. (Honestly, I don't fully understand the
> problem space. I just saw "sort by clocks" and all kinds of alarm bells
> went off.)