Bug 6303 - hg complains about "ignoring uknown working directory parent" in commands that don't even need the working copy
Summary: hg complains about "ignoring uknown working directory parent" in commands tha...
Status: RESOLVED FIXED
Alias: None
Product: Mercurial
Classification: Unclassified
Component: Mercurial (show other bugs)
Version: 5.3
Hardware: PC Linux
: urgent bug
Assignee: Bugzilla
URL:
Keywords: regression
Depends on:
Blocks:
 
Reported: 2020-04-20 11:28 UTC by vgatien-baron
Modified: 2020-06-01 00:11 UTC (History)
2 users (show)

See Also:
Python Version: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description vgatien-baron 2020-04-20 11:28 UTC
If I commit in a loop and in parallel log in a loop, and the log command doesn't need to know about the dirstate, it looks at it anyway since 5.3 (it seems because of this localrepo._quick_access_changeid_wc business) and cause racy prints:


commit-loop:

#!/bin/bash                                                                                                                                                                 
export HGRCPATH=
cd /tmp
rm -rf repo
hg init repo
cd repo
echo a > a
hg commit -Am_
i=0
while true; do
  i=$((i + 1))
  echo $i
  echo $i > a
  hg commit -m _
done

log-loop:
#!/bin/bash                                                                                                                                                                 
export HGRCPATH=
cd /tmp/repo
while true; do
  i=$((i + 1))
  echo $i
  hg log -q > /dev/null
done

$ ./commit-loop & ./log-loop
...
4
5
warning: ignoring unknown working parent c52cf1df54c5!
6
7
...


I expect there should be no print, which is what happens in 5.2.

There are two bugs here:
- there's no reason the dirstate should be read (strace confirms hg log reads the .hg/dirstate in 5.3 but not 5.2)
- for better transactionality, the dirstate should be open (not necessarily read) before opening the changelog, so the revisions in the dirstate point to existing revisions (if we ignore removal of committed revision with hg rollbacks or hg strip)
Comment 1 Bugzilla 2020-05-01 00:01 UTC
Bug marked urgent for 10 days, bumping
Comment 2 Pierre-Yves David 2020-05-04 11:58 UTC
The dirstate is read, because it is used to fast path some lookup. However, that fast should now result in such spurious print. Looking into it.
Comment 3 Pierre-Yves David 2020-05-04 12:25 UTC
Actually, something trigger the dirstate loading earlier.
Comment 4 Pierre-Yves David 2020-05-04 15:05 UTC
So, the current issue is that the `mercurial.logcmdutil.getrevs` function load the changelog (through a `_initialrevs` call) then a call to `_makematcher` will create a workingctx that will trigger a dirstate loading.

forcingly loading the dirstate before the `_initialrevs` fixes it.

I currently see multiple way to improve this.

1) try to prevent the dirstate loading in the current code → could solve the current issue, but are at risk of later change triggering it again

2) pre-loading the dirstate in `mercurial.logcmdutil.getrevs` → works but we can have the same issue in other code

3) blindly pre-load the dirstate before loading the changelog → ensure the loading order but might have performance impact,

4) pre-open the dirstate file and load later (Valentin suggestion) → more complex and will be frown upon by some OS (eg: windows)

5) Further repository format rework to ensure transaction on this → much more work

Right now, I am vaforing pre-loading a simple but efficient way to solve the issue. Either locally in the function or globally (option 2 and 3). If the performance impact of the generic on real word repository seems small enough, that is probably the best to do.
Comment 5 Pierre-Yves David 2020-05-11 10:41 UTC
Ther is a simple way to only preload the `parents` part of the dirstate before loading the changelog. The performance impact is minimal except when using the Rust extension in some specific cases. We are digging this out with Raphaël and wil submit the patch once the Rust performance regression is fixed.
Comment 6 Bugzilla 2020-05-22 00:01 UTC
Bug marked urgent for 10 days, bumping
Comment 7 HG Bot 2020-05-24 07:55 UTC
Fixed by https://mercurial-scm.org/repo/hg/rev/35b255e474d9
Pierre-Yves David <pierre-yves.david@octobus.net>
dirstate: make sure the dirstate is loaded before the changelog (issue6303)

Before this change, it was possible for the changelog to be loaded before the
dirstate. If a transaction happens betwen the changelog and dirstate reading,
the dirstate can up end poitning toward a revision not existing in the (olded)
changelog. This lead to a warning.

With this revision, we preload the dirstate parent before reading the changelog.
This has a negligible performance impact on performance for all case we are
tracking.

Differential Revision: https://phab.mercurial-scm.org/D8528

(please test the fix)
Comment 8 Bugzilla 2020-06-01 00:11 UTC
Bug was set to TESTING for 7 days, resolving