If I commit in a loop and in parallel log in a loop, and the log command doesn't need to know about the dirstate, it looks at it anyway since 5.3 (it seems because of this localrepo._quick_access_changeid_wc business) and cause racy prints: commit-loop: #!/bin/bash export HGRCPATH= cd /tmp rm -rf repo hg init repo cd repo echo a > a hg commit -Am_ i=0 while true; do i=$((i + 1)) echo $i echo $i > a hg commit -m _ done log-loop: #!/bin/bash export HGRCPATH= cd /tmp/repo while true; do i=$((i + 1)) echo $i hg log -q > /dev/null done $ ./commit-loop & ./log-loop ... 4 5 warning: ignoring unknown working parent c52cf1df54c5! 6 7 ... I expect there should be no print, which is what happens in 5.2. There are two bugs here: - there's no reason the dirstate should be read (strace confirms hg log reads the .hg/dirstate in 5.3 but not 5.2) - for better transactionality, the dirstate should be open (not necessarily read) before opening the changelog, so the revisions in the dirstate point to existing revisions (if we ignore removal of committed revision with hg rollbacks or hg strip)
Bug marked urgent for 10 days, bumping
The dirstate is read, because it is used to fast path some lookup. However, that fast should now result in such spurious print. Looking into it.
Actually, something trigger the dirstate loading earlier.
So, the current issue is that the `mercurial.logcmdutil.getrevs` function load the changelog (through a `_initialrevs` call) then a call to `_makematcher` will create a workingctx that will trigger a dirstate loading. forcingly loading the dirstate before the `_initialrevs` fixes it. I currently see multiple way to improve this. 1) try to prevent the dirstate loading in the current code → could solve the current issue, but are at risk of later change triggering it again 2) pre-loading the dirstate in `mercurial.logcmdutil.getrevs` → works but we can have the same issue in other code 3) blindly pre-load the dirstate before loading the changelog → ensure the loading order but might have performance impact, 4) pre-open the dirstate file and load later (Valentin suggestion) → more complex and will be frown upon by some OS (eg: windows) 5) Further repository format rework to ensure transaction on this → much more work Right now, I am vaforing pre-loading a simple but efficient way to solve the issue. Either locally in the function or globally (option 2 and 3). If the performance impact of the generic on real word repository seems small enough, that is probably the best to do.
Ther is a simple way to only preload the `parents` part of the dirstate before loading the changelog. The performance impact is minimal except when using the Rust extension in some specific cases. We are digging this out with Raphaël and wil submit the patch once the Rust performance regression is fixed.
Fixed by https://mercurial-scm.org/repo/hg/rev/35b255e474d9 Pierre-Yves David <pierre-yves.david@octobus.net> dirstate: make sure the dirstate is loaded before the changelog (issue6303) Before this change, it was possible for the changelog to be loaded before the dirstate. If a transaction happens betwen the changelog and dirstate reading, the dirstate can up end poitning toward a revision not existing in the (olded) changelog. This lead to a warning. With this revision, we preload the dirstate parent before reading the changelog. This has a negligible performance impact on performance for all case we are tracking. Differential Revision: https://phab.mercurial-scm.org/D8528 (please test the fix)
Bug was set to TESTING for 7 days, resolving