[PATCH 0 of 2 RFC] auditing paths on flag changes

Adrian Buehlmann adrian at cadifra.com
Sun May 22 02:31:56 CDT 2011

On 2011-05-22 01:12, Matt Mackall wrote:
> On Sun, 2011-05-22 at 00:32 +0200, Adrian Buehlmann wrote:
>> I'll most likely get a layering violation brown bag from Matt for this.
> Actually, no, I don't see a problem with this.

Ok. Then I'm fine with this patch series getting included, since from a
functional standpoint, I'm convinced that this patch series does the
right thing.

>> We probably need to have a design chat about which audit cache should
>> be used for what tree.
>> My understanding is, that the cache of the auditor of the repo's wopener is
>> conceptually tied to _the tree we're updating to_ (on 'hg update').
> Not sure what you mean by 'tree' here. You seem to be suggesting the
> directory state at a given revision. But it's really just attached to
> the working directory. 

Sure. We try to transform that working directory from state A into a
different tree state B, by taking instructions from changesets, which we
might have pulled from an evil remote attacker, so we have to examine
these instructions to make sure we don't blindly modify local files
(outside the working directory) by traversing symlinks.

> Now, it may be the case that the cache can get confused if it persists
> across multiple updates, and we should be nuking it across working
> directory 'transactions' (ie periods where we hold the wlock).


One detail that started me thinking was that we have multiple audit
caches. For example, I spotted another one at localrepo.auditor.

I'm wondering a little bit whether we should "simply" quit using the
wopener's audit cache in applyupdates for the repo.wwrite(s).

Otherwise we might get derailed by poisoned or stale cache content when
applying the changes.

We would probably need two separate transient auditors with separated
caches in applyupdates: one for the unlinks (these are tied to state A
of the working dir) and another one for wwrite(s), flag changes and
targets of moves (these are tied to state B of the working dir).

State A is the precondition of applyupdates and state B is the

> But other
> than that, the fact that a cache exists shouldn't be a visible
> implementation detail.

I think applications like TortoiseHg could probably benefit from such a

There are probably other GUIs or tools in the wild which might keep repo

> (We could have another long discussion about whether we should be paying
> attention to the TOCTOU exploits possible here that could be exploited
> by hostile local users with write permissions in the working directory.
> But my general approach here is that once you've given a local hostile
> user write permissions in your working directory, you've already lost so
> we can ignore all -local- coherency issues for the purposes of the
> cache.)

Agreed. I'm interested here in scrutinizing changesets pulled from
potentially evil remotes.

As you have demonstrated in the traversing symlinks thread.

>> But we might have a problem already with the current design: What happens if
>> someone uses the same localrepo object for multiple operations (e.g. multiple
>> updates)?
>> merge.applyupdates currently uses a local transient auditor for files that are
>> unlinked. Which is good because thanks to this, these paths won't be added
>> to the auditor's cache of localrepo.wopener. But is this enough? Can't these
>> paths already be lingering in the auditor's cache of localrepo.wopener from earlier
>> file accesses?
>> Shouldn't we remove the paths of the files we are unlinking from the auditor's
>> cache?

More information about the Mercurial-devel mailing list