[PATCH 05 of 10] chgserver: add a context manager to learn what to include for confighash
Yuya Nishihara
yuya at tcha.org
Wed Jul 6 09:15:23 EDT 2016
On Tue, 5 Jul 2016 21:10:43 +0100, Jun Wu wrote:
> Excerpts from Yuya Nishihara's message of 2016-07-05 22:46:04 +0900:
> > > I don't quite get your question. So, I draft some questions:
> > >
> > > 1. Is a static whitelist enough?
> > >
> > > No. The issue is 3rd-party extensions.
> > >
> > > We can say this is a limitation of chg and document it somewhere but
> > > people just don't read docs and yell at you whenever they see hg works
> > > while chg does not.
> >
> > As long as third-party extensions use their own config sections, e.g.
> > "remotefilelog.blah-blah", all of them are included in the hash if the list is
> > a whitelist of safe sections. If they read e.g. ui.username in uisetup() and
> > we have to work around that, a static list wouldn't help.
>
> I finally got your idea. It's a "blacklist" about what *not* to hash. I
> object because it leads to unnecessary server processes. Given the fact
> each process needs 100MB+ memory and probably does not share many pages in
> the kernel (not forked from an Python ancestor), I'd treat process number
> seriously.
What you say black is what I see white. Anyway, do you have many variants of
configurations in production? Most of my repository's hgrc files only set
"ui", "paths", and "email", so the not-to-hash list would work for me.
Also, I guess 100MB+ is the VIRT value. I think it is the size of the address
space.
> > Huh? Do they want to trust only [ui]debug ignoring --debug option?
>
> --debug is just a sugar setting the config item "ui.debug", which I think is
> the source of truth:
>
> if options['verbose'] or options['debug'] or options['quiet']:
> for opt in ('verbose', 'debug', 'quiet'):
> val = str(bool(options[opt]))
> for ui_ in uis:
> ui_.setconfig('ui', opt, val, '--' + opt)
Extensions are loaded before the ui options are set.
> > > Therefore I think the "dynamically learn" approach is the way to go.
> > > Developers don't need to know about chg and aren't forced to rewrite their
> > > code (sometimes it's hard) using other patterns. Developers with chg in mind
> > > can write code to reduce the count of chgservers.
> >
> > I prefer the static approach if it can save 90% of the problems with a little
> > penalty of excessive server redirection. Dynamically updating the list would
> > make debugging harder and would need more tests.
>
> I understand your worry. I admit I didn't think too carefully about this
> approach (but now I'm clear). The issue is: if the servers disagree about
> what to hash, they can generate different hashes. It's especially a problem
> if one server changes its idea about what to hash.
>
> I think we should freeze the list at chgserver startup to make sure a single
> server has consistent view about what to hash all the time.
>
> For multiple servers, the following (bad) situation may happen:
>
> 1. server A tell client to redirect to hash1
> 2. client does not find hash1, starts a new server
> 3. the new server generates hash2, while there is already a server running
> at hash2, therefore the starting of the new server is a waste.
>
> To solve it, it seems we have to store the "what to hash" information on
> disk, which is more complicated than I thought.
>
> Therefore, what I think the next steps are:
>
> 1. Add new configs to chgserver deciding what to hash:
>
> [chgserver]
> confighash.sections = section1, section2
> confighash.items = section.item, section.item
>
> This would also serve as the "api" for other extensions to hash their
> configs as they just need to use setconfig.
>
> This is a one-time config, chgserver reads it during startup and forget
> about it afterwards.
>
> 2. (Optionally) add the auto-learn logic. I probably won't actively
> work on this, but put it here as it's a possible step.
>
> What do you think? 1 is pretty easy and straight-forward and solves a lot
> of issues. I may want to follow up with the chg-compat checker test then.
I don't think (1) needs to be a config section, which may be updated by
repository's hgrc. Instead, extensions can update the list by ui/extsetup()
chgserver = extensions.find('chgserver')
chgserver._configsections.add(...)
This will be simpler if chgserver gets out of hgext.
> > > > > Note this is not a solution for all cases, namely the following cases still
> > > > > need extra work:
> > > > >
> > > > > 1. "if ui.config('x'): ui.setconfig('x', 'y')" in uisetup.
> > > > > This will overwrite config 'x' and chg will be unable to hash the value
> > > > > user provides. The pattern is mainly used in the evolve extension. It
> > > > > could be solved if we have clear layers about what are set by extensions
> > > > > and what are set by users.
> > > > >
> > > > > 2. Patterns like "self.ui = ui" in uisetup.
> > > > > This will keep a reference to a stale ui object. We probably want to
> > > > > add devel-warn for the case. It may also be reasonable to add a similar
> > > > > check to reposetup.
> > > > >
> > > > > 3. Interacting with filesystem or network in uisetup.
> > > > > In general, extensions doing these should be aware of race conditions.
> > > >
> > > > (1) and (2) are generally wrong because the ui passed to uisetup() and
> > > > extsetup() are "lui", a temporary copy of ui.
> > >
> > > Why are they wrong?
> > >
> > > For (1), you can have an extension:
> > >
> > > # a.py
> > > def uisetup(ui):
> > > # extdiff is in chgserver._configsections
> > > ui.setconfig('extdiff', 'x', 'x')
> > >
> > > And run chg:
> > >
> > > chg --config extensions.s=$PWD/a.py --config extdiff.x=1 version
> > > chg --config extensions.s=$PWD/a.py --config extdiff.x=2 version
> > >
> > > The second command will not start a new server.
> >
> > IIRC, config values set in uisetup() aren't copied back to the original ui
> > and repo.ui, which is why I said generally wrong.
>
> That's not true :( Otherwise the "# stolen from
> tortoisehg.util.copydynamicconfig()" hack will be unnecessary.
The hack should be necessary because config values are updated by reading
.hg/hgrc.
Try "hg config foo.bar" with/without a repository to see if ui.setconfig()
survives.
def uisetup(ui):
ui.setconfig('foo', 'bar', 'baz')
More information about the Mercurial-devel
mailing list