[PATCH 1 of 3 V2] util: add method to hash nested combination of python data structures

Sun Jul 12 09:37:14 CDT 2015

On Sun, 12 Jul 2015 14:27:41 +0100, Pierre-Yves David wrote:
> On 07/09/2015 03:23 PM, Yuya Nishihara wrote:
> > On Wed, 8 Jul 2015 16:55:09 +0000, Laurent Charignon wrote:
> >> On Jul 8, 2015, at 6:32 AM, Yuya Nishihara <yuya at tcha.org<mailto:yuya at tcha.org>> wrote:
> >> On Tue, 7 Jul 2015 15:21:08 +0000, Laurent Charignon wrote:
> >> On Jul 7, 2015, at 5:54 AM, Yuya Nishihara <yuya at tcha.org<mailto:yuya at tcha.org>> wrote:
> >> On Mon, 6 Jul 2015 11:42:40 -0700, Laurent Charignon wrote:
> >> # HG changeset patch
> >> # User Laurent Charignon <lcharignon at fb.com<mailto:lcharignon at fb.com>>
> >> # Date 1435794507 25200
> >> #      Wed Jul 01 16:48:27 2015 -0700
> >> # Node ID 38cd2c6265855f0a8e65e02e2cc181921df498ca
> >> # Parent  2748bf78a5bf610da4f2d90fd1eea19a3b360c04
> >> util: add method to hash nested combination of python data structures
> >>
> >> The goal of this series of patches is to have a method to compute the hash of
> >> config objects. This will enable restarting the command server when the config
> >> change.
> >>
> >> I'm curious who and how the command server will be restarted. This patch seems
> >> to imply that we can't rely on the mtime of the config files.
> >>
> >> - I will send you a pull request for that
> >>
> >> A pull request? Oh, it seems you're playing with chg, thanks.
> >>
> >> Yep, I actually sent it, you can have a look :)
> >
> > Thanks, I'll look into it.
> >
> >> - We can, but if the config ends up being the same we don't want to restart
> >> the command server preemptively.
> >>
> >> Hmm, because config files are edited by user, I think we can assume that the
> >> mtime change denotes the content change in most cases.
> >>
> >> How about automated deployment through configuration tools?
> >
> > Does the deployment tool run with chg?
> 
> Goal will be to have everything running with chg.
> 
> I think we should stick to the config hashing for now (or any "always" 
> right alternative). Using mtime have issues (cf foozy series about 
> dirstate and mtime). I would rather go with the full and solution 
> solution first and then see if performance are an issue looking for a 
> better solution.
> 
> One of the goal here is to use chg with generic test, script and 
> automatic deployment tool. They will modify config and run hg close to 
> each other.

Oh, I see. If we want to run the tests using chg, mtime isn't enough,
especially on ext3 that provides low time resolution.

> What is the slow down involved by the config hashing here? My 
> expectation is that it will nefligible. In all cases, I would be 
> surprise if chg+config-hashing is slower that no-chg.

My point is that the proposed hashing function has little benefit. It's slower
than __eq__() and we have objects to be compared.

Also, I'm not sure if a hash() value is strong enough for collision because it
is designed for a hash table where speed is more important than cryptographic
hashes.

> >> What if the client adds a space or a comment, do we really want to
> >> restart the server then?
> >
> > Well, it's acceptable cost for me. It's just an extra ~100msec, is shorter than
> > I type C-x C-s C-x C-c.
> 
> I stand with yuya here. space change in the config is a config change. 
> It will be rare enough so that we can bare with a server restart here.
> 
> >> My idea for chg is:
> >>
> >> 1. server writes mtime (or sha1 hash) and path to "rcmtimes" file at startup
> >> 2. client read "rcmtimes" to detect config change (everything done in C *1)
> >> 3. kill -TERM, wait and respawn server, or kill -HUP ?
> 
> This server lifecycle being controled by its many client seems strange 
> to me. In my opinion, the only responsibility the client should have 
> regarding the server is to get one rolling if it does not already 
> exists. All life cycles decisions (shutdown//reload if config changed, 
> shutdown after timeout) should be done by the server, probably controled 
> by config option.

I won't argue if the server can restart itself. I just don't like the
following:

 1. server shutdown by itself if config changed
 2. but client have to start new server

> > My position for this patch:
> >
> >   - it won't be a general solution for the issue of the command server
> >     https://mercurial.selenic.com/wiki/CommandServer#Known_issues
> >   - because in common scenario of using pipe, server and client are 1:1,
> >     a single long-lived connection + many "runcommand" requests,
> >     which is very different from how chg works
> >   - so for now, I don't think this patch will be useful in Mercurial core
> 
> I think the chg use case is worthwhile (one server, multiple 
> connection). How is hglib working in this regard ? Having a server 
> capable of restarting on config changes seems a valuable option in 
> general. I think this is series (config comparison) still make senses 
> for core.

The hglib spawns its own server process and communicates via stdio, so there's
no "connection" thing. And the current protocol won't allow the server to
restart in the middle of the "runcommand" request.

> I also think it make sense for a long running connection to keep the 
> same config (the same way it keeps the same code and in memory object).

I agree.

> (on other note, I think have chg available in core would also be good.
> Would you be okay with that?)

Maybe in contrib? That will allow me to eliminate dirty hacks in chg.