[PATCH 1 of 3 V2] util: add method to hash nested combination of python data structures

Sun Jul 12 08:27:41 CDT 2015

On 07/09/2015 03:23 PM, Yuya Nishihara wrote:
> On Wed, 8 Jul 2015 16:55:09 +0000, Laurent Charignon wrote:
>> On Jul 8, 2015, at 6:32 AM, Yuya Nishihara <yuya at tcha.org<mailto:yuya at tcha.org>> wrote:
>> On Tue, 7 Jul 2015 15:21:08 +0000, Laurent Charignon wrote:
>> On Jul 7, 2015, at 5:54 AM, Yuya Nishihara <yuya at tcha.org<mailto:yuya at tcha.org>> wrote:
>> On Mon, 6 Jul 2015 11:42:40 -0700, Laurent Charignon wrote:
>> # HG changeset patch
>> # User Laurent Charignon <lcharignon at fb.com<mailto:lcharignon at fb.com>>
>> # Date 1435794507 25200
>> #      Wed Jul 01 16:48:27 2015 -0700
>> # Node ID 38cd2c6265855f0a8e65e02e2cc181921df498ca
>> # Parent  2748bf78a5bf610da4f2d90fd1eea19a3b360c04
>> util: add method to hash nested combination of python data structures
>>
>> The goal of this series of patches is to have a method to compute the hash of
>> config objects. This will enable restarting the command server when the config
>> change.
>>
>> I'm curious who and how the command server will be restarted. This patch seems
>> to imply that we can't rely on the mtime of the config files.
>>
>> - I will send you a pull request for that
>>
>> A pull request? Oh, it seems you're playing with chg, thanks.
>>
>> Yep, I actually sent it, you can have a look :)
>
> Thanks, I'll look into it.
>
>> - We can, but if the config ends up being the same we don't want to restart
>> the command server preemptively.
>>
>> Hmm, because config files are edited by user, I think we can assume that the
>> mtime change denotes the content change in most cases.
>>
>> How about automated deployment through configuration tools?
>
> Does the deployment tool run with chg?

Goal will be to have everything running with chg.

I think we should stick to the config hashing for now (or any "always" 
right alternative). Using mtime have issues (cf foozy series about 
dirstate and mtime). I would rather go with the full and solution 
solution first and then see if performance are an issue looking for a 
better solution.

One of the goal here is to use chg with generic test, script and 
automatic deployment tool. They will modify config and run hg close to 
each other.

What is the slow down involved by the config hashing here? My 
expectation is that it will nefligible. In all cases, I would be 
surprise if chg+config-hashing is slower that no-chg.

>> What if the client adds a space or a comment, do we really want to
>> restart the server then?
>
> Well, it's acceptable cost for me. It's just an extra ~100msec, is shorter than
> I type C-x C-s C-x C-c.

I stand with yuya here. space change in the config is a config change. 
It will be rare enough so that we can bare with a server restart here.

>> My idea for chg is:
>>
>> 1. server writes mtime (or sha1 hash) and path to "rcmtimes" file at startup
>> 2. client read "rcmtimes" to detect config change (everything done in C *1)
>> 3. kill -TERM, wait and respawn server, or kill -HUP ?

This server lifecycle being controled by its many client seems strange 
to me. In my opinion, the only responsibility the client should have 
regarding the server is to get one rolling if it does not already 
exists. All life cycles decisions (shutdown//reload if config changed, 
shutdown after timeout) should be done by the server, probably controled 
by config option.

> (OT: we'll also want to detect change on mercurial/__version__.py)

ho yeah! good catch.

>> Your idea seems simpler to implement, however, how do you deal with clients
>> that were connected to the same server before, what is going to happen to them?
>
> Perhaps the forked server process will stay until the client disconnects. If
> we want to make the main process take care of forked processes, maybe we should
> implement the signal handler appropriately. I didn't carefully write chg to be
> safe for a startup/shutdown race.

I think the forker server process will stay around until the client 
disconnect (thanks to fork). This will create issue for client "in the 
process to be forked", rare but possible occurrence.

> My position for this patch:
>
>   - it won't be a general solution for the issue of the command server
>     https://mercurial.selenic.com/wiki/CommandServer#Known_issues
>   - because in common scenario of using pipe, server and client are 1:1,
>     a single long-lived connection + many "runcommand" requests,
>     which is very different from how chg works
>   - so for now, I don't think this patch will be useful in Mercurial core

I think the chg use case is worthwhile (one server, multiple 
connection). How is hglib working in this regard ? Having a server 
capable of restarting on config changes seems a valuable option in 
general. I think this is series (config comparison) still make senses 
for core.

I also think it make sense for a long running connection to keep the 
same config (the same way it keeps the same code and in memory object).

(on other note, I think have chg available in core would also be good. 
Would you be okay with that?)

-- 
Pierre-Yves David