[PATCH 1 of 3 V2] util: add method to hash nested combination of python data structures

Pierre-Yves David pierre-yves.david at ens-lyon.org
Tue Jul 14 05:50:59 CDT 2015



On 07/12/2015 03:37 PM, Yuya Nishihara wrote:
> On Sun, 12 Jul 2015 14:27:41 +0100, Pierre-Yves David wrote:
>> On 07/09/2015 03:23 PM, Yuya Nishihara wrote:
>>> On Wed, 8 Jul 2015 16:55:09 +0000, Laurent Charignon wrote:
>>>> On Jul 8, 2015, at 6:32 AM, Yuya Nishihara <yuya at tcha.org<mailto:yuya at tcha.org>> wrote:
>>>> On Tue, 7 Jul 2015 15:21:08 +0000, Laurent Charignon wrote:
>>>> On Jul 7, 2015, at 5:54 AM, Yuya Nishihara <yuya at tcha.org<mailto:yuya at tcha.org>> wrote:
>>>> On Mon, 6 Jul 2015 11:42:40 -0700, Laurent Charignon wrote:
>>>> - We can, but if the config ends up being the same we don't want to restart
>>>> the command server preemptively.
>>>>
>>>> Hmm, because config files are edited by user, I think we can assume that the
>>>> mtime change denotes the content change in most cases.
>>>>
>>>> How about automated deployment through configuration tools?
>>>
>>> Does the deployment tool run with chg?
>>
>> Goal will be to have everything running with chg.
>>
>> I think we should stick to the config hashing for now (or any "always"
>> right alternative). Using mtime have issues (cf foozy series about
>> dirstate and mtime). I would rather go with the full and solution
>> solution first and then see if performance are an issue looking for a
>> better solution.
>>
>> One of the goal here is to use chg with generic test, script and
>> automatic deployment tool. They will modify config and run hg close to
>> each other.
>
> Oh, I see. If we want to run the tests using chg, mtime isn't enough,
> especially on ext3 that provides low time resolution.
>
>> What is the slow down involved by the config hashing here? My
>> expectation is that it will nefligible. In all cases, I would be
>> surprise if chg+config-hashing is slower that no-chg.
>
> My point is that the proposed hashing function has little benefit. It's slower
> than __eq__() and we have objects to be compared.
>
> Also, I'm not sure if a hash() value is strong enough for collision because it
> is designed for a hash table where speed is more important than cryptographic
> hashes.

I do not have a strong opinion on how we compare config, as long as we 
do not use loosy heristic like mtime that are know to have issue.

That said, if hashing is a simple, valid and done way to do __eq__ we 
should maybe move forward with it. If this create performance issue, we 
can be smarter later.


>>>> What if the client adds a space or a comment, do we really want to
>>>> restart the server then?
>>>
>>> Well, it's acceptable cost for me. It's just an extra ~100msec, is shorter than
>>> I type C-x C-s C-x C-c.
>>
>> I stand with yuya here. space change in the config is a config change.
>> It will be rare enough so that we can bare with a server restart here.
>>
>>>> My idea for chg is:
>>>>
>>>> 1. server writes mtime (or sha1 hash) and path to "rcmtimes" file at startup
>>>> 2. client read "rcmtimes" to detect config change (everything done in C *1)
>>>> 3. kill -TERM, wait and respawn server, or kill -HUP ?
>>
>> This server lifecycle being controled by its many client seems strange
>> to me. In my opinion, the only responsibility the client should have
>> regarding the server is to get one rolling if it does not already
>> exists. All life cycles decisions (shutdown//reload if config changed,
>> shutdown after timeout) should be done by the server, probably controled
>> by config option.
>
> I won't argue if the server can restart itself. I just don't like the
> following:
>
>   1. server shutdown by itself if config changed
>   2. but client have to start new server

I'm not sure what is your issue with it. I see this as expressable as:

  Client start server if none is available.

And that seems very reasonable to me. This see available as "not dead, 
or not dying" but it is fine with me.

Can you elaborate on what your issue is?

>> (on other note, I think have chg available in core would also be good.
>> Would you be okay with that?)
>
> Maybe in contrib? That will allow me to eliminate dirty hacks in chg.

Contrib would be the obvious start to start with. Once we have the test 
running with it, we can decide if we wants more. But having it in 
contrib will the whole idea of test run with chg (also a bit perf with), 
much more easy for everybody.

-- 
Pierre-Yves David


More information about the Mercurial-devel mailing list