[PATCH 1 of 2] verify: document corner cases
Ryan McElroy
rm at fb.com
Thu Mar 30 13:15:11 EDT 2017
On 3/30/17 5:01 PM, Jun Wu wrote:
> Excerpts from Ryan McElroy's message of 2017-03-30 09:55:50 +0100:
>> On 3/29/17 10:51 PM, Jun Wu wrote:
>>> # HG changeset patch
>>> # User Jun Wu <quark at fb.com>
>>> # Date 1490823901 25200
>>> # Wed Mar 29 14:45:01 2017 -0700
>>> # Node ID 1f7890370b5437466534cbd0a313d21671dade03
>>> # Parent e9fda3b8614a8b701bd48041afa8b709e1227f27
>>> verify: document corner cases
>>>
>>> It seems a good idea to list all kinds of "surprises" and expected behavior
>>> to make the upcoming changes easier to understand.
>>>
>>> Note: the comment added do not reflect the actual behavior of the current
>> grammar-nit: s/do/does
>>
>>> code.
>>>
>>> diff --git a/mercurial/verify.py b/mercurial/verify.py
>>> --- a/mercurial/verify.py
>>> +++ b/mercurial/verify.py
>>> @@ -381,4 +381,44 @@ class verifier(object):
>>>
>>> # verify contents
>>> + #
>>> + # Define 4 cases to help understand corner cases:
>>> + #
>>> + # | common | rename | meta | ext
>> Can you define these four words?
> The first table is supposed to define them precisely.
I guess it didn't for me, which is why I asked and thought "english
descriptions" would be useful.
>
>> common = common case?
>> rename = file was just copied from somewhere else?
>> meta = ???
> "rename" is called "metadata" in filelog.py, so it means metadata without
> rename.
When would this happen? It's still not clear to me.
>
>> ext = extension that adds a flags processor?
>>> + # -------------------------------------------------------
>>> + # flags() | 0 | 0 | 0 | not 0
>>> + # renamed() | False | True | False | ?
>>> + # rawtext[0:2]=='\1\n'| False | True | True | ?
>>> + #
>>> + # "rawtext" means the raw text stored in revlog data, which
>>> + # could be retrieved by "revision(rev, raw=True)". "text"
>>> + # mentioned below is "revision(rev, raw=False)".
>>> + #
>>> + # There are 3 different lengths stored physically:
>>> + # 1. L1: rawsize, stored in revlog index
>>> + # 2. L2: len(rawtext), stored in revlog data
>>> + # 3. L3: len(text), stored in revlog data if flags=0, or
>>> + # possibly somewhere else if flags!=0
>>> + #
>>> + # L1 should be equal to L2. L3 is unrelated to L1 or L2. But
>> How about "L3 can be different than L1 and L2" -- it would be the same
>> if there are no flags, right?
> Yes. I notice the minor difference between the language. Thanks for the
> suggestion!
>
>>> + # text may or may not affect commit hash depending on flag
>>> + # processors (see revlog.addflagprocessor).
>>> + #
>>> + # | common | rename | meta | ext
>>> + # -------------------------------------------------
>>> + # rawsize() | L1 | L1 | L1 | L1
>>> + # size() | L1 | L2-LM | L1(*) | L1 (?)
>>> + # len(rawtext) | L2 | L2 | L2 | L2
>>> + # len(text) | L2 | L2 | L2 | L3
>>> + # len(read()) | L2 | L2-LM | L2-LM | L3
>>> + #
>>> + # LM: length of metadata, depending on rawtext
>>> + # (*): not ideal, see comment in filelog.size
>>> + # (?): could be "- len(meta)" if the resolved content has
>>> + # rename metadata
>>> + #
>>> + # Checks needed to be done:
>>> + # 1. length check: L1 == L2, in all cases.
>>> + # 2. hash check: depending on flag processor, we may need to
>>> + # use either "text" (external), or "rawtext" (in revlog).
>>> try:
>>> l = len(fl.read(n))
>> Overall this is pretty easy to follow and adds a lot of clarity. Just a
>> couple of questions inline.
More information about the Mercurial-devel
mailing list