[PATCH 1 of 2 V2] verify: document corner cases
Ryan McElroy
rm at fb.com
Fri Mar 31 06:14:34 EDT 2017
On 3/30/17 10:51 PM, Jun Wu wrote:
> # HG changeset patch
> # User Jun Wu <quark at fb.com>
> # Date 1490823901 25200
> # Wed Mar 29 14:45:01 2017 -0700
> # Node ID 17b41390f4912a4c18538d778837bc2cf4a1be92
> # Parent dea2a17cbfd00bf08ee87b3e44b1c71499189f89
> # Available At https://urldefense.proofpoint.com/v2/url?u=https-3A__bitbucket.org_quark-2Dzju_hg-2Ddraft&d=DwIGaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=Jw8rundaE7TbmqBYd1txIQ&m=kuj9GJXY5gbugZhlyrbxjthYrYkZ6M3HZHYFrfQg_ZY&s=yj22qlE05uO47PYA0xtQ2kgE_-D5kRIk7mybLrSUZdg&e=
> # hg pull https://urldefense.proofpoint.com/v2/url?u=https-3A__bitbucket.org_quark-2Dzju_hg-2Ddraft&d=DwIGaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=Jw8rundaE7TbmqBYd1txIQ&m=kuj9GJXY5gbugZhlyrbxjthYrYkZ6M3HZHYFrfQg_ZY&s=yj22qlE05uO47PYA0xtQ2kgE_-D5kRIk7mybLrSUZdg&e= -r 17b41390f491
> verify: document corner cases
>
> It seems a good idea to list all kinds of "surprises" and expected behavior
> to make the upcoming changes easier to understand.
>
> Note: the comment added does not reflect the actual behavior of the current
> code.
>
> diff --git a/mercurial/verify.py b/mercurial/verify.py
> --- a/mercurial/verify.py
> +++ b/mercurial/verify.py
> @@ -380,5 +380,51 @@ class verifier(object):
> del filenodes[f][n]
>
> - # verify contents
> + # Verify contents. 4 cases to care about:
> + #
> + # common: the most common case
> + # rename: with a rename
> + # meta: file content starts with b'\1\n', the metadata
> + # header defined in filelog.py, but without a rename
> + # ext: largefiles. content stored elsewhere
> + #
> + # More formally, their differences are shown below:
> + #
> + # | common | rename | meta | ext
> + # -------------------------------------------------------
> + # flags() | 0 | 0 | 0 | not 0
> + # renamed() | False | True | False | ?
> + # rawtext[0:2]=='\1\n'| False | True | True | ?
> + #
> + # "rawtext" means the raw text stored in revlog data, which
> + # could be retrieved by "revision(rev, raw=True)". "text"
> + # mentioned below is "revision(rev, raw=False)".
> + #
> + # There are 3 different lengths stored physically:
> + # 1. L1: rawsize, stored in revlog index
> + # 2. L2: len(rawtext), stored in revlog data
> + # 3. L3: len(text), stored in revlog data if flags==0, or
> + # possibly somewhere else if flags!=0
> + #
> + # L1 should be equal to L2. L3 could be different from them.
> + # "text" may or may not affect commit hash depending on flag
> + # processors (see revlog.addflagprocessor).
> + #
> + # | common | rename | meta | ext
> + # -------------------------------------------------
> + # rawsize() | L1 | L1 | L1 | L1
> + # size() | L1 | L2-LM | L1(*) | L1 (?)
> + # len(rawtext) | L2 | L2 | L2 | L2
> + # len(text) | L2 | L2 | L2 | L3
> + # len(read()) | L2 | L2-LM | L2-LM | L3 (?)
> + #
> + # LM: length of metadata, depending on rawtext
> + # (*): not ideal, see comment in filelog.size
> + # (?): could be "- len(meta)" if the resolved content has
> + # rename metadata
> + #
> + # Checks needed to be done:
> + # 1. length check: L1 == L2, in all cases.
> + # 2. hash check: depending on flag processor, we may need to
> + # use either "text" (external), or "rawtext" (in revlog).
> try:
> l = len(fl.read(n))
>
This comment is very clear to me now. Well done and thanks!
More information about the Mercurial-devel
mailing list