[PATCH RFC v10] scmutil: add a simple key-value file helper

Kostia Balytskyi kobalyts at outlook.com
Mon Feb 13 17:44:32 EST 2017


On 13/02/2017 21:58, Jun Wu wrote:
> As said before, I still prefer a simpler design without the "required"
> boolean flag, and move the responsibility of checking requirements to the
> higher-level code. Because:
>
>    - As Yuya pointed out at the first place, "required" here will have
>      trouble dealing with future schema changes. So it's not that useful
>      unless you are 100% sure that the schema stays unchanged forever.
That's the whole point of required - to mark fields which you are sure 
can't be removed.  At least the way I see it and the way I intended to 
use it - is to future-proof my schema.
>    - It'll be shorter, and easier to read - no need to jump here to
>      understand what "True" means in "KEYS = ..." in shelve.py.
We can use named tuples for this purpose to say "required=True" ...
>    - It'll be faster, although just slightly.
>
>  From another perspective, "Checking requirements" is part of "data
> validation", which cannot be done in this low-level simple structure alone.
> For example, some fields may have to be 40-byte hex string and that kind of
> checks are probably done in a higher level. Since the higher level code will
> do "data validation" anyway, it makes more sense to just not do "data
> validation" here so they happen in a single place. The higher level code
> could also print more user-friendly error messages.
Yes, fully functional data validation is not possible on this level, but 
we can make some people's lives easier by outsourcing requirements checking.
>
> If you agree these are reasonable, but have a long stack which is hard to
> change, I could help fix them later.
I will change and re-send withoug required-related functionality as this 
patch is taking forever to get in.
>
> Excerpts from Kostia Balytskyi's message of 2017-02-03 02:47:26 -0800:
>> # HG changeset patch
>> # User Kostia Balytskyi <ikostia at fb.com>
>> # Date 1484824655 28800
>> #      Thu Jan 19 03:17:35 2017 -0800
>> # Node ID 19a449c91ef14e691cf1347748473e0094fedc86
>> # Parent  9f264adbe75bfae8551dc0e6e0fce8d43fc7b43a
>> scmutil: add a simple key-value file helper
>>
>> The purpose of the added class is to serve purposes like save files of shelve
>> or state files of shelve, rebase and histedit. Keys of these files can be
>> alphanumeric and start with letters, while values must not contain newlines.
>> Keys which start with an uppercase letter are required, while other keys
>> are optional.
>>
>> In light of Mercurial's reluctancy to use Python's json module, this tries
>> to provide a reasonable alternative for a non-nested named data.
>> Comparing to current approach of storing state in plain text files, where
>> semantic meaning of lines of text is only determined by their oreder,
>> simple key-value file allows for reordering lines and thus helps handle
>> optional values.
>>
>> Initial use-case I see for this is obs-shelve's shelve files. Later we
>> can possibly migrate state files to this approach.
>>
>> The test is in a new file beause I did not figure out where to put it
>> within existing test suite. If you give me a better idea, I will gladly
>> follow it.
>>
>> diff --git a/mercurial/error.py b/mercurial/error.py
>> --- a/mercurial/error.py
>> +++ b/mercurial/error.py
>> @@ -246,3 +246,6 @@ class UnsupportedBundleSpecification(Exc
>>   
>>   class CorruptedState(Exception):
>>       """error raised when a command is not able to read its state from file"""
>> +
>> +class MissingRequiredKey(Exception):
>> +    """error raised when simple key-value file misses a required key"""
>> diff --git a/mercurial/scmutil.py b/mercurial/scmutil.py
>> --- a/mercurial/scmutil.py
>> +++ b/mercurial/scmutil.py
>> @@ -1571,3 +1571,51 @@ class checkambigatclosing(closewrapbase)
>>       def close(self):
>>           self._origfh.close()
>>           self._checkambig()
>> +
>> +class simplekeyvaluefile(object):
>> +    """A simple file with key=value lines
>> +
>> +    Keys must be alphanumerics and start with a letter, values must not
>> +    contain '\n' characters"""
>> +
>> +    # if KEYS is non-empty, read values are validated against it:
>> +    # each key is a tuple (keyname, required)
>> +    KEYS = []
>> +
>> +    def __init__(self, vfs, path):
>> +        self.vfs = vfs
>> +        self.path = path
>> +
>> +    def validate(self, d):
>> +        for key, req in self.KEYS:
>> +            if req and key not in d:
>> +                e = "missing a required key: '%s'" % key
>> +                raise error.MissingRequiredKey(e)
>> +
>> +    def read(self):
>> +        lines = self.vfs.readlines(self.path)
>> +        try:
>> +            d = dict(line[:-1].split('=', 1) for line in lines if line)
>> +        except ValueError as e:
>> +            raise error.CorruptedState(str(e))
>> +        self.validate(d)
>> +        return d
>> +
>> +    def write(self, data):
>> +        """Write key=>value mapping to a file
>> +        data is a dict. Keys must be alphanumerical and start with a letter.
>> +        Values must not contain newline characters."""
>> +        lines = []
>> +        for k, v in data.items():
>> +            if not k[0].isalpha():
>> +                e = "keys must start with a letter in a key-value file"
>> +                raise error.ProgrammingError(e)
>> +            if not k.isalnum():
>> +                e = "invalid key name in a simple key-value file"
>> +                raise error.ProgrammingError(e)
>> +            if '\n' in v:
>> +                e = "invalid value in a simple key-value file"
>> +                raise error.ProgrammingError(e)
>> +            lines.append("%s=%s\n" % (k, v))
>> +        with self.vfs(self.path, mode='wb', atomictemp=True) as fp:
>> +            fp.write(''.join(lines))
>> diff --git a/tests/test-simplekeyvaluefile.py b/tests/test-simplekeyvaluefile.py
>> new file mode 100644
>> --- /dev/null
>> +++ b/tests/test-simplekeyvaluefile.py
>> @@ -0,0 +1,87 @@
>> +from __future__ import absolute_import
>> +
>> +import unittest
>> +import silenttestrunner
>> +
>> +from mercurial import (
>> +    error,
>> +    scmutil,
>> +)
>> +
>> +contents = {}
>> +
>> +class fileobj(object):
>> +    def __init__(self, name):
>> +        self.name = name
>> +
>> +    def __enter__(self):
>> +        return self
>> +
>> +    def __exit__(self, *args, **kwargs):
>> +        pass
>> +
>> +    def write(self, text):
>> +        contents[self.name] = text
>> +
>> +    def read(self):
>> +        return contents[self.name]
>> +
>> +class mockvfs(object):
>> +    def read(self, path):
>> +        return fileobj(path).read()
>> +
>> +    def readlines(self, path):
>> +        return fileobj(path).read().split('\n')
>> +
>> +    def __call__(self, path, mode, atomictemp):
>> +        return fileobj(path)
>> +
>> +class testsimplekeyvaluefile(unittest.TestCase):
>> +    def setUp(self):
>> +        self.vfs = mockvfs()
>> +
>> +    def testbasicwriting(self):
>> +        d = {'key1': 'value1', 'Key2': 'value2'}
>> +        scmutil.simplekeyvaluefile(self.vfs, 'kvfile').write(d)
>> +        self.assertEqual(sorted(self.vfs.read('kvfile').split('\n')),
>> +                         ['', 'Key2=value2', 'key1=value1'])
>> +
>> +    def testinvalidkeys(self):
>> +        d = {'0key1': 'value1', 'Key2': 'value2'}
>> +        with self.assertRaisesRegexp(error.ProgrammingError,
>> +                                     "keys must start with a letter.*"):
>> +            scmutil.simplekeyvaluefile(self.vfs, 'kvfile').write(d)
>> +        d = {'key1@': 'value1', 'Key2': 'value2'}
>> +        with self.assertRaisesRegexp(error.ProgrammingError, "invalid key.*"):
>> +            scmutil.simplekeyvaluefile(self.vfs, 'kvfile').write(d)
>> +
>> +    def testinvalidvalues(self):
>> +        d = {'key1': 'value1', 'Key2': 'value2\n'}
>> +        with self.assertRaisesRegexp(error.ProgrammingError, "invalid val.*"):
>> +            scmutil.simplekeyvaluefile(self.vfs, 'kvfile').write(d)
>> +
>> +    def testrequiredkeys(self):
>> +        d = {'key1': 'value1', 'Key2': 'value2'}
>> +        scmutil.simplekeyvaluefile(self.vfs, 'allkeyshere').write(d)
>> +
>> +        class kvf(scmutil.simplekeyvaluefile):
>> +            KEYS = [('key3', False), ('Key2', True)]
>> +        self.assertEqual(sorted(kvf(self.vfs, 'allkeyshere').read().items()),
>> +                         [('Key2', 'value'), ('key1', 'value')])
>> +
>> +        d = {'key1': 'value1', 'Key3': 'value2'}
>> +        scmutil.simplekeyvaluefile(self.vfs, 'missingkeys').write(d)
>> +
>> +        class kvf(scmutil.simplekeyvaluefile):
>> +            KEYS = [('key3', False), ('Key2', True)]
>> +        with self.assertRaisesRegexp(error.MissingRequiredKey, "missing a.*"):
>> +            kvf(self.vfs, 'missingkeys').read()
>> +
>> +    def testcorruptedfile(self):
>> +        contents['badfile'] = 'ababagalamaga\n'
>> +        with self.assertRaisesRegexp(error.CorruptedState,
>> +                                     "dictionary.*element.*"):
>> +            scmutil.simplekeyvaluefile(self.vfs, 'badfile').read()
>> +
>> +if __name__ == "__main__":
>> +    silenttestrunner.main(__name__)
> _______________________________________________
> Mercurial-devel mailing list
> Mercurial-devel at mercurial-scm.org
> https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel



More information about the Mercurial-devel mailing list