D3175: commands: implement `export --format=X` with support for CBOR

indygreg (Gregory Szorc) phabricator at mercurial-scm.org
Fri Apr 6 22:11:26 UTC 2018


indygreg created this revision.
Herald added a subscriber: mercurial-devel.
Herald added a reviewer: hg-reviewers.

REVISION SUMMARY
  What's better than having to parse patch files? Not having to parse
  them.
  
  The current text-based patch file format used by `hg export` is good
  for humans to exchange. But for machines, it is better to use a
  data format that is more structured.
  
  We recently introduced support for CBOR. CBOR is a great, binary
  preserving data format (unlike JSON), and I think we should use it
  heavily for data interchange.
  
  This commit teaches `hg export` to write data to CBOR. It adds a
  --format argument. Hopefully this is the most controversial part of
  this patch. I thought about using --template/-T. We already have
  -Tjson. -Tcbor seems like a reasonable feature addition. That might
  be the right way forward. I'm not sure. I didn't want to scope bloat
  the patch. At least not initially. (The code for exporting a patch
  is a bit wonky and I'm not very comfortable with the templating
  layer.)

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D3175

AFFECTED FILES
  mercurial/cmdutil.py
  mercurial/commands.py
  mercurial/utils/stringutil.py
  tests/test-completion.t
  tests/test-export.t

CHANGE DETAILS

diff --git a/tests/test-export.t b/tests/test-export.t
--- a/tests/test-export.t
+++ b/tests/test-export.t
@@ -258,6 +258,51 @@
   abort: export requires at least one changeset
   [255]
 
+--format=hg works and is the default
+
+  $ hg export > expected
+  $ hg export --format hg > got
+  $ diff got expected
+
+--format=cbor emits a CBOR patch
+
+  $ hg export --format cbor > cbor
+  $ f --sha256 --hexdump cbor
+  cbor: sha256=825d5c5a23bb64d1aad17e39f1e151c933d519293f3d05eb591a7479453ea21a
+  0000: 9f bf 47 76 65 72 73 69 6f 6e 01 44 75 73 65 72 |..Gversion.Duser|
+  0010: 44 74 65 73 74 44 74 69 6d 65 fb 00 00 00 00 00 |DtestDtime......|
+  0020: 00 00 00 48 74 69 6d 65 7a 6f 6e 65 00 46 62 72 |...Htimezone.Fbr|
+  0030: 61 6e 63 68 47 64 65 66 61 75 6c 74 44 6e 6f 64 |anchGdefaultDnod|
+  0040: 65 54 19 7e cd 81 a5 7f 76 0b 54 f3 4a 58 81 7a |eT.~....v.T.JX.z|
+  0050: d5 b0 49 91 fa 47 47 70 61 72 65 6e 74 73 81 54 |..I..GGparents.T|
+  0060: f3 ac ba fa c1 61 ec 68 f1 59 8a f3 8f 79 4f 28 |.....a.h.Y...yO(|
+  0070: 84 7c a5 d3 4b 64 65 73 63 72 69 70 74 69 6f 6e |.|..Kdescription|
+  0080: 58 5b 20 21 22 23 24 25 26 28 2c 2d 2e 2f 30 31 |X[ !"#$%&(,-./01|
+  0090: 32 33 34 35 36 37 38 39 3a 3b 3c 3d 3e 3f 40 41 |23456789:;<=>?@A|
+  00a0: 42 43 44 45 46 47 48 49 4a 4b 4c 4d 4e 4f 50 51 |BCDEFGHIJKLMNOPQ|
+  00b0: 52 53 54 55 56 57 58 59 5a 5b 5c 5d 5e 5f 60 61 |RSTUVWXYZ[\]^_`a|
+  00c0: 62 63 64 65 66 67 68 69 6a 6b 6c 6d 6e 6f 70 71 |bcdefghijklmnopq|
+  00d0: 72 73 74 75 76 77 78 79 7a 7b 7c 7d 7e 4b 70 61 |rstuvwxyz{|}~Kpa|
+  00e0: 74 63 68 63 68 75 6e 6b 73 9f 58 7b 64 69 66 66 |tchchunks.X{diff|
+  00f0: 20 2d 72 20 66 33 61 63 62 61 66 61 63 31 36 31 | -r f3acbafac161|
+  0100: 20 2d 72 20 31 39 37 65 63 64 38 31 61 35 37 66 | -r 197ecd81a57f|
+  0110: 20 66 6f 6f 0a 2d 2d 2d 20 61 2f 66 6f 6f 09 54 | foo.--- a/foo.T|
+  0120: 68 75 20 4a 61 6e 20 30 31 20 30 30 3a 30 30 3a |hu Jan 01 00:00:|
+  0130: 30 30 20 31 39 37 30 20 2b 30 30 30 30 0a 2b 2b |00 1970 +0000.++|
+  0140: 2b 20 62 2f 66 6f 6f 09 54 68 75 20 4a 61 6e 20 |+ b/foo.Thu Jan |
+  0150: 30 31 20 30 30 3a 30 30 3a 30 30 20 31 39 37 30 |01 00:00:00 1970|
+  0160: 20 2b 30 30 30 30 0a 58 2f 40 40 20 2d 31 30 2c | +0000.X/@@ -10,|
+  0170: 33 20 2b 31 30 2c 34 20 40 40 0a 20 66 6f 6f 2d |3 +10,4 @@. foo-|
+  0180: 39 0a 20 66 6f 6f 2d 31 30 0a 20 66 6f 6f 2d 31 |9. foo-10. foo-1|
+  0190: 31 0a 2b 6c 69 6e 65 0a ff ff ff                |1.+line....|
+
+  >>> from __future__ import print_function
+  >>> from mercurial.thirdparty import cbor
+  >>> from mercurial.utils import stringutil
+  >>> with open('cbor', 'rb') as fh:
+  ...     print(stringutil.pprint(cbor.load(fh)))
+  [{b'branch': b'default', b'description': b' !"#$%&(,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~', b'node': b'\x19~\xcd\x81\xa5\x7fv\x0bT\xf3JX\x81z\xd5\xb0I\x91\xfaG', b'parents': [b'\xf3\xac\xba\xfa\xc1a\xech\xf1Y\x8a\xf3\x8fyO(\x84|\xa5\xd3'], b'patchchunks': [b'diff -r f3acbafac161 -r 197ecd81a57f foo\n--- a/foo\tThu Jan 01 00:00:00 1970 +0000\n+++ b/foo\tThu Jan 01 00:00:00 1970 +0000\n', b'@@ -10,3 +10,4 @@\n foo-9\n foo-10\n foo-11\n+line\n'], b'time': 0.000000, b'timezone': 0, b'user': b'test', b'version': 1}]
+
 Check for color output
   $ cat <<EOF >> $HGRCPATH
   > [color]
diff --git a/tests/test-completion.t b/tests/test-completion.t
--- a/tests/test-completion.t
+++ b/tests/test-completion.t
@@ -231,7 +231,7 @@
   clone: noupdate, updaterev, rev, branch, pull, uncompressed, stream, ssh, remotecmd, insecure
   commit: addremove, close-branch, amend, secret, edit, interactive, include, exclude, message, logfile, date, user, subrepos
   diff: rev, change, text, git, binary, nodates, noprefix, show-function, reverse, ignore-all-space, ignore-space-change, ignore-blank-lines, ignore-space-at-eol, unified, stat, root, include, exclude, subrepos
-  export: output, switch-parent, rev, text, git, binary, nodates
+  export: output, switch-parent, rev, format, text, git, binary, nodates
   forget: include, exclude, dry-run
   init: ssh, remotecmd, insecure
   log: follow, follow-first, date, copies, keyword, rev, line-range, removed, only-merges, user, only-branch, branch, prune, patch, git, limit, no-merges, stat, graph, style, template, include, exclude
diff --git a/mercurial/utils/stringutil.py b/mercurial/utils/stringutil.py
--- a/mercurial/utils/stringutil.py
+++ b/mercurial/utils/stringutil.py
@@ -34,6 +34,10 @@
             '%s: %s' % (pprint(k), pprint(v)) for k, v in sorted(o.items())))
     elif isinstance(o, bool):
         return b'True' if o else b'False'
+    elif isinstance(o, int):
+        return '%d' % o
+    elif isinstance(o, float):
+        return b'%f' % o
     else:
         raise error.ProgrammingError('do not know how to format %r' % o)
 
diff --git a/mercurial/commands.py b/mercurial/commands.py
--- a/mercurial/commands.py
+++ b/mercurial/commands.py
@@ -1893,6 +1893,7 @@
      _('print output to file with formatted name'), _('FORMAT')),
     ('', 'switch-parent', None, _('diff against the second parent')),
     ('r', 'rev', [], _('revisions to export'), _('REV')),
+    ('', 'format', 'hg', _('format of written data (EXPERIMENTAL)')),
     ] + diffopts,
     _('[OPTION]... [-o OUTFILESPEC] [-r] [REV]...'), cmdtype=readonly)
 def export(ui, repo, *changesets, **opts):
@@ -1966,14 +1967,20 @@
     revs = scmutil.revrange(repo, changesets)
     if not revs:
         raise error.Abort(_("export requires at least one changeset"))
+
+    if opts['format'] not in ('hg', 'cbor'):
+        raise error.Abort(_('invalid value for --format: only "hg" and "cbor" '
+                            'are supported'))
+
     if len(revs) > 1:
         ui.note(_('exporting patches:\n'))
     else:
         ui.note(_('exporting patch:\n'))
     ui.pager('export')
     cmdutil.export(repo, revs, fntemplate=opts.get('output'),
-                 switch_parent=opts.get('switch_parent'),
-                 opts=patch.diffallopts(ui, opts))
+                   switch_parent=opts.get('switch_parent'),
+                   opts=patch.diffallopts(ui, opts),
+                   fmt=opts['format'])
 
 @command('files',
     [('r', 'rev', '', _('search the repository as it is in REV'), _('REV')),
diff --git a/mercurial/cmdutil.py b/mercurial/cmdutil.py
--- a/mercurial/cmdutil.py
+++ b/mercurial/cmdutil.py
@@ -19,7 +19,9 @@
     nullrev,
     short,
 )
-
+from .thirdparty import (
+    cbor,
+)
 from . import (
     bookmarks,
     changelog,
@@ -50,6 +52,7 @@
 )
 
 from .utils import (
+    cborutil,
     dateutil,
     stringutil,
 )
@@ -1559,8 +1562,47 @@
     for chunk, label in patch.diffui(repo, prev, node, match, opts=diffopts):
         yield chunk, label
 
+def _exportsinglecbor(repo, ctx, match, switch_parent, seqno, diffopts,
+                      encoder):
+    """Export a single patch to CBOR."""
+    node = scmutil.binnode(ctx)
+    parents = [p.node() for p in ctx.parents() if p]
+    branch = ctx.branch()
+    if switch_parent:
+        parents.reverse()
+
+    if parents:
+        prev = parents[0]
+    else:
+        prev = nullid
+
+    with cborutil.streammap(encoder) as writeitem:
+        writeitem(b'version', 1)
+        writeitem(b'user', encoding.fromlocal(ctx.user()))
+        writeitem(b'time', ctx.date()[0])
+        writeitem(b'timezone', ctx.date()[1])
+        writeitem(b'branch', branch)
+        writeitem(b'node', node)
+        writeitem(b'parents', parents)
+
+        extras = []
+        for headerid in extraexport:
+            header = extraexportmap[headerid](seqno, ctx)
+            if header:
+                extras.append(header)
+
+        if extras:
+            writeitem(b'extras', extras)
+
+        writeitem(b'description',
+                  encoding.fromlocal(ctx.description().rstrip()))
+
+        encoder.encode(b'patchchunks')
+        chunks = patch.diff(repo, prev, node, match, opts=diffopts)
+        cborutil.streamarrayitems(encoder, chunks)
+
 def export(repo, revs, fntemplate='hg-%h.patch', fp=None, switch_parent=False,
-           opts=None, match=None):
+           opts=None, match=None, fmt='hg'):
     '''export changesets as hg patches
 
     Args:
@@ -1572,6 +1614,8 @@
                      Default is false, which always shows diff against p1.
       opts: diff options to use for generating the patch.
       match: If specified, only export changes to files matching this matcher.
+      fmt: Format to export the patch to. ``hg`` is the default Mercurial
+           patch format. ``cbor`` encodes to CBOR.
 
     Returns:
       Nothing.
@@ -1613,17 +1657,35 @@
             else:
                 fp.write(s)
 
+    if fmt == 'cbor' and allfp:
+        allcbor = cbor.CBOREncoder(allfp)
+        cborutil.beginindefinitearray(allcbor)
+    else:
+        allcbor = None
+
     for seqno, rev in enumerate(revs, 1):
         ctx = repo[rev]
 
         if not allfp:
             assert fntemplate
-
             with makefileobj(ctx, fntemplate, mode='wb', modemap=filemode,
                              total=total, seqno=seqno, revwidth=revwidth) as fp:
-                doexport(ctx, fp, fp.name)
+                if fmt == 'cbor':
+                    encoder = cbor.CBOREncoder(fp)
+                    with cborutil.streamarray(encoder):
+                        _exportsinglecbor(repo, ctx, match, switch_parent,
+                                          seqno, opts, encoder)
+                else:
+                    doexport(ctx, fp, fp.name)
         else:
-            doexport(ctx, allfp, dest)
+            if fmt == 'cbor':
+                _exportsinglecbor(repo, ctx, match, switch_parent, seqno, opts,
+                                  allcbor)
+            else:
+                doexport(ctx, allfp, dest)
+
+    if allcbor:
+        cborutil.endindefinite(allcbor)
 
 def showmarker(fm, marker, index=None):
     """utility function to display obsolescence marker in a readable way



To: indygreg, #hg-reviewers
Cc: mercurial-devel


More information about the Mercurial-devel mailing list