D1974: narrow: import experimental extension from narrowhg revision cb51d673e9c5

durin42 (Augie Fackler) phabricator at mercurial-scm.org
Mon Feb 12 15:12:32 EST 2018


durin42 added a comment.


  Okay, I've pushed more followups. The bulk of the TODOs are recorded in a file introduced in https://phab.mercurial-scm.org/D2196, but there are also some added inline in the code.
  
  Let me know how I can help - I'll make time to video conference this week if it'd facilitate getting this landed with minimal additional lost sanity.
  
  Thanks!

INLINE COMMENTS

> indygreg wrote in narrowbundle2.py:138-151
> I'm not an expert on the manifest APIs. There //may// be a more optimal way to implement this...

I wrote lazymanifest, and this was the best adgar and I could come up with...

> indygreg wrote in narrowrepo.py:85-87
> I agree with the inline todo :)

Okay if we come back to that, since it's already logged as a TODO?

> indygreg wrote in narrowrevlog.py:1
> I'd like to see this file's content moved into core sooner rather than later. There are a lot of implications for storage that need to be in people's minds when they are touching code in core.

Yes, noted this in a TODO.rst.

> indygreg wrote in narrowrevlog.py:90-94
> Consider doing that and changing this to raise if called.

Added to a TODO file.

> indygreg wrote in narrowrevlog.py:105
> Nit: `dir` is a builtin. If this matches core, fine. But I'd prefer avoiding the name collision.

Ouch, good catch. This does match core, so I fixed both in my followup.

> indygreg wrote in narrowrevlog.py:114-115
> In-place mutation of low-level types. Yummy.
> 
> Please add a todo for the post-landing list to construct the proper type from the beginning. This likely requires some API changes in core. I'm thinking some function should return the type to use for new revlogs. Or we should spawn this type and call super.__init__ from its __init__.

Yeah, this is one of several places where I want to just hoist internals changes to core, with the only customer being narrow (for now, at least). It's super gross the way it currently is.

> indygreg wrote in narrowrevlog.py:128-131
> I think this wants a comment explaining why we lie about rename metadata when the destination is outside of the narrow spec. Also, we'll want to flag this for further review, since there could be some interesting implications to lying here.

Yeah, all I remember about this (https://bitbucket.org/Google/narrowhg/commits/3cd72b1a1b41c9e46f12eba78c253da169277374) is that git-diffs break in the case of a rename from outside->inside. There are almost certainly other problems lurking in the weeds here, but to my knowledge we've not seen them at Google...

> indygreg wrote in narrowrevlog.py:134-139
> This is making assumptions about code that hasn't landed yet. Not sure if we should replace this with a TODO or what.

Added an explicit TODO. We can make this a TON cleaner when narrowhg can be formally aware of remotefilelog, this was mostly a kludge to work around them being in disjoint repos and not wanting to assume we could find remotefilelog cleanly.

> indygreg wrote in narrowspec.py:31-42
> Oh, hey, this looks just like sparse profiles! I sense some code conversion in our future...

Yes, the goal is that the two extensions will share a fair amount of logic. :)

> indygreg wrote in narrowspec.py:49-54
> Playing devil's advocate, do we really need two formats doing the same thing?

That's a great question. I don't really know the answer offhand. Added a TODO...

> indygreg wrote in narrowspec.py:83
> Can we use `str.count()` to avoid creating objects for each line?

Gave it a shot, and things broke all over the place. :(

> indygreg wrote in narrowspec.py:93-95
> What about `\` as a path separator? Should we also ban that? I assume we're interpreting the value here and paths as bytes, so `\` will never be used as an escape character?

@martinvonz please correct me if this is wrong.

The paths are stored in the narrowspec with canonical (that is /) path separators, just like the rest of the internals. We're reading things as bytes, so yes, \ should never occur.

> indygreg wrote in narrowspec.py:110-115
> It is better to build a list of lines and `'\n'.join()` them.

It shouldn't be: this particular "str with refcount of 1" behavior was optimized a long time ago in cpython to not be a horrendous stack of copies.

If it shows up in a profile, we can obviously fix it.

> indygreg wrote in narrowspec.py:133-138
> This reinvents `repo.vfs.tryread()`.

Yep, but it also does some extra cache invalidation, so I don't think I can fuse them.

> indygreg wrote in narrowspec.py:146-150
> This is called from `narrowrepo.setnarrowparts()` and neither of them cares about locking. Somewhere we should ensure we hold the wlock.

Added to my TODO doc because I don't want reasoning about locking to block getting out of this pile-of-comments hell.

> indygreg wrote in narrowwirepeer.py:45-48
> This will blindly add `includepats` and `excludepats` as wire protocol arguments to the `unbundle` command. In the Python server, wire protocol arguments have to be defined as part of the command handler or the server will refuse to serve the request.
> 
> So, this code is missing the server-side declaration of these arguments.
> 
> And, the client/peer code here is buggy for not conditionally adding these arguments based on the presence of a server capability advertising support for receiving these arguments.

Yeah, this is a mess. I've added a TODO, as I'm not entirely sure how to work around this for now (calling capabilities in here seems...fraught.)

> indygreg wrote in test-narrow-archive.t:30
> I'm not sure if this invocation of `tar` is portable. I'm sure others will fix things later if it breaks tests on other platforms.

Noted. AFAIK this should be portable at least to bsdtar, but probably solaris or someone will eventually complain?

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D1974

To: durin42, #hg-reviewers
Cc: idlsoft, martinvonz, indygreg, mercurial-devel


More information about the Mercurial-devel mailing list