Interest in integrating hg-git into Mercurial

Augie Fackler raf at durin42.com
Fri Aug 16 16:58:56 EDT 2019



> On Aug 1, 2019, at 22:30, Gregory Szorc <gregory.szorc at gmail.com> wrote:
> 
> On Thu, Aug 1, 2019 at 10:38 AM Augie Fackler <raf at durin42.com> wrote:
> 
> 
> > On Aug 1, 2019, at 13:01, Gregory Szorc <gregory.szorc at gmail.com> wrote:
> > 
> > Is there any interest in integrating hg-git (or hg-git functionality) into the Mercurial distribution as an officially supported extension?
> > 
> > Given the popularity of Git and the difficulty of installing semi-complex Python software like hg-git, I was thinking it would be beneficial to end-users for Mercurial to support interacting with Git repositories out-of-the-box with as little set-up pain as possible. hg-git feels like the path of least resistance towards attaining that goal. (I would eventually like to see support for Mercurial opening Git repositories natively. While I think this is technically viable, it is probably a year or two away, as it needs significant work to shore up Mercurial's storage interfaces and internal code contracts to support such a significant invariant as interfacing with Git repositories.)
> > 
> > Vendoring hg-git would like also entail vendoring its dependencies: urllib3, certifi, and dulwich. Vendoring urllib3 and certifi is probably not the worst thing in the world, as it would give us an excuse to refactor Mercurial's HTTP internals to move off the ugly hacks we employ to use Python's standard library.
> > 
> > I'm not promising I will follow through and do this work. At this time I'm mostly interested in taking a quick pulse to see if there is any interest in doing it. If there is general support, I may follow through :)
> 
> I'm semi-enthusiastic, but I'd rather we took a storage-abstraction approach than the hg-git "convert the repo" approach. With our newfound storage abstractions, I think it's reasonable to index a git repository and present its data in an hg-friendly format. A few places in hg would likely need patching to handle more than two parents in a merge, but other than that I think things basically work.
> 
> I would prefer we interface with Git repositories using the storage abstraction and not have to maintain a shadow Mercurial repository as well. And I do think that is technically viable.

For your consideration, a (crude) prototype: phab.mercurial-scm.org/D6734

It's a hack, but that's under a week of hacking, even when I try to account for the reuse of code I had laying around. You're right that octopus merges will be a pain, but it looks like rationalizing the dirstate interface is the real big hassle at the moment. I suspect we could dummy up octopus merges with some weird hash tricks...

> But I think there is a long, long tail of issues that will prevent that from working as robustly as we all want it to - and as many end-users will want it to. The handling of octopus merges alone will be a giant PITA because there is code all over Mercurial that assumes exactly 2 parents. Plus, the storage abstraction work stopped short of the changelog and locks/transactions, which I think will be the hardest parts to abstract. I think there's a lot of work there and we shouldn't block Git repo integration on solving that general problem.
> 
> If we were to move forward with integrating hg-git, I would do so by making the hg<->git repo coupling stronger. For example, I would install hooks into the Git repository that prevented mutations unless Mercurial were driving them (e.g. by looking for the presence of an environment variable). And I would define repo requirements that denoted special behavior in the presence of an associated Git repository. This would be designed such that in a future world one could run `hg debugupgraderepo` and replace the Mercurial full conversion repo with storage abstractions that allow us to write directly into the Git repo.

My worry is that hg-git is a tortured codebase, and I'm not sure the effort necessary to bring it up to sane standards for core is well-spent when it feels like we're _really close_ to having small-to-medium repositories working.

>  Note that I'm also willing to try and push this forward, and have some experimental hacks lying around that should make a proof of concept fairly easy. The reaction at the last sprint took away some of my enthusiasm, but if people are receptive to the idea I'll carve out some time in August...
> 
> (I think your "a year or two" estimate is pessimistic on this front: I think it could be done very quickly by indexing the git repo at load time, and the indexing can be made reasonably quick, especially given that I had this _working_ in the 3.x era, albeit slowly.)
> 
> > _______________________________________________
> > Mercurial-devel mailing list
> > Mercurial-devel at mercurial-scm.org
> > https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel
> 



More information about the Mercurial-devel mailing list