rust hg status

Augie Fackler raf at durin42.com
Wed Feb 20 19:51:51 EST 2019



> On Feb 19, 2019, at 14:43, Valentin Gatien-Baron <vgatien-baron at janestreet.com> wrote:
> 
> 
> 
> On Tue, Feb 19, 2019 at 10:46 AM Augie Fackler <raf at durin42.com> wrote:
> On Fri, Feb 15, 2019 at 02:39:44PM -0500, Valentin Gatien-Baron wrote:
> > Hello,
> >
> > I wrote a fraction of hg status in rust, just the minimum needed to
> > compare current revision and working copy with few of the flags and
> > config settings supported. As you can imagine, the goal was better
> > performance.  Before trying to upstream bits of this, I figured I'd
> > check there's interest for this change in particular, or this kind
> > of changes in general (I suspect rust would bring significant
> > improvements to hg cat or hg files). The rest of this mail is more
> > details.
> 
> This sounds _very_ promising and I'd love to see what you've got!
> 
> Cool!
> Seeing my mail again, it's perhaps not clearly said that what I have and what I timed below is a fully rust exe that implements a fraction of hg status, not a change to python hg that uses big chunks of rust some fraction of the time. Though it seems that upstreaming would take the latter approach, at least to start with.

I'm still interested in both - we've talked on and off about a small native-binary helper for things like printing information that people like in their shell prompt.

>  
> 
> >
> > While the implementation doesn't handle every uncommon situation right
> > and could use some serious cleanup, it's an interesting performance
> > improvement. In a repository with 100k tracked files and 500k ignored
> > files, in the best case and measuring on a good machine:
> >
> > - hg-rs st takes ~50ms
> > - hg-rs st -mard takes ~14ms
> > - hg-rs st -u takes ~39ms
> >
> > By contrast, hg+chg+fsmonitor's best case is 110ms regardless of
> > flags. Without fsmonitor, we're talking about 2.4s for hg st or hg st
> > -u, and 400ms for hg st -mard. As a baseline, hg st --syntax-error
> > takes 12ms.
> 
> Fascinating! Are you using re2 or Python's built-in re?
> 
> Definitely using re2. If I disable re2, the full status goes from 2.4s to 5.7s. I didn't say how the rust implementation differs from the python version, but using rust+re2 is not enough to get to 40ms for finding unknown files. In addition to optimizations to the hgignore handling (mostly special treatment of globs that can match exactly one file), and parallelism, and not pointlessly lstat'ing untracked files in filesystems that provide the filetype in readdir, there's a cache that holds a list of "this directory is known to have no untracked files assuming it has this timestamp, and the hgignore is bla and the dirstate is bla", which usually shortcuts the listing of untracked files in most directories, and thus shortcuts applying the hgignore on such files. 
> Though even when the cache fails to help, like when the hgignore changes, rust status takes 300ms (and it's quite plausible there's room for improvement here, I stopped optimizing when it felt like a good enough replacement).

Wow, even more impressive.

>  
> 
> >
> > A ratio of x2 compared with fsmonitor+chg is nice, but while neither
> > best case is what you get all the time, fsmonitor degrades pretty
> > badly, oftentimes in hard to understand ways, making for an
> > unpredictable experience that is frequently bad.
> > Say you change the hgignore, the rust version will take 300ms, the
> > fsmonitor version will take 4.4s (I think 2s timeout + 2.4s regular
> > status).
> > Say you remove a directory at the root of the repository, 50ms rust
> > vs 4.4s fsmonitor.
> > Say you haven't used a particular share in some time, you may well see
> > 1s rust vs 4.4s fsmonitor.
> >
> > So I think there's a lot of value in having status without fsmonitor
> > going much faster:
> > - increase significantly the scale at which fsmonitor is needed
> > - improve the bad cases of fsmonitor (or even the fast path depending
> > on how things are made to work together)
> >
> > Regards,
> >
> > Valentin Gatien-Baron
> 
> > _______________________________________________
> > Mercurial-devel mailing list
> > Mercurial-devel at mercurial-scm.org
> > https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel



More information about the Mercurial-devel mailing list