Development of a performance tracking tool for Mercurial
Matt Mackall
mpm at selenic.com
Tue Apr 12 19:18:52 EDT 2016
On Tue, 2016-04-12 at 12:03 +0200, Philippe Pepiot wrote:
> Hello,
>
> I published a new demo with parametrized benchmarks:
> https://hg.logilab.org/review/hgperf/raw-file/685dfc2bbe87/html/index.html
This regression looks interesting:
https://hg.logilab.org/review/hgperf/raw-file/685dfc2bbe87/html/index.html#other
s.time_tags?branch=default&x=28265&idx=2
Am I right in thinking this is the time it takes to run "hg tags" against the
Mozilla repo?
> The code used to run benchmarks was:
> https://hg.logilab.org/review/hgperf/file/da745dae4dd1 (See README.rst)
>
> All benchmarks where run against three reference repository (hg, pypy
> and mozilla-central), and revsets are parametrized with variants
> (sort(), first(), last(), etc).
>
> Our remarks after analyzing this demo:
>
> - As expected, having multiple reference repositories give more
> informations about regressions and improvements (for example the
> improvement from d2ac8b57a for the revset last(roots((0::) - (0::tip)))
> is only visible on mozilla repo).
> - Colors that are changing when (de)selecting parameters are very
> disturbing, need to fix that.
> - Maybe we could select log scale by default ?
> - Handle all parameters in url to make them shareable.
>
> After seeing this demo, having revset variants on the same page is
> relevant for you ?
>
> Now I've a question about writing and maintaining benchmark code as we
> have multiple choices here:
>
> 1) Use mercurial internal API (benefits: unlimited possibilities without
> modifying mercurial and we can write backward compatible benchmark with
> some 'if' statements and benchmarks older versions, profits all ASV
> features (profiling, memory benchmarks etc). Drawbacks: duplicate code
> with contrib/perf.py, will break on internal API changes, need more
> maintenance and more code to write/keep backward compatible benchmarks).
>
> 2) Use contrib/perf.py extension from the benchmarked version of
> mercurial (benefits: de facto backward compatible, drawbacks: limited to
> what the tool can do in previous versions)
>
> 3) Use contrib/perf.py extension from the latest version of mercurial
> (benefits: no duplicate code, easier maintenance, new benchmarks profits
> to both tools. Drawbacks: not backward compatible for now (it works only
> for >= 3.7 versions)). We could also implement some glue code, either in
> the tracking tool or in contrib/perf.py, to list available benchmarks
> and theirs parameters.
>
> At this stage of the project my advice it to use 1), but we could also
> have a mix of 1) and 3). It depend on how fast are internal api changes
> and on your short/mid/long term objectives on the level of integration
> of the tracking tool.
>
> What do you think ?
>
> On 04/04/2016 01:41 PM, Philippe Pepiot wrote:
> >
> > Hello,
> >
> > We (people at Logilab and Pierre-Yves) have a discussion last friday
> > about the performance tracking tool, here is the summary:
> >
> > - The choice have been made to use ASV http://asv.readthedocs.org/ as
> > it appear to us to be the more complete tool. ASV will be enhanced to
> > fit our needs, at least fixing hg branches handling, revision instead
> > of date for X axis, add a notification system and having a better home
> > page.
> > - We will setup a new buildbot job in the existing mercurial
> > infrastructure and provide an online version of the performance
> > tracking tool that is continuously updated when changes are pushed in
> > the mercurial repository.
> > - We discussed about parametrized benchmark that could be displayed on
> > the same graph (multiple reference repositories and revset variants).
> > ASV has this feature
> > (http://asv.readthedocs.org/en/latest/writing_benchmarks.html#parameterized-
> > benchmarks),
> > we will experiment it.
> > - We discussed about tracking improvement too, a change can have
> > positive or negative impact on multiple benchmarks (especially on
> > revsets benchmarks), having a global view of this information could be
> > a good feature.
> > - We planed a possible sprint on the topic in May 2016 either in Paris
> > or London
> > - The wiki page
> > https://www.mercurial-scm.org/wiki/PerformanceTrackingSuitePlan need
> > to be updated and completed to reflect the current state of the topic.
> >
> >
> > On 03/31/2016 10:24 AM, Philippe Pepiot wrote:
> > >
> > > Hello,
> > >
> > > Beside my replies bellow, a new demo of ASV with more bench values is
> > > available at
> > > https://hg.logilab.org/review/hgperf/raw-file/454c2bd71fa4/index.html#/reg
> > > ressions
> > > (this was tested against the pypy repository located at
> > > https://bitbucket.org/pypy/pypy)
> > >
> > > The results database can be seen in
> > > https://hg.logilab.org/review/hgperf/file/454c2bd71fa4/results
> > >
> > > On 03/30/2016 12:21 AM, Pierre-Yves David wrote:
> > > >
> > > > >
> > > > > - How do we manage the non-linear structure of a Mercurial history?
> > > > That's a fun question. The Mercurial repository is mostly linear as
> > > > long as only one branch is concerned. However:
> > > >
> > > > - We don't (and have no reason to) enforce it,
> > > > - the big picture with multiple branches part is still non-linear.
> > > >
> > > The solution proposed in ASV is have a graph per branch and to only
> > > follow the first parent of a merge (to avoid unwanted up and down
> > > that disturb regression detection), this is what I've done in the
> > > demo. The revset used to build the default branch graph is hg log
> > > --follow-first -r 'sort(ancestors(default), -rev)'
> > >
> > > The drawback is that we cannot always detect precisely the particular
> > > changeset which introduce the regression if it occur on a merge
> > > changeset (but we can give a range here).
> > >
> > >
> > > >
> > > >
> > > > >
> > > > >
> > > > > Airspeed velocity
> > > > > ~~~~~~~~~~~~~~~~~
> > > > >
> > > > > - http://asv.readthedocs.org/
> > > > > - used by the http://www.astropy.org/ project and inspired by
> > > > > https://github.com/pydata/vbench
> > > > > - Code: https://github.com/spacetelescope/asv
> > > > > - Presentation (2014): https://www.youtube.com/watch?v=OsxJ5O6h8s0
> > > > > - Python, Javascript (http://www.flotcharts.org/)
> > > > >
> > > > >
> > > > > This tool aims at benchmarking Python packages over their lifetime.
> > > > > It is mainly a command line tool, ``asv``, that run a series of
> > > > > benchmarks (described
> > > > > in JSON configuration file), and produces a static HTML/JS report.
> > > > >
> > > > > When running a benchmark suite, ``asv`` take care of clone/pulling
> > > > > the source repository
> > > > > in a virtual env and running the configured tasks in this virtual env.
> > > > >
> > > > > Results of each benchmark execution are stored in a "database"
> > > > > (consisting in
> > > > > JSON files). This database is used to produce evolution plots of
> > > > > the time required
> > > > > to run a test (or any metrics; out of the box, asv has support for
> > > > > 4 types of benchmark:
> > > > > timing, memory, peak memory and tracking), and to run the
> > > > > regression detection algorithms.
> > > > >
> > > > > One key feature of this tool is that it's very easy for every
> > > > > developer to use it on
> > > > > its own development environment. For example, it provides an ``asv
> > > > > compare`` command allowing to compare
> > > > > the results of any 2 revisions.
> > > > >
> > > > > However, asv will require some work to fit the needs:
> > > > >
> > > > > - The main drawback with asv is the fact it's designed with commit
> > > > > date as X axis.
> > > > > We must adapt the code of asv to properly handle this
> > > > > "non-linearity" related to
> > > > > dates (see https://github.com/spacetelescope/asv/issues/390)
> > > > > - Tags are displayed in the graphs as a secondary x axis labels,
> > > > > and are related to commit
> > > > > date of the tag; these should be displayed as annotations of the
> > > > > dots instead.
> > > > >
> > > > >
> > > > > :Pros:
> > > > >
> > > > > - Complete and cover most of our needs (and more)
> > > > > - Handle mercurial repositories
> > > > > - Generate static website with dashboard, interactive graphs
> > > > > - Detect regressions, implement step detection algorithms:
> > > > > http://asv.readthedocs.org/en/latest/dev.html#module-asv.step_detect
> > > > > - Parametrized benchmarks
> > > > > - Can collect metrics from multiple machines
> > > > > - Show tags on the graph, link to commits
> > > > > - Framework to write time, memory or custom benchmarks
> > > > > - Facilities to run benchmarks (run against a revset, compute only
> > > > > missing values etc)
> > > > > - Can be used easily on the developer side as well (before
> > > > > submitting patches)
> > > > > - Seems extensible easily through a plugin system
> > > > >
> > > > > :Cons:
> > > > >
> > > > > - No email notifications
> > > > > - Need to plot the graph by revision number instead of commit date
> > > > > - The graph per branch need to be fixed for mercurial
> > > > This one seems pretty solid and I like the idea of being able to run
> > > > it locally.
> > > >
> > > > The dashboard seems a bit too simple to me, and I'm a bit worried
> > > > here. the branch part is another unknown.
> > > >
> > > > How hard would be to implement a notification system on top of that.
> > > I agree the home page with summary graphs seems useless, the
> > > regression page could be a better entry point.
> > >
> > > To implement a notification system we could track modifications of
> > > the file "regression.json" which is generated when the static site is
> > > built (asv publish). At this point my idea is to keep history of the
> > > static site in a dedicated repository and generate a rss/atom page by
> > > looking at the history of the regression.json file and then we can
> > > plug any external tool that produce notification from the feed (irc,
> > > mail etc). Another idea could be to have a mercurial hook that does
> > > the same thing.
> > >
> > >
> > > >
> > > >
> > > > >
> > > > >
> > > > >
> > > > > EzBench
> > > > > ~~~~~~~
> > > > >
> > > > > - Code: https://cgit.freedesktop.org/ezbench
> > > > > - Used to benchmark graphics related patch on the Linux kernel.
> > > > > - Slides:
> > > > > https://fosdem.org/2016/schedule/event/ezbench/attachments/slides/1168
> > > > > /export/events/attachments/ezbench/slides/1168/fosdem16_martin_peres_e
> > > > > zbench.pdf
> > > > > - Shell scripts
> > > > >
> > > > > EzBench (https://cgit.freedesktop.org/ezbench) is a collection of
> > > > > tools to benchmark
> > > > > graphics-related patchsets on the Linux kernel. It runs the
> > > > > benchmark suite on a particular
> > > > > commit and store the results as csv files. It has tools to read the
> > > > > results and generate static
> > > > > html reports. It can also automate the bisect process to find the
> > > > > commit who introduced the
> > > > > regression. It's written in shell and python and is highly coupled
> > > > > to its purpose.
> > > > >
> > > > > :Pros:
> > > > >
> > > > > - Generate reports
> > > > > - Bisects performance changes automatically and confirm a detected
> > > > > regression by reproducing it
> > > > > - Reducing variance tips, capture all benchmark data (hardware,
> > > > > libraries, versions)
> > > > >
> > > > > :Cons:
> > > > >
> > > > > - Not usable as it
> > > > > - Doesn't handle mercurial repositories
> > > > It is unclear to me what's make it not usable as is (beside the lack
> > > > of Mercurial support?)
> > > Well, It seem we have to write bunch of shell code to create a
> > > "profile" and "tests", there is a kind of common library to write
> > > these files but its only about graphics stuff. All I was able to have
> > > is a temporary black screen :)
> > >
> > >
--
Mathematics is the supreme nostalgia of our time.
More information about the Mercurial-devel
mailing list