[Bug 3557] New: Issues loading http://hg.mozilla.org/try

bugzilla-daemon at bz.selenic.com bugzilla-daemon at bz.selenic.com
Thu Jul 26 09:09:15 CDT 2012


http://bz.selenic.com/show_bug.cgi?id=3557

          Priority: normal
            Bug ID: 3557
                CC: mercurial-devel at selenic.com
          Assignee: bugzilla at selenic.com
           Summary: Issues loading http://hg.mozilla.org/try
          Severity: bug
    Classification: Unclassified
                OS: All
          Reporter: shyam at mozilla.com
          Hardware: All
            Status: UNCONFIRMED
           Version: 2.2.3
         Component: hgweb
           Product: Mercurial

Over at Mozilla, we have an hg repo that we call try, that's used by developers
to "test" out changes to firefox code before actually committing them to the
source repos (like mozilla-central or integration/mozilla-inbound).

We've always had issues with try once it reaches about 3000 heads, but of late
that's been happening pretty quickly (presumably because of the faster pace of
development). At this point, we used to simple reset try (re-clone from
mozilla-central, fix up perms) and things would be fine. 

The most recent issue started out with try taking time to load, just over http
or https. We last reset try on June 26th 2012. 

Just over a week later, on July 4th, our developers started noticing slowness
when trying to load the web interface of try (i.e: http://hg.mozilla.org/try).
On checking, try was found to have 670+ heads.

The other gory details are in
https://bugzilla.mozilla.org/show_bug.cgi?id=770811 but over the course of the
month, hg.mozilla.org/try has gotten progressively worse.

http://hg.mozilla.org/mozilla-central or https://hg.mozilla.org/mozilla-central
always loads in under a second, hg.mozilla.org/try takes anywhere b/w under a
second to over 3 minutes

Yesterday a bunch of us sat down and spent some time getting strace information
to see what was actually happening behind the scenes :

1) Successful request (under 3 seconds) - http://pastebin.com/yj4dfwjU
2) Unsuccessful request (random amt of time) - http://pastebin.com/C2imiHh0

By Unsuccessful, I mean the request takes a bunch of time to process and then
eventually works. The bunch of time is seemingly random & the load times seem
to be completely sporadic, which is why we're a little miffed. As of now, we've
not been able to lock this down to a pattern as such.

The biggest difference b/w all the fast loading repos and try is the number of
heads :

[root at hgweb1.dmz.scl3 mozilla-central]# hg heads -q | wc -l
8

[root at hgweb1.dmz.scl3 try]# hg heads -q | wc -l
2855

So I'm inclined to think some combination of hgweb and those many heads is
causing a problem somewhere.

To give some info into this setup, we have about 5 web nodes, with the hg repos
mounted ro over an NFS mount and fronted by Apache and mod_wsgi. The NFS bits
are handled by Netapp.

Any help/pointers will be awesome! Thanks.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


More information about the Mercurial-devel mailing list