[PATCH stable?] run-test: ensure the test port are available before launching test

Fri May 8 01:14:07 CDT 2015

On 05/07/2015 08:26 PM, Gregory Szorc wrote:
> On Thu, May 7, 2015 at 8:00 PM, Pierre-Yves David
> <pierre-yves.david at ens-lyon.org <mailto:pierre-yves.david at ens-lyon.org>>
> wrote:
>
>
>
>     On 05/07/2015 07:37 PM, Gregory Szorc wrote:
>
>         On Thu, May 7, 2015 at 6:21 PM, Pierre-Yves David
>         <pierre-yves.david at ens-lyon.org
>         <mailto:pierre-yves.david at ens-lyon.org>
>         <mailto:pierre-yves.david at ens-lyon.org
>         <mailto:pierre-yves.david at ens-lyon.org>>>
>         wrote:
>
>              # HG changeset patch
>              # User Pierre-Yves David <pierre-yves.david at fb.com
>         <mailto:pierre-yves.david at fb.com>
>              <mailto:pierre-yves.david at fb.com
>         <mailto:pierre-yves.david at fb.com>>>
>
>              # Date 1431044040 25200
>              #      Thu May 07 17:14:00 2015 -0700
>              # Node ID c25b2adb3664cd3c488e2c53aab0c64100d40af7
>              # Parent  17ba4ccd20b48511b3d06ab47fb1b2faf31410d7
>              run-test: ensure the test port are available before
>         launching test
>
>              I'm running into systematic issue because there is always
>         some port
>              taken in the
>              1500 wide range of port used by the test (3 for each test
>         file).
>
>              diff --git a/tests/run-tests.py b/tests/run-tests.py
>              --- a/tests/run-tests.py
>              +++ b/tests/run-tests.py
>              @@ -47,10 +47,11 @@ import errno
>                import optparse
>                import os
>                import shutil
>                import subprocess
>                import signal
>              +import socket
>                import sys
>                import tempfile
>                import time
>                import random
>                import re
>              @@ -76,10 +77,22 @@ processlock = threading.Lock()
>                if sys.version_info < (2, 5):
>                    subprocess._cleanup = lambda: None
>
>                wifexited = getattr(os, "WIFEXITED", lambda x: False)
>
>              +def checkportisavailable(port):
>              +    """return true is nobody seem a port seems free to bind on
>              localhost"""
>              +    try:
>              +        s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
>              +        s.bind(('localhost', port))
>              +        s.close()
>              +        return True
>              +    except socket.error, exc:
>              +        if not exc.errno == errno.EADDRINUSE:
>              +            raise
>              +        return False
>              +
>                closefds = os.name <http://os.name> <http://os.name> ==
>         'posix'
>
>                def Popen4(cmd, wd, timeout, env=None):
>                    processlock.acquire()
>                    p = subprocess.Popen(cmd, shell=True, bufsize=-1,
>         cwd=wd, env=env,
>                                         close_fds=closefds,
>              @@ -1601,10 +1614,12 @@ class TestRunner(object):
>                        self._tmpbinddir = None
>                        self._pythondir = None
>                        self._coveragefile = None
>                        self._createdfiles = []
>                        self._hgpath = None
>              +        self._portoffset = 0
>              +        self._ports = {}
>
>                    def run(self, args, parser=None):
>                        """Run the test suite."""
>                        oldmask = os.umask(022)
>                        try:
>              @@ -1805,10 +1820,28 @@ class TestRunner(object):
>                        if failed:
>                            return 1
>                        if warned:
>                            return 80
>
>              +    def _getport(self, count):
>              +        port = self._ports.get(count) # do we have a
>         cached entry?
>              +        if port is None:
>              +            port = self.options.port + self._portoffset
>              +            portneeded = 3
>              +            # above 100 tries we just give up and let test
>         reports
>              failure
>              +            for tries in xrange(100):
>              +                allfree = True
>              +                for idx in xrange(portneeded):
>              +                    if not checkportisavailable(port + idx):
>
>              +                        allfree = False
>              +                        break
>              +                self._portoffset += portneeded
>              +                if allfree:
>              +                    break
>              +            self._ports[count] = port
>              +        return port
>              +
>
>
>         This is hard to read. Part of me thinks there is a race
>         condition here.
>         The offset management in particular has me worried.
>
>
>     The test (therefore this function) is called in the main thread
>     before we start using thread so no race condition should happen.
>
>         Ideally, tests would bind to port 0 and let the OS choose an
>         existing
>         port. That's the only guaranteed way to prevent race conditions. But
>         then we'd need to parse port numbers out of command output so
>         they could
>         be used in the rest of tests and we'd need a mechanism to register
>         replacements in test output. Scope boat.
>
>
>     Would could have variable management inside the test, but I expect
>     headache from it.
>
>         If you are going to rewrite port selection, I'd almost prefer
>         having the
>         test harness choose random numbers between 1024 and 65535,
>         attempt to
>         bind, and assign that. This seems simple and effective. There is
>         no need
>         to maintain state: just try binding in a loop until it works. Is
>         this
>         solution too simple? Is user control over the port range really that
>         important? We're not talking about a production service here...
>
>
>     Choosing random number for each test does not guarantee test will
>     not choose overlapping port number. The math says there is more than
>     99% change it will happen.
>
>
> Err, yeah.
>
> Have the test runner choose N random ports when the test executes. Keep
> track of which ports are reserved. Refuse to try those until the test
> finishes executing. That is doable. And it doesn't require nested range
> loops.

We have nested for loops because we bound the amount of tries, and each 
test get assigned multiple ports. I do not see these double for going 
away in another implementation. I also not sure why you flag that as 
"complicated".

-- 
Pierre-Yves David