[PATCH stable?] run-test: ensure the test port are available before launching test

Thu May 7 22:26:52 CDT 2015

On Thu, May 7, 2015 at 8:00 PM, Pierre-Yves David <
pierre-yves.david at ens-lyon.org> wrote:

>
>
> On 05/07/2015 07:37 PM, Gregory Szorc wrote:
>
>> On Thu, May 7, 2015 at 6:21 PM, Pierre-Yves David
>> <pierre-yves.david at ens-lyon.org <mailto:pierre-yves.david at ens-lyon.org>>
>> wrote:
>>
>>     # HG changeset patch
>>     # User Pierre-Yves David <pierre-yves.david at fb.com
>>     <mailto:pierre-yves.david at fb.com>>
>>
>>     # Date 1431044040 25200
>>     #      Thu May 07 17:14:00 2015 -0700
>>     # Node ID c25b2adb3664cd3c488e2c53aab0c64100d40af7
>>     # Parent  17ba4ccd20b48511b3d06ab47fb1b2faf31410d7
>>     run-test: ensure the test port are available before launching test
>>
>>     I'm running into systematic issue because there is always some port
>>     taken in the
>>     1500 wide range of port used by the test (3 for each test file).
>>
>>     diff --git a/tests/run-tests.py b/tests/run-tests.py
>>     --- a/tests/run-tests.py
>>     +++ b/tests/run-tests.py
>>     @@ -47,10 +47,11 @@ import errno
>>       import optparse
>>       import os
>>       import shutil
>>       import subprocess
>>       import signal
>>     +import socket
>>       import sys
>>       import tempfile
>>       import time
>>       import random
>>       import re
>>     @@ -76,10 +77,22 @@ processlock = threading.Lock()
>>       if sys.version_info < (2, 5):
>>           subprocess._cleanup = lambda: None
>>
>>       wifexited = getattr(os, "WIFEXITED", lambda x: False)
>>
>>     +def checkportisavailable(port):
>>     +    """return true is nobody seem a port seems free to bind on
>>     localhost"""
>>     +    try:
>>     +        s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
>>     +        s.bind(('localhost', port))
>>     +        s.close()
>>     +        return True
>>     +    except socket.error, exc:
>>     +        if not exc.errno == errno.EADDRINUSE:
>>     +            raise
>>     +        return False
>>     +
>>       closefds = os.name <http://os.name> == 'posix'
>>
>>       def Popen4(cmd, wd, timeout, env=None):
>>           processlock.acquire()
>>           p = subprocess.Popen(cmd, shell=True, bufsize=-1, cwd=wd,
>> env=env,
>>                                close_fds=closefds,
>>     @@ -1601,10 +1614,12 @@ class TestRunner(object):
>>               self._tmpbinddir = None
>>               self._pythondir = None
>>               self._coveragefile = None
>>               self._createdfiles = []
>>               self._hgpath = None
>>     +        self._portoffset = 0
>>     +        self._ports = {}
>>
>>           def run(self, args, parser=None):
>>               """Run the test suite."""
>>               oldmask = os.umask(022)
>>               try:
>>     @@ -1805,10 +1820,28 @@ class TestRunner(object):
>>               if failed:
>>                   return 1
>>               if warned:
>>                   return 80
>>
>>     +    def _getport(self, count):
>>     +        port = self._ports.get(count) # do we have a cached entry?
>>     +        if port is None:
>>     +            port = self.options.port + self._portoffset
>>     +            portneeded = 3
>>     +            # above 100 tries we just give up and let test reports
>>     failure
>>     +            for tries in xrange(100):
>>     +                allfree = True
>>     +                for idx in xrange(portneeded):
>>     +                    if not checkportisavailable(port + idx):
>>
>>     +                        allfree = False
>>     +                        break
>>     +                self._portoffset += portneeded
>>     +                if allfree:
>>     +                    break
>>     +            self._ports[count] = port
>>     +        return port
>>     +
>>
>>
>> This is hard to read. Part of me thinks there is a race condition here.
>> The offset management in particular has me worried.
>>
>
> The test (therefore this function) is called in the main thread before we
> start using thread so no race condition should happen.
>
>  Ideally, tests would bind to port 0 and let the OS choose an existing
>> port. That's the only guaranteed way to prevent race conditions. But
>> then we'd need to parse port numbers out of command output so they could
>> be used in the rest of tests and we'd need a mechanism to register
>> replacements in test output. Scope boat.
>>
>
> Would could have variable management inside the test, but I expect
> headache from it.
>
>  If you are going to rewrite port selection, I'd almost prefer having the
>> test harness choose random numbers between 1024 and 65535, attempt to
>> bind, and assign that. This seems simple and effective. There is no need
>> to maintain state: just try binding in a loop until it works. Is this
>> solution too simple? Is user control over the port range really that
>> important? We're not talking about a production service here...
>>
>
> Choosing random number for each test does not guarantee test will not
> choose overlapping port number. The math says there is more than 99% change
> it will happen.
>

Err, yeah.

Have the test runner choose N random ports when the test executes. Keep
track of which ports are reserved. Refuse to try those until the test
finishes executing. That is doable. And it doesn't require nested range
loops.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://selenic.com/pipermail/mercurial-devel/attachments/20150507/d4181291/attachment.html>