SSL troubleshooting on Windows

Matt Harbison mharbison72 at gmail.com
Sun Apr 16 22:56:19 EDT 2017


On Wed, 12 Apr 2017 17:02:41 -0400, Augie Fackler <raf at durin42.com> wrote:

> On Mon, Apr 10, 2017 at 01:02:19AM -0400, Matt Harbison wrote:
>> I had a little adventure a couple of weeks ago, changing a certificate  
>> on
>> the server.  I'm not sure if it's worth doing anything, or if this just
>> serves as a heads up to others using Windows.
>>
>> We use SCM Manager (which uses tomcat) on CentOS, with Mercurial  
>> 4.0.2.  The
>> certificate had expired over the weekend.  So the admin imported the new
>> certificate into the tomcat keystore, saw that the main page loaded in  
>> a web
>> browser (probably Chrome, and I think I subsequently tried FireFox), and
>> declared victory.  But Mercurial was still failing the verification.   
>> After
>> a bit of fiddling around, he eventually imported the root and  
>> intermediate
>> certificates for the chain into the keystore as well, and then Mercurial
>> clients worked.  Except for one Windows machine.  (I'm a bit puzzled  
>> that
>> the web browsers were OK with this config, but not Mercurial.)
>>
>> I was able to get the one failing machine working by temporarily  
>> setting the
>> fingerprint.  After awhile of debugging, I realized that this is a  
>> python
>> bug [1].  Basically, Windows doesn't ship with a full set of  
>> certificates,
>> and instead downloads them on demand.  But python isn't triggering the
>> download.  In this case, Windows had the root certificate and the leaf
>> certificate from the website, but was missing the intermediate  
>> certificate.
>
> This is actually a server misconfiguration: some browsers do AIA
> chasing to mitigate missing intermediate certs (Chrome does, but
> Firefox doesn't!), so strictly speaking the right fix is to adjust
> your webserver on the CentOS box.
>
> (Thanks to Alex Gaynor for confirming my intuition on this and filling
> in some blanks.)

Thanks for the reply, and sorry for the delay.  I only know how some of  
this works in theory, so I had to do a lot of digging.

This explains why the Chrome was OK with the original keystore, but  
Mercurial wasn't.  But when the one Windows machine still failed later,  
the server was sending the intermediate certificates, which I confirmed  
with s_client in OpenSSL:

0: server cert
1: Go Daddy Secure Certificate Authority - G2 (exp 2026)
2: Go Daddy Root Certificate Authority - G2 (exp 2037)

I rebuilt the keystore using the "G2 with cross to G1, with root" [1],  
which added a new certificate to the exchanged list:

0: server cert
1: Go Daddy Secure Certificate Authority - G2 (exp 2026)
2: Go Daddy Root Certificate Authority - G2 (exp 2037)
3: Go Daddy Class 2 Certification Authority (exp 2034)

2 and 3 are both under "Trusted Root Certification Authorities" and  
"Third-Party Root Certification Authorities" in certmgr.msc.  Looking at  
the Certification Path tab for 2 and 3 in certmgr shows both as root  
certificates.  1 is shown as issued by 2, and no (obvious) relationship  
between 2 and 3.

If the "Class 2 Certification Authority" certificate is deleted from  
"Third-Party Root Certification Authorities" (but still in "Trusted Root  
Certification Authorities"), Mercurial fails.  If you load the page in IE,  
it puts the cert back into "Third-Party Root Certification Authorities",  
and then everything works in Mercurial again.  (You need to close the Find  
Certificate window and refresh in
certmgr.msc, or certmgr's Find dialog doesn't see the newly installed cert  
after IE puts it there.  Nice.)

A second interesting thing that I stumbled into is that on another machine  
where I was experimenting with this, Mercurial was able to verify the  
connection with the short list above, but failed when sent the longer  
list.  I could toggle back and forth between success and failure by simply  
switching between the old and new keystore on the server.  I ended up  
fixing this by reimporting the "Class 2 Certification Authority"  
certificate from another Windows machine.

I know that the traditional model for this requires that you have your  
trusted certificates on hand, and identified as such when verifying a  
connection.  But it seems that newer versions of Windows are shipping with  
minimal certificates, and then call out to Windows Update to fetch the  
issuing certificates, and trust them (or not) based on what is setup on  
the MS server side of things.  See "How Windows updates root certificates"  
here:

https://support.microsoft.com/en-us/help/931125/how-to-get-a-root-certificate-update-for-windows

>> So the "fix" was to load the main web page in Internet Explorer, which
>> silently builds the certificate chain on Windows, and then Mercurial  
>> works.
>> An additional problem is that the error message is the same for both  
>> this
>> failure, and a real certificate problem.
>>
>> I started adding code to:
>>
>> 1) Detect and fix this without Internet Explorer via a debug function  
>> that
>> calls CertGetCertificateChain()
>> 2) Detect the problem when connecting, and print out a message pointing  
>> to
>> the debug command
>> 3) Just fix the problem when connecting, so that the debug command isn't
>> necessary
>>
>> The problem is that you need to successfully connect to the server in  
>> order
>> to get the certificate to pass to the Win32 function.  Since the  
>> connection
>> fails if verification is on, a second (non verifying) connection needs  
>> to be
>> established to get the certificate, and then a third (verifying)  
>> connection
>> to retry the original.  Ugh.  I tried turning off SSL verification on
>> Windows, so that only the first connection is needed.  It looks like  
>> there
>> are Win32 functions that can also verify the certificate chain, so we
>> shouldn't be missing any functionality.  But python doesn't return the
>> non-binary certificate when verification is off, so another check we do
>> fails [2].
>>
>> I assume the chance of landing this hacky code is about 0.  The other
>> alternative seems to be to package certifi or similar in the binary  
>> Windows
>> installers.
>
> Sadly, that probably wouldn't fix this case because of the above.

I agree that the missing intermediate cert is a server issue.  But missing  
the root certificates is a Windows problem that apparently can be fixed by  
calling a specific CryptoAPI method.  (Really, python should be doing  
this.)

>> That was mentioned when Greg overhauled the SSL stuff last
>> year, but I think it was deferred because there wasn't any real reason  
>> to
>> when python can access the Windows certificate store.  Maybe this is a
>> reason?
>
> I'm going to lean on my above claim about server misconfiguration for  
> now.
>
>>
>> [1] https://bugs.python.org/issue20916
>> [2]  
>> https://www.mercurial-scm.org/repo/hg/file/e0dc40530c5a/mercurial/sslutil.py#l623

[1] https://certs.godaddy.com/repository/gd_bundle-g2-g1.crt


More information about the Mercurial-devel mailing list