Tuesday, June 4, 2019

Handling Authentication Requests with Selenium: Epilogue

I hope the preceding series of posts has been useful. To wrap things up, I want to share a GitHub repository that contains sample code for each of the items we've discussed. It includes an ASP.NET Core demo web site that implements Basic, Digest, and NTLM authentication. It includes sample Selenium code using BenderProxy (version 1.1.2 or later) and PassedBall (version 1.2.0 or later) to automate the site. The Selenium code runs in a console application, which will await you pressing the Enter key before shutting down the proxy and quitting the browser. This will allow you to see the state of the browser before everything quits. Other features of the sample repo include working factory classes for Selenium sessions and the demo cases themselves.

To make the demo in the source repo properly work, you must run it on Windows, because we are enabling NTLM authentication. Also, you will need administrative access on your Windows machine, which is unfortunate, but there is no other way to get the development web server to listen on a host name other than "localhost". If you change the test to navigate to the site on "localhost", the browser will likely bypass the proxy, because most browsers bypass proxies browsing on localhost without taking other configuration steps. By default, the demo project uses www.seleniumhq-test.test and port 5000, but you can use whatever you want. Here's how to configure your test environment so that the demo app will work properly:

From an elevated ("Run as Administrator") command prompt, edit your hosts file to contain a mapped entry for the host you wish to use. The hosts file can be edited in any text editor, including Notepad, so the following command will open it:

notepad.exe %WinDir%\System32\drivers\etc\hosts

Once open, add the following line:

127.0.0.1 <host name>

Be sure to substitute your preferred host name for <host name> . Save and close the hosts file. As an aside, this is a very useful technique for Selenium code to simulate navigation to external sites without actually having to navigate outside one's local machine.

Also in the elevated command prompt, execute the following command:

netsh http add urlacl url="http://<host name>:<port>/" user=everyone

Be sure to substitute your preferred host name and port for <host name> and <port> respectively. You should see a message that the URL reservation was successfully added. Now, this is a dangerous command, because it does open up a URL reservation for everyone, so you don't want to leave this permanently in place. You can remove it at any time after you're done using the sample by using another elevated command prompt to execute:

netsh http remove urlacl url="http://<host name>:<port>/"

Once you've added the hosts file entry and the URL ACL, you're ready to load and run the authentication tests. Open the solution in Visual Studio 2019, and you should be able to build and run. When running, the solution runs a console application that will launch the test web app, start the proxy server, start a browser configured to use the proxy with Selenium, navigate to a protected URL for a specific authentication scheme, and then wait for the Enter key to be pressed. This will let you examine the browser to validate that, yes, the authentication succeeded You can also examine the diagnostic output written to the console by the test code, which describes the WWW-Authenticate and Authorization headers being used. Once you've validated to your satisfaction that in the browser really did authenticate using Selenium and without prompting the user, you can press Enter, which will quit the browser, stop the proxy server, and shut down the test web app. As an extra validation step, you can also start the test web app from Visual Studio and manually navigate to the URLs to validate that they really do prompt for credentials when browsed to.

Here's the Main method of the test app:

As you can see, you can change the browser being used (line 5 in the listing above), and the authentication type (line 9 in the listing above) being tested by changing the commented lines in the main method. If you decided to use a different host name or port, you can also change that by uncommenting and changing the appropriate lines (lines 15 and 16 in the listing above, respectively).

Hopefully, this series has given you some insights into how browsers perform authentication, and how it's possible to automate this using Selenium, without resorting to other UI automation tools. Happy coding!

Monday, June 3, 2019

Handling Authentication Requests with Selenium - Part 4: NTLM Authentication

Now that we've created a thorough intellectual framework for how to handle authentication requests using Selenium in combination with a web proxy, and thanks to our last post, we can handle more than Basic authentication, let's take things a step further, and see how you can use Selenium in automating pages secured with NTLM authentication. Before we can do that, though, we need to have an understanding of how NTLM authentication differs from the previous types of authentication we've used before.

NTLM authentication is a Microsoft-developed technology, originally implemented in the company's IIS web server product. It's not widely used on the public internet, but it does integrate nicely with things like Active Directory, so it can be quite useful for web applications used on company intranets that require security based on Active Directory credentials. This means that to provide sample code, we'll need to have a few things in place first. First, we'll need a test website that we can run locally, running on a server that implements NTLM authentication. Since we're working in C# in this series, we can create an ASP.NET Core web project to do that.

Second, we'll also need to host the application using Windows. Even though the ASP.NET Core project can run against .NET Core, and that can run on platforms other than Windows, we'll need to actually run on Windows to take advantage of NTLM authentication, unless we want to introduce a ton of complexity with Active Directory domains and the like (which we don't for this post).

Finally, most browsers bypass the use of a proxy when running strictly on localhost. This means that if you're running things all on the same system, you'll need to either configure the browser not to do this, or trick it into thinking the site the browser is connecting to isn't the local machine. The latter is far easier, since it only involves adding a line to the Windows hosts file (located at %WinDir%\System32\drivers\etc\hosts). On my test system, I've redirected www.seleniumhq-test.test to 127.0.0.1 by using the hosts file, and the sample code will reflect this.

NTLM authentication is a challenge-response based authentication scheme, and it differs from other HTTP authentication schemes in that it authenticates a connection, not an individual request. This means that the browser and server must support so-called "keep-alive," or persistent TCP connections between them. It also means that our proxy has to support persistent TCP connections, and must allow us to use that exact connection for making the requests. Fortunately, the proxy we've been using so far, BenderProxy, does support this.

The challenge-response mechanism used is complicated. Very complicated. So again, we'll be using the PassedBall library to parse authentication headers and generate authorization responses. It also requires multiple request/response round trips to perform the authentication handshake. Here's the implementation code for handling the NTLM authentication challenge for a sample site hosted on our local host machine:

Note carefully that the initial 401 Unauthorized response may contain multiple WWW-Authenticate headers, so one may need to make sure the proper one is being used to interpret the response. Browsers, when faced with this, will usually choose what they perceive to be the "strongest" authentication method. In our case, we need to do that determination for ourselves.

We'll wrap up this series with one more post, summing everything up.