Friday, April 19, 2019

Announcing Selenium 4.0 Alpha .NET Bindings

I am very proud to announce the release of the first alpha version of the Selenium 4.0 .NET language bindings! These bindings have been years in the making, and are now available for the first time in alpha form. They are by no means finished, and new features will be available before release. Some things to note about the bindings:
  • The bindings now only support .NET Framework 4.5 and above, and .NET Core 2.0 and above (via .NET Standard). This is to gain support for additional classes in the .NET Framework that are unavailable in previous versions of the framework.
  • The internals of how the bindings communicate with the browser drivers has been completely rewritten to use System.Net.Http.HttpClient. I'm sure that something in this conversion has been missed, so there needs to be thorough testing of this.
  • The bindings now only support the W3C WebDriver Specification dialect of the wire protocol. This simplifies the code for the .NET bindings considerably.
  • Methods and classes that were marked with the Obsolete attribute in 3.141 of the .NET bindings have been fully removed. This includes the ExpectedConditions and PageFactory classes. If you want to continue to use those structures, the DotNetSeleniumExtras packages will be updated for the final release.
The complete list of changes is listed in the bindings' CHANGELOG. Please download and try out the bindings, and send your feedback. If you run into issues, you can file a new issue in the issue tracker at the Selenium project GitHub repository or you can contact me via Twitter (@jimevansmusic) or on the Selenium project's IRC or Slack channel. Happy automating!

Sunday, February 17, 2019

Improving IE Driver Use with Invalid Protected Mode Settings

Users using the IE driver when they do not have the ability to properly set the Protected Mode settings of the driver, usually restricted by an overzealous IT department, have always faced challenges when using the IE driver. This has been a known issue ever since the rewritten driver was introduced in 2011. The technical reasons for requiring the setting of Protected Mode settings are well-documented, and haven't changed in the intervening years.

In order to use the driver without setting the Protected Mode settings, the user had to resort to passing capabilities into the driver session creation, but this was still dicey because the driver could do nothing to mitigate when a Protected Mode boundary was crossed. So even then, it was possible, even likely, to receive errors like, "Unable to get current browser," or, "Unable to find element on closed window." As such, there was no conceivable way to work around the issue and still support all of the versions of Internet Explorer that were required. Since as of July 2019, the driver will support no versions other than IE 11, that landscape has changed somewhat. A change to the IE driver was recently (at the time of this writing) committed that makes the attempt to at least make the experience somewhat better.

Now, when the user does not set the Protected Mode settings of the browser and sends the capability to bypass the checks for those settings, the driver will attempt to predict when a Protected Mode boundary will be crossed, and set in motion a process to reattach itself to the newly created browser. This process is far from perfect. It is subject to really challenging race conditions that are truly impossible to eliminate entirely, because of the architecture of the browser itself. Nevertheless, even in its flawed state, this is still a better outcome than it was previously for users.

Please note that the advice and support policy of the IE driver will continue to be that the user must set the Protected Mode settings of the browser properly before using the driver. Any "issues" that arise by not having the settings set, but that disappear when the settings
are corrected, are not considered by the project to be valid issues. This will include, but not be limited to, issues like abandoned browser instances not being closed, use of multiple instances of the driver where the wrong browser window is connected to and automated, and issues where the driver appears to hang upon navigation to a new page. If the problem disappears when the browser is properly configured, any issue reports will be immediately closed with a note to properly configure the browser and remove the capability.

The following situations should be at least partially mitigated by the change:

  •  Navigation to a new page
  •  Clicking on a link (specifically an <a> tag) that will lead to navigation to a new page
  •  Clicking a link that opens a new window
Other cases, like navigating backward and forward through the browser history, clicking an element that submits a form, and so on, may not be handled. In those cases, issue reports will be summarily closed, unless a specific pull request fixing the issue is also provided. Additionally, use of things like proxies to capture traffic between the browser and web server may miss some traffic because of the race conditions inherent in the mechanism used to reattach to a newly created browser. Again, these race conditions are unavoidable, and
issue reports that are based on them will be immediately closed with a note indicating that the browser must have its settings properly set. These strict guidelines are not intended to be harsh, and are not put in place with the intent to avoid investigating and fixing issues;
rather, they must be enforced because the underlying architecture of the browser makes them unavoidable.

While not perfect, it's hoped that these changes will make things a little easier for users who run against Internet Explorer, but are prevented by circumstances beyond their control from properly configuring the browser. If you're one of those unlucky users, I hope you'll give the driver a spin, and see how it works for you.

Sunday, March 11, 2018

Deprecating Parts of Selenium's .NET Bindings

Note: This blog post should be considered a work-in-progress until this note is removed.

With the release of 3.11.0 of the Selenium .NET bindings, a few things in the support library (WebDriver.Support.dll) have been marked with the Obsolete attribute. This will come as something of a surprise for some users when they update to that version. In particular, the .NET implementation of the PageFactory and the ExpectedConditions class used with WebDriverWait have been deprecated. It's understandable that there would be some consternation about suddenly seeing compile warnings mentioning removing components in a future release of Selenium, particularly if one's own code makes use of those components. Why would the .NET bindings' maintainers do this?

First, one must consider the original intent of the WebDriver.Support.dll assembly. When originally created, it was originally designed to showcase some of the things that would be possible to create based on the WebDriver API. It was not intended that large numbers of users would plug those examples directly into production code.

Second, the .NET implementation of these constructs was created mostly because some users asked, "Java has it, so why doesn't .NET?" Rather than blindly copying the Java implementations as was done, it would have been better to think about what actually makes sense when using C#. In other words, "C# isn't Java, and therefore the things that work best for Java may not be entirely appropriate for C#."

In the case of the .NET PageFactory, the implementation was problematic and cumbersome, as well as not nearly flexible enough for the myriad ways people wanted to create Page Objects. Additionally, when .NET Core 2.0 was released, the classes upon which the .NET PageFactory relied were not included .NET Core 2.0. This meant that to get the PageFactory working under .NET Core, the project either had to take on a new dependency, mangle the code with conditional compile directives, or leave it unsupported in .NET Core. The first approach is a non-starter for the Selenium project's .NET bindings, the reasons for which should be a subject of its own blog post. The second approach made the code nearly impossible to properly maintain.

Furthermore, with respect to the PageFactory in particular, there is no benefit to be gained by identifying elements via an attribute over doing it directly in runtime code. Claims that the PageFactory made Page Object creation and maintenance less verbose simply do not hold up under close scrutiny.

With respect to ExpectedConditions, again, this was an addition that was created in .NET solely because "Java has it." At the time the ExpectedConditions class in Java was created, the syntax for creating a lambda function (or something that acted like one) was particularly arcane and difficult to understand. In that case, a helper class made lots of sense for the Java bindings. However, C# isn't Java. In C#, the syntax for creating lambda functions ("anonymous methods" in the language of Microsoft's documentation) has been well understood by C# developers for many years, and is a standard tool in their arsenal.

In this case, the question of code verbosity does have some merit, but since wait conditions are rarely one-size-fits-all, it would be a much cleaner approach for users to develop their own conditions class that has the wait conditions they're interested in. This, however, is something users have an aversion to. Additionally, the thought of a "standard" collection of implementations of specific wait conditions seems to be a good idea on its face, but there is a great deal of variation on the way users want any given condition to work. Having a collection of wait conditions might be a good thing, but the Selenium project is not the place for it.

So that people would still have access to the existing implementations, a new organization has been set up on Github. The code for these two pieces of the support library have been migrated there, and the first binary artifacts have been distributed concurrent with Selenium 3.11. People can move over to the ported implementations with minimal change (usually just a namespace change) to their own code. The new repo is awaiting someone from the community who feels strongly about maintaining these types of libraries.

Monday, September 25, 2017

Selenium WebDriver Support For .NET Core 2.0

Starting with release 3.6.0 of the .NET bindings, Selenium now has the initial support for .NET Core 2.0. The .NET bindings in that release contain versions of the assemblies that are build against the .NET Standard 2.0 platform, which means they're intended to be used with .NET Core 2.0 projects. I know this has been a feature many people have wanted for a long time, and I'm glad the project can now deliver it. However, it does come with some associated costs, and with a few known issues.

The first known issue is that calls to localhost in .NET Core are slower than those in the full .NET Framework. This is due to internal differences in the .NET libraries themselves, and are not the fault of the bindings directly. See this issue in the .NET Core repository for more details

Secondly, attempting to save a screenshot to any graphics file format other than Portable Network Graphics (PNG) will throw an exception. .NET Core does not provide the image manipulation classes that the full .NET Framework does, and there are no production-ready third-party libraries that provide that functionality yet and also only rely on managed code. It's fully possible to save a screenshot when using .NET Core, but you can only save it to the PNG file format within the Selenium libraries. This concern is over and above the difficulties with adding dependencies to the language bindings

Speaking of difficulties with adding dependencies to the Selenium project leads me to the next known issue. When using the bindings against .NET Core, there is no PageFactory class available. This is not an oversight, nor is it a bug. I have long said that the .NET PageFactory implementation is not required for effective implementation of the Page Object Pattern, and the .NET PageFactory does not provide any tangible benefits to the user. Even the argument that the code is easier to read is specious with properly constructed page objects. Moreover, the existing .NET PageFactory implementation requires use of classes that are not available in .NET Core. It is a non-trivial matter to add additional dependencies to the .NET bindings, so simply replacing those classes with a third-party library that is compatible with .NET Core is not a "perfectly obvious" option.

Finally, references to the .NET Standard 2.0 library versions provided in this and future releases are only valid when using NuGet package references. Simply copying the assembly and adding an assembly reference to the .NET Core 2.0 project will not work. This is by design of the .NET Core ecosystem, which is now entirely dependent on NuGet to propertly resolve dependencies.

One last note with the 3.6.0 release of the .NET bindings. Previously, the .zip archives that were provided at the official Selenium release site contained only the assemblies (.dlls) for the various frameworks that we supported. Starting with this release, the downloadable .zip archives contain NuGet package (.nupkg) files inside the .zip. To extract the actual .dlls from the packages, you can use any .zip reader to extract files from a .nupkg file. Yes, this means that we're putting a .zip inside a .zip, which is less than efficient, and we may revisit this mechanism of distributing the binary releases in the future.

Wednesday, March 22, 2017

Announcing Beta Release of Selenium IE Driver

One of the most common question I get asked is, "How can I help contribute to Selenium?" Usually my answer involves pull requests and the like, but today, I can offer a much easier way for people to contribute. A significant part of my attention over the last four years has been thinking about and working on the W3C specification for WebDriver. While the specification codifies many of the things that the open source Selenium project has done for years, there are some significant changes to the wire protocol that the language bindings use to communicate with the drivers themselves. The specification already has an implementation in wide use, in geckodriver, Mozilla's driver implementation for Firefox. In order to move forward, however, the IE driver needs to be updated to follow the specification. Here's where you come in.

I've modified the IE driver to use the W3C dialect of the wire protocol. This modification, while significant internally, shouldn't show any differences in behavior from the existing, shipping IE driver. It currently passes all of the tests in the Selenium project for IE. While these tests are pretty extensive, the permutations available in the DOM and in Selenium WebDriver code used to automate it are nearly infinite. To that end, I'm announcing the availability of a beta version of the IE driver. What am I asking you to do? Simply download the new driver executable, and use it in place of the existing driver you're using in your Internet Explorer automation.


  • The beta driver should be a drop-in replacement for the existing 3.3.0 IEDriverServer.exe release. It should require no changes in your code, save maybe pointing to the new executable.
  • Having said that, there are some differences that are expected due to spec compliance. Full-page screenshots, for example, are explicitly disallowed by the specification, so are no longer generated by the driver.
  • The beta driver's version number (visible by executing IEDriverServer.exe --version) will be 3.3.99.x. Bug fix releases will increment the "build" (fourth) field of the version number.
  • This executable will only be available via the download site; it will not be available via package managers (Maven, NuGet, npm, etc.). If the beta appears in any of the (unofficial) packages that may be used for IEDriverServer.exe in a package manager, a request will be sent to the package owner to remove it, so please don't rely on those.
  • There have been some extensive internal rewrites due to the nature of the protocol changes. More on what to look for below.
  • Only the 32-bit version of the driver is being provided for the beta.

Areas of Concern

We want to know if there are any differences between the shipping 3.3.0 version of IEDriverServer.exe and the beta version. You should see the same behavior, including bugs; do not expect the beta driver to magically fix issues you may have experienced with IE in the past. Updating to support the specification wire protocol has required extensive rewrites, but these should all be transparent to the language bindings. The biggest changes have happened in the areas of element interactions, so you should pay special attention to things like or WebElement.sendKeys(). There is one known issue that if you call WebElement.submit(), and the onsubmit event handler throws up a JavaScript alert(), the driver will hang. This issue won't be fixed until after the merge back to master. Also note that the beta has to date only been tested against IE 11, and per the driver's current support policy, only officially supports IE 9, 10, and 11.

Reporting Issues

Issues with the beta can be reported to the Selenium project's issue tracker. However, we have to set some ground rules for the issues that you submit. Here they are:
You'll need to provide the following information with any issue report:
  • Language bindings (Java, .NET, Ruby, Python, JavaScript) and version number you're using
  • The specific version of the beta you're using
  • The WebDriver code that behaves differently
  • An HTML page (or link to one) that the WebDriver code can be run against
Lack of any of this information will cause the issue to be closed immediately, without action or investigation! There are simply too many other potential issues with the existing IE driver, and the timeline for getting this merged into the main code line is simply too short to be able to go back and forth with issue reporters trying to set up a reproducible case. Moreover, here are some further guidelines about submitting issues.
  • Prefixing your issue title with "IE Driver Beta" will get it processed more quickly than if you don't.
  • The beta has only been tested with 3.3.x versions of any language bindings. It should still work with any language bindings of the 3.x vintage, but if you haven't tried your code with at least 3.3.x, you will be asked to do so before further investigation can continue on your issue.
  • You should be able to concretely demonstrate a difference in behavior from IEDriverServer.exe 3.3.0 and the beta you're using. If you cannot, you will be asked to do so before investigation can continue.
  • If you are using a test framework, and your sample code cannot be extracted to simple, straightforward WebDriver-only code, your issue will be closed. Developer bandwidth is just too narrow to wade through tons of framework code to get to the single few lines of WebDriver code that are exhibiting different behavior.
  • If you omit an HTML page that can be tested against, your issue will be closed. Again, this may seem overly restrictive, but without this caveat, it will be nearly impossible to debug the issue with the beta driver.
This is pretty time-sensitive, so if you'd like to give this a try, the Selenium project developers would really appreciate it.

Monday, February 13, 2017

Announcing End of Life of .NET Selenium RC Language Bindings

This post will serve as the official announcement that version 3.1 of the Selenium .NET language bindings will be the last to provide a Selenium RC library. Users still relying on the RC API will be able to continue to do so using WebDriverBackedSelenium, or by converting your code to use WebDriver proper. Selenium RC has been deprecated for over six years, and the .NET Selenium RC language bindings have not been updated with a code change other than a version bump in nearly that long. This change isn't likely to affect many users at this point, and the 3.1 versions of the language bindings will continue to be available more-or-less indefinitely, but there will be no further changes to the .NET RC library or releases of it.

Let me restate again so that it's blatantly obvious. This does not affect the .NET language bindings for WebDriver, and WebDriverBackedSelenium will remain a viable path forward for some time. It only affects Selenium RC in the .NET language bindings.

Tuesday, August 23, 2016

Polyamory, Pride Flags, and Patterns of Feedback

Warning: For those of you who come here looking for technical advice and inside information about the Selenium project, WebDriver, or browser automation, this post isn't about any of those. You might just want to skip this one altogether.

One thing about me I'm not really sure how many people are aware of is that I'm polyamorous. That means that I am comfortable being in simultaneous romantic relationships with multiple partners at once, and that my participation in those relationships is openly known by all people involved. I've been polyamorous, or "poly" for short, for nearly all of my adult life. A little over 20 years ago, I lived in the Pacific Northwest, and for the first time in my life, I experienced first-hand the struggles and celebrations of what is now known as the LGBT community. One thing that struck me was the imagery and symbolism those communities used to rally around, identify other members, and publicly announce their membership in the community. The pride flag was one image that made a huge impression on me. At that time, the poly community didn't really have similar symbols to use, so I took it upon myself to create one. Here's what I made up, and released into the public domain in the late summer or early fall of 1995.

Here's the text I wrote up describing it to the first mailing list I shared it with. It's become the canonical description of this particular flag.
The poly pride flag consists of three equal horizontal colored stripes with a symbol in the center of the flag. The colors of the stripes, from top to bottom, are as follows: blue, representing the openness and honesty among all partners with which we conduct our multiple relationships; red, representing love and passion; and black, representing solidarity with those who, though they are open and honest with all participants of their relationships, must hide those relationships from the outside world due to societal pressures. The symbol in the center of the flag is a gold Greek lowercase letter 'pi', as the first letter of 'polyamory'. The letter's gold color represents the value that we place on the emotional attachment to others, be the relationship friendly or romantic in nature, as opposed to merely primarily physical relationships.
Now, here are some things to understand. Clearly, I'm not a visual artist. My tools for creation at the time were literally limited to Microsoft Paint, running on Windows 3.1. Nevertheless, the flag design managed to limp along, with little fanfare. My friends and I used it, and thought of it as quirky and something that could be used in the way other pride flags were used, as a symbol to rally around and for identification.

Fast forward 20 years. Apparently, this thing called the World Wide Web happened, and let all sorts of people communicate and discover things they'd never known about before. New polyamorous people began to discover the flag existed. One would think that people might think it was an interesting idea, given its intent. One would be wrong. The flag has been called vile, no good, hideous, disappointing, ugly, and many other negative things.

One of the issues frequently brought up is that the color scheme is garish or unpleasing. That's subjective, and I can't argue with their perception. I still think there's value in the color symbology, if not the actual RGB values I used when creating it.

Many people seem to take issue with the pi symbol as obscure. There were specific reasons for choosing it at the time. First, I specifically avoided imagery that included a heart. The leather pride flag, which predates the design of mine, includes a heart, and I was trying to avoid confusion, given that community was there first. The "infinity heart" was not yet as widely accepted a symbol for polyamory, and would have been challenging for me to incorporate given my limited abilities in the visual arts. The letter pi was readily available on computer typographic platforms even in those days, so I chose that.

Also, at the time, I was more concerned with "in the closet" polyfolk, and was far more in the closet myself than I am these days. I wanted a symbol that could be used relatively anonymously, that could let people who were in on the symbology connect, without it being too specific.

Additionally, there was already a rich history of existing pride symbols using Greek letters, the use of lambda as an LGBT symbol, being a concrete example. I was hoping to evoke similarity and solidarity without being too explicit or derivative. Finally, the fact that the "poly" in polyamory is a Greek root seemed to indicate that would be a natural choice. In retrospect, perhaps a lemniscate ("infinity symbol") would've been a better choice, but nobody spoke up then.

Poly people coming to read this full story for the first time, welcome. Glad to meet you. If you don't care for the flag, I'm sorry to have offended your sensibilities. Today, there are a number of alternative symbols you can rally around. Use mine, don't use it, I'm just glad some people found a banner to rally around in the late '90s. Feel free to leave comments, but dismissive and abusive comments will be removed.