Wednesday, January 9, 2013

Revisiting Native Events in the IE Driver

A little over six months ago, I wrote a blog post that concentrated on so-called "native events" in the IE driver. To refresh your memory, native events are using OS-level mechanisms for simulating mouse and keyboard input in the browser. This is distinct from "simulated events", which rely on using JavaScript to trigger the events on the elements in a page. Since I wrote that blog post, there have been some developments in the IE driver that have changed the landscape a little, and it's probably a good time to revisit the topic.

If you'll recall, there were two major issues that people have had challenges with using native events with the IE driver. The first is mouse clicks being swallowed up if the IE window isn't the foreground window. The second is hovering over elements, where the elements displayed on hover would appear for a fraction of a second, then disappear. This second issue was particularly vexing, since it only happens when the physical mouse cursor is within the bounds of the IE window. See the aforementioned previous blog post if you want more details about what exactly causes these issues, and why they're particularly hard to solve. Having said that, let's look at the newer approaches that have come available in the last six months.

Persistent Hovers

The first approach is using what's called "persistent hovers." By default, the IE driver simulates native mouse movements by using the Windows SendMessage API to send WM_MOUSEMOVE messages to the IE window. This can work, but if the IE window is in the foreground, and the physical mouse cursor is within the bounds of the IE window, the hover won't persist; it'll flash, and disappear. To solve this, the IE driver launches a separate thread and continually throws WM_MOUSEMOVE messages at the IE window for the coordinates that your test code specifies.

This means that the hover doesn't flash once and disappear, now it flickers, but at least it sticks around long enough to do something with. With the flickering, there's a far better chance that the hidden element you're attempting to display with your hover will be visible when you try to interact with it. Careful readers will note that there's still a race condition here, which your code might lose. Also, using persistent hovers renders that particular instance of IE unusable if you want to do something with it manually after you've finished with your test. Finally, notice that they only attempt to address the hover problem, not the "no focus click swallowing" problem.

Persistent hovers are available now with native events in the IE driver. They're enabled by default. If, for whatever reason, you need to revert to the behavior without them, they can be controlled by the "enablePersistentHover" capability.

Simulated Events

Over the last six months or so, there have been some strong improvements made in the simulated events implementation in IE. Simulated events have the advantage that they're not reliant on window focus. They're also a bit faster than the default native events implementation. However, these advantages are not without their costs.

Simulated events are limited to the JavaScript sandbox. There are certain effects that can't be activated via JavaScript, like a hover triggered solely by CSS. Blocking JavaScript events, like using alert(), confirm() or prompt(), especially in an onclick or onsubmit event handler, can cause your JavaScript execution (and therefore your WebDriver code) to hang.

If your application doesn't run afoul of the issues endemic to pure JavaScript input simulation, simulated events may be a good choice for you. You can use simulated events by setting the "nativeEvents" capability to false.

Requiring Window Focus

One of the guiding principles of the WebDriver project is that a driver should not require the browser window to be in focus to work properly. This principle is so important to the project that it's been codified as part of the draft W3C specification for WebDriver. However, there is some subset of users that don't care about needing window focus in their environment, they just want the most accurate test utility possible. Often these people mention that they're running their tests in a test lab or on dedicated virtual machines, and don't need to worry that focus is taken by the browser window. Until now, the IE driver had no way to accommodate such a requirement.

However, with a recent change set in the IE driver code base, it is now possible to ask the IE driver to bring the window currently focused in the driver (as determined by the WebDriver.getWindowHandle) to be the window in the foreground. This attempt is not guaranteed to succeed, because making a window from another process the foreground window is a Bad Thing, but it's the only way to make simulated input work properly. Additionally, instead of using the flawed SendMessage/PostMessage approach, this change introduces use of the Windows SendInput API.

This functionality is brand new. It should be considered very experimental. All input using the WebDriver Advanced User Interactions API (the "Actions" class) will go through this code path if set correctly, but mouse clicks using WebElement.click() will not yet do so. Additionally, there's no mechanism (yet!) in the wire protocol to enforce that a sequence of actions should be handled as an atomic unit, though the new internal classes added to the IE driver will make it easy to implement such a feature. To enable this functionality, set the "requireWindowFocus" capability to true; it is defaulted to false, to maintain existing functionality. Note that requireWindowFocus and enablePersistentHover are mutually exclusive.

Hopefully, these methods will give you the most flexibility with how you perform keyboard and mouse interaction with Internet Explorer. Most of these are hacks at best, but that's the best we can do until Microsoft begins producing the IE driver.