Wednesday, July 25, 2012

WebDriver: Y U NO HAVE HTTP Status Codes?!

There's a long-standing issue in the Selenium issue tracker dealing with the fact that the WebDriver API does not expose HTTP status codes to the user. For those of you who like to keep score at home, it's issue #141 in the tracker. The issue was opened in February, 2009, and was closed as "Won't Fix" in December of that year. Despite the finality of many project contributors and the main project architect saying that this feature will not be made available, it has continued to garner comments, many vehement, for reconsideration. I'm going to try to spend a little bit making what I hope is a reasoned, rational argument why this decision was made, and why the feature isn't needed in the project.

How Did We Get Here?

What follows is an oversimplified brief recap of some history. In the beginning was Selenium RC, which was an API that grew organically during its existence, with no rhyme or reason for how methods were tacked onto its single object. Over time, it became a brilliant example of the God object anti-pattern, violating all kinds of object-oriented programming principles. One of the more obscure methods engendered by the organic growth of the project was called captureNetworkTraffic(). This method purported to capture all of the network traffic between the browser and the site being automated, and was made possible because in some browser configurations, Selenium acted as a proxy between the browser and the site being automated, thus all browser traffic passed through Selenium, and could be captured or manipulated.

And so it came to pass that Selenium RC was found wanting, and lo, there was much weeping and wailing and gnashing of teeth in the browser automation community. And thus it was that the WebDriver project was born, and eventually merged into the Selenium project to become Selenium WebDriver. WebDriver was a completely different approach to browser automation, preferring to act more like a user, which solved the fundamental problems inherent in the Selenium RC approach. However, since much of the actual driving of the browser would now be done external to the browser itself, and with no proxy in between the browser and the site being automated, it made creating a method similar to captureNetworkTraffic() difficult at best, and impossible at worst. 

Where Are We Now?

This brings us to the current state of affairs, with HTTP status codes being unavailable in the WebDriver API. An architectural decision was made during the creation and development of the WebDriver API that this feature would not be implemented, and that it would be declared "out of scope" for Selenium WebDriver. This has caused some consternation among users, especially since the issue has been closed, and it's been said that the feature won't be implemented, and been said so in extremely plain, even blunt, terms. There are some valid technical reasons why this decision was made. Here are just a few of them:
  • What about redirects? Do you just return the last code after the redirect, or the code that indicates a redirect? What do you say to those who disagree with your answer to that question?
  • Some browsers make it impossible to get the status code. Do you really want an API that works on some browsers, but not others?
  • WebDriver is concerned with driving the browser, not necessarily just web application testing. Does returning HTTP status codes really fit this mission?
  • WebDriver is concerned with driving the browser as a user would. Does a browser show HTTP status codes to the user, or just rendered pages?

Nevertheless, that hasn't stopped people from arguing that it should be included. Let's take a few minutes to examine some of those arguments, and then we'll take a look at what options there are for solving this issue.

"HTTP status codes are an important part of website testing."

Yes, I can see the argument that HTTP status codes could be an important part of testing your website. However, while web application testing is an important use case of the Selenium project, it's far from the only one. A frequent response to this is, "But your own web pages and literature say it's for web application testing!" That's a fair critique about documentation and public perception of the project, but that's really a separate discussion. HTTP status codes are not a required part of automating the browser.

"I don't care that it doesn't work on all browsers, I need those status codes."

One of the major advantages of the WebDriver API over what has come before is its elegance and purity. Implementing a solution that works in only some browsers is vastly inferior to one that works for all browsers, especially when the solution isn't in the core competency of the library. There are solutions that do work for all browsers without polluting the API, albeit they require integration with other tools that <gasp> are not Selenium. Tacking on this feature for some browsers in a suboptimal way that might miss important edge cases because it's not a core competency is akin to driving a wood screw into a board with a hammer because that's the only tool you have. Yes, it'll work, but it's not going to be pretty, and is likely to fail you somewhere down the line.

"Other parts of the WebDriver API let you do things a user can't do, why not expose status codes?"

Proponents of this argument usually cite manipulation of cookies, or finding of elements by anything other than visual inspection, or viewing a page's HTML source, or any number of other items. All of those items are germane to interacting with the page itself, with what is displayed. The HTTP status code is not directly concerned with what is displayed on the page.

"But so many people want the feature, you should really add it."

Ah, yes, the old, "But I want a pony," argument. Or it's corollary, "20 million New Kids On The Block fans can't be wrong!" This argument is occasionally followed by a sometimes hostile, "Well, your lack of this feature makes your library completely useless to me," or even, "You'd better add it or else I'll stop using it!" This latter response is the equivalent of, "I'm going to hold my breath until you give me exactly what I want!" Just because something's popular doesn't make it a good idea.

Where Do We Go From Here?

Just as proponents of wanting to see HTTP status codes in the WebDriver API are convinced that the arguments against including it don't hold water, the members of the project team are equally convinced that adding it is a bad idea. So if you feel like you absolutely need to have them, what can you do? Well, you have a couple of options.

Remember how I said earlier that Selenium RC was able to give you this information because it acted as a proxy? You can recreate that exact same environment with WebDriver! Of course, you have to use a dedicated proxy to do it, but guess what? There are lots of software proxies around that make this really easy. Additionally, you're using the proxy to capture the traffic, not something half-baked that's been shoehorned into the WebDriver API, which is a little thing I like to call using the right tool for the job. As an example, the BrowserMob proxy is one that's being used successfully by lots of people. It's open-source, and it's written in Java, so if you're using Java, you can even control it directly from your existing code. If you're not using Java, fear not, as there are wrapper libraries written for many different languages, including Python, .NET, Ruby, and PHP, to name a few. The WebDriver library even allows you to set the browser you're automating to use the proxy.

"But I don't want to use a proxy! I only want to have to manage Selenium as a dependency," I hear you say. I've got a little secret for you: WebDriver is Open Source Software. It's even very liberally licensed. If you don't like a decision made by the project team, fork it, create a patch, and share with the world. I personally love innovation in the Open Source world. Don't tell me how you want to see it done, show me, with working code. Unless you can demonstrate your solution working for all browsers though, I'll probably use a proxy if I need this functionality, and that's my choice.  
   

22 comments:

  1. The problem with BrowserMob is that it's still in beta phase and it's not very stable. Although I've used it successfully in many project it's been always causing some problems sometimes to the extent that I had to abandon it. Alternatively I use Firefox+Firebug+NetExport but this solution is slow and has it's drawbacks too. If anyone knows about a stable proxy I could use with Java I would appreciate if they share their knowledge.

    ReplyDelete
    Replies
    1. Thanks for the explanation Jim. You motivated me to write a blog with some alternatives and workarounds for those of use who need this feature:

      http://www.ninthavenue.com.au/how-to-get-the-http-status-code-in-selenium-webdriver

      Delete
  2. Not exposing HTTP Status codes was the single most important reason I stopped using Selenium WebDriver. It's just a way too important detail about HTTP to leave out of an HTTP testing framework.

    The only argument you make that holds water (imho) is that introducing HTTP status codes won't make it coherent across all implementations. The solution to this is to throw exception in browsers that don't expose it. People are used to special-case their for different browsers already, so this wouldn't cause much confusion (if the exception thrown has an intuitive message, at least).

    So, I didn't understand this decision before reading this blog post and still don't.

    ReplyDelete
    Replies
    1. Just out of curiosity, what did you decide to use instead?

      I'll point out that WebDriver is most emphatically *not* an "HTTP testing framework," except in the sense that it drives a browser, which uses HTTP to communicate with a server.

      Once again, there are options, the most extreme of which is to implement it, and show me how it's done. Less extreme is to use the right tool for capturing that information, a proxy.

      Rather than continue to shout at the proverbial rain about what idiots the developers are for making poor decisions, I invite you to be the agent of the change you want to see.

      Delete
    2. Not sure why you characterize @asbjornu's arguments as "continuing to shout at the proverbial rain." That implies that WebDriver cannot possibly change - which is untrue.

      Would you characterize Sun Suu Kyi's decades of resistance to the Myan Mar regime in the same way? Maybe WebDriver *won't* change, but that doesn't per se imply that Selenium's users who see the change in direction as a serious mistake should stop replying to posts that defend the decision.

      Delete
    3. Nice strawman you've created there. There's a fundamental difference between political regimes and a discussion about API decisions in an open-source project. Furthermore, I resent the comparison.

      I'd never suggest that anyone should stop making the case for an opinion they hold. Having said that, the important thing to take away from my comments are that if it's that all-fired important to someone, there are other courses of action than just "replying to posts that defend the decision." Create the change. Write the code. Show me a patch that implements the way you want it to work.

      I suspect I know what many people's response to the "do it yourself" attitude may be, but I'll give my audience the benefit of the doubt and refrain from attempting a preemptive response.

      Delete
    4. Sorry the comparison to Myan Mar got in your way ... that was not the point of the comparison. The point (I could have been more explicit but I thought it was clear) was your apparent characterization of continued opposition as pointless and foolish.

      I agree with the take-away you explained in this last comment: those who disagree are encouraged not only to express their disagreement but also to help create the change. But what you said before was not "Rather than *just* continue... I *also* invite you to", but "Rather than continue ... I invite you to". So yes, you did appear to be suggesting that he stop making the case for the opinion he holds. I'm glad to hear that is not the case.

      Delete
    5. Likewise, I thought my statement was clear as well, that I'm not opposed to a thoughtful, respectful discussion of differing opinion, but that actions are even better. I apologize for the lack of clarity.

      Delete
    6. While I agree with you, Jim, in that the most constructive thing to do would be to provide a patch or something similar to show how it could be done, I'm not a Java developer, so that would require a bit more investment on my part than having what this feature is worth to me.

      Having said that, I still think a good way to implement it is to throw exceptions in the one(?) browser where it's not available, and return the status code in the others. Redirects should be handled like they are now; they can be followed automatically (in which case you should return the status code of the final page) or they can be followed manually (in which case you should return the HTTP redirect status code of the one or many redirecting pages in-between).

      But as I wrote, I've moved on. I and the company I work for am now using HtmlUnit (or rather, NHtmlUnit) directly, since it was headless testing we were after all along. The JavaScript support is so-so, but it's better than nothing and helps us tremendously in keeping a solid harness around our web applications.

      Delete
    7. If "headless testing" is what you were really after, then it sounds like you made the right decision. As much as I like the project and am invested in it, WebDriver isn't the right tool for every job. NHtmlUnit is an interesting choice.

      There are just a couple of misconceptions that I want to correct. Given your stated goals, you were probably more interested in the HtmlUnitDriver than any other implementation, but I don't want anyone to come away from this discussion with the idea that you *must* be a Java developer to make meaningful contributions to WebDriver. If you're changing something in the WebDriver API, submitting that change in any language binding would be fine. For the implementation side on a specific browser, that language will change depending on the driver implementation (C++ for Chrome and IE, mostly JavaScript for Firefox and Safari, Objective-C for iPhone, and so on).

      Yes, that may be a huge investment for someone who's not natively a developer in those languages, and that's a perfectly fair assessment. Nevertheless, that's the tradeoff one always faces when using open-source software, is my pain at not having a feature greater than the effort it would require for me to build it myself. This is especially true when the project is produced entirely by unpaid volunteers, as WebDriver is.

      Finally, it isn't just one browser that makes it impossible to get the HTTP status code, there are at least three that I know of. IE, Safari desktop and mobile Safari (iPhone) are all in that boat. Android may or may not be, I honestly don't know.

      Delete
    8. Jim,

      An old thread I know, but new to me.

      I want to "create robust, browser-based regression automation" for my project, written in python. Selenium seems like the perfect tool for me, in fact I grabbed the requirements text from http://docs.seleniumhq.org/, as they matched my needs exactly.

      After getting a few simple test cases going, I wanted to expand automation suite to include checks for various HTTP status codes. Imagine my surprise to see they are not available.

      In the perhaps naive attempt to have your team reconsider this early design decision, allow me to respond to your stated rational

      - It was an early design decision not to return HTTP status codes. This one should be easy; lots of decisions made early in a project get reconsideration based on user feedback, right?

      - What about redirects? redirects can be handled in a few ways, but if you want to start with returning the last status code, that would likely handle a large percentage of the test needs, and that approach seems to be consistent with the concept of matching the user experience with the tool.

      - Some browsers make it impossible to get the status. OK, I’m sure a lot of your users can live with that. >50% is a lot better than 0% coverage of this feature.

      - Webdriver does more than support testing. I get it, but you promote it as a test tool, so I assume you want it to be a great test tool. Why would you limit its usability for this case because there are other use cases?

      - Webdriver drives the browser like a user. Users can see the HTTP status codes in most modern browsers.

      Anyway, it might be informative if you take a poll on the topic. It is surprising to see such resistance to a popular idea with some technical merit.

      Delete
  3. Yes, I can see the argument that HTTP status codes could be an important part of testing your website.

    It's encouraging to see that you are listening to these important points in favor of exposing HTTP status codes.

    while web application testing is an important use case of the Selenium project, it's far from the only one

    That's a bit like saying that going forward is an important use case for a car. The Seleniumhq web site says it is the *primary* use case: "Primarily it is for automating web applications for testing purposes." And it's very difficult to reconcile this with your conclusion that "the feature isn't needed in the project".

    That's a fair critique about documentation and public perception of the project, but that's really a separate discussion.

    You can try to separate it, but it still needs to be addressed, because this whole issue turns on the question (it didn't use to be a question at all) of whether Selenium is for testing web applications. Until the conflict between public promises of web application testing, and statements that WebDriver is only for browser automation, are addressed, you will continue to have users who feel like they've been "baited and switched."

    HTTP status codes are not a required part of automating the browser.

    This just begs the above question.

    I hear you acknowledging that the feature is important, but that it would be very difficult to implement in a way that works across all browsers. That's a fair answer. Asking others to help/implement the feature is a fair answer, if the project committers would accept contributions. What's not fair is claiming that web application testing features are not needed in Selenium/WebDriver, despite years of marketing Selenium as a web application testing tool.

    ReplyDelete
    Replies
    1. "if the project committers would accept contributions" should read "if the project committers would accept such contributions"

      Delete
    2. I believe the issue of "whether Selenium is for testing web applications" *is* a separate discussion. Clearly you believe it to be the central issue. It's not one I'm prepared to discuss or debate.

      With respect to the specific issue of HTTP status codes, to date, I've not seen any submissions or contributions from anyone attempting to solve the problem. If you have, and if you feel they've been rejected out of hand without consideration, please point me to the artifacts of that submission.

      Delete
  4. This is probably too late for anyone to read.. but just in case.. I am using Webdriver and a site I test, upon login, redirects me to another page. The response on login is 302. When I do this manually the site works fine. When I record a script using Selenium IDE and play it back, it works fine (uses clickAndWait). For some reason, despite putting wait code in my java webdriver code, the redirect never occurs upon login and the main login screen shows up again. It's as if the browser that is being driven just doesn't issue the redirect for some reason. I've no idea why this is. I would love to have access to the status code and if it's a 302, grab the location and use webdriver to navigate to that location, but obviously that's impossible. I'd be fine without the codes if the webdriver driven browser would just redirect while my code "waits" for an element to show up.

    If any of you know why this happens.. why selenium ide works, manual works, but webdriver prevents the browser from redirecting.. I'd appreciate it.

    ReplyDelete
    Replies
    1. First, thanks for taking the time to post a comment. Second, this really isn't the best place to get direct support for an issue like this; one of the user-facing mailing lists like selenium-users (https://groups.google.com/forum/?fromgroups#!forum/selenium-users) or the webdriver list (https://groups.google.com/forum/?fromgroups#!forum/webdriver) would be.

      As for your exact question, redirects via 302 HTTP status codes are not globally broken for every page everywhere. There is an explicit test for this in the WebDriver tests, and it passes on all browsers we test against (http://ci.seleniumhq.org:8080/).

      Without seeing the exact page you're testing against, I could only hazard a wild guess as to the cause. I'd certainly be suspicious of JavaScript running on the login page that doesn't fully initialize the user name and password, so the user name and password you're typing into the login page aren't being sent when the button is clicked.

      Delete
    2. Hey Jim.. wow.. thought maybe this would never be read. Thank you.

      The site happens to have an option when you register to redirect you to a search page when you log in. Being a JEE web app developer, I am surprised it just doesn't respond with the page directly, but instead this site sends a redirect (302) to the browser so that it can then redirect to the specific search page rather than the default internal home page.

      For the record, I've posted on a few forums, selenium and otherwise, and so far no help or ideas.

      Can you elaborate on how to use Webdriver to test for this?

      Like I said, it's strange to me that Webdriver stops the web browser from redirecting.. if that is what is going on. I would have thought the wait code snippet (or a Thread.sleep()) would work, allowing the browser being driving to carry out the redirect. Looking at web developer I can only see that the POST goes to the server with a 302 response. When I manually do this, the 302 comes back, then a GET request is issued and the search page is returned. When I do this with Webdriver, the 302 is returned but many more GET requests (for js, images, etc) are issued. Finally a last GET to another site is issued.

      I don't know..it's an odd issue..but one I fixed by simply switching back to the default home page, and now I am off to control the rest of the site. It may be that this particular web site is written badly or doing something it shouldn't and is out of the scope of Webdriver to respond properly to. Still, can't help wonder why only when driven by webdriver does it do this, but the IDE and manually doing so do not cause this to happen.

      Thanks.

      Delete
  5. This comment has been removed by a blog administrator.

    ReplyDelete
  6. Hi,
    Well explained about Rantings of a Selenium Contributor. can you explain about What is the difference between Selenium core extensions and Selenium IDE extensions?
    Thanks,
    David,
    Selenium Developer

    ReplyDelete
  7. While I appreciate that getting the response codes is not always possible, the argument that Selenium mimics a real browser user doesn't hold much water.

    Why? Because a computer program is not a person.

    When I browse to my favourite site, to read the news or to watch a funny cat video, I do so with the specific purpose of getting information, or entertainment, or whatever. A computer program going to the same site does so for very different reasons: either to test the site, or to scrape information from it. A computer program doesn't care about what happened in the world, or cat antics.

    A computer program needs to get information from the responses to the requests it makes. And some of the most valuable sources of information.

    It's not a problem that not all browsers offer this information. In fact, it's totally irrelevant. If I write a computer program that needs this information, then it's up to me to choose a browser that does. It's not up to Selenium to decide that I can't have this information, because some browsers that I don't care about don't offer it.

    ReplyDelete
  8. This comment has been removed by the author.

    ReplyDelete
  9. This comment has been removed by the author.

    ReplyDelete