The path to learning WebDriver is interesting and often mind-bending one, so get ready… We are going to encounter some wonderful, wild and wacky things as we trek in the land of Southern Surprises.
You are about to find out why Selenium WebDriver is going to make your life so much better – well, in a QA sense and why we are calling it, The Ruling Champ! In order to get a grip on the tool and build a test automation framework, it is really important to get a deeper understanding of what we are dealing with. So, what are we waiting for? Let us get a good foundation started, now!
What way is simpler to understand than a pictorial representation? That’s how our brain likes to remember things and that’s how we are going to proceed too.
From this picture we can make out that there are 3 layers in this architecture,
- WebDriver API and
Let us talk through this one step at a time. Big Word Alert! Bindings –According to Wikipedia, it means mapping of one thing to another. Just remember these two words, glue code.
There are numerous high level programming languages out there and you might want to use C# but others might prefer Python. Whereas everyone wishes to utilize the common WebDriver API for automating the browser in their own language of comfort. This is where language level bindings come into the picture. These are like glue code/wrapper libraries written in corresponding languages to communicate with the WebDriver API. Apart from Java, C#, Ruby, Python bindings, there are many more. It is quite easy to add new ones as well.
Next onto drivers. WebDriver API allows us to have drivers that know how to drive a particular browser that it corresponds to. We have Chrome driver, IE driver, Microsoft Edge driver, Firefox driver (built-in) etc. There are also mobile specific drivers such as iOS-driver, Selendriod (Selenium for Android) etc. A Chrome driver, for example, knows how to drive Chrome browser to perform low level activities such as manipulating web elements, navigating to web pages, getting user input from them and so on.
We mention the required driver in our code. This driver server comes as an executable file. When we run our tests, the driver server listens on a port on our local machine. It will interpret the commands received from the WebDriver API, execute on the actual browser and return the results back to our code through the API.
Zooming in and putting it all together:
In this series, we are going to be programming our tests in Java. Think of it as our scripting language for automating the browser. The corresponding Java binding code issues commands to the WebDriver API. All implementations of WebDriver that communicate with the browser, use a common wire protocol. The wire protocol is basically a RESTful web service over HTTP, implemented in request/response pairs of “commands” and “responses”. So we can send HTTP requests, such as, GET, POST,PUT etc. to the driver server. This server is the machine running the RemoteWebDriver. For example, the ChromeDriver server refers to the Chrome browser that implements the wire protocol directly. When the Java test is run, this server listens and waits for these commands. It interprets them accordingly, performs the low level browser activities and it responds back with an HTTP response message.
Language level bindings (issues commands) -> WebDriver common wire protocol (REST based web service over HTTP) -> driver servers (interprets HTTP requests and responds with HTTP response messages)
Do not panic if you are not getting the full picture. Take a break as there are numerous posts to follow and you are sure to get a clear understanding as we move forward.
A heads-up, while setting up WebDriver in Eclipse, you will get to see the actual java language bindings. Very soon we will download the driver server executable files, include them in our code and automate the browser actions as well. So, cheer up and watch for upcoming posts!
See you soon and have a great day ahead.