
This week I’ll be zooming in on Sikuli, a testing tool which uses computer vision to aid in verifying UI elements. Sikuli was created by the MIT User Design Group in 2010. The Sikuli IDE allows use of Jython to write simple test cases based on identifying visual elements on the screen, like buttons, and interacting with them, then verifying that other elements look correct. This comes close to a manual tester’s role of visually verifying software and allows test automation without having serious development skills, or knowledge of the underlying code. Sikuli is written on C/C++ and is currently maintained by Raimund Hocke.
If you’ve ever tried to do visual verification as a test automation approach in a web environment, you know that it’s a pretty difficult task. From my own experience of trying to setup visual verification on our video player at SDL Media Manager using Selenium and these Test API Utilities for verification, you will experience issues like:
- different rendering of ui elements in different browsers
- delays or latency makes tests unreliable
- constantly updating web browsers getting ahead of your Selenium drivers
- browsers rendering differently on different OS’s
- browser rendering changes after updates
- interacting with elements that are dynamically loaded on the screen, with non-static identifiers is inconsistent
- creating and maintaining tests requires specialised skills and knowledge.
Sikuli aims to reduce the effort of interacting with the application under test. I have downloaded it to give it a test drive. I decided to try out interacting with our SDL Media Manager video player to take Sikuli through its paces, since I already have some experience testing it with Selenium.
The first thing I had to do was setup the test video I created for video player testing. It’s comprised of static basic shapes on a black background which helps increase repeatability of tests since its hard to get a snapshot at exactly the right moment in the video. The black background also helps with transparency effects. I then started the player running and clicked on the test action buttons in the IDE to try to interact with the player.
Some Sikuli commands:
- wait
- either waits for a certain amount of time, or waits for the pattern specified
- find
- finds and returns a pattern match for whatever you are looking for
- click
- perform a mouse click in the middle of the pattern
I had to play around a bit but this is what finally worked.
The click function was not working on the Mac because the Chrome app was not in focus, so I had to use switchApp. After this the automation seemed to work quite nicely in locating the play/pause button of the player, clicking it to pause, clicking to resume playing, then waiting for the next next part of the video to show, containing the yellow square, and clicking on that to pause the video.
This is what happened when the test did not succeed:
An interesting characteristic of Sikuli is that you can specify how strong a pattern match you would like to trigger a positive result. It uses the OpenCV computer vision API which was built to accelerate adoption of computer perception and which contains hundreds of computer vision algorithms for image processing, video analysis and object and feature detection. It’s built for real-time image and video processing and is pretty powerful. It was created by Intel and can be used in C\C++, Java, Ruby and Python. There is even a wrapper for C# called Emgu CV. Check out the Wiki page for a nice overview.
Traditional automated testing methods which validate web UI element markup might miss issues with browser rendering that would be fairly obvious to a human. Although automated UI tests are costly to setup and maintain, in my opinion, they represent a vital aspect of product quality that could and should be automated.
Sikuli has a lot of potential, especially since it’s built on a solid computer vision API and is actively being maintained. This indicates to me that there is still room for growth in automated visual verification. I would love to hear your stories about visual verification or Sikuli or any ideas you have on this topic. Comment below!