Automating GUI tasks with Sikuli and Jenkins

GUIs have always killed automation. We’re giving automation a fighting chance.

Automating UI interactions in Jenkins jobs. Crazy? Yes. Cool? Oh yes. Useful? I hope not.

Many companies have that application. The old, unavoidable GUI-based beast of legacy code that is essential to their process. Ever dreamt up the perfect automated pipeline, only to be stopped by the beast?

In a perfect world, you would like to add command line support, or just rewrite the damn thing. But that takes too many resources, so you put the pipeline idea on hold and wait for better times. Fortunately, with the aid of some clever tools, it’s actually possible to automate the pain away.

Introducing Sikuli

Sikuli is a visual framework for automating user interface tasks. It’s a Jython-based scripting language based on image recognition, with some very handy and self-explanatory methods.

The built-in IDE makes it easy to take screenshots and use them directly in the code. Here’s an example of a simple script that opens paint, draws a line and closes the program. Take a look at the video below to see it in action. The accompanied image shows the script being run.

Sikuli script

See the official documentation for details on the different methods.

Invoking a Sikuli script outside the IDE is easy. Sikuli ships in a handy JAR, so simply run java -jar sikulix.jar -r projects/my-script.sikuli and watch the magic happen.

Sikuli in itself is a useful tool for developers tired of clicking the same 20 buttons every time some project needs to run. It removes the risk of human error and is genuinely satisfying to watch.

Going one step further with Jenkins

You might be thinking: “Running Sikuli on Jenkins, are you completely mad?”. The answer is “Yes, but…”.

In certain unfortunate situations, it’s still useful:

  • When a GUI tool is needed to set up prerequisites for tests or builds to run, such as resetting proprietary hardware or other system-related tasks.
  • When you need to create UI tests for an application with no command-line options and no element ids, which would have made it possible to use Selenium.

If a potential Jenkins pipeline is dependant on a GUI application, Sikuli can run directly on a Jenkins slave. Sikuli runs on pretty much any platform:

For Linux based systems, there are Docker containers that make it possible to run several containerised instances of Sikuli on one machine.

On Windows slaves there are a couple of requirements:

  • A Java installation along with the Sikuli JAR.
  • The Jenkins slave process needs to run as a user and not a service, so the correct desktop environment is allocated. This means that Jenkins must run through cmd at startup. See this post for more details.
  • An active remote desktop session must be open for the desktop to exist. This is solved by installing VNC and having the slave connect to itself.

After following these steps you will be able to invoke any Sikuli script from Jenkins and the world will be open to you.

There are some obvious dangers involved in this, especially when running Sikuli directly on a slave. You need proper exception handling to make sure your slave does not enter an error state, leaving applications open. Solid concurrency controls should also be in place, ensuring that multiple jobs don’t attempt to grab one desktop at the same time. For the latter, I recommend using the throttle concurrent builds plugin.

Enjoy, but don’t overdo it

Image recognition can be fickle. Scripts can break due to interface scaling, window size changes etc. They also take a lot of maintenance when your UI changes often, or has many execution paths.

Avoid cramming GUI interactions in your pipeline. While it’s tempting to use Sikuli extensively, consider it only as a last resort. You have the power to automate GUI, to slay the beast, but use it wisely.

Author: Michael Madsen

Read more about Michael