GUI automation of program with changing menu options - automation

I'm looking to automate software on Windows 2008. The automation software doesn't have to be Windows 2008 compatible (I can use remote desktop).
The GUI has two main areas, a list of embedded images on the left, and a display pane on the right. The display pane shows where all the embedded images have been placed on the screen (the program is used for building Human Machine Interfaces [HMI's]).
I need to click each of the embedded images in the list on the left and extract some data from them. The problem is; depending on main display file chosen, the list of embedded images will have different names and be of different lengths.
The automated task therefore changes depending on main display image file opened. Is there an automation program that can be customized for this? I could write separate scripts for each main display file but this defeats the purpose of automating. I looked into Sikuli, AutoIt, pywinauto and others, but have not found examples of what I'm trying to accomplish.

AutoHotkey can do what you're asking for with little difficulty.
You can use some basic OOP principles to write one program that has different clicking locations etc. based on which display file you're running.


Generate a url for a specific text or area in a pdf (like a bookmark)

So that the link can be used in some external applications, and users in that application can click the link and navigate to that specific location (text or area) in the pdf.
Is this possible?
There are several different URL # (fragment) constructs that were introduced by Adobe for use with their web browser PDf viewer plug-in. The current accessible list is at
Many newer competing PDF enabled browsers may use similar, but there is no guarantee one URL.pdf#Fits all. Most will respect #page=number&zoom view or fit but it is very variable, thus you need to check each feature across browsers.
Important to switch off any PDF remember my view settings
For your previous request using Bookmarks can be unreliable (as can the comments ID) unless you are using Acrobat and can check the bookmark function. By far the more reliable may be the use of zoom at a scroll location so here using MS Edge:-
"C:\Program Files (x86)\Microsoft\Edge\Application\msedge.exe" file:///C:/Users/.../Desktop/CEI-PID-Sample-Rev4annotsC.pdf#zoom=300,200,200
I support SumatraPDF however its browser plug-in whilst still functional is way past EOL thus unsupported, it had a limited support for those fragments even though the CLI was recently extended for some Acrobat /Action commands.
However related to your desire the one single portable EXE can jump to well formed text blocks. Here I call "V 4508" in a fairly similar fashion
"C:\Program Files\SumatraPDF\SumatraPDF.exe" "C:\Users\WDAGUtilityAccount\Desktop\SandBox\apps\Graphics\CEI-PID-Sample-Rev4annotsC.pdf" -zoom 300 -search "V 4508"
And if I press A then the text will be highlighted for comments

GUI automation with clicking on text

There are many GUI automation tools that allow clicking on a specified image (well-known Sikuli, for example). Is there any way to click on the specified text, not image? This way the tool will:
make screenshot
recognize text on it
find text position (somehow)
send click event to this position
It would be much easier to write tests using this approach (many interfaces have text button, inputs etc.) rather than make screenshots for every single element.
I've seen some OCR feature in Sikuli but it didn't work for me (I tried invoking click('some-text-here').
Sikuli built-in OCR features are pretty buggy and unstable. All (or at least most of) the related issues are listed in this BUG. However there are few possible workarounds which are, however, not also always applicable..
If the text is known, you can take a screenshot of the text and then look for it as a screenshot. For example if you know exact font of this text, you can automatically generate such text on the screen and use it as a pattern to locate it elsewhere.
The built-in tesseract based OCR, performs significantly better when the font is bigger, "fatter" and in Grayscale (usually). Hence you might do some background image processing before attempting the actual recognition. I used ImageMagick to resize and filter the images for better recognition. It can be in the background as a command line tool. For example:
convert -filter spline -resize 100x -unsharp 10x20 -type Grayscale
I am aware that this does not answer your question directly but these are steps you might consider taking towards the final solution.
I'm a developer at Deskover company and we are currently developing an application, UiPath Studio that meets your needs.
We provide text recognition on various technologies with 100% accuracy, ability to find specific text in an area on screen, a control or an entire window, and also ability to click text or controls.
You can execute different actions, sequentially by creating workflows.
We at Deskover are big fans of Sikuli project. We actually use the same image recognition engine in UiPath Studio.
UiPath Studio is a visual tool that helps you create workflows easily, but you can also use the underlying API and implement an application that extracts text and clicks on it.
You can find more details about the UiPath library here.

PDF Outline Text - Automation of Acrobat Sequences

I have built an application that automates the filling out of form fields inside a pdf. It then takes various assets and combines them together to generate a "print ready" product. All of this is accomplished using the magic of iTextSharp. When form fields are populated, they are then flattened to text. The problem is that even with the fonts embedded they aren't really attached to the form fields in a meaningful way (like straight text elements are) and the printers are complaining that the pdf is generating licensing errors due to this. I researched this a bit and it just seems to be the nature of how form fields are.
The artists we are working with requested that we research a way to "outline" the text that is created from flattening the form fields. I found that running the PDF Optimizer with a custom preset allows for Text Outlining in Acrobat, and even better I can generate an Acrobat Sequence that runs this command on the pdf. The problem is that Sequences can not be automated, at all.
I found a plug-in called AutoBatch that allows for the execution of Sequences on the command line through a batch file. The downside is that this would require installing Acrobat Pro and the Plug-in on the server this application will be running on. Further it seems like an overkill solution just to outline the text in the pdf. For all I know at this point iTextSharp may allow me to do this programmatic, but searching for such a thing on google returns little results and nothing relevant.
So the question: Is there a better way to outline text in a pdf than the current solution I have implemented or am I kind of stuck?
TLDR; PDF is generated w/ non-standard fonts. I need to "outline" this text to send it to the printer. Currently using AutoBatch Acrobat Plug-In to execute Acrobat Sequence from the Command Line. Seems excessive, wondering if anyone knows a better way to automate font outlining.
I am also in a printing environment and have used forms for "Box Covers" plenty of times to shorten the code used to produce box covers.
I simple us "pdfStamper.FormFlattening = true;" and the printers (Xerox DP180 and DC5000) has no problems in using the PDF.
The moment I leave out FormFlattening the printer gives a lot of errors regarding the PDF.
If you are using FormFlattening then check if the printer has the font locally installed in order for it to reference the font from the print engine instead of the PDF resources.

How to create a program on top of Visio?

Is it possible to make a standalone/independent (from visio) program that is built on visio. Say, can i attach some of the design templates and visio drawing page on to my form??
Visio supports VBA. With that, you can add all kinds of interactivity to your document.
And, you can embed visio in another program with the activex control.
Both of these methods require visio to be installed on the machine (if that's what you were getting at by the "independent" comment).
The Visio Viewer may or may not install the activex control or support VBA, I don't know.
There's a value stream mapping tool called SigmaFlow VSM that is an application built on Visio like you want to make. Basically the tool loads up Visio and strips out a lot of the Visio toolbars and puts their own UI in. Obviously it requires you have Visio installed.
There's a similar tool called eVSM that leaves the Visio UI in place, but provides a toolbar and templates and stencils for the purpose of building value stream maps.
I prefer the eSVM approach, where you end up giving the user the full ability to do whatever they want within Visio, while making the very specific task of Value Stream Map diagram creation easy.

Automated Development of Presentation with Interactivity

I am trying to identify the right tool, language, software package, or other for the automated development of presentations, where the presentation is user interactive.
The presentation will consist of images with titles and some descriptive text. Most of the time there will be 35–70 images. I would like to show each image on a separate page, slide, tab, etc. (I guess proper terminology depends on the solution.)
The images will change, but the titles will remain the same, and there will be a little bit of change to the description of each image.
After putting the presentation together, I would like the user to be able to circle and "write" on the electronic image in kind of the wax pencil sense (I previously worked in a photo lab and we worked with wax pencils on negatives all the time and would like to have kind of a similar flexibility). Moreover, I would like users to be able to add comments as well, kind of in the way Adobe PDF Professional allows, e.g. inserting bubble comments, etc.
Most importantly, I would like to be able to do this in an automated way. Right now we are using PowerPoint, but the amount of time it is taking to put an image on a slide in PowerPoint, resize it, and then set up the text is killing us. Plus, as the images change it takes tons of time to go back and update them. Thus, we would like something that is a bit faster to update images and get the feedback from our few users. Does not necessarily have to be a web hosted solution, but could be run through a browser.
Sorry this is so long and thanks for any ideas and feedback, especially if there is an existing software package solution, language that can be used, or other approach to get this done.
These days, two of the most popular are Adobe Captivate and Articulate Presenter. For service, instead of product, you can check out services like
I don't know of any product that completely answers your requirements.
But, for similar results I use two different tools for developing the presentations and another one for drawing while presenting.
If I just want to make a presentation made of pictures and texts, and I want to automate its creation, I use irfanview with its wonderful feature for automated slideshows. I put all the images together, annotate them (I use either their filenames, or if not enought, with EXIF and comment fields) and create a slideshow, that can be compiler into an .exe file.
If I want a more elaborated presentation. With full annotation capabilities, I use Wink
For drawing over the screen during the presentation, I use a very old bitmap drawing program, called PC-Draw, that allows, with a hotkey, to capture the screen as a bitmap and begin drawing over it, and with another hotkey, to return to the original screen without altering the running programs at all. I have not found it anywhere in the web. However, I found similar programs just a quick google away.
All three tools are free and easy (and even fun) to use.