Universal Robot Reading HMI or GUI Tablet Screen

  • I have a machine tending application that requires reading a menu on a Windows tablet GUI in order to select the job to start and then push start. We already have a UR3 in line with the equipment to move parts on and off once complete, but I cannot find a work-around for autonomous control of the start-up of the system without this GUI. Obviously we would need to add a "thumb" to the gripper to allow interfacing of the UR with the tablet screen, but that is a pretty easy addition once the UR knows where the buttons are with some vision system


    Has anyone implemented a method of using a mounted camera to read and navigate a simple GUI or know of a company that already has software for this?


    Thanks!

  • There are a couple variations on this -- first off, if you're working with a Windows interface, it will be a lot more reliable to integrate a software agent on the OS that can control the application (perhaps linking to the robot via Modbus for job selection and job control). Windows has pretty good accessibility API's for iterating over an application's visual control tree and taking actions.


    If that's absolutely not an option and the robot has to press the buttons, then the next best step might be to integrate a vision camera (Cognex etc) with the screen, and train it to recognize the patterns of the buttons you're looking for. Provide the X/Y coords of your target GUI button to the robot, and the robot can navigate to those coordinates and push the screen.

  • Thanks for your reply!


    I have much more familiarity with vision system software like Cognex or OpenCV, so that would probably be my preferred method. However, we are stuck with a Windows tablet and a simple program from the manufacturer that controls the equipment and I would like to look into your solution; do you have any specific links or key terms to look up some tutorials? With this type of overlay control I am a complete novice, but that hasn't stopped me from learning a new skill in the past.


    Of course, time is money so I would also love to have full examples to work from or just purchase a service or software from a company (of either overlay control or robotic read & interface with the tablet).


    Thanks for any further information!

  • How the tablet app works is obviously a consideration, but generally anything you would be able to automate with a camera on a windows PC you should be able to automate in a more robust way interacting with the windows app control tree directly.


    Here is a doc that covers how to iterate a control tree. https://msdn.microsoft.com/en-…op/ee671590(v=vs.85).aspx


    And here's a more in-depth writeup: https://www.codeproject.com/Ar…crosoft-Automation-Framew


    A reasonable play here would probably be a C# app running on the tablet, polling some Modbus register(s) on the UR bot for what to do next (there is a great NModbus library already out there for C#). For a "start machine" type workflow, the app would see the modbus value that represents "press the start button", then the C# app walks the control tree, finds the start button control (either by text, button size/position, or whatever), and issues a click event for that button. The robot has a built-in modbus server with a lot of registers reserved for application-specific use, so you can keep it simple that way and have plenty of room for adding registers for status, error codes, etc.


    To take it one step further, if this is like a wireless remote tablet, I'd take a look at what protocol the vendor's app is using to control the machine (run wireshark on the tablet to capture the traffic), and if it turns out that it's a simple socket command or something like "START JOB 3", you might even be able to issue the command directly from the robot using socket scripting and make the whole thing simpler.

Advertising from our partners