Sporadic EKI and Internal file error.

  • I've been encountering a sporadic EKI error state when leaving our system unattended overnight.

    I have a loop running in SPS which will use Ethernet.KRL to try to open a connection to a local server and then wait to be sent a command. I've found that sometimes when the kuka side of the system is left on (but no commands are being sent/the server is off), the following error occurs after a number of hours. To be clear, this is when SPS the running, but no other programs are selected.

    First error:  EKI00022: Error while reading the configuration. XML not valid. Originator: <EthernetKRL> 

    Upon clearing error 1: EKI00003: File access failed Originator: <EthernetKRL>

    Upon clearing error 2:  KSS02804: Internal error (file: ads_in.c, line 689, value: 'H5BF8')  (yikes)

    I've verified that the EKI XML file is correct, and a reboot will clear all the errors, so I assume that is caused by something else going wrong.

    Kuka support informed me that the KSS error means a program has more than 9500 lines, but they were unable able to tell me what the function of ads_in.c was.

    Does anyone know what could be causing this? I've included a stripped down (but still rather impenetrable) version of the SPS file below, along with the software versions.

    What's confusing me is the randomness of the error, it only ever occurs when the system has been left inactive, but the times can range from 6 hours to 50.

    I've tested to see if it's caused by the spam of MsgNotify() calls, but I've successfully called it far more than 9500 times without the error presenting.

    My next theory was that I was somehow not correctly clearing the connection or RET object. Any help would be appreciated!

    Robot: KR6 R900-2

    Controller: KRC4 Compact

    KSS: V8.3.36

    Kernel: KS V8.3.403

    Ethernet.KRL: Version 2.2.8

    The server is just a linux NUC, but since this error presents when the nuc is off, I haven't included details on it.

  • Place your Ad here!
  • hello and welcome to robot-forum

    first of all, nice job posting version numbers.

    you did not post communication settings of your XML configuration so there is no telling what the environment is, if alive flag is configured etc.

    it is clear that SPS is acting as Client and normally it is the Client that starts.. and stops connection... normally... but connection can be dropped for various reasons - maybe your Server gets restarted and by the looks of your SPS, robot is not aware of that, keeps assuming that connection is open since flagConnected is still true.

    to be honest i only skimmed through the code so my observations may not be very accurate but i think i see the issue:

    i see no error handling or reconnecting method. flag Alive is not there at all so how is SPS supposed to know if connection is lost?

    also i don't see your flags ever initialized (i keep pestering others with initializing variables). basically when robot is booted and SPS starts, $FLAG[flagConnected] is FALSE... accidentally (it just happens to be that, this is not because programmer made sure that is the case). so thanks to that accident, EKI code has a chance to connect to server - once! that's it... after that SPS assumes that it is connected since nothing ever resets that flag - except reboot.

    this also means that manual restarting of SPS is of no use in this case so clearly someone will rather to longer reboot than restart program...

    and manual reset does not work since $FLAG[flagConnected] is not visible globally. more precisely $FLAG[] is global but flagConnected is a runtime variable. one could reset $FLAG[] manually if one knows what specific element of that array is assigned to flagConnected.

    i would add that MsgNotify is not a (real) problem here. it can be used ... a lot.... because notification messages (unlike all other message types) do not get stored in the message buffer (which only can handle up to 100 messages). i used to generate notificaiton messages in a loop for hours, without issue (100000+ messages)

    Thing to note here is that MsgNotify does not put messages into logbook database ... unless one explicitly modifies MsgLib.src or uses message option parameter. which this code does not... so if you acknowledge messages, or reboot for example, all your messages are gone. cannot go back in time to see what happened and when, and therefore they will not make it into archive either.

    finally i would add that i am not much of a fan of serious demolition/reconstruction of the standard files, including main submit (SPS.SUB). i think it is better to place user code into a separate module(s), then just call module from SPS. it is not only more portable but also easier to comment out when not needed and - far far easier to test since it could be called from any interpreter. for example instead of calling your module from SPS, you can call it from a robot program or - run it as a robot program. then you can step through code, view runtime variable etc. once you are happy with it, call it from SPS or one of extended submits.

    1) read pinned topic: READ FIRST...

    2) if you have an issue with robot, post question in the correct forum section... do NOT contact me directly

    3) read 1 and 2

  • On some older versions of EKI, I could reliably cause the System Fault (forcing a cold boot) by "hammering" the socket interface too often. I had to add some deliberate time delays between Open and Close events, and also between successive Send events during the same Open period.

    Another thing was the Error and Warning configuration in my channel XML files. It turned out (don't know if this is still the case) that one setting would make the Error counter in the Diagnostic Monitor keep counting up over time, and eventually, when that counter maxed out, it would 100% cause the System Fault. We eventually fixed this by changing the Error/warning settings in the XML file.

    ads_in.c is part of the KSS operating system (well, probably part of VxWorks underlying KSS), and isn't user accessible. I suspect it has an unhandled counter rollover condition, but it was never fixed on the EKI versions that I was using back then. I don't know if it's been fixed in later updates or not.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account
Sign up for a new account in our community. It's easy!
Register a new account
Sign in
Already have an account? Sign in here.
Sign in Now

Advertising from our partners