Increasing Timeouts

Every command that is sent to an agent and ran, are based on a timeout. If no timely response is received from the agent, anthill will timeout the step. Sometimes these timeouts need to be extended when there are cases such as an agent being overloaded with work or even when a network is too saturated. Luckily, the timeout can be modified to a larger number to allow more time to pass, before a job is marked 'dead'. The current method of how the timeouts work is below:

  • Send Command
  • Wait for defaultTimeout for acknowledgement that the command was received
  • Loop until command status is "completed":
    • Send Request for command status
    • wait cmdResultTimeout for status

NOTE: The agent will try to respond within cmdResultTimeout/2 milliseconds as a safety margin, but the server will wait the full time!

Server installed.properties

Update these properties in the installed.properties file in server/conf

install.server.maxCmdResultRetries=10
install.server.defaultTimeout=10000


Server system properties

Add these to ah3server script in your server's bin directory. Edit the script, find the variable JAVA_OPTS, add options like this:

-Dcom.urbancode.anthill3.maxMissedAnnouncements=10


com.urbancode.anthill3.announceIntervalMillis=30000
com.urbancode.anthill3.maxMissedAnnouncements=2


Agent system properties

Add these to the worker-args.conf file in your agent bin directory, one per line like this:

-Dcom.urbancode.anthill3.maxMissedAnnouncements=10

com.urbancode.anthill3.announceIntervalMillis=30000
com.urbancode.anthill3.maxMissedAnnouncements=2