How to Fix Ansible Timeout Error on Ubuntu 22.04
Troubleshooting Guide: Ansible Timeout Error on Ubuntu 22.04
As a Senior DevOps Engineer, encountering an “Ansible Timeout Error” is a common, yet frustrating, experience. While the message is straightforward, its root cause can be multifaceted. This guide will systematically walk you through diagnosing and resolving these timeouts specifically when targeting Ubuntu 22.04 hosts from your Ansible controller.
1. The Root Cause: Why this happens on Ubuntu 22.04
An Ansible timeout error indicates that a connection attempt or a task execution exceeded a predefined time limit. On Ubuntu 22.04, this can stem from several common issues:
- Network Latency or Congestion: The most frequent culprit. Slow or unstable network links between your Ansible controller and the Ubuntu 22.04 target host can cause SSH connection attempts or data transfer during task execution to exceed the default timeout.
- Firewall Restrictions (UFW): Ubuntu 22.04 defaults to
ufw(Uncomplicated Firewall). If port 22 (SSH) isn’t explicitly open, or if other ports required for specific tasks are blocked, Ansible connections will hang and eventually timeout. - SSH Server (sshd) Configuration on Target: The OpenSSH server on Ubuntu 22.04 has its own timeout mechanisms. If
ClientAliveIntervalorClientAliveCountMaxare too low, inactive connections might be prematurely terminated by the target, leading to Ansible timeouts. Conversely, an overloadedsshdor too many concurrent connections (e.g.,MaxStartupslimit) can prevent new connections. - Target Host Resource Exhaustion: An Ubuntu 22.04 server struggling with high CPU, low memory, or slow disk I/O can delay command execution significantly, causing Ansible tasks to time out even if the initial SSH connection was successful.
- Ansible’s Default Timeout Settings: Ansible itself has a default connection timeout of 10 seconds. For environments with higher latency or complex initial SSH handshakes, this default is often insufficient. Task execution timeouts are generally controlled by the module or by
asyncoperations. - DNS Resolution Issues: If your controller cannot quickly resolve the hostname of your Ubuntu 22.04 target, the SSH connection setup will delay, potentially leading to a timeout.
2. Quick Fix (CLI)
Before diving into configuration files, these CLI-based steps can help you quickly identify and often mitigate timeout issues for immediate testing.
-
Increase Ansible’s Connection Timeout (Ad-hoc Command): This is the fastest way to test if a longer timeout resolves the issue.
# Test with a ping module, setting the timeout to 30 seconds ansible your_ubuntu_host -m ping -i inventory.ini -e 'ansible_ssh_common_args="-o ConnectTimeout=30"'Or, using the general Ansible timeout:
ANSIBLE_TIMEOUT=30 ansible your_ubuntu_host -m ping -i inventory.iniReplace
your_ubuntu_hostandinventory.iniwith your actual host and inventory file. -
Verify SSH Connectivity Manually: Attempt to connect directly to the Ubuntu 22.04 host using SSH from your Ansible controller. Add verbose output (
-vvv) for detailed debugging.ssh -vvv user@your_ubuntu_hostLook for hangs, specific error messages, or delays in the output. This helps differentiate between an Ansible-specific issue and a fundamental SSH connectivity problem.
-
Check Network Reachability: A simple
pingcan confirm basic network connectivity.ping -c 4 your_ubuntu_hostHigh packet loss or very high latency (e.g., >200ms) indicates a network issue that needs addressing outside of Ansible.
3. Configuration Check
For persistent solutions, you’ll need to adjust configuration files on both your Ansible controller and the target Ubuntu 22.04 hosts.
3.1. Ansible Controller Configuration (ansible.cfg)
Edit your global ansible.cfg (e.g., /etc/ansible/ansible.cfg) or a project-specific ansible.cfg file.
-
Increase
timeout: Modify thetimeoutparameter in the[defaults]section to a higher value (e.g., 30 or 60 seconds).# /etc/ansible/ansible.cfg or project_path/ansible.cfg [defaults] timeout = 30Explanation: This sets the general timeout for various Ansible operations, including connection establishment.
-
Configure SSH Arguments for Connection Stability: In the
[ssh_connection]section, you can add specific SSH client arguments.[ssh_connection] # Use ControlPersist for faster subsequent connections ssh_args = -o ControlMaster=auto -o ControlPersist=60s -o PreferredAuthentications=publickey -o ConnectTimeout=30 # Enable pipelining for faster execution of tasks # Ensure 'requiretty' is disabled in sudoers if using pipelining with become pipelining = TrueExplanation:
ControlMaster=autoandControlPersist=60s: This reuses an existing SSH connection, significantly speeding up subsequent tasks and reducing the chance of timeout on new connections.60skeeps the master connection open for 60 seconds of inactivity.ConnectTimeout=30: Explicitly sets the SSH client-side connection timeout to 30 seconds.pipelining = True: Reduces the number of SSH operations by executing multiple commands over a single connection. Requiresrequirettyto be off forsudoon the target if usingbecome.
3.2. Target Ubuntu 22.04 Host SSH Server Configuration (sshd_config)
If the issue is dropped connections, you might need to adjust the server-side SSH keep-alive settings on your Ubuntu 22.04 hosts.
-
Edit
sshd_config:sudo nano /etc/ssh/sshd_configAdd or modify the following lines:
# /etc/ssh/sshd_config ClientAliveInterval 60 ClientAliveCountMax 5Explanation:
ClientAliveInterval 60: Thesshdserver will send a null packet to the client every 60 seconds if no data has been received from the client. This keeps the connection alive through network devices that might otherwise close idle connections.ClientAliveCountMax 5: If 5 consecutive client alive messages are sent without a response from the client,sshdwill disconnect the client.- Combined, this means the connection will be terminated after
60 * 5 = 300seconds (5 minutes) of unresponsiveness. AdjustClientAliveCountMaxhigher if your network is extremely unreliable, but be mindful of resource usage.
-
Restart SSH Service: After modifying
sshd_config, you must restart the SSH service for changes to take effect.sudo systemctl restart sshd
3.3. Target Ubuntu 22.04 Host Firewall (UFW)
Ensure SSH traffic is explicitly allowed on your Ubuntu 22.04 hosts.
-
Check UFW Status:
sudo ufw statusLook for a rule allowing
OpenSSHor port22. -
Allow OpenSSH: If not allowed, enable it:
sudo ufw allow OpenSSH # or if you prefer by port # sudo ufw allow 22/tcp -
Enable UFW (if disabled): If UFW is inactive, you might need to enable it, but be careful as this will activate all configured rules.
sudo ufw enable
4. Verification
After making changes, it’s crucial to verify that the timeout issues are resolved.
-
Rerun Your Original Playbook/Command: The most direct way to verify is to run the exact Ansible playbook or ad-hoc command that was previously timing out.
# If using a playbook ansible-playbook your_playbook.yml -i inventory.ini # If using an ad-hoc command ansible your_ubuntu_host -m setup -i inventory.ini # (or your specific module) -
Monitor with Increased Verbosity: If the issue persists, run your Ansible command with
-vvvfor detailed output. This can reveal where exactly the process is hanging.ansible-playbook your_playbook.yml -i inventory.ini -vvvExamine the output for clues like “Authentication failed,” “Connection refused,” or long pauses at specific task points.
-
Check
ControlPersist(if enabled): If you enabledControlPersist, after a successful run, check for the existence of the control socket:# The default path is usually /home/user/.ansible/cp or /tmp ls -l ~/.ansible/cp/ansible-ssh-*The presence of these files indicates that
ControlPersistis active and should speed up subsequent connections. -
Observe System Logs: On the target Ubuntu 22.04 host, check the SSH daemon logs for connection issues:
journalctl -u sshd | tail -fLook for messages indicating failed authentication, dropped connections, or
MaxStartupswarnings.
By systematically applying these steps, you should be able to diagnose and resolve most Ansible timeout errors when working with Ubuntu 22.04 hosts. Remember to test changes incrementally and revert if they introduce new issues.