Dec 24, 2025

How to Fix Ansible Timeout Error on CentOS 7

Troubleshooting Ansible Timeout Errors on CentOS 7

As a Senior DevOps Engineer, few things are as frustrating as an Ansible run stalling with a “timeout” error. While the message is direct, the underlying causes can range from simple network hiccups to nuanced system configurations. This guide focuses specifically on diagnosing and resolving Ansible timeout issues when targeting CentOS 7 hosts.

1. The Root Cause: Why this Happens on CentOS 7

An Ansible timeout error signifies that the Ansible control node failed to establish an SSH connection or receive a response from the target host within the allotted time. On CentOS 7, several factors frequently contribute to this:

Aggressive Firewall Rules (firewalld): CentOS 7 utilizes firewalld by default. A common scenario is that the target host’s firewall is blocking incoming SSH connections (port 22). This is especially true for freshly provisioned VMs or servers.
SSH Daemon Configuration (sshd_config): The default SSH server configuration on CentOS 7 might have conservative settings for LoginGraceTime (how long the server waits for a login) or lack ClientAliveInterval and ClientAliveCountMax (SSH keep-alive mechanisms), leading to dropped connections or timeouts during prolonged operations, or even during initial handshake if the network is slow.
Network Latency or Packet Loss: High latency or significant packet loss between the Ansible control node and the CentOS 7 target can naturally lead to timeouts as SSH handshakes and command executions take longer than expected.
SELinux Interference: While less common for a pure “timeout” during initial connection, SELinux in enforcing mode can block SSH or subsequent Ansible operations, causing commands to hang and eventually time out if it prevents necessary file access or process execution.
Target Host Resource Exhaustion: If the CentOS 7 target host is under severe load (CPU, memory, I/O), its ability to respond to SSH connection attempts or execute commands promptly can be impaired, leading to timeouts.
Ansible Control Node SSH Client/Ansible Configuration: The Ansible control node itself might have a low default SSH timeout in its ansible.cfg or the underlying SSH client configuration (~/.ssh/config or /etc/ssh/ssh_config).

2. Quick Fix (CLI)

These commands are for immediate diagnosis and temporary resolution. Use with caution in production environments.

Verify Basic Connectivity: First, ensure the target host is reachable on the network and SSH port.
```
# From your Ansible Control Node
ping <target_host_ip>
telnet <target_host_ip> 22
ssh -v <user>@<target_host_ip>
```
- ping: Checks basic ICMP reachability.
- telnet: Confirms if port 22 is open and listening. A “Connected” message is good.
- ssh -v: Provides verbose output for the SSH connection attempt, invaluable for diagnosing SSH handshake issues. Look for “Permission denied”, “Connection timed out”, or “Connection refused” messages.
Temporarily Disable Firewall (on Target CentOS 7 Host): If telnet fails or ssh -v shows “Connection refused” immediately, the firewall is a prime suspect.
```
# On the Target CentOS 7 Host (via console or alternative access)
sudo systemctl stop firewalld
sudo systemctl disable firewalld # Optional, but prevents it from starting on reboot
```
- Caution: This completely disables the firewall. Do not do this permanently in a production environment.
Temporarily Set SELinux to Permissive Mode (on Target CentOS 7 Host): If the connection establishes but Ansible tasks hang, SELinux might be interfering.
```
# On the Target CentOS 7 Host (via console or alternative access)
sudo setenforce 0
```
- Caution: This sets SELinux to permissive mode, logging but not enforcing security policies. Do not do this permanently in a production environment without proper analysis.
Increase Ad-Hoc Ansible SSH Timeout: For a quick test, you can override the SSH timeout for a specific Ansible command.
```
# From your Ansible Control Node
ansible -m ping <target_host_group_or_name> -i /path/to/inventory.ini -e ansible_ssh_timeout=30
```
- The ansible_ssh_timeout variable sets the SSH connection timeout in seconds. Try increasing it from the default (10 seconds) to 30 or 60.

3. Configuration Check

For robust, permanent solutions, modify the relevant configuration files.

Target CentOS 7 Host - Firewall (firewalld): Instead of disabling, properly open the SSH port.
```
# On the Target CentOS 7 Host
sudo firewall-cmd --add-service=ssh --permanent
sudo firewall-cmd --reload
```
- This command permanently adds the SSH service to the active firewall zone and reloads the rules without dropping existing connections.
Target CentOS 7 Host - SSH Daemon (/etc/ssh/sshd_config): Edit this file to adjust connection parameters.
```
# On the Target CentOS 7 Host
sudo vi /etc/ssh/sshd_config
```
Add or modify the following lines:
- LoginGraceTime 60s: Increases the time allowed for a client to authenticate after connecting. Default is often 2 minutes, but some systems might have less.
- ClientAliveInterval 30: The server sends a null packet to the client if no data has been received within 30 seconds. This keeps the connection alive.
- ClientAliveCountMax 5: If ClientAliveInterval packets are sent and no response is received for 5 consecutive times, the server will disconnect the client. (30s * 5 = 150s max idle before disconnect).
After editing, restart the SSH service:
```
sudo systemctl restart sshd
```
Ansible Control Node - Global Ansible Configuration (/etc/ansible/ansible.cfg or ~/.ansible.cfg): Modify the global timeout settings for Ansible.
```
# On the Ansible Control Node
vi /etc/ansible/ansible.cfg
```
Under the [defaults] section, add or modify:
```
[defaults]
timeout = 30
ssh_args = -o ControlMaster=auto -o ControlPersist=60s -o PreferredAuthentications=publickey
```
- timeout: This is Ansible’s default connection timeout in seconds. Increase it to 30, 60, or even 120 for very slow networks.
- ssh_args:
  - ControlMaster=auto and ControlPersist=60s: These settings allow Ansible to reuse existing SSH connections for subsequent tasks within a playbook run, significantly reducing connection overhead and potential timeouts. ControlPersist specifies how long the master connection remains open after the last client connection closes.
Ansible Control Node - Inventory File (/path/to/inventory.ini): You can set host-specific timeouts in your inventory for problematic hosts.
```
[webservers]
web1.example.com ansible_host=192.168.1.10 ansible_ssh_timeout=45
web2.example.com ansible_host=192.168.1.11
```
- This ansible_ssh_timeout variable will override the global timeout setting for web1.example.com.
Target CentOS 7 Host - SELinux (Permanent, if needed): If you definitively determine SELinux is causing the issue beyond initial SSH (e.g., specific Ansible modules failing with permission denied after connection), you might need to manage it.
- Recommended: Do not permanently disable SELinux. Instead, identify the specific SELinux denial using sudo ausearch -c sshd -m avc --raw or sudo journalctl -t audit | grep AVC and create a custom SELinux policy module to allow the necessary action.
- Last Resort (not recommended for production): Change /etc/selinux/config to SELINUX=permissive and reboot. This makes the change persistent but reduces security.

4. Verification

After applying any configuration changes, it’s crucial to verify the fix.

Direct SSH Connectivity Test: From your Ansible control node, ensure you can manually SSH into the target host without issues.
```
ssh <user>@<target_host_ip>
```

Ansible Ad-Hoc Ping Test: Run a simple Ansible ad-hoc command.

ansible -m ping <target_host_group_or_name> -i /path/to/inventory.ini

A successful response should look like:

<target_host_ip> | SUCCESS => {
    "ansible_facts": {
        "discovered_interpreter_python": "/usr/bin/python"
    },
    "changed": false,
    "ping": "pong"
}

Run a Simple Playbook: Create and run a minimal playbook to ensure all steps of a typical Ansible run complete without timeouts.

test_playbook.yml:

---
- name: Test connectivity and uptime
  hosts: <target_host_group_or_name>
  tasks:
    - name: Get uptime information
      command: uptime
      register: uptime_output

    - name: Print uptime
      debug:
        var: uptime_output.stdout

Execute:

ansible-playbook -i /path/to/inventory.ini test_playbook.yml

Increase Verbosity for Deeper Insight: If issues persist, use Ansible’s verbose output to pinpoint the exact failure point.
```
ansible -m ping <target_host> -i /path/to/inventory.ini -vvv
ansible-playbook -i /path/to/inventory.ini test_playbook.yml -vvv
```
The -vvv (or even -vvvv) flag will provide detailed SSH and Ansible execution logs, which can often highlight the specific point of failure or the cause of the delay leading to a timeout.

By methodically working through these steps, you should be able to diagnose and resolve most Ansible timeout errors encountered when managing CentOS 7 hosts.