How to Fix Terraform Permission Denied on Azure VM


Troubleshooting “Terraform Permission Denied” on Azure VM

As a Senior DevOps Engineer, encountering “Permission Denied” errors during Terraform deployments on Azure is a common hurdle. This guide provides a direct, actionable approach to diagnose and resolve these issues when operating from an Azure Virtual Machine (VM).


1. The Root Cause: Why This Happens on Azure VM

At its core, a “Permission Denied” error in Terraform on Azure signifies that the identity Terraform is using lacks the necessary Role-Based Access Control (RBAC) permissions to perform a requested action on a specific Azure resource or scope.

When you run Terraform from an Azure VM, the identity can manifest in a few ways:

  1. Service Principal (SPN): This is the most common scenario for automated deployments. Terraform uses a client ID and client secret (or certificate) to authenticate as an SPN, which then needs RBAC roles assigned to it in Azure.
  2. Managed Identity: If your VM has a System-Assigned or User-Assigned Managed Identity, Terraform can implicitly leverage this identity for authentication without explicit credentials. This Managed Identity, like an SPN, requires appropriate RBAC roles.
  3. User Account (via Azure CLI): If you’ve logged into Azure CLI (az login) as a user on the VM and Terraform is configured to use CLI credentials, then that user’s permissions are in play.

The “Permission Denied” message itself originates from the Azure API, not Terraform. Terraform merely translates your desired state into API calls, and Azure rejects the call because the authenticated principal (SPN, Managed Identity, or User) does not possess the required Microsoft.Authorization/roleDefinitions/actions to create, modify, or delete the specified resource within the given scope.

Common culprits include:

  • Incorrectly configured Service Principal/Managed Identity.
  • Missing or insufficient RBAC role assignments for the identity.
  • Scope of the role assignment is too narrow (e.g., assigned to a resource group when subscription-level permissions are needed for global resources).
  • Typographical errors in environment variables or Terraform provider configuration.

2. Quick Fix (CLI)

The fastest way to troubleshoot a permission issue is often to grant the identity broader permissions temporarily, then narrow them down once functionality is confirmed. Use this as a diagnostic step, not a permanent security solution. Always aim for the principle of least privilege.

Assuming your Terraform is using a Service Principal or Managed Identity, you’ll need its Object ID.

Steps:

  1. Identify the Service Principal/Managed Identity Object ID:

    • For a Service Principal: If you know its name, use:
      az ad sp list --display-name "Your-Terraform-SPN-Name" --query "[0].id" -o tsv
      If you only have the client_id, use:
      az ad sp show --id "your-client-id" --query "id" -o tsv
    • For a System-Assigned Managed Identity (on the VM itself):
      # Run this command from within the VM
      curl -H Metadata:true "http://169.254.169.254/metadata/identity/principalId?api-version=2018-02-01"
      # Or if Azure CLI is installed on the VM:
      az identity show --name $(hostname) --query "principalId" -o tsv
    • For a User-Assigned Managed Identity:
      az identity show --name "Your-UAI-Name" --resource-group "Your-Resource-Group" --query "principalId" -o tsv
    • For the currently logged-in user (if az login is used):
      az ad signed-in-user show --query "id" -o tsv
    • Store this ID in a variable:
      PRINCIPAL_ID="<The-Object-ID-You-Found>"
  2. Identify the Scope:

    • For broad permissions (e.g., creating resource groups, managing multiple resources across the subscription):
      SCOPE="/subscriptions/$(az account show --query "id" -o tsv)"
    • For permissions limited to a specific resource group:
      SCOPE="/subscriptions/$(az account show --query "id" -o tsv)/resourceGroups/YourResourceGroupName"
  3. Grant “Contributor” Role (Diagnostic): This role allows full access to all resources within the scope, but does not allow managing access permissions.

    az role assignment create --assignee $PRINCIPAL_ID --role "Contributor" --scope $SCOPE
    • Note: If the error still persists after granting Contributor, the issue might be with managing access (e.g., creating other role assignments). In such cases, temporarily try “Owner” for diagnostic purposes (--role "Owner"), but revert immediately.

3. Configuration Check

After ensuring the identity has the necessary permissions, verify that Terraform is actually using that identity and that its credentials are correctly configured.

Terraform’s Azure provider (azurerm) sources its credentials in a specific order of precedence:

  1. Explicitly defined values in the provider block in .tf files.
  2. Environment variables.
  3. Managed Identity (if detected on an Azure resource like a VM).
  4. Azure CLI credentials (from az login).

Check these common configuration points:

A. Environment Variables (Most Common for SPNs)

Ensure these are set correctly in your VM’s shell where Terraform is run.

echo $ARM_CLIENT_ID
echo $ARM_CLIENT_SECRET
echo $ARM_TENANT_ID
echo $ARM_SUBSCRIPTION_ID
  • Verification:
    • Does ARM_CLIENT_ID match the Application (client) ID of your Service Principal?
    • Is ARM_CLIENT_SECRET correct and not expired?
    • Does ARM_TENANT_ID match your Azure Active Directory Tenant ID?
    • Does ARM_SUBSCRIPTION_ID match the subscription you intend to deploy to?
  • Action: If any are incorrect or missing, set them:
    export ARM_CLIENT_ID="<Your_Service_Principal_Client_ID>"
    export ARM_CLIENT_SECRET="<Your_Service_Principal_Secret>"
    export ARM_TENANT_ID="<Your_Azure_Tenant_ID>"
    export ARM_SUBSCRIPTION_ID="<Your_Azure_Subscription_ID>"

B. Terraform provider Block

Open your main.tf or provider.tf file and check the provider "azurerm" block.

provider "azurerm" {
  features {}
  # If explicitly defined:
  subscription_id = "<Your_Azure_Subscription_ID>"
  tenant_id       = "<Your_Azure_Tenant_ID>"
  client_id       = "<Your_Service_Principal_Client_ID>"
  client_secret   = "<Your_Service_Principal_Secret>"
  # OR if using MSI (Managed Service Identity/Managed Identity):
  # use_msi = true # For System-Assigned
  # user_assigned_identity_id = "/subscriptions/..." # For User-Assigned
}
  • Verification:
    • Do the client_id, client_secret, tenant_id, and subscription_id values match your intended credentials?
    • If use_msi = true or user_assigned_identity_id is set, ensure the Managed Identity exists and has the necessary permissions.
  • Action: Correct any discrepancies. If you are using environment variables, these parameters should ideally not be hardcoded in the provider block to avoid conflicts unless you explicitly intend to override.

C. Azure CLI Login (if applicable)

If you’re relying on az login, ensure the correct account is active:

az account show --output json
  • Verification:
    • Does the id field match your target subscription ID?
    • Does the user.name or servicePrincipalProfile.displayName show the identity you expect to be using?
  • Action: If incorrect, log in with the right account or select the correct subscription:
    az login # Follow prompts
    az account set --subscription "<Your_Subscription_ID>"

D. Debugging Terraform Output

Increase Terraform’s verbosity for more detailed error messages:

export TF_LOG=TRACE
export AZURE_LOG_LEVEL=DEBUG # Provides more Azure API specific details
terraform plan

This can reveal exactly which Azure API call is failing and often includes the specific HTTP status code (e.g., 403 Forbidden) and a more detailed message from Azure.


4. Verification

Once you’ve adjusted permissions and/or configuration, verify your fix.

  1. Clear Terraform Cache (Optional but Recommended):

    rm -rf .terraform .terraform.lock.hcl # Remove generated files
    terraform init -upgrade
  2. Perform a Terraform Plan: This will simulate the changes without applying them, allowing you to confirm if the permission issue is resolved.

    terraform plan
    • Expected Outcome (Success): Terraform will output a plan detailing the resources it intends to create, modify, or destroy, without any “Permission Denied” errors.
    • Expected Outcome (Failure): If you still encounter “Permission Denied,” re-evaluate the previous steps. The granted role might still be insufficient, the scope too narrow, or the credentials still misconfigured. Review the verbose logs (TF_LOG=TRACE) closely.
  3. Apply the Terraform Configuration: If terraform plan succeeds, proceed with the application.

    terraform apply
    • Expected Outcome: Terraform successfully deploys your resources to Azure.

By systematically addressing the identity, role assignments, and credential configuration, you can effectively troubleshoot and resolve “Terraform Permission Denied” errors on Azure VMs. Remember to prioritize the principle of least privilege in your permanent solutions.