Cisco ISE Production Deployment on AWS

Purpose & Background

Cisco Identity Services Engine (ISE) provides network access control (NAC), RADIUS/TACACS authentication, authorization, accounting, and policy enforcement services.

This document describes the Production deployment of Cisco ISE on AWS, including:

infrastructure components
failover mechanism
health checks
operational guidance

High-Level Architecture

Two Cisco ISE nodes are deployed on EC2 forming a TACACS cluster across two availability zones.

RADIUS/TACACS traffic health is actively managed by a UDP Lambda probe, which performs a real RADIUS check on port 1812 and dynamically registers or deregisters nodes from the NLB target groups.

Admin traffic on TCP/443 is not automated via Lambda. Instead, the Admin NLB always registers only one node at a time, controlled via Terraform.

Components

VPC

Network Operations Centre VPC

vpc-035cbba5e5264a6bc

EC2 Instances

Deployed across two Availability Zones.

TACACS Cluster

Two EC2 instances are used:

TACACS1

Instance name: MOJ-NS-NOC-ISE-TACACS001
Private IP: 172.27.3.63
Subnet: subnet-0e2b800aa0df1c3f0 (Private1-az1)

TACACS2

Instance name: MOJ-NS-NOC-ISE-TACACS002
Private IP: 172.27.4.63
Subnet: subnet-009e4b1fc6fb32cec (Private1-az2)

Key Pair

Stored in 1Password:

Vault: Network Operation Team - NOC
Item: CISCO ISE EC2 Prod KeyPair moj-aws-noc-ise-kp

Elastic Load Balancer

Network Load Balancers are deployed across two AZs with source IP stickiness enabled.

Subnets

subnet-0e2b800aa0df1c3f0 (Private1-az1)
subnet-009e4b1fc6fb32cec (Private1-az2)

Traffic Types

TACACS / RADIUS Ports

ISE-TACACS-NETWORK-NLB

Handles authentication traffic.

Port	Protocol	Purpose
49	TCP	TACACS
1812	UDP	RADIUS Auth
1813	UDP	RADIUS Accounting

Target health is controlled by: - AWS NLB health checks - Custom UDP Lambda probe

Admin Port

ISE-TACACS-ADMIN-NLB

Handles GUI traffic on TCP/443

Only one node per cluster is registered at a time.

Normal state:

ISE-TACACS-443 172.27.3.63

Failover state:

ISE-TACACS-443 172.27.4.63

Failover is controlled through Terraform configuration, not Lambda.

AWS Transfer Family and S3

AWS Transfer Family SFTP is configured to send backup files from Cisco ISE to Amazon S3.

To use AWS Transfer Family with SFTP, each user must authenticate using a public SSH key, which is generated and uploaded into Transfer Family by the LAN team.

S3 bucket for tacacs backup

tacacs1 = { home_directory = "/moj-noc-prod-ise-backup/tacacs" }

Secrets Manager

Two AWS Secrets Manager secrets are used:

ise_basic_auth

Basic authentication credentials used by the TACACS failover Lambda when calling the Cisco ISE API.

ise_shared_secret

Shared secret used by the UDP health-check Lambda for RADIUS probe requests.

Route53

Two alias records are created for HTTPS traffic routing to the Network Load Balancer.

VPC Endpoints

VPC Endpoints allow the Lambda functions to communicate with AWS APIs without using the public internet.

Lambda

Lambda Source Files

lambda_files = {
  tacacs_udp = "ise_tacacs_udp_failover.py"
}

EventBridge Scheduler

Lambda functions are triggered every minute.

Previous Failover Automation Design - Admin (443) Lambda for ISE-TACACS-ADMIN-NLB target group

Originally, admin failover was designed to be handled by the Lambda function:

ise_tacacs_https_failover.py

This function queried the Cisco ISE API to determine which node was PrimaryAdmin, then dynamically registered the correct node.

Design Decision

Due to the limitations Remove Admin Load Balancer Failover Lambda, the Lambda based Admin failover logic was removed.

Admin traffic failover is now handled by:

AWS NLB TCP health checks
Terraform controlled active target

Health Check - UDP Lambda for ISE-TACACS-NETWORK-NLB target group

This Lambda acts as a custom health checker for two TACACS nodes. It actively tests whether each node can respond on UDP/1812 (RADIUS authentication) and then updates two load balancer target groups so that only healthy nodes receive traffic.

AWS native load balancer health checks are mainly HTTP, HTTPS, or TCP based and do not perform a real RADIUS validation on UDP/1812.

If verification is required to confirm that the RADIUS service is responding correctly, basic TCP reachability cannot be relied upon. This Lambda performs an application level check instead.

Function Overview

Fetch RADIUS Shared Secret

Retrieve the RADIUS shared secret from AWS Secrets Manager.

Health Check

Probe port 1812 with a dummy Access-Request using radtest.

Treat any valid reply as a successful response:

Access-Accept
Access-Reject

Health Rules

Node is healthy if port 1812 responds
Node is unhealthy if port 1812 does not respond

Target Group Registration

Desired state:

Nodes that successfully pass the 1812 health check.

Apply the same desired node set to both target groups:

  1812
  1813

Failure Behaviour

If both nodes fail
- Log an ALERT
- Do not modify existing target groups
If a node starts responding again on 1812
- It will be added back to both target groups during the next Lambda run

Operational Guidance

Normal Operations

Admin NLB should have only one registered target.
RADIUS/TACACS traffic dynamically adjusts based on probe results. If a node becomes unreachable, it will be temporarily deregistered and automatically re-added once healthy.

Admin NLB Failover Procedure

Admin NLB ISE-TACACS-ADMIN-NLB failover is performed through Terraform.

Terraform variable: admin_failover_to_secondary

Normal State

admin_failover_to_secondary = false

Active node: 172.27.3.63

Failover to Secondary

Change Terraform configuration:

admin_failover_to_secondary = true

Run pipeline / Terraform apply.

Terraform will:

deregister 172.27.3.63
register   172.27.4.63

Failback

Set:

admin_failover_to_secondary = false

Apply Terraform again.

Troubleshooting

UDP probe failing

Check the following:

RADIUS shared secret configuration
NLB target deregistration events
Security group rules
Cisco ISE service status

Persistent deregistration

If both nodes appear unhealthy:

Validate connectivity to both nodes
Verify credentials and shared secrets

Monitoring and Alerting

CloudWatch log groups are used for Lambda logging.

The current logging level is set to:

LOG_LEVEL=INFO

To enable more detailed troubleshooting logs, change the environment variable to:

LOG_LEVEL=DEBUG

CloudWatch Log Groups

tacacs_udp = "/aws/lambda/ise-tacacs-udp-failover"

This page was last reviewed on 16 March 2026. It needs to be reviewed again on 16 September 2026 by the page owner #nvvs-devops .