Issue
Following this guide:
https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/UsingAlarmActions.html
I have alarms created to alert me on StatusCheckFailed_System events. It is possible through the UI to set up EC2 Actions to do something with the instance once it is in an alarm state. In this case, I'd like to reboot the system if I get an alarm on StatusCheckFailed_System. Is this possible in Terraform? I have a snippet of my TF code for reference: I have a fleet of Windows Server 2016/2019 instances and I'm using Terraform 4.x
locals {
all_ec2s = toset(data.aws_instances.all_ec2.ids)
all_ebs = toset(data.aws_ebs_volumes.all_volumes.ids)
}
# EC2 System failures
resource "aws_cloudwatch_metric_alarm" "system_failure" {
for_each = local.all_ec2s
alarm_name = "${data.aws_instance.each_ec2[each.key].tags["Name"]} - EC2 System failures"
alarm_description = "Systems that have failed the EC2 Status check."
comparison_operator = "GreaterThanOrEqualToThreshold"
threshold = var.cloudwatch_ec2_system_failure_evaluation_threshold
period = var.cloudwatch_ec2_system_failure_alarm_period
evaluation_periods = var.cloudwatch_ec2_system_failure_evaluation_periods
metric_name = "StatusCheckFailed_System"
namespace = "AWS/EC2"
statistic = "Maximum"
alarm_actions = [aws_sns_topic.cloudwatch_alerts_topic.arn]
insufficient_data_actions = [aws_sns_topic.cloudwatch_alerts_topic.arn]
ok_actions = [aws_sns_topic.cloudwatch_alerts_topic.arn]
dimensions = { InstanceId = each.key }
}
Solution
Yes it's possible, it's just not well documented. You simply need to pass a specific ARN as one of the alarm actions, like so:
alarm_actions = [
aws_sns_topic.cloudwatch_alerts_topic.arn,
"arn:aws:automate:${data.aws_region.current.name}:ec2:reboot"
]
Answered By - Mark B Answer Checked By - Marilyn (WPSolving Volunteer)