我有一个 ECS 集群(在 Fargate 上1.4.0
),其中包含许多任务和服务,它们都登录到 Cloudwatch,一切正常。我有几个 ECS 计划任务(通过 EventBridge),我知道它们会按预期在计划时间运行。我知道这一点是因为 a) 我可以在 EventBridge 规则监控选项卡中看到它,并且 b) 计划任务之一的工作是发送电子邮件,我收到了该电子邮件。那么它正在运行,但不会像其他任务一样登录到 CloudWatch 吗?
在我开始故障排除步骤之前,让我给你提供更多见解:
在每个任务定义中,我都有这个日志块:
logConfiguration = {
logDriver = "awslogs"
options = {
awslogs-group = aws_cloudwatch_log_group.ecs_log_group.name
awslogs-region = "us-east-1"
awslogs-stream-prefix = "prod-cron-engage"
}
}
我知道那里的一切都是正确的,因为我的其他未安排的任务(通过服务全天候运行)都已成功记录在那里。
每个任务都有以下两个参数:
execution_role_arn = aws_iam_role.ecs_task_execution_role.arn
task_role_arn = aws_iam_role.ecs_task_execution_role.arn
这些都是:
resource "aws_iam_role" "ecs_task_execution_role" {
name = "ecsTaskExecutionRole"
assume_role_policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Action = "sts:AssumeRole"
Effect = "Allow"
Principal = {
Service = "ecs-tasks.amazonaws.com"
}
}
]
})
tags = {
"Name" = "${var.name_prefix}-iam-ecs-role"
}
}
resource "aws_iam_role_policy_attachment" "ecs_task_execution_role_policy" {
role = aws_iam_role.ecs_task_execution_role.name
policy_arn = "arn:aws:iam::aws:policy/service-role/AmazonECSTaskExecutionRolePolicy"
}
其中AmazonECSTaskExecutionRolePolicy
,有针对 ECS 的基本策略,包括访问 CloudWatch。
另外,这是我的事件桥规则:
resource "aws_cloudwatch_event_rule" "prod_cron_engage_rule" {
name = "prod-engage-rule"
description = "Run Prod Engage task every 30 minutes."
schedule_expression = "rate(30 minutes)"
}
resource "aws_cloudwatch_event_target" "prod_cron_engage_target" {
target_id = "run-prod-engage-task-every-half-an-hour"
rule = aws_cloudwatch_event_rule.prod_cron_engage_rule.name
arn = aws_ecs_cluster.ecs_cluster.arn
role_arn = aws_iam_role.eventbridge_role.arn
ecs_target {
task_definition_arn = aws_ecs_task_definition.prod_cron_engage_task.arn
task_count = 1
launch_type = "FARGATE"
network_configuration {
subnets = module.vpc.private_subnets
security_groups = [aws_security_group.ecs_sg.id]
assign_public_ip = false
}
tags = {
"Name" = "${var.name_prefix}-ecs-prod-cron-engage"
}
}
}
以下是 EventBridge 角色和策略:
resource "aws_iam_role" "eventbridge_role" {
name = "eventbridge-ecs-role"
assume_role_policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Action = "sts:AssumeRole"
Principal = {
Service = "events.amazonaws.com"
}
Effect = "Allow"
Sid = ""
}
]
})
}
resource "aws_iam_role_policy" "eventbridge_policy" {
name = "eventbridge-ecs-policy"
role = aws_iam_role.eventbridge_role.id
policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Action = "ecs:RunTask"
Effect = "Allow"
Resource = [
aws_ecs_task_definition.prod_cron_engage_task.arn
]
},
{
Action = "iam:PassRole"
Effect = "Allow"
Resource = aws_iam_role.eventbridge_role.arn
}
]
})
}
我到目前为止做了什么?
起初,我以为也许存在某种限制规则,拒绝这些计划任务访问 CloudWatch(不太可能,但我想既然它们现在已通过 EventBridge 进行安排,也许这是可能的),所以我授予 ECS 和 EventBridge 完全的 CloudWatch 访问权限。什么都没有改变。
我尝试创建一个具有这些广泛权限的新日志组,并查看任务是否可以创建该新日志组。新日志组不存在,因此无法通过任务定义创建日志组。
EventBridge 策略没有足够的权限。
根据该文档,我将我的政策更改为此,并且一切都按预期进行!