How to schedule ECS Services in AWS easily (start/stop)

This is my solution which is highly based on a great AWS employee (Alfredo J).

If you are reading this article, you might know this is not something possible to do without a workaround in AWS. You might think of using a scheduled task or complex solutions but after a while, Alfredo from Mexico supported me to bring this solution to all of you.

First, create a Policy:

And choose the JSON option:

In the editor, you must add the following policies that enable your services to update your ECS programmatically. 

{
    "Version""2012-10-17",
    "Statement": [
        {
            "Effect""Allow",
            "Action": [
                "logs:*"
            ],
            "Resource""arn:aws:logs:*:*:*"
        },
        {
            "Effect""Allow",
            "Action": [
                "ecs:DescribeServices",
                "ecs:UpdateService"
            ],
            "Resource": [
                "*"
            ]
        }
    ]
}

It can be called ServicessShedulerPolicy.

After the policy is created, you must create an IAM role:

This one is going to use your new policy (ServicessShedulerPolicy). It can be called ECSSchedulerRole.

The next step is defining some tags in the ECS services you want to schedule like this:

I tend to use ecs as my key and auto-scheduler as my value, but you can use any key and value. You can use the ones that fit your needs better.

Next, you need to create a Lambda function choosing Python 3.8 as your programming language:

The next step is to choose the permissions and choose the new IAM role (ECSSchedulerRole):

As soon as your function has been created, add the Python script:

import json
import os
import boto3
import logging

logger = logging.getLogger()
logger.setLevel(logging.INFO)

region=os.environ['REGION']
cluster_name=os.environ['CLUSTER']
client = boto3.client('ecs'region_name=region)

def lambda_handler(eventcontext):
    key=os.environ['KEY']
    value=os.environ['VALUE']
    service_desired_count = int(event["service_desired_count"])

    response = client.list_services(cluster=cluster_name)
    next_token = (response['nextToken'])
    paginator = client.get_paginator('list_services')
    response_iterator = paginator.paginate(cluster=cluster_name)
    for i in response_iterator:
        arn=i['serviceArns']

        for serv in arn:
            resp2=client.list_tags_for_resource(resourceArn=serv)

            for tag in resp2['tags']:
                if tag['key']==key and tag['value']==value:
                    
                    service_name=serv.split('/')[-1]

                    response = client.update_service(
                        cluster=cluster_name,
                        service=service_name,
                        desiredCount=service_desired_count
                        )

                    logger.info("Updated {0} service in {1} cluster with desire count 
set to {2} tasks".format(service_name, cluster_name, service_desired_count))

    return {
        'statusCode'200,
        'new_desired_count': service_desired_count
    }

In the configuration of the Lambda function, you must define your Environment variables:

Where:
  • CLUSTER is the name of your cluster.
  • KEY is the key defined in your ECS service (tags).
  • REGION is your region like us-west-1.
  • VALUE is the value defined in your ECS service (tags). 
After everything is created you need to create some rules in Amazon EventBridge (formerly, CloudWatch Events). Here, you define the event you want to trigger based on the schedule that you expect based on a Cron expression.

In this section, you are going to choose your Lambda function and configure its input choosing Constant (JSON text). In the JSON, you are going to define the number of instances:

{
  "service_desired_count""0"
}

Where:
  • service_desired_count is the number of desired services. 0 is to stop the service/s, any other number is to start the service/s.
If something fails, you need to double-check that the created IAM role has the required policies like ecs:UpdateService. You can check this from the lambda logs.

Comments

  1. HI @Federico Navarrete,

    Thank you for the blog. When i execute the python script, it is failing due to below error
    {
    "errorMessage": "'nextToken'",
    "errorType": "KeyError",
    "stackTrace": [
    " File \"/var/task/lambda_function.py\", line 19, in lambda_handler\n next_token = (response['nextToken'])\n"
    ]
    }

    ReplyDelete
    Replies
    1. Hi Krishna, the KeyError occurs because the nextToken key is not always present in the response from the list_services call. This can happen for a few reasons:

      1) Single Page of Results: If the list_services call returns all results in a single response, there won't be a nextToken in the response. The nextToken is only included when there are more results to be retrieved in subsequent calls.

      2) No Services: If there are no services in the ECS cluster, the response will not include a nextToken (and also serviceArns will be an empty list).

      Are you sure you have any service? Because it seems like a cluster configuration issue.

      Delete

Post a Comment