Before Lambda was announced, if you wanted to schedule EC2 instances on and off to save costs, creating an extra micro or nano instance as “the lord of the instances” to rule other instances was kind of a no-brainer. However, the lord of the instances has its dark side. In order to use the AWS CLI to control other resources, an Internet connection is a must. Not only that, this instance has to run 24×7. So, with the force of internet connection plus 24×7 plus the power of controlling your EC2 instances, this tiny lord of the instances could potentially unleash its dark side and become the Darth Vader to ruin your entire EC2 fleet.
However, the new AWS service Lambda, which hopefully will be available in Sydney this year, can actually be the lord of the instances but without the evil side. With Lambda, you can put your fleet of EC2s in safe hands through “the true lord of the instances”.
In this post, I will show you few basic moves to control your Lambda “force” over scheduling EC2 instances on and off.
To start with, I need to create two Lambda functions, each function performs only one action, starting up instances or shutting down instances. So I have START-TaggedEC2-test and STOP-TaggedEC2-test as the name of our Lambda functions. These two Lambda functions basically are same except filter settings and the action it does. The python script I used is based on M. Lapidakis’ Github code EC2-Tag-Assets-Lambda.py. This script will go through all instances in the same account. If the instance meets the criteria, both instance state and identifier, written in the script, the lambda function will perform an action, start or stop, to the instance. For example, to make the Lambda function START-TaggedEC2-test, I use tag name “AutoOn” and value “TRUE” as the tag filter in the script to identify wanted instances and check instance state to make sure only stopped instances are objects of this lambda function. I also create an IAM role called “lambda_basic_execution” by using popped up Lamdba role creation wizard, select this role, leave other settings default on the function creating page and create the function.
Use the same settings but slightly different script to create the lambda function STOP-TaggedEC2-test. When two functions are in place, get into IAM console, attach following policies to the role to make sure our Lambda function has privileges to access EC2 instances.
Note: In actual production environment, we recommend using individual roles for each function and restrict the policy to just enough for the job. By doing so, if one function goes wrong, it won’t cause any damage except those it has privileges.
In order to use these two lambda functions, I use CloudWatch. On page Rules in CloudWatch console, create a new rule with setups, as in the following diagram, for each Lambda function. In this test, I have rule No.1 to trigger function START-TaggedEC2-test every 3 hours, and rule No.2 to trigger STOP-TaggedEC2-test every 67 mins. (The reason I trigger STOP function every 67 mins is to avoid triggering START and STOP functions at same time.)
To test the whole auto starting and stopping lambda functions I have created, I have following environment:
Default VPC with internet gateway (default-vpc)
1 default subnet
1 instance with public IP address
instance name: Lambda-test-ec2-public
instance ID: i-eec1b561
Tag : AutoOn : False
Tag : AutoOff : True
New VPC without internet gateway (lambda-test-vpc)
1 private subnet with local route only
1 instance with private IP only
instance name: Lambda-test-ec2-private
instance ID : i-8e836e2a
Tag : AutoOn : True
Tag : AutoOff : True
In this environment, we have two different VPC settings. The default-vpc has internet access while the lambda-test-vpc is totally isolated from public access.
After the whole setup has been finalized, those rules for lambda functions are enabled and two test instances are running, I leave the testing stack running. Ideally, instance Lambda-test-ec2-public should be stopped when next time lambda function STOP-TaggedEC2-test running; instance Lambda-test-ec2-private should be started every three hours and turned off for the most time as STOP-TaggedEC2-test runs every 67mins.
On the following day, I have following CPU utilization details:
Following diagram reveals instance Lambda-test-ec2-public. From it, I find the instance is completely not using its CPU after approximately 7.30UTC 14/MAR. Because this instance’s tag is set AutoOn to false, Lambda function START-TaggedEC2-test ignores it at each run. As a result, this instance stays at stop state after been turned of by the STOP function.
From the graph below we can easily find the CPU usage of instance Lambda-test-ec2-private occurs every 3 hours and disappears after a while. This indicates those two lambda functions control the instance to start up every 3 hours and to turn off every 67mins.
From this test, although there are only two basic functions involved, we can have some rough idea of how good Lambda is. Even the instance Lambda-test-ec2-private couldn’t access to and be accessed from internet at all.
Without the cost of keeping an instance running 247 and the heavy dependency of internet access, the amazing Lambda service can easily manage your fleet of EC2s. Not only that, we can fit each lambda function with the strictest privileges to keep the damage to minimum if something went wrong. Also, unlike the “lord of the instances” which needs VPC peering to spread its power to other VPCs in the same account, Lambda can manage all resources in your account without any special connection requirement. What a fancy service!