This procedure was created in order to limit and control instance deployment. The need for it arose from an AWS account under our management that had a significant amount of on-demand resources available. Unfortunately, the very common pattern that we see is that most people simply assign AdministratorAccess
or AmazonEC2FullAccess
to users in order to side step the sometimes complicated details of IAM policies.
Background
AWS accounts have a default limit of twenty on-demand instances for basic instance types. In other words, if you have not purchased or reserved capacity, you cannot start more than twenty machine instances in an individual region.
This is a fairly flexible number, but does still leave some room for problems. In particular, there are some very large instances with a significant per-hour cost. As of January 2018, the largest p3 series are over $24/hr (although separately limited in number). Starting more pedestrian instances in the hundreds can dwarf even that cost.
Since the limits on this particular account had been set fairly high, and the workloads were well understood, we moved to limit instance size deployment using IAM policies tied to an IAM group. Anyone in the group would have the ability to deploy EC2 instances but only smaller sizes. If the need arose to deploy larger instances, the idea was that a role would be defined so that through a conscious effort, larger instances would be available.
This approach protects both from unexpected deployments as well as unexpected automation mistakes. In a production environment, it would be reasonable to place certain instance type completely off-limits.
IAM Structures
It is very helpful to use IAM user groups in order to organize permissions instead of attaching policies to users directly since there are a number of IAM entities involved. The complete list is:
- A user group for attaching policies.
- A policy managing a specific set of actions, attached to the user group.
- A role allowing for elevating to a set of permissions so as to enable higher level set of permissions when desired.
- A policy detailing the elevated permission set, attached to the role.
Workflow
Since the same policy that limits permissions is used to grant access to the role, we need to create the role first. Every entity in AWS is given an Amazon Resource Name or ARN, which is used to refer to the entity. By creating the role first, we are able to gather and fill in the ARN for the role in the new policy. We can then define the main policy and attach it to the user group.
To begin, navigate to your AWS console, and select the IAM service.
Role for Overriding Restriction
We need to create the role that will allow us to switch permission sets. The name of the role in our example is ec2-all-instance-launch-cap
but you can select any name you like.
The role needs to have its own set of policies attached. Keep in mind that switching to the role replaces all policies of the user. There is no blending of the original permission set for the duration the role is held. The ability to switch to the role is a specific action, which needs to be permitted, and will be allowed in the next policy.
Role Policy
For the role, we simply attach the AWS managed policy AmazonEC2FullAccess
. This means that by switching roles, the user is able to elevate to full EC2 admin rights. Any other policies attached to the group need to be associated if they are needed in order to perform a given action. Since the role is to be used only to override the run request we only need to apply the EC2 access routine. If the expectation is that other operations will be performed while holding the role (like updating DNS, or creating network objects), these would need to be attached as well.
Once you create the role, navigate to the role definition and copy the ARN of the role. It will be used in the next step.
Policy for Limits
The user group defines all of the policies for the user and can have multiple policies attached. One of these policies sets up the access control for starting instances. Instead of specifying all of the possible instances to run, we use Deny effect in the statement, which allows for a short and simple policy. This has the further effect of preventing any other policies from overriding the block. You will need to modify the policy by adding the ARN of the role you created in the previous step (where the policy listing shows ARN-OF-ROLE
).
The Policy
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "overrideBlockOnReq",
"Effect": "Allow",
"Action": "sts:AssumeRole",
"Resource": "ARN-OF-ROLE"
},
{
"Sid": "limitedSize",
"Effect": "Deny",
"Action": "ec2:RunInstances",
"Resource": "arn:aws:ec2:*:*:instance/*",
"Condition": {
"ForAnyValue:StringNotLike": {
"ec2:InstanceType": [
"*.nano",
"*.small",
"*.micro",
"*.medium"
]
}
}
}
]
}
Procedures Around this Configuration
Detecting Role Usage
This elevation will be recorded under CloudTrail. Upon a CLI request the role is not logged under a separate event, but appears under as part of the runInstances event. If you look at the event source, the difference shows up in the userIdentity
section.
action using user identity
...
"userIdentity": {
"type": "IAMUser",
"principalId": "AAAAAAAAAAAAAAAAAAAAA",
"arn": "arn:aws:iam::123456789012:user/loginuser",
"accountId": "123456789012",
"accessKeyId": "AAAAAAAAAAAAAAAAAAAA",
"userName": "loginuser"
},
...
action using role permissions
...
"userIdentity": {
"type": "AssumedRole",
"principalId": "AAAAAAAAAAAAAAAAAAAAA:AWS-CLI-session-1234567890",
"arn": "arn:aws:sts::123456789012:assumed-role/ec2-all-instance-launch-cap/AWS-CLI-session-1234567890",
"accountId": "123456789012",
"accessKeyId": "BBBBBBBBBBBBBBBBBBBB",
"sessionContext": {
"attributes": {
"mfaAuthenticated": "false",
"creationDate": "2018-01-17T15:50:17Z"
},
"sessionIssuer": {
"type": "Role",
"principalId": "AAAAAAAAAAAAAAAAAAAAA",
"arn": "arn:aws:iam::123456789012:role/ec2-all-instance-launch-cap",
"accountId": "123456789012",
"userName": "ec2-all-instance-launch-cap"
}
}
},
...
Using the Role: Switching to the Role in the CLI
We assume you already have a profile configured for the AWS CLI client. If not, set one up using the aws configure
command.
Edit the file ~/.aws/config
and add a stanza for the account that will use the role as follows:
# Note: the source profile is either the name of another stanza or "default" if it is the default profile.
[profile prof-override]
role_arn = ARN-OF-ROLE
source_profile = default
You will now be able to override the block using the alternate profile: aws --dry-run --profile prof-override ec2 run-instances --image-id ami-cb9ec1b1 --instance-type m4.large
Using the Role: Switching to the Role in the Console
The AWS console provides a more friendly working environment. In order to switch roles in the console, go to the role definition under IAM, and locate the URL associated. You should share that URL with anyone who might be expected to switch roles. The format is fairly simple, and you can build it yourself if need be. Once you click on the URL, click on Switch Role in order to transition.
In the current version of the console, your username changes color to indicate that you are now in a role. When you are done, select your username from the top of the console page, and from the drop down menu select Back to username.
What does this policy not protect us from?
- This does not impose limits on the size of instances deployed though other services - most importantly, through an auto-scaling group.
- We are not limiting in any way the total number of instances spun up by a given user. The normal limits help here, but there isn't a way to determine how many instances are currently running through just policy.
- We're not capping costs, just adding some protections. If you want to set alerts for cost overruns or spikes, you need to look into the Billing controls.
- We do not require any Multi Factor Authentication (MFA) for the role switch. The assumption is that users logging in are already using some form of second factor for authentication. This is mostly due to the increased complexity around using the CLI with MFA, and the fact that the role change is only to allow an override, not to enable an elevation to another security domain. You might want to add the feature; it is not difficult to add, and can bring reasonable protections.
- We are only looking at instance startup, but you may consider who can shutdown or terminate instances, as that may result in downtime or data loss.
Additional Options
It's an easily tailored recipe and there are a number of additional conditions that you could add or check. You might also consider:
- Should the region be limited?
- In most cases it would be unusual to place instances outside of a home country or territorial region, and doing so might involve legal/privacy ramifications.
- Is there a benefit to restricting certain instance sizes outright?
- Some of the very large instances are really for specific use cases such as HPC, data mining, or FPGA computing.
- Is the time of day of importance?
- Would it be reasonable for someone to start instances during the middle of the night?
Conclusion
I've detailed how to use IAM policy to protect from unintended operations, and have included a role so as not to reduce effectiveness of anyone doing work - but to add mindfulness to an operation. In the Unix world, it took many years for sudo
to be accepted, and even more for it to be carefully configured on a regular basis. Hopefully, as you begin using AWS services, you will understand usage and how to reduce risks before you realize why these controls can be so valuable.