A gentle introduction to scripting Amazon EC2

2019 January 18

When I set out to use Amazon Web Services’ (AWS) cloud offering, Elastic Compute Cloud (EC2), it was to solve a simple problem: run a process (e.g. a web server, a compiler, or a test suite) on someone else’s machine. I was surprised and ill-prepared for the amount of boilerplate setup required to reach that point, however.

AWS has loads of very thorough documentation, but digging through it can be tedious. Further, I prefer command-line tools, but most of the documentation, including the Quick Start Guide and 10-Minute Tutorial for EC2, is centered on their browser application. The document I’ve written here is the introduction I wish I had when I started using AWS.

AWS interfaces

There are three interfaces to AWS: a REST API, the Console (a browser application in front of the REST API), and the CLI (a command-line tool in front of the REST API). Note that “the Console” will be carefully and deliberately capitalized both here and in the official AWS documentation to indicate the browser application. Do not use confuse it with your terminal.

Before using the Console, you’ll need an account with a password. Before using the API or CLI, you’ll need an access key (and its secret key), which you can create in the Console.

Create a user

When you sign up for AWS, you’ll start with a root user with unlimited permissions, including access to your billing information. (You must provide billing information to sign up, even for the free tier, but it will not be charged until you exceed the free tier.) The AWS best practices recommend that you never use your root user for anything except to create other users. We’ll create a user, Developer, who can only manage EC2 resources from the REST API or CLI only (it will have an access key but no password).

Another best practice is to never assign permissions to users directly, but only through their group membership, which means we’ll start by creating a group. In the Console, navigate to the Identity and Access Management (IAM) service. Amazon has helpfully placed a “Security Status” checklist at the bottom of that page, which you can follow instead (they are a superset of my instructions here). Next, step into the Groups section.

Click “Create New Group”.
Enter “Developers”, then click “Next Step”.
Search for “AmazonEC2FullAccess”, select it, then click “Next Step”.
Click “Create Group”.

Do not worry that these permissions are too limited. They are enough for this introduction, and you will be able to add more later.

Next, move to the Users section of IAM.

Click “Add user”.
In the “User name” field, enter “Developer”. For “Access type”, select only “Programmatic access”. Click “Next: Permissions”.
Select the “Developers” group. Click “Next: Tags”.
Click “Next: Review”.
Click “Create user”.

By default, AWS will create an access key for this user and show you both the access key and its secret key on the “Add user” confirmation page. You will only ever be shown a secret key once, when it is created, so now is your only chance to copy it. However, you can always add or delete access keys for a user, so if you miss this one, just create another.

Install and configure the CLI

To install the CLI, you’ll need Python. Installing Python for your platform is out-of-scope for this introduction. Once you have it, install the AWS CLI with pip:

pip install awscli --upgrade

Once this is finished, you should have the aws command in your $PATH. Next, we’ll take the access key and secret key we saved from creating the Developer user and configure the CLI:

aws configure

You’ll be prompted for the keys, as well as a default region and a default output format (from among json, text, and table).

Instances

Now comes the juicy part: launching and connecting to machines in EC2. Before we run any commands, however, we’ll need to understand a few concepts.

Markets

There are three primary markets for instances:

On-Demand instances are launched as needed (“on demand”), with no long-term commitment, but with an indefinite lifetime that will not be interrupted (unlike a Spot instance).
Spot instances are launched as needed, with no long-term commitment, but with a limited lifetime that may be interrupted if the spot price exceeds your bid price. Usually much cheaper than On-Demand (~70% off).
Reserved instances are launched upon agreeing to a long-term commitment. Usually much cheaper than On-Demand (~40% off), but not as cheap as Spot.

You need to choose the market before you launch the instance. It cannot be changed after.

Types

The instance type defines its combination of CPU, memory, storage, and networking capacity. The free tier lets you launch t2.micro instances, which have 1 CPU, 1 GB RAM, and network performance on the order of 100 Mbps.

Volumes

You can attach multiple volumes of storage to your instance. There are two types: Elastic Block Store (EBS) and instance store. Just use EBS, which is the default. Instance store volumes are not even available for the free tier (t2.micro) instances. You should only switch from EBS to instance store as an optimization, and only when you know you need it.

Images

Instances are launched with an Amazon Machine Image (AMI). AMIs can be created and shared by anyone, and Amazon offers a few standard AMIs, which you can most easily find by starting the Launch Instance Wizard. Once you launch an instance, you can install the software you need and save the instance as a new AMI, to save yourself the installation step on duplicate instances.

You must select an AMI when launching an instance; there is no default image. For our example, we’ll use AMI ami-04328208f4f0cf1fe, which is an image of Amazon Linux 2, which is Amazon’s customization of Linux specially built and tuned for EC2 instances. There are free tier images available for several flavors of Linux and Windows Server that you can try instead.

Regions and Availability Zones

Instances are grouped into availability zones within geographic regions. us-east-2 (which is in Ohio) is generally the cheapest region, but if latency to the instance is a concern for you, you will want to choose a closer region.

The availability zone has no impact on the price of the instance, but if you will be launching several instances that need to communicate with each other, then there is a small charge for network data that crosses availability zones within a region, and a larger charge for network data that crosses regions. Spanning a network over multiple availability zones is better for reliability, but the reliability of a single availability zone is good enough for most small users.

You’ll want to choose your region carefully before launching an instance. Moving an instance across availability zones is tedious.

Security Groups

Much like permissions are given to users through their membership in groups, firewalls are configured for instances through their membership in security groups. A security group exists within a Virtual Private Cloud (VPC), which exists within a region. For each region, AWS gives you one default VPC, each with one default security group. You probably don’t need to worry about creating your own VPC; the security of the default VPC is good enough for most small users. If you want to connect to your instances over SSH, however, you’ll need to edit the default security group (or create a new one) and add a rule to open port 22 over TCP for inbound connections.

Security groups exclusively use whitelists of rules. Each rule has four parameters: inbound or outbound, protocol (e.g. TCP or UDP), port range, and IP address range. A rule says: “Machines in this security group can receive requests from (or send requests to) addresses in this range, on ports in this range, over this protocol.” A single machine can exist in multiple security groups, in which case it gets the union of their whitelists.

You don’t have to be too careful picking your security group before launching an instance, because you can change, after the instance is launched, both (a) the security group memberships of the instance and (b) the rules of its security groups.

Key Pairs

If you want to connect to your instance over SSH, you’ll need to launch it with a key pair. You must create the key pair before you launch the instance, and save its private half for when you connect. Much like with your API secret key, you’ll be given the private key only once, when the pair is created. You should save it in an identity file.

aws ec2 create-key-pair \
  --key-name my-key \
  --query 'KeyMaterial' \
  --output text \
  > my-key.pem

Launch and connect to an Instance

Now that we know everything we need to know about instances and have all of our parameters lined up, it’s time to launch an EC2 instance from the command line:

aws ec2 run-instances \
  --count 1 \
  --image-id ami-04328208f4f0cf1fe \
  --instance-type t2.micro \
  --key-name my-key \
  --query 'Instances[0].InstanceId' \
  --output text \
  | tee instance_id

We can try connecting to it over SSH, but we first need to wait for it to become ready:

aws ec2 wait instance-running --instance-ids $(cat instance_id)

Once it’s ready, we need to get its public domain name:

aws ec2 describe-instances \
  --instance-ids $(cat instance_id) \
  --query 'Reservations[0].Instances[0].PublicDnsName' \
  --output text \
  | tee address

Now we can finally log in using the private key we saved earlier. We need to log in as the default user, which depends on the AMI. For Amazon Linux 2, it is ec2-user:

ssh -i my-key.pem ec2-user@$(cat address)

Now you should be ready to explore the other commands in the CLI. Bon voyage!