Wednesday, February 13, 2013

Amazon Loadbalancing and Autoscaling

Elastic Load Balancing detects unhealthy instances within a pool and automatically reroutes traffic to healthy instances until the unhealthy instances have been restored. It is supported across multiple availability zones, is available in a VPC, and supports sticking user sessions to specific EC2 instances.

Before you get started with loadbalancing, you need two replicated machines. Stop your instance and create an image. This is a snapshot from which you can deploy new instances, but they will have different local IP/DNS and different underlying hardware, so you will need to update any config that relys on these.

AWS > EC2 > Instances
 yourInstance > Actions > Stop > Yes, Stop
 yourInstance > Actions > Create Image (EBS AMI)
  Image Name: imageForReplication
  (everything else default)
  > Yes, Create > Close

AWS > EC2 > AMIs
 (wait until your new AMI is "available")
 imageForReplication > Launch
 Number of Instances: 1
 Instance Type: t1.micro // based on the original
 Launch Instances: VPC: Subnet: yourSubnet
 > Continue > Termination Protection: [checked]
 > Continue > Continue > Name: clonedInstance
 > Continue > Choose from your existing Key Pairs: sameAsBefore 
 > Continue > Choose one or more of your existing Security Groups: sameAsBefore
 > Continue > Launch > Close

Before creating a loadbalancer you'll need to decide its security group. For the sake of this test lets assume you want inbound from 80/443 and outbound to anywhere. Once you're satisified with the loadbalancer you can further lock this down.

AWS > VPC > Security Groups > Create Security Group
 Name: LBgroup
 Description: whatever
 VPC: yourVPCM
 > Yes, Create
 Inbound > Add Rule
  Port Range: 80
  > Add Rule
  Port Range: 443
  > Add Rule
 > Apply Rule Changes
 (note, by default there is already an allow-all outbound rule)

Next we create the actual loadbalancer. You will be charged $0.025 per hour your Load Balancer is running and $0.008 per GB of data transferred through it. Unlike a bare-metal loadbalancer, amazon's elastic loadbalancer isn't a real machine. It's effectively a loadbalanced loadbalancer that can handle infinite traffic and will never go down. A side effect of this prompts the following warning (displayed in aws benieth the DNS name of the loadbalancer).

Because the set of IP addresses associated with a LoadBalancer can change over time, you should never create an "A" record with any specific IP address. If you want to use a friendly DNS name for your LoadBalancer instead of the name generated by the Elastic Load Balancing service, you should create a CNAME record for the LoadBalancer DNS name.

If your loadbalancer will be passing traffic to normal EC2 instances then you must create it inside EC2. If it will be passing traffic to VPC instances then you must create it inside your VPC.

You have the option of making it an internal load balancer. A normal loadbalancer in a VPC has a DNS name that points to the loadbalancer's public IP so it can be addressed from outside the VPC. An internal loadbalancer in a VPC has a DNS name that points to the loadbalancer's VPC private IP. The DNS records are publicly resolvable in both cases.

AWS > EC2 > Load Balancers > Create Load Balancer
 Load Balancer Name: LB
 Create LB inside: yourVPC
 Create an internal load balancer: unchecked
 Listener Configuration:
  TCP/80 -> TCP/80
  TCP/443 -> TCP/443
 > Continue
 Ping Protocol: HTTP
 Ping Port: 80
 Ping Path: /some/valid/path
 (all others default)
 > Continue

The GUI for adding subnets is a bit confusing. You click the green plus to move a subnet from the available list to the selected list. It then appears with a red x meaning that it is selected and can be moved back to available.

Choose the approprite security group and instances you want load balanced and complete the wizard.

We can tail the apache access logs on the target EC2 instances to see the loadbalancer's health checks. Note that a loadbalancer in a VPC has been autoassigned an internal IP and addresses the balanced instances by their internal IPs.

Multiple CTRL+F5 refreshes of the loadbalancer url cause access logs alternating between the two boxes.

Your loabalancer can be configured to support session stickiness.

Elastic Load Balancing supports two mechanisms to provide session stickiness: load balancer-generated HTTP cookies, which allow browser-based session lifetimes, and application-generated HTTP cookies, which allow application-specific session lifetimes.

If you know nothing about sessions, you might want to read shlomoswidler's super basic article. There is also a great article on that gives the following details.

Amazon ELB only supports round robin and session sticky algorithms. It doesn't support weighted or least connection algorithms. It cannot load balance based on URL patterns. It doesn't provide logs. It is not a page cache server or web accelerator. shlomoswidler also has a detailed post although it is much older and may be out of date.

Amazon ELB is designed to handle unlimited concurrent requests per second with “gradually increasing” load pattern. If your traffic is spikey, the ramp-up characteristics of ELB could be a problem for you. According to RightScale it easily handled 20K+ requests/sec given slow ramp up. Amazon ELB can be pre-warmed by contacting Amazon Support when you expect a sudden increase in load, for example corelated with promotions or ad campagins.

Amazon ELB times out persistent socket connections after 60 seconds of idle. Amazon ELB does not provide a fixed IP so it must be mapped by cname or by route53. Note that some ISPs limit cnames to 32 characters, so try to keep your loadbalancer names short.

Amazon ELB can loadbalance to multiple EC2 instances inside multiple availability zones inside a single region. It cannot speak to other regions. You can use route53 to direct to separate ELBs for each region.

(false) Amazon ELB sticks request when traffic is generated from a Single IP. Note: my observatons contradict this claim. shlomoswidler's detailed post says: According to user reports in other forum posts, clients from a single IP address will tend to be connected to the same back-end instance. It is likely that earlier versions of ELB exhibited some stickiness, but today the only way to ensure stickiness is to use the Sticky Sessions feature.

Sometimes ELB assigns its load balancers with IP address ending with X.X.X.255 which certain networks will not properly route. Note this when you are debugging ELB for missing requests.

Amazon AutoScaling does not gracefully (without interruption to existing connections) remove Web/App EC2 from Amazon ELB. The connections are instantly dropped when the Web/App EC2 is removed and no grace period is given by ELB or AutoScaling.

Low level details on the loadbalancer's abilities are given in Choosing Listeners for Your Load Balancer.

When you use TCP for both front-end and back-end connections, your load balancer will forward the request to the back-end instances without modification to the headers.

If you use SSL for your front-end connection, you have to install an SSL certificate on your load balancer. You can also configure ciphers for SSL negotiation between your client and the load balancer.

When you use HTTP for both front-end and back-end connections, your load balancer parses the headers in the request and terminates the connection before re-sending the request.

If you use HTTPS for your front-end listener, you must install an SSL certificate on your load balancer. You can also configure ciphers for SSL negotiation between your client and the load balancer.

When you use HTTP or HTTPS, the load balancer inserts or updates the X-Forwarded-* headers and can insert or update cookies for sticky sessions.

If you are using SSL and do not want Elastic Load Balancing to terminate, you can use a TCP listener and install certificates on all the back-end instances handling requests.

To enable HTTPS support, use AWS Identity and Access Management (IAM) to upload your SSL certificate and key. After you upload your certificate, specify its Amazon Resource Name (ARN) when you create a new load balancer or update an existing load balancer. To update an existing SSL certificate, use IAM to upload your new SSL certificate. After you upload the new certficate, update your load balancer with the new certificate.

That doesn't quite spell it out for you, but what it means is that the loadbalancer can pass TCP level traffic (normal or SSL), in which case it won't support sticky sessions. Or the loadbalancer can terminate HTTP (or HTTPS) requests and send new ones (with modified headers) to the internal boxes, in which case it will support sticky sessions. When it terminates HTTPS, you must have installed an SSL cert and key-pair on the loadbalancer via IAM. When it terminates HTTP or HTTPS, the link from loadbalancer to internal box can be HTTP or HTTPS.

That is explicitly stated in Elastic Load Balancing Concepts and Terminologies.

You can associate stickiness duration for only HTTP/HTTPS load balancer listeners.

The purpose for SSL is to prevent someone from observing the pipe. If your loadbalancer and EC2 instances are inside a VPC then the pipe should be private so SSL shouldn't be necessary. In any case the person capable of observing the pipe in a virtual environment is probably also capable of reading your EC2 disk and scooping the SSL keys, so I think I'd advice to not bother with SSL for connections inside the VPC.

When terminating HTTP requests, the loadbalancer will insert X-Forwarded-For and X-Forwarded-Proto headers. These let you discover the original clientIP and the origional protocol. I'm not sure when it was added but in a VPC loadbalancers have a security group which effectively lets you filter client IPs, so the only use I can think of for the X-Forwarded-For header would be analytics or customized marketing, like with demandbase. But the X-Forwarded-Proto is critical. You probably have portions of your site that you're only willing to serve over https, but if the link between loadbalancer and EC2 is http, then you have to inspect the X-Forwarded-Proto header to know what the original link was (and return a redirecting location header when they've requested a protected page in the clear).

There are some conflicting statements out there about ELB support becasue it has added features as time goes by. Originally it didn't support terminating SSL and didn't support session stickyness. Support for session stickyness was added circa April 2010. Support for SSL termination was added circa October 2010. It appears that early versions of ELB has some stickyness based on clientIP but that no longer exists.


If you're serious then you require portions of your site to be accessed by serverAuth HTTPS and you have multiple application server replicas, for example three tomcats serving the same site. When one goes down you don't want that user logged out so you need to support session sharing, like with ehcache. But for efficiency you want one session to always hit the same tomcat (as long as it's still alive), so you want your loadbalancer to be sticky. With ELB that means you have to terminate SSL at the load balancer. With your infrastructure inside a VPC, the only public access point is the SSL terminating ELB, so your internal links (ELB to apache to tomcat to database or whatever) could be in the clear (HTTP). But some organizations will have policy that demands that any connection that transmits sensitive data (like password) must be encrypted. No problem, the link from ELB to internal box can also be HTTPS.

Below, we'll setup an ELB that accepts incoming HTTP/S (80/443) and forwards HTTP/S (80/443). To do this we'll upload a cert and private key. This terminates SSL at the ELB. As far as the client knows, the ELB is the final server. It is the ELB cert that is public facing so it has to be issued from a public trusted CA. We'll also SSL protect the link from the ELB to the internal box, but this cert is internal only so we can generate it ourselves. In this example, I'll generate both certs. The only difference in a real deployment is that you would send the ELB certReq to a real CA to get a real cert, rather than just processing it yourself.

This command will generate a private and public key pair. We specify the full subject wanted in the certificate request. We request output of the private key as a subjectPrivateKeyInfo in an unprotected pem file. We specify the they type of key to generate. We request output the public key as a certificate request in an unprotected pem file.

openssl req -nodes -subj "/C=CA/OU=yourcompany/" -keyout "/tmp/keyInfo.pem" -newkey rsa:1024 -new -out "/tmp/certReq.pem"

Next we create an extensions file for more details for the certificate request. We specify that this key should be used for signing and encryption. We specify that this cert should be used to authenticate a web server. We specify all the domain names for which the certificate is valid.

echo "" > "/tmp/ext.txt"
echo "keyUsage=digitalSignature,keyEncipherment" >> "/tmp/ext.txt"
echo "extendedKeyUsage=serverAuth" >> "/tmp/ext.txt"
echo ",DNS:*," >> "/tmp/ext.txt"

Next we process the certificate request, producing the certificate (which contains the public key) and signing it with its own private key. The means that no one will trust it. In production you would pay a publicly trusted CA to sign it for you. But if you're just using it internally or for test, you can tell your boxes to trust it.

We specify a lifetime of 730 days and select a random serial number using $RANDOM which is a bash extension that just prints a random number. We specify SHA1 as the signing algorithm

openssl x509 -req -days 730 -set_serial $RANDOM$RANDOM$RANDOM$RANDOM$RANDOM -sha1 -in "/tmp/certReq.pem" -extfile "/tmp/ext.txt" -signkey "/tmp/keyInfo.pem" -out "/tmp/cert.pem"

Finally we will output our private key as a raw privateKey in unprotected pem file. Other tutorials describe this as removing a passpharase from a key but our input keyInfo was not protected by a passphrase.

openssl rsa -in /tmp/keyInfo.pem -out /tmp/key.pem

We have produced the following files.

$ cat /tmp/keyInfo.pem

$ cat /tmp/key.pem

$ cat /tmp/cert.pem

To better understand what's going on here we can run the pem blobs (not including the --- BEGIN/END --- comments) through an ASN.1 decoder. The keyInfo decodes as follows.

   INTEGER 0x00 (0 decimal)
      OBJECTIDENTIFIER 1.2.840.113549.1.1.1 (rsaEncryption)
   OCTETSTRING 3082025c02010002818100ae9560687bd1ac33980c9a5fa8f1b917463fb439962a5827b672b79bb9fc50a5a3955acef5e1eb2fc516d0a9d9c3b4d7855ca2a1452db4889bda264028daaa0b071b8a01b03b5b2c81ff51fc36204c3417eb349b896c68e54b620d45d6bbb6bd771302ee9bdc8c49ca8a642277a845b30a4a4dc5d01b0c96b762d13f8308386f02030100010281804071350ffc3c6e02f16a1d8597f7f9e9646dd959b45b5704f9aca8a79be44de48658781792dd5c91da7f4c7095c84eb58b2da17e43e9d60ce2f2885200828e673d9346db3d95db4b60e735ee0a9fe2a2037abdd30975291802782b8de854ec11ee41a001fc83740fd68772d5a749ddd116bc4ec9f82c909139c4bd92c1fc80e1024100d7959ad37cdfaa07489b3d81984095e25a55ad88374b08c89156eb7492c118a3e9c82c684ee2fa75c44abc484f3f2fb792c26bddcfd83c79d66c3e0b2d87a7df024100cf500b3d599a23e8d966e2d8392136fe0af4785b8dba8784faa308454583e640a3d925e03c42b2e7e70f245264f27a077b94655bc5fd06413c8b46a54e68c171024100ae9a8d6e126a3814741abf62f1c40560f19708d815286171c83ce4b062979ff449c905266a15ed926a2bb978bb2e4ae05c2db91d4a54310ee0ba84399b638e730240020d8fdeeea9391bd03355a1c08714ad555c7068afb19e2ff1ef7560823cb92600b960c7a4b120666d8257e0bd012db62f421bf2d9b614bec6a3b67262a164f102406f6900abb894428cb2f8004eca94eb91020b3e2799d432f5f678944ef8270cca3e3b4712482efb22dcca3f4303d1f8907f2a36dd7b00c32da5d7e3afc9c33509

That jus says, hey, here's a private key. You can decode the octet string to see the raw private key. If you do, you'll see the following, which is exaclty the same as when you decode key.pem. That is to say that our earlier openssl rsa -in -our command just scooped the key blob out and stored it as a seperate file.

   INTEGER 0x00 (0 decimal)
   INTEGER 0x00ae9560687bd1ac33980c9a5fa8f1b917463fb439962a5827b672b79bb9fc50a5a3955acef5e1eb2fc516d0a9d9c3b4d7855ca2a1452db4889bda264028daaa0b071b8a01b03b5b2c81ff51fc36204c3417eb349b896c68e54b620d45d6bbb6bd771302ee9bdc8c49ca8a642277a845b30a4a4dc5d01b0c96b762d13f8308386f
   INTEGER 0x010001 (65537 decimal)
   INTEGER 0x4071350ffc3c6e02f16a1d8597f7f9e9646dd959b45b5704f9aca8a79be44de48658781792dd5c91da7f4c7095c84eb58b2da17e43e9d60ce2f2885200828e673d9346db3d95db4b60e735ee0a9fe2a2037abdd30975291802782b8de854ec11ee41a001fc83740fd68772d5a749ddd116bc4ec9f82c909139c4bd92c1fc80e1
   INTEGER 0x00d7959ad37cdfaa07489b3d81984095e25a55ad88374b08c89156eb7492c118a3e9c82c684ee2fa75c44abc484f3f2fb792c26bddcfd83c79d66c3e0b2d87a7df
   INTEGER 0x00cf500b3d599a23e8d966e2d8392136fe0af4785b8dba8784faa308454583e640a3d925e03c42b2e7e70f245264f27a077b94655bc5fd06413c8b46a54e68c171
   INTEGER 0x00ae9a8d6e126a3814741abf62f1c40560f19708d815286171c83ce4b062979ff449c905266a15ed926a2bb978bb2e4ae05c2db91d4a54310ee0ba84399b638e73
   INTEGER 0x020d8fdeeea9391bd03355a1c08714ad555c7068afb19e2ff1ef7560823cb92600b960c7a4b120666d8257e0bd012db62f421bf2d9b614bec6a3b67262a164f1
   INTEGER 0x6f6900abb894428cb2f8004eca94eb91020b3e2799d432f5f678944ef8270cca3e3b4712482efb22dcca3f4303d1f8907f2a36dd7b00c32da5d7e3afc9c33509

Finally, if you decode the cert you'll get the following.

      [0] {
         INTEGER 0x02 (2 decimal)
      INTEGER 0x210a921777a330d017ce
      SEQUENCE {
         OBJECTIDENTIFIER 1.2.840.113549.1.1.5 (sha1WithRSAEncryption)
      SEQUENCE {
         SET {
            SEQUENCE {
               OBJECTIDENTIFIER (countryName)
               PrintableString 'CA'
         SET {
            SEQUENCE {
               OBJECTIDENTIFIER (organizationalUnitName)
               UTF8String 'yourcompany'
         SET {
            SEQUENCE {
               OBJECTIDENTIFIER (commonName)
               UTF8String ''
      SEQUENCE {
         UTCTime '130201195453Z'
         UTCTime '150201195453Z'
      SEQUENCE {
         SET {
            SEQUENCE {
               OBJECTIDENTIFIER (countryName)
               PrintableString 'CA'
         SET {
            SEQUENCE {
               OBJECTIDENTIFIER (organizationalUnitName)
               UTF8String 'yourcompany'
         SET {
            SEQUENCE {
               OBJECTIDENTIFIER (commonName)
               UTF8String ''
      SEQUENCE {
         SEQUENCE {
            OBJECTIDENTIFIER 1.2.840.113549.1.1.1 (rsaEncryption)
         BITSTRING 0x30818902818100ae9560687bd1ac33980c9a5fa8f1b917463fb439962a5827b672b79bb9fc50a5a3955acef5e1eb2fc516d0a9d9c3b4d7855ca2a1452db4889bda264028daaa0b071b8a01b03b5b2c81ff51fc36204c3417eb349b896c68e54b620d45d6bbb6bd771302ee9bdc8c49ca8a642277a845b30a4a4dc5d01b0c96b762d13f8308386f0203010001 : 0 unused bit(s)
      [3] {
         SEQUENCE {
            SEQUENCE {
               OBJECTIDENTIFIER (keyUsage)
               OCTETSTRING 030205a0
            SEQUENCE {
               OBJECTIDENTIFIER (extKeyUsage)
               OCTETSTRING 300a06082b06010505070301
            SEQUENCE {
               OBJECTIDENTIFIER (subjectAltName)
               OCTETSTRING 3029820b6d792e736974652e636f6d820d2a2e6d792e736974652e636f6d820b616e6f746865722e636f6d
      OBJECTIDENTIFIER 1.2.840.113549.1.1.5 (sha1WithRSAEncryption)
   BITSTRING 0x2830856c98c3879c007f040ea572de32302636a5980ecc8203f35b6ff2b7a002051e07bb6d992afa17132e759752b8ddca70fbac6ddc1b1d39c83ad7ac54d22505fd9f07845fb610da9569e08f2b3d35351ce2e1f0b1bc59fd9276d6d68d38c7737363e618be826104ebe461e288365c7196482d11b65d00f4a924c08841cf8f : 0 unused bit(s)

Here is a simplified annotated version of the above.

SEQUENCE { // this is our certificate
      INT 0x02 // always version 3 indicated by decimal 2
      INT 0x210a921777a330d017ce // selected serial number
      OID 1.2.840.113549.1.1.5 // this cert is signed with sha1
      SEQUENCE {,ou=yourcompany,c=ca } // cert issued to this person
      SEQUENCE {
         UTCTime '130201195453Z' // not valid before 2013-02-01...
         UTCTime '150201195453Z' // not valid after 2015-02-01...
      SEQUENCE {,ou=yourcompany,c=ca } // cert issued by this person
      SEQUENCE {
         OID 1.2.840.113549.1.1.1 // here comes an rsa public key
         BITSTRING 0x30818902818100ae9560687bd1ac33980c9a5fa8f1b917463fb439962a5827b672b79bb9fc50a5a3955acef5e1eb2fc516d0a9d9c3b4d7855ca2a1452db4889bda264028daaa0b071b8a01b03b5b2c81ff51fc36204c3417eb349b896c68e54b620d45d6bbb6bd771302ee9bdc8c49ca8a642277a845b30a4a4dc5d01b0c96b762d13f8308386f0203010001 : 0 unused bit(s)
      [3] { // extensions
         SEQUENCE {
            SEQUENCE {
               OBJECTIDENTIFIER (keyUsage)
               OCTETSTRING 030205a0 // signing and encryption
            SEQUENCE {
               OBJECTIDENTIFIER (extKeyUsage)
               OCTETSTRING 300a06082b06010505070301 // used by an ssl web server
            SEQUENCE {
               OBJECTIDENTIFIER (subjectAltName)
               // if you decode this octetstring you'll see it's just a sequence of ascii strings
               // namely the web server names you specified earlier, clients will understand
               // this cert to be valid for these domains as well as the one specified in the
               // common name (cn) fo the subject of the cert (above)
               OCTETSTRING 3029820b6d792e736974652e636f6d820d2a2e6d792e736974652e636f6d820b616e6f746865722e636f6d
   OID 1.2.840.113549.1.1.5 // here comes a sha1 signature
   BITSTRING 0x2830856c98c3879c007f040ea572de32302636a5980ecc8203f35b6ff2b7a002051e07bb6d992afa17132e759752b8ddca70fbac6ddc1b1d39c83ad7ac54d22505fd9f07845fb610da9569e08f2b3d35351ce2e1f0b1bc59fd9276d6d68d38c7737363e618be826104ebe461e288365c7196482d11b65d00f4a924c08841cf8f : 0 unused bit(s)


Amazon's developer guide explains how to config SSL for your ELB, but it is Amazon's IAM guide tha explains how to create an upload the SSL cert/key-pair. But it is misleading. According to, the acutal process is much simpler and doesn't require IAM.

Note that the private key and public key certificate are in the format of openssl unprotected pem files, and the comments are required. Also note that the private key must be a raw private key, not the keyInfo produced by the openssl req command (see above). If the comments aren't included around the private key pem blob, AWS will fail with "Error: Invalid Private Key". If the private key blob is actually a keyInfo, AWS will fail with "Error: Invalid Private Key". If you got creative and just changed the comment of the keyInfo from "-----BEGIN PRIVATE KEY-----" to "-----BEGIN RSA PRIVATE KEY-----", AWS will fail with "Error: Public Key Certificate and Private Key doesn't match", which is misleading.

AWS > EC2 > Load Balancers > Create Load Balancer
 Load Balancer Name: LB
 Create LB inside: yourVPC
 Create an internal load balancer: unchecked
 Listener Configuration:
  // already contains: HTTP/80 -> HTTP/80
  HTTPS/443 -> HTTPS/443
 > Continue
 Upload a new SSL Certificate:
  // using a self signed cert created with openssl
  Certificate Name: yourCert
  Private Key: -----BEGIN RSA PRIVATE KEY-----...
  Public Key Certificate: ----BEGIN CERTIFICATE-----...
  Certificate Chain: [empty]
 > Continue
 // you are given the option to edit the acceptable ciphers
 > Continue
 // you are given the option to import trust for backend authentication
 > Continue
 Ping Protocol: HTTP
 Ping Port: 80
 Ping Path: /some/valid/path
 (all others default)
 > Continue
 // add your subnet
 > Continue
 // select your loadbalancer security group
 > Continue
 // select the instances to loadbalance
 > Continue > Create > Close

The wizard did prompt for SSL details but didn't prompt for session stickiness. We have to add that after the fact. If your backend server is tomcat, your sessions are usually mapped by the JSESSIONID cookie. If your backend server is php, your sessions are usually mapped by the PHPSESSID cookie.

AWS > EC2 > Load Balancers > yourLoadBalancer > Description > Port Configuration
 443 > Stickiness: Disabled (edit) > Enable Application Generated Cookie Stickiness > Cookie Name: JSESSIONID
 80 > Stickiness: Disabled (edit) > Enable Application Generated Cookie Stickiness > Cookie Name: JSESSIONID

Now when we navigate to the site via the loadbalancer all traffic for a particular session goes to the same backend box. Also note that since we keep the http and https streams synchronized (i.e. if it came to the LB as X then it went to the internal box as X) we can ignore the X-Forwarded-Proto headers. If we had all the LB-Internal traffic be one or the other then we would have to update backend logic that redirected when some urls were accessed by http.

It's not 100% clear to me that this is working correctly. Without the stickiness, if I navigate to a site and login and browse through it the requests bounce between boxes, which is bad. With stickiness, after login all requests stick to the same box, but when I log out and log back in they're still on the same box and when I login from a new browser they're still on the same box, but when I ask a guy down the hall to login, he's on the other box. So I guess it's working, but not exactly as I expected.

Auto Scaling

At this stage you created a backend box, created an image of that box, launched a new box from that image, ran a one-click script to update any config (perhaps based on local IP or machine bound passwords), and configured a loadbalancer for session sticky access to these two backend boxes.

Next we want to automatically add a third box. How to do it? Auto Scaling is enabled by Amazon CloudWatch. All Amazon EC2 instances and Elastic Load Balancers are automatically enabled for Basic Monitoring (at no charge).


It looks like we may have to dip into the command line tools to get this working as suggested by kkpradeeban's 2011 article. That true, but at first, I wasn't convinced. Consider the task of launching an AMI. It's not automatic. You have to select things like size (micro), security group, ssh keypair, etc. It looks like AWS CloudFormation can help with that, and that CloudFormer will create a CloudFormation template from an existing EC2 instance.

AWS > CloudFormation > Launch CloudFormer >
 Stack Name: yourStack
 Stack Template Source:
  Use a sample template:
  CloudFormer - create a template from your existing resources
 > Continue
 I acknowledge that this template may create IAM resources [checked]
 > Continue > Continue > Continue > Close

This creates a bunch of stuff in your AWS acount, like an IAM user, a security group and an EC2 t1.micro instance. The only component that causes billing is the EC2 t1.micro instance. Once the status of your CloudFormer shows CREATE_COMPLETE, you can click on the Outputs tab which has a URL to the newly create EC2 instance which is a website that implements the CloudFormer service.

While the EC2 instance is running, you are billed for it. I tried stopping and restarting it, but then the CloudFormer service url gave error 500, so it appears you can't stop the CloudFormer EC2 instance. You can however delete the entire CloudFormer stack which seems to remove everything it added (user, security group, ec2 instance).

At the CloudFormer url, click Create Template to launch the CloudFormer wizard. For more information on how to build a template see the AWS CloudFormation User Guide and sample templates.

Create Template > Continue > Continue >
 [check] yourElasticLoadBalancer
> Continue >
 // all instance currently loadbalanced are automatically checked
 // it notes that no auto scaling groups exist
> Continue > ...

It appears to me that CloudFormation is beyond autoscaling. If you had setup a replicated system with loadbalancing and autoscaling, and you wanted to be able to repeat that process automatically (so someone elsewhere could set up fundamentally the same thing as you had manually set up) then CloudFormation is of use.

Auto Scaling

According to Jospeh Mornin as of Dec. 2012, you must use the CLI (command line interface) to configure autoscaling. Tutorials are offered by techrepublic and cardinalpath, and official amazon docs describe how to Setup the CLI

First you must download the AutoScale CLI zip. If you already have Java on your path and JAVA_HOME defined then all you need to do is unzip the CLI tools and set AWS_AUTO_SCALING_HOME and add its bin directory to your path. The CLI is stateless, so each command requires a credential in order to establish permissions. That can be provided as a command line argument (--aws-credential-file Z:\secret\cliUserCredential.txt), but that gets tiresome. Instead you can set AWS_CREDENTIAL_FILE in the environment. Remember to keep this unsecured file on a thumb drive, unplugged when not in use. I explain how to create the credential file below.

My Computer > Properties > Advanced > Environment Variables 
 > User Variables > New
  Variable name: AWS_AUTO_SCALING_HOME
  Variable value: C:\some where\CLIs\AutoScaling-
  > Ok
  Variable name: AWS_CREDENTIAL_FILE
  Variable value: Z:\secret\cliUserCredential.txt
  > Ok
 > System Variables > Path > Edit > ...;C:\some where\CLIs\AutoScaling-\bin
  > Ok > Ok > Ok

At this stage as-cmd from a new console should display the list of available amazon CLI commands. However, these are of no use to us without a credential.

It's a bad idea to leave a master user access key lying around, so we will create a user specifically for using the CLI.

AWS > IAM > Users > Create New Users
 user name: cliUser
 [checked] Generate an access key for each User
 > Create > recordAccessKey > Close Window
> cliUser > Permissions > Attach User Policy > Policy Generator > Select
 Effect: Allow 
 AWS Service: Auto Scaling
 Actions: All Actions (*)
 > Add Statement > Continue > Apply Policy

The above produces an accessKeyId and secretAccessKeyId that can be copied from screen or downloaded as csv. It will actually be consumed by the CLI as a text file like the follwing.

AWSAccessKeyId=<Write your AWS access ID>
AWSSecretKey=<Write your AWS secret key>

Since this is an uprotected credential, I suggest storing it on a thumb drive as cliUserCredential.txt, so it is only exposed to attack while in use.

Note the instructions on how to specify region if you are outside of us-east-1.

Using the CLI

When you manually launch an AMI, you specify things like instance type, security group, etc. A launch configuration is a template with those answers pre-filled. We need one for auto scaling. Note that a launch config doesn't specify EC2/VPC, that is determined by the auto-scaling-group. Note that --monitoring-disabled true means enable basic (free) monitoring instead of detailed (not-free) monitoring.

as-create-launch-config --help

as-create-launch-config newLaunchConfigName
 --image-id ami-12345abc
 --instance-type t1.micro
 --key existingSSHKeyName
 --group sg-12345abc
 --monitoring-disabled true

as-create-auto-scaling-group --help

as-create-auto-scaling-group newAutoScaleGroupName
 --default-cooldown 120
 --min-size 1
 --max-size 3
 --tag "k=Name,v=autoScaledInstance,p=true"
 --availability-zones us-east-1b
 --vpc-zone-identifier subnet-12345abc
 --launch-configuration existingLaunchConfigName
 --load-balancers existingLoadBalancerName

The cooldown 120 means wait two minutes between each scaling action. Min size is the mininum number of machines launched by this autoscaler. Thus you may have several manually added machines from the same ami in the same region with the same loadbalancer and they won't count to this total. Max size is the max machines this auto scaler will launch. Tags apply to the auto-scale-group, but with the p=true flag can be propagated to the EC2 instances that are launched.

Just the act of creating the auto-scaling group should have launched the first instance. You can confirm this with the describe command.

as-describe-auto-scaling-groups --headers

AUTO-SCALING-GROUP  yourScaler  yourConfig     us-east-1b          yourLB          1         3         1                 Default

INSTANCE  i-1234abcd   us-east-1b         InService  Healthy  yourConfig

TAG  yourScaler   auto-scaling-group  Name  autoScaledInstance true

When you later need to clean these up, they can be described and deleted. Note that you have to update the auto-scale-group to contain no instances before it can be deleted. You can refresh the list of instances in EC2 or describe auto-scaling-groups from CLI to see when the auto-scaled instances have terminated. Once terminated you can delete the auto-scale group.

as-describe-auto-scaling-groups --headers

as-update-auto-scaling-group yourGroup --min-size 0 --max-size 0

as-delete-auto-scaling-group yourGroup
Are you sure? y

as-describe-launch-configs --headers

as-delete-launch-config yourConfig
Are you sure? y


The launched instance isn't associated with an ElasticIP which in a VPC means it can't call out to other machines on the web. I've updated my Amazon EC2 for Enterprise post to describe how to add a t1.micro NAT box and subnet to a simple VPC.

Also the launched instance has a new internal IP and perhaps machine bound passwords, thus config that needs to be updated. There is the option of passing in user data during launch. Also a startup script could detect the change in IP and react appropriately. I disucss this in Machine Binding in the Amazon Cloud and Hostname in Amazon Linux.


Your auto-scaling-group an specify termination policies via --termination-policies VALUE1,VALUE2,VALUE3. For example, you can specify that Auto Scaling should terminate the oldest instance, the newest instance, the instance with the oldest launch configuration, or the instance that is nearest to the next instance-hour. The default policy is to terminate the instance with the oldest launch configuration. Find more details in the amazon guide.


When you create your auto-scale-group, it immediately launches the minimum number of instances. You need to define triggers on CloudWatch metrics in order to scale up the number of instances. This is described on techrepublic and thatsgeeky. As far as I can tell, all this can be done in the AWS CloudWatch console, but I'll start with the CLI.

Download the CloudWatch CLI zip. From above you should already have Java on your path and JAVA_HOME and AWS_CREDENTIAL_FILE defined. Unzip the CLI tools and set AWS_CLOUDWATCH_HOME and add its bin directory to your path. Usage details are provided in the developer's guide.

My Computer > Properties > Advanced > Environment Variables 
 > User Variables > New
  Variable name: AWS_CLOUDWATCH_HOME
  Variable value: C:\some where\CLIs\CloudWatch-
  > Ok
 > System Variables > Path > Edit > ...;C:\some where\CLIs\CloudWatch-\bin
  > Ok > Ok > Ok

We only gave auto scaling permissions to the cliUser created above. In order to execute the CloudWatch commands, we have to add that permission.

AWS > IAM > Users > cliUser > Permissions > Attach User Policy > Policy Generator > Select
 Effect: Allow
 AWS Service: Amazon CloudWatch
 Actions: All Actions (*)
 > Add Statement > Continue > Apply Policy

Autoscaling uses CloudWatch alarms to trigger scaling events. CloudWatch's basic monitoring is in 5 minute increments. Metrics differing in any name, namespace, or dimensions (case sensitive) are classified as different metrics.

A good start is listing pre-defined metrics and available stats. mon-list-metrics shows a huge list. Each metric is available in several scopes, such as CPUUtilization in a specific instance or aggregrated accross all instances in an auto-scaling group. Here's a simplified example. These metrics are also visible in AWS > CloudWatch > Metrics > All Metrics.

Metric Name     Namespace    Dimensions

CPUUtilization  AWS/EC2      {InstanceId=i-1234abcd}
CPUUtilization  AWS/EC2      {AutoScalingGroupName=yourGroup}
CPUUtilization  AWS/EC2      {ImageId=ami-1234abcd}

DiskReadBytes   AWS/EC2      {InstanceId=i-1234abcd}
DiskReadBytes   AWS/EC2      {AutoScalingGroupName=yourGroup}
DiskReadBytes   AWS/EC2      {ImageId=ami-1234abcd}
DiskReadOps     AWS/EC2      {InstanceId=i-1234abcd}
DiskReadOps     AWS/EC2      {AutoScalingGroupName=yourGroup}
DiskWriteBytes  AWS/EC2      {InstanceId=i-1234abcd}
DiskWriteBytes  AWS/EC2      {AutoScalingGroupName=yourGroup}
DiskWriteOps    AWS/EC2      {InstanceId=i-1234abcd}
DiskWriteOps    AWS/EC2      {AutoScalingGroupName=yourGroup}

Latency         AWS/ELB      {LoadBalancerName=yourLB}
Latency         AWS/ELB      {LoadBalancerName=yourLB,AvailabilityZone=us-east-1b}
Latency         AWS/ELB      {AvailabilityZone=us-east-1b}

NetworkIn       AWS/EC2      {InstanceId=i-1234abcd}
NetworkIn       AWS/EC2      {AutoScalingGroupName=yourGroup}
NetworkOut      AWS/EC2      {InstanceId=i-1234abcd}
NetworkOut      AWS/EC2      {AutoScalingGroupName=yourGroup}

RequestCount    AWS/ELB      {AvailabilityZone=us-east-1b}
RequestCount    AWS/ELB      {LoadBalancerName=yourLB,AvailabilityZone=us-east-1b}
RequestCount    AWS/ELB      {LoadBalancerName=yourLB}

VolumeIdleTime  AWS/EBS      {VolumeId=vol-1234abcd}

These metrics are descirbed in cloudwatch and loadbalancer documentation. Here are some highlights.

CPUUtilization is the percentage of allocated EC2 compute units that are currently in use on the instance.

DiskReadOps is the number of completed read operations from all ephemeral disks available to the instance (i.e. none for a t1.micro).

DiskReadBytes is the number of bytes read from all ephemeral disks available to the instance (i.e. none for a t1.micro).

NetworkIn is the number of bytes received on all network interfaces by the instance.

Latency is the seconds elapsed after the request leaves the load balancer until it receives the corresponding response.

RequestCount is the number of requests handled by the load balancer.

I just recently created my auto-scaling group, so there aren't many stats yet. We can look at CPUUtilization for example. Note that stats are collected at five minute intervals. CloudWatch free basic monitoring for EC2 provides ten pre-selected metrics at five-minute frequency.

mon-get-stats CPUUtilization --statistics "Average" --namespace "AWS/EC2" --dimensions "AutoScalingGroupName=yourGroup"

2013-02-15 15:16:00  41.333999999999996  Percent
2013-02-15 15:21:00  2.334               Percent
2013-02-15 15:26:00  2.0                 Percent
2013-02-15 15:31:00  2.0                 Percent
2013-02-15 15:36:00  2.34                Percent

Most of the tutorials I saw scale based on CPUUtilization, but I think we're more interested in average Latency. Note that in an interval with no traffic, there is no entry for Latency. Also note that stats are collected at 1 minute intervals. CloudWatch free basic monitoring for ELB provides ten pre-selected metrics at one-minute frequency.

mon-get-stats Latency --statistics "Average" --namespace "AWS/ELB" --dimensions "LoadBalancerName=yourLB"

2013-02-15 15:43:00  0.09276564864864864  Seconds
2013-02-15 15:45:00  0.2294718            Seconds
2013-02-15 15:46:00  0.46668775           Seconds

Also note that many statistics are available for each metric.

C:\>mon-get-stats Latency --headers --statistics "SampleCount,Average,Sum,Minimum,Maximum" --namespace "AWS/ELB" --dimensions "LoadBalancerName=yourLB"
Time                 SampleCount  Average              Sum        Minimum   Maximum   Unit
2013-02-15 15:43:00  111.0        0.09276564864864864  10.296987  0.005962  1.321208  Seconds
2013-02-15 15:45:00  10.0         0.2294718            2.294718   0.007558  0.606748  Seconds
2013-02-15 15:46:00  4.0          0.46668775           1.866751   0.212814  0.998193  Seconds
2013-02-15 15:50:00  3.0          0.284503             0.853509   0.230918  0.38009   Seconds

When you create an alarm (mon-put-metric-alarm or AWS > CloudWatch > Alarms > Create Alarm), you specify an alarm action. That might be "send me an email when billing goes above $40", but it also might be "follow this scaling policy when the cpu utilization in this auto-scaling group has been above 80% for three intervals". Scaling policies are defined by the CLI as-put-scaling-policy and are of the form "add one machine to this auto-scaling group". There's an example on techrepublic.

Let's start by defining the scaling policies then decide how to trigger them.

Scaling Policy

as-put-scaling-policy PolicyName
                      --type value
                      --auto-scaling-group value
                      --adjustment value
                     [--cooldown value]
                     [--min-adjustment-step value]

--type Specify whether the adjustment is the new desired size or an increment to current capacity. Value must be one of: ExactCapacity, ChangeInCapacity, PercentChangeInCapacity.

--adjustment The amount to scale the capacity of the associated group. Use negative values to decrease capacity. For negative numeric values, specify this option as --adjustment=-1 on Unix or "--adjustment=-1" on Windows.

--cooldown Time (in seconds) between a successful Auto Scaling activity and succeeding scaling activity.

as-put-scaling-policy scaleUp --type ChangeInCapacity --auto-scaling-group yourGroup --adjustment 1 --cooldown 300

as-put-scaling-policy scaleDown --type ChangeInCapacity --auto-scaling-group yourGroup "--adjustment=-1" --cooldown 300

We now have the ability to plus/minus one machine in our auto-scaling-group. We will trigger these actions with CloudWatch alarms. When you create the above scaling policies, the CLI outputs the arn, like the following. These are the means by which the alarm will reference the policy, so you need them. You can view them at any time via as-describe-policies


To create the alarms that will use these policies we can use the CLI or the AWS console.

mon-put-metric-alarm AlarmName
                     --comparison-operator value
                     --evaluation-periods value
                     --metric-name value
                     --namespace value
                     --period value
                     --statistic value
                     --threshold value
                    [--actions-enabled value]
                    [--alarm-actions value[,value...]]
                    [--alarm-description value]
                    [--dimensions "key1=value1,key2=value2..."]
                    [--insufficient-data-actions value[,value...]]
                    [--ok-actions value[,value...]]
                    [--unit value]

--comparison-operator must be one of: GreaterThanOrEqualToThreshold, GreaterThanThreshold, LessThanThreshold, LessThanOrEqualToThreshold.

--evaluation-periods is the number of consecutive periods for which the value of the metric needs to be compared to threshold.

--period defines how often in seconds the alarm should sample the metric. With free basic monitoring, CPU is collected at 5 minute intervals and Latency is collected at 1 minute intervals. Your alarm period should be a multiple of these. For example, the best you can do for Latecy is a period of 60 seconds, however you may want to average noise by only checking every three mintues. Even if that would trigger your alarm, the --evaluation-periods could require another three minute sample to also exceed the threshold before triggering action.

--statistic is the statistic of the metric on which to alarm. Value must be one of: SampleCount, Average, Sum, Minimum, Maximum.

--threshold is the value to which the metric will be compaired. For example, if the Latency metric was 1.8 seconds and the comparison was GreaterThanThreshold and the threshold was 1.6 seconds, then the comparison would return true, or if the CPUUtilization metric was 42% and the comparison was GreaterThanThreshold and the threshold was 80%, then the comparison would return false. Note that you only define the raw value here. Units are defined next.

--unit is units of the value in the --threshold, such as Seconds or Milliseconds. mon-get-stats shows that the Unit of CPUUtilization is Percent and the Unit of Latency is Seconds. Be sure not to mix these, i.e. don't specify a threshold in seconds when the metric is in percent.

--dimensions lets you specify if we are looking at a particular loadbalancer or all loadbalancers in an availability zone, etc. To see what's available, list all metrics.

--alarm-actions defines the SNS topics to which notification should be sent when the alarm goes into the ALARM state. If this is the arn of an auto-scale policy, that action will be taken.

Here is an example of an alarm based on LoadBalancer latency. My feeling is that you really want a more advanced alarm that is a mix of cpu and latency over serveral short periods. The latency is what the end user sees, but the cpu gives confidence that adding more machines will actually have a positive effect.

mon-put-metric-alarm alarmUp
                     --namespace "AWS/ELB"
                     --metric-name Latency
                     --dimensions "LoadBalancerName=yourLB"
                     --period 60
                     --statistic Average
                     --comparison-operator GreaterThanThreshold
                     --threshold 2
                     --unit Seconds
                     --evaluation-periods 2
                     --alarm-actions arn:aws:autoscaling:us-east-1:123456789123:scalingPolicy:1234abcd-12ab-12ab-12ab-123456abcdef:autoScalingGroupName/yourGroup:policyName/scaleUp

This alarm says: check the AWS/ELB latency metric for yourLB every 60 seconds to see if the average latency is greater than 2 seconds. If true for two consecutive checks active the scaleUp policy.

After having created alarms with the CLI, they are visible in the AWS CloudWatch console. This makes it easy to add actions such as sending you an email when it occurs.

With the CLI you can view existing alarms and alarm history.

C:\>mon-describe-alarms --headers
alarmDown  INSUFFICIENT_DATA  arn:aws:...    AWS/ELB    Latency      60      Average    20            LessThanThreshold     2.0
alarmUp    INSUFFICIENT_DATA  arn:aws:...    AWS/ELB    Latency      60      Average    2             GreaterThanThreshold  2.0

alarmDown  2013-02-15T17:43:04.674Z  StateUpdate          Alarm updated from ALARM to INSUFFICIENT_DATA
alarmUp    2013-02-15T17:25:07.856Z  StateUpdate          Alarm updated from OK to INSUFFICIENT_DATA
alarmUp    2013-02-15T17:13:07.849Z  StateUpdate          Alarm updated from INSUFFICIENT_DATA to OK
alarmDown  2013-02-15T17:13:04.782Z  Action               Successfully executed action arn:aws:autoscaling...scaleDown
alarmDown  2013-02-15T17:13:04.693Z  Action               Successfully executed action arn:aws:sns...
alarmDown  2013-02-15T17:13:04.676Z  StateUpdate          Alarm updated from INSUFFICIENT_DATA to ALARM
alarmUp    2013-02-15T17:10:07.846Z  StateUpdate          Alarm updated from OK to INSUFFICIENT_DATA
alarmDown  2013-02-15T17:09:04.672Z  StateUpdate          Alarm updated from OK to INSUFFICIENT_DATA
alarmUp    2013-02-15T16:55:07.858Z  StateUpdate          Alarm updated from INSUFFICIENT_DATA to OK
alarmUp    2013-02-15T16:51:07.852Z  StateUpdate          Alarm updated from ALARM to INSUFFICIENT_DATA
alarmUp    2013-02-15T16:42:07.959Z  Action               Successfully executed action arn:aws:sns...
alarmUp    2013-02-15T16:42:07.883Z  Action               Successfully executed action arn:aws:autoscaling...scaleUp
alarmUp    2013-02-15T16:42:07.860Z  StateUpdate          Alarm updated from OK to ALARM
alarmDown  2013-02-15T16:41:04.667Z  StateUpdate          Alarm updated from INSUFFICIENT_DATA to OK
alarmUp    2013-02-15T16:40:07.848Z  StateUpdate          Alarm updated from INSUFFICIENT_DATA to OK
alarmDown  2013-02-15T16:36:44.631Z  ConfigurationUpdate  Alarm "alarmDown" updated
alarmUp    2013-02-15T16:36:23.182Z  ConfigurationUpdate  Alarm "alarmUp" updated
alarmDown  2013-02-15T16:32:05.630Z  ConfigurationUpdate  Alarm "alarmDown" created
alarmUp    2013-02-15T16:31:56.122Z  ConfigurationUpdate  Alarm "alarmUp" created

This is also available at AWS > CloudWatch > Alarms > yourAlarm > History [tab]

With the above in place we have a highly available, auto-scalable solution. Truely instance specific data like private IP used in callback urls can be dynamically determined rather than configured in properties files, and machine binding secrets can be passed in at launch via user data. But this leaves us with a stateless system. If you software has rarely used configuration files, you'll want to be able to update them on one instance and have the effect propagated to all instances. In the long run, you'll need to create an updated AMI with those changes so that newly launched instances will start with the lates configuration, although depending on your solution for these config files, they may not be part of the base AMI.

One option is a shared file system (either S3 or EBS) backed, that just contains your config files, potentially symlinked from their native locations. This is described on stackoverflow and turnkeylinux but can have the significant disadvantage of a single point of failure. If you file share goes down, all your instances go down. Another option is a share system like zookeeper, although unless that solution is distributed (zookeeper is centralized) it suffers from the same single-point-of-failure. Amazon S3 is a highly redundant system, however it has failed in the past and is not ideal for incremental changes to files. Since zookeeper support clustering it is possible to run it on each instance in your auto-scale group thus making it as reliable as the applications that depend on it. I talk more about this in Zookeeper in AWS.

{ "loggedin": false, "owner": false, "avatar": "", "render": "nothing", "trackingID": "UA-36983794-1", "description": "", "page": { "blogIds": [ 397 ] }, "domain": "", "base": "\/michael", "url": "https:\/\/\/michael\/", "frameworkFiles": "https:\/\/\/michael\/_framework\/_files.4\/", "commonFiles": "https:\/\/\/michael\/_common\/_files.3\/", "mediaFiles": "https:\/\/\/michael\/media\/_files.3\/", "tmdbUrl": "http:\/\/\/", "tmdbPoster": "http:\/\/\/t\/p\/w342" }