Wednesday, November 13, 2013

AWS Conference

The sessions I attended aren't released yet, but they will be available on youtube.


[seems like vaporware] Xceedium provides single signon. Uses password vault. Integrates with ldap. Adds granularity so one user can't see another's ec2instances. Adds audit record can playback actions inside the aws console.

[seems really good for mid to large size organization] Cloudability uses read access to your usage stats s3 bucket and tags you hopefully consistantly add to your ec2 assets to produce billing reports and predictions and helps you plan your reserved instance purchases.

[seems essential for a large organization] Datadog has a very slick interface for showing you stats about your systems. It also shows bubbles on charts to tell you what to watch for that might be bad. Also has a chat-like interface inside the stats display (per event) so people in your IT can disscuss and solve your problems. This also serves as an historicla knowledge base and is full text searchable.


Use tcp backend sparingly. Primarily use http. You'll need both db and static data storage.

Ec2 behind elb from day1 to get used to the pattern. Use rds for db. This is the core. Http based api talking json to ec2 storing in your db.

For production, spin up 2nd avail zone. Ckeck milti-az on your db and rds will handle failover aotomatically.

Use s3 for your static content. This doesn't go through your elb. From your app both elb and s3 are just restful apis.

For more load use autoscale and elasticache For more load cloudfront is important. I.e game updates. At large scale is cheaper. Cloudfront isn't just get. Userdata put can get to s3 via cloudfront. I.e. S3 buckets have a geographic location.

Above is tricky. Elb can make it easier. Plus adds cloudwatch. Url swap gives zero downtime deploy. Hard to do on your own. Elb is not a proxy. It's like config macros. Thus, start with elb.

Dbs are a problem for games because they're write heavy. Thus bottleneck. Solution is dynmodb. Nosql. Thus no relational cost. You specify throughput, amazon does config in background. Thus start with dynmodb instead of rds.

Sns gives mobile push. Good for social aspect of gaming. Also use this for cloudwatch to notify you about usage peaks or errors in your system.

Use sqs to do background jobs like avatar resize or leaderboard recalc that you don't want bogging down your front end servers.

Combo: cloudwatch uses sns to notify you of problem. You reply with sms to sns hooked to sqs that launches background process to fix the problem.

Analytics: gamers prefer redshift to mapreduce. Both have adapters for s3 and dynamo.

Game writes log to device. Periodically uploads to s3 via cloudfront. Gets picked up by redshift. Same deal for apache server logs.


Normally you read manuals and write a script for how to provision and connect all the components in your architecture. Vloud formation is a formal way to do this that has built in knowledge for how to configure and connect aws components.

Cloudformation gives you automation. Intuit has been successful with this. Started with simple migration to AWS using ec2, s3, rds, elb. Managing was alone was hard. Started using cloudformation and chef.

Infrastructure as code. Write modular json templates. Treat as source with checkins and peer review. Called this: modular templates, loosely coupled. Writes a json cloudformation stack per archicture tire, i.e one json scrips for sns.

Now they have many json stacks which has become dificult to manage. Finally wrote best way to use this is with a blue/green deploy. Use clone to create new environment whith a few slight changes then once it proves sane use destrouto release all the previous resources. His scripts are highly dependant on chef.

New CF features: parallel stack processing, richer template language, user defined resource names.

It looks like IAM is more powerful than I thought. For example you can have some users who can only touch certain cloud formation stacks (say dev) and others who can only touch others (say ops). Furthermore you can have CF tag all the resources it creates and have an IAM policy that says certainusers can't touch any resources with that tag.

Use CF to do a rolling software update on EC2 in an autoscale group. You can tell it to do it in batches with pause between batches so tha if the update doesn't have the desired effect you canstop the rolling update and cycle out the already updated machines.


Nasa claims that aws is more secure than on premis.

People in your organization may already hace uncontrolled cloud accounts.

Paying aws for support, especially for the early years was very helpful. Demonstrated that was acts as a partner, not just a provider.

Automate your asset provisioning and compliance monitoring. Generally trust but verify. It's important to tell people how instead of telling people no.

Instead of giving people direct access to aws you can have them sign in to a local portal and proxy commands to aws so you can control what they can do.

Use local keys to encrypt some data that is uploaded to aws.

Governance in the cloud is achievable. JPL requiremements were based on NIST853 Rather than checking the box that says "yes I have a fence around my data center" you have to say, that is supplied by this line in our service agreement with amazon. That's easien now that govcloud has fedram certification.

Commanding a space craft directly out of amazon is probably a bad idea, but most things can be migrated to the cloud.

If you have a public ec2 instance than you don't have netwok logs. That's okay for some use cases bututher things should be in a vpc with a layer between you and the web.

Suggests "Trusted Advisor" as an app that will greatly help you with compliance.

They have approved amis with their security software, and the amazon apis can tell him when someone has launched a non-approved instance.

Claims that IAM is very powerful but you have to understnad how it works. They have a forensics IAM poweruser. The root is always owned by the secueity team. Thus the organization has multiple amazon root i.e. Billing accounts but security effectively controlls them all.

How is the cloud better than what you have in site.

- incident response: forensics can look at machines without knowledge of the owner. Also amazon will fgive them the ram of a live machine. When an instance is deemed a bad actor they can contain it with a security group, bur can still investigate because they didn't have to kill it.

- visibility: api access gives instant up to date satats about all artefacts in the system. Outside of cloud you rely on your staff to physically provide this data.

- data integrity: encrypted s3/glacier with geographic diversity is the safest data of all time.

- compliance: was initially an issue. But now with automatation and stats its easier than on premis, although he did mention that they had waviers.

- aws security: gives announcements like "we see this vulnerability trending and you need this tomcat patch to be safe", and aws is more incented to have a bulletproof hypervisor than you are incented to keepyour stuff safe so that's not the weak link.

See governance talk and build from there.

If you have a devops style you must train the developers about iptables, etc.

Automate everything. Use the api, not the gui.

Get support from amazomn so that you have a trust relationship.

Nasa jpl lead security engineer asserts that aws is more secure (when used correctly) than any on-prem could be.

When asked about insider threat his answer suggested that his source of confidence was cinversation with amazon support not some posted document.

Most custom tools they use are very old and more elegant tools exist today.

Started weak but was a very good talk.


Sub atomic particles... Because my 415 was moved.

Spot is thr primary way that manyworkloads should use ec2. Can ger a x10 cost savings.

HTCondor is a mechanism to cleanly distribute jobs accross spot instances and cleanly detach when those spots are reclaimed and cleanly detach from ondemand when cheaper spots become available.


First step in customer focused delivery is to write the press release and faq. Then you know what is the important part of what you're trying to deliver.

Quote: security may be the most important thing in aws Security: systems are locked down by default. IAM supports SAML 2.0 Expects that in a few years encryption od data in transit and at rest will be the default. This is already the case with redshift.

Airbnb has a 5 person operations team.

Dropcam inputs petabytes per month. Free inbound bandwidth. Awesome service.


Large scale distributive systems.

TLA by Leslie Lamport and PlusCal used to specify before implementation.


Our first priority day in and day out is security.

June 2013 launched fine grain resource level permissions. I.e. Only mike cah shutdown this instance.

Calgary scientific. Pure web softepware dev kit. Pureweb is diferent from appstream. Medical imaging. Mri, ct, is causing an explosion of medical imiaging data. Fda cleared for web and ios. Mayo clinic is a user of pureweb. Saved 11minutes in stroke diagnosis.

Doing cpu bound processing in the cloud and only streaming screen image to the device. Supports collaboration accross multiple devices simultaneously.


S3 is eventually consistent.


S3/glacier helpers: (with local cache) - aws storage gateway - twin strata - panzura - cloudberry - cycle dataman - aspera - crossftp

Customer has 20tb of catscans, etc which they will upload to s3 and perform coutations there.

Send to s3 first then use a policy to push to glacier, that way all your object references stay in s3 which makes it easier to use when retrieved from glacier.

What are they doing today. Storing it on hd a d dvd in a fie proof safe. Difficult to use. Difficult to share.

Janssen diagnostics, lab image problem. J&J, huntington PA, beerse Belgium Volume 20tb/yr Fda valdiated solution

Use concurrent transfers into s3 Generally compress and then encrypt Per month less than 5% egress -free?

Make sure your router is actually using your directconnect. Saturating direct connect requires multiple hosts. Confirm all links perform at advertised speed.


IDS in AWS should take advantage of the AWS environment and thus will look different.

Create IAM roles/identities with minimal access and multifactor for everything of value.

Turn on s3 bucket logging.

Create a security role in each aws account. A role is a user without credentials. It canbe used where ever a user can be used.

Use s3 as write-once-storage for config comparison via s3 versioning and multi-factor delete. Go hardcore and create a second account so even if production is comprimised they can't touch your write-once storage.

In IAM go to users to see what credentials they have. If they have any you don't expect, you've been comprimised.

S3, sqs, sns policies are part of your security perimiter. Watch for implicit permission, i.e. If you can reset the admin password then you are admin.

Use a python script to daily dump your aws policies and config and diff it with your write once copy and notify someone of changes.

Can use provided by aws.

Given that the attacker broke in, how can you tell what they've done. Use s3 logging. Also use billing alerts to notice if the attacker is stealing your resources. You can link accounts so your write once account notifys you about mass spending on your production account.

Rebuild instances daily. Then you only have a day's worth to investigate. And if anattacker gets access the lose itwhen the instance gets recycled.

If you pay for support then aws trusted advisor will warn you about dangerous config. Aws support is trained to escalate potential security issues to the aws security team.

New security best practices whitepaper.


Encryption and key management: in aws, partner solution, cloud hsm.

S3, glacier, redshift, rds-oracle-mssql all support automatic server side encryption prior to any writes to disk.

Activate? S3 is just a checkbox.

Each service genetates aes-256 key per object, archive, cluster db.

Regularly rotated master encryption keys stored separately from ciphertext.

Client side encryption is same deal but you do all the work. To make this easier the s3 sdk (java, ruby, .net) provides an s3 encryption client. This aws has supplied the code but only you have access to your keys.

Key management infrastructure partner solutions: trend, safenet, gazzang, ciphercloud, voltage, vormetric.

CloudHSM. You have dedicated access to the appliance. Lives in your vpc. Only you have the securityoficer role. Aws just has the admin role.

Cloudformatation template provisions vpc and subnets and ec2 with safenet drivers. Is included in 2013 pci compliance.

Safenet protectv encrypts io from ec2 to ebs.

Safenet protectapp same deal for s3.

Redshift natively directly uses your cloudhsm.

Entersekt migrated their entire infrastructure to aws. They've integrated with lastpass.

Netflix uses cloudhsm. Lots of application level use cases. Pasword reset tokens. Data encryption. DRM. Hash/verify.

Complexity is anti-security. They have three levels. Low (instance has key. Gives hifh throughput), medium (key lives elsewhere and is accesses via rest layer), high (acessed via rest call but lives on hsm).

For low and medium keys, the proxy layer maintains a key db. That db requires high level. Internal ca also uses high for its root. Device activation also ises high.

They want to move payment fto the cloud and are looking at pci.

Using hsms per region. Lower latency and saved 33% cost.

Just releasec whitepaper on data enc and key management. Several other resources in the slides.

{ "loggedin": false, "owner": false, "avatar": "", "render": "nothing", "trackingID": "UA-36983794-1", "description": "", "page": { "blogIds": [ 466 ] }, "domain": "", "base": "\/michael", "url": "https:\/\/\/michael\/", "frameworkFiles": "https:\/\/\/michael\/_framework\/_files.4\/", "commonFiles": "https:\/\/\/michael\/_common\/_files.3\/", "mediaFiles": "https:\/\/\/michael\/media\/_files.3\/", "tmdbUrl": "http:\/\/\/", "tmdbPoster": "http:\/\/\/t\/p\/w342" }