Notes and information that were collected while studying and prepping for the AWS SA Associate Exam.
Topic
Answer
Exam Time:
80 Minutes
No. Questions:
60 Questions
Question Types:
Scenario and Multiple Choice
Passing Score:
~ 70%
Validity Period:
2 years
Renewal Exam:
1/2 price off
General
Amazon History:
2003 - Chris Pinkman and Benjamin Black presented a paper on what Amazon's internal infrastructure should look like and suggested sellingit as a service
2004 - SQS the first AWS service launched
2006 - Official AWS Launch
2007 - 180K devs on platform
2010 - Amazon.com moved to AWS PlatformRegion is geographical area which consists of at least 2 AZ's
2012 - First Re-Invent conference in Las Vegas
2013 - Certifications Launched
2014 - AWS commited to achieve 100% renewable energy usage for its global footprint
2015 - AWS broke out it's revenue, 6 Billion USD per annum and growing close to 90% year after year
Global Infrastructure:
Regions Vs. Availability Zones:
A Region is geographical area which consists of at least 2 Availability Zone's or AZ's. An AZ is simply a data center.
AWS uses the first 4, and last IP addresses of a subnet:
x.x.x.0 - Network Address
x.x.x.1 - Gateway Address
x.x.x.2 - DNS Address
x.x.x.3 - Future Allocation Address
x.x.x.255 - Broadcast Address
Consolidated Billing:
Accounts roll for customers:
Paying account is independent, can not access resources of the other accounts
Linked accounts are independent from one another
Currently there is a limit of 20 linked accounts for consolidated billing (soft limit)
One bill per AWS account
Easy to track charges and allocate costs between linked accounts
Volume pricing discount
Resources across all linked accounts are tallied, and billing is applied collectively to allow bigger discounts
Active Directory:
Provides single sign-on to the AWS console, which authenticates directly off of your Active Directory infrastructure
Uses Secure Assertive Markup Language (SAML) authentication responses.
Behind the scenes, sign-In's use the AssumeRoleWithSAML API to request temporary security credentials and then constructs a sign-in URL for the AWS Management Console
Browser will then receive the sign-in URL and will be redirected to the console
You always authenticate against AD first, and then are granted security credentials that allow you to log into the AWS console
Best Practices:
Business Benefits of Cloud:
Almost 0 upfront infrastructure investment
Just in time Infrastructure
More efficient resource utilization
Usage based costing
Reduced time to market
Technical Benefits of Cloud:
Automation - Scriptable infrastructure
Auto-Scaling
Proactive Scaling
More efficient development life cycle
Improved testability
DR and Business Continuity
Overflow the traffic to the cloud
Design for Failure
Rule of thumb: Be a pessimist when designing architectures in the cloud
Assume things will fail, always design implement and deploy for automated recovery from failure
Assume your hardware will fail
Assume outages will occur
Assume that some disaster will strike your application
Assume that you will be slammed with more than the expected number of requests per second
Assume that with time your application software will fail too
Decouple your components:
Think SQS
Build components that do not have tight dependencies on each other so that if one component dies, fails, sleeps, or becomes busy, the other components are built so they can continue to work as if no failure is happening. Build each component as a black box
Service Limits:
Each service has the default limits defined, to see the official AWS documentation on service limits, check here
Lets you provision a logically isolated section of the AWS Cloud where you can launch AWS resources in a virtual network that you define. You have complete control over your virtual networking, IP ranges, creation of subnets and configuration of route tables and network gateways.
Virtual data center in the cloud
Allowed up to 5 VPCs in each AWS region by default
All subnets in default VPC have an Internet gateway attached
Multiple IGW's can be created, but only a single IGW can be attached to a VPC
Each EC2 instance has both a public and private IP address
If you delete the default VPC, the only way to get it back is to submit a support ticket
By default when you create a VPC, a default main routing table automatically gets created as well.
Subnets are always mapped to a single AZ's
Subnets can not be mapped to multiple AZ's
/16 is the largest CIDR block available when provisioning an IP space for a VPC
Amazon uses 3 of the available IP addresses in a newly created subnet
x.x.x.0 - Always subnet network address and is never usable
x.x.x.1 - Reserved by AWS for the VPC router
x.x.x.2 - Reserved by AWS for subnet DNS
x.x.x.3 - Reserved by AWS for future use
x.x.x.255 - Always subnet broadcast address and is never usable.
169.254.169.253 - Amazon DNS
By default all traffic between subnets is allowed
By default not all subnets have access to the Internet. Either an Internet Gateway or NAT gateway is required for private subnets
You can only have 1 Internet gateway per VPC
A security group can stretch across different AZ's
You can also create Hardware Virtual Private Network (VPN) connection between your corporate data center and your VPC and leverage the AWS cloud as an extension of your corporate data center
Network Address Translation (NAT) Instances:
When creating a NAT instance, disable Source/Destination checks on the instance or you could encounter issues
NAT instances must be in a public subnet
There must be a route out of the private subnet to the NAT instance in order for it to work
The amount of traffic that NAT instances support depend on the size of the NAT instance
If you are experiencing any sort of bottleneck issues with a NAT instance, then increase the instance size
HA can be achieved by using Auto-scaling groups, or multiple subnets in different AZ's with a scripted fail-over procedure
NAT instances are always behind a security group
Network Address Translation (NAT) Gateway:
NAT Gateways scale automatically up to 10Gbps
There is no need to patch NAT gateways as the AMI is handled by AWS
NAT gateways are automatically assigned a public IP address
When a new NAT gateway has been created, remember to update your route table
No need to assign a security group, NAT gateways are not associated with security groups
Preferred in the Enterprise
No need to disable Source/Destination checks
Network Access Control Lists (NACLS):
Numbered list of rules that are evaluated in order starting at the lowest numbered rule first to determine what traffic is allowed in or out depending on what subnet is associated with the rule
The highest rule number is 32766
Start with rules starting at 100 so you can insert rules if needed
Default NACL will allow ALL traffic in and out by default
You must assign a NACL to each subnet, if a subnet is not associated with a NACL, it will allow no traffic in or out
NACL rules are stateless, established in does not create outbound rule automatically
You can only assign a single NACL to a single subnet
VPC Peering:
Connection between two VPCs that enables you to route traffic between them using private IP addresses via a direct network route
Instances in either VPC can communicate with each other as if they are within the same network
You can create VPC peering connections between your own VPCs or with a VPC in another account within a SINGLE REGION
AWS uses existing infrastructure of a VPC to create a VPC peering connection. It is not a gateway nor a VPN, and does not rely on separate hardware
There is NO single point of failure for communication nor any bandwidth bottleneck
There is no transitive peering between VPC peers (Can't go through 1 VPC to get to another)
Hub and spoke configuration model (1 to 1)
Be mindful of IPs in each VPC, if multiple VPCs have the same IP blocks, they will not be able to communicate
You can peer VPC's with other AWS accounts as well as with other VPCs in the same account
Resource or Operation
Default Limit
Comments
VPCs per region:
5
The limit for Internet gateways per region is directly correlated to this one. Increasing this limit will increase the limit on Internet gateways per region by the same amount.
Subnets per VPC:
200
Internet gateways per region:
5
This limit is directly correlated with the limit on VPCs per region. You cannot increase this limit individually; the only way to increase this limit is to increase the limit on VPCs per region. Only one Internet gateway can be attached to a VPC at a time.
Customer gateways per region:
50
VPN connections per region:
50
VPN connections per VPC (per virtual private gateway):
10
Route tables per VPC:
5
Including the main route table. You can associate one route table to one or more subnets in a VPC.
Routes per route table (non-propagated routes):
50
This is the limit for the number of non-propagated entries per route table. You can submit a request for an increase of up to a maximum of 100; however, network performance may be impacted.
BGP advertised routes per route table (propagated routes):
5
You can have up to 100 propagated routes per route table; however, the total number of propagated and non-propagated entries per route table cannot exceed 100. For example, if you have 50 non-propagated entries (the default limit for this type of entry), you can only have 50 propagated entries. This limit cannot be increased. If you require more than 100 prefixes, advertise a default route.
Elastic IP addresses per region for each AWS account:
5
This is the limit for the number of VPC Elastic IP addresses you can allocate within a region. This is a separate limit from the Amazon EC2 Elastic IP address limit.
Security groups per VPC:
500
Inbound or outbound rules per security group:
50
You can have 50 inbound and 50 outbound rules per security group (giving a total of 100 combined inbound and outbound rules). If you need to increase or decrease this limit, you can contact AWS Support — a limit change applies to both inbound and outbound rules. However, the multiple of the limit for inbound or outbound rules per security group and the limit for security groups per network interface cannot exceed 250. For example, if you want to increase the limit to 100, we decrease your number of security groups per network interface to 2.
Security groups per network interface:
5
If you need to increase or decrease this limit, you can contact AWS Support. The maximum is 16. The multiple of the limit for security groups per network interface and the limit for rules per security group cannot exceed 250. For example, if you want 10 security groups per network interface, we decrease your number of rules per security group to 25.
This limit is the greater of either the default limit (350) or your On-Demand instance limit multiplied by 5. The default limit for On-Demand instances is 20. If your On-Demand instance limit is below 70, the default limit of 350 applies. You can increase the number of network interfaces per region by contacting AWS Support, or by increasing your On-Demand instance limit.
Network ACLs per VPC:
200
You can associate one network ACL to one or more subnets in a VPC. This limit is not the same as the number of rules per network ACL.
Rules per network ACL:
20
This is the one-way limit for a single network ACL, where the limit for ingress rules is 20, and the limit for egress rules is 20. This limit can be increased upon request up to a maximum if 40; however, network performance may be impacted due to the increased workload to process the additional rules.
Active VPC peering connections per VPC:
50
If you need to increase this limit, contact AWS Support . The maximum limit is 125 peering connections per VPC. The number of entries per route table should be increased accordingly; however, network performance may be impacted.
Outstanding VPC peering connection requests:
25
This is the limit for the number of outstanding VPC peering connection requests that you've requested from your account.
Expiry time for an unaccepted VPC peering connection request:
1 week (168 hrs)
VPC endpoints per region:
20
The maximum limit is 255 endpoints per VPC, regardless of your endpoint limit per region.
Flow logs per single eni, single subnet, or single VPC in a region:
2
You can effectively have 6 flow logs per network interface if you create 2 flow logs for the subnet, and 2 flow logs for the VPC in which your network interface resides. This limit cannot be increased.
NAT gateways per Availability Zone:
5
A NAT gateway in the pending, active, or deleting state counts against your limit.
AWS Direct Connect lets you establish a dedicated network connection between your network and one of the AWS Direct Connect locations. Using industry standard 802.1q VLANs.
Makes it easy to establish a dedicated network connection from your premises to AWS
Establish private connectivity between AWS and your data center, office or collocation environment
Can reduce network costs, increase bandwidth throughput, and provide more consistent network connectivity rather than Internet based connections.
Requires a dedicated line such as MPLS, or other circuit ran from tel-co.
From this line, you would have a cross connect from your on-premises device direct to AWS data centers
Resource or Operation
Default Limit
Comments
Virtual interfaces per AWS Direct Connect connection:
50
Active AWS Direct Connect connections per region per account:
Elastic Compute Cloud - Backbone of AWS, provides re-sizable compute capacity in the cloud. Reduces the time required to obtain and boot new server instances to minutes allowing you to quickly scale capacity, both up and down, as your computing requirements change.
Once an Instance has been launched with instance store storage, you can not attach additional instance store volumes after the instance is launched, only EBS volumes
When using an instance store volume, you can not stop the instance (the option to do so will not be available, as the instance moves to another host and and would cause complete data loss)
When using ephemeral storage, an underlying host failure will result in data loss
You can reboot both instance types (w/ephemeral and EBS volumes) and will not lose data, but again, an ephemeral volume based instance can NOT be stopped
By default both Root volumes will be deleted on termination, however you can tell AWS to keep the root device volume on a new instance during launch
No such thing as user-data, remember its always meta-data not user-data
Can not encrypt root volumes, but you can encrypt any additional volumes that are added and attached to an EC2 instance.
You can have up to 10 tags per EC2 instance
AWS does not recommend ever putting RAID 5's on EBS
When configuring a launch configuration for an auto-scaling group, the Health Check Grace Period is the period of time to ignore health checks while instances or auto-scaled instances are added and booting.
Termination protection is turned off by default, you must turn it on
Roles:
You can only assign an EC2 role to an instance on create. You can not assign a role after the instance has been created and/or is running
You can change the permissions on a role post creation, but can NOT assign a new role to an existing instance
Role permissions can be changed, but not swapped
Roles are more secure then storing your access key and secret key on individual EC2 instances
Roles are easier to manager, You can assign a role, and change permissions on that role at any time which take effect immediately
Roles can only be assigned when that EC2 instance is being provisioned
Roles are universal, you can use them in any region
Instance sizing:
T2 - Lowest Cost General Purpose - Web/Small DBs
M4 - General Purpose - App Servers
M3 - General Purpose - App servers
C4 - Compute Optimized - CPU Intensive Apps/DBs
C3 - Compute Optimized - CPU Intensive Apps/DBs
R3 - Memory Optimized - Memory Intensive Apps/DBs
G2 - Graphics / General Purpose - Video Encoding/Machine Learning/3D App Streaming
I2 - High Speed Storage - NoSQL DBs, Data Warehousing
Also referred to as ephemeral storage and is not persistent
Instances using instance store storage can not be stopped. If they are, data loss would result
If there is an issue with the underlying host and your instance needs to be moved, or is lost, Data is also lost
Instance store volumes cannot be detached and reattached to other instances; They exist only for the life of that instance
Best used for scratch storage, storage that can be lost at any time with no bad ramifications, such as a cache store
EBS (Elastic Block Storage):
Elastic Block Storage is persistent storage that can be used to procure storage to EC2 instances.
You can NOT mount 1 EBS volume to multiple EC2 instances instead you must use EFS
Default action for EBS volumes is for the root EBS volume to be deleted when the instance is terminated
By default, ROOT volumes will be deleted on termination, however with EBS volumes only, you can tell AWS to keep the root device volume
EBS backed instances can be stopped, you will NOT lose any data
EBS volumes can be detached and reattached to other EC2 instances
3 Types of available EBS volumes can be provisioned and attached to an EC2 instance:
General Purpose SSD (GP2):
General Purpose up to 10K IOPS
99.999% availability
Ratio of 3 IOPS per GB with up to 10K IOPS and ability to burst
Up to 3K IOPS for short periods for volumes under 1GB
Provisioned IOPS SSD (I01)
Designed for I/O intensive applications such as large relational or No-SQL DBs.
Use if need more than 10K IOPS
Magnetic (Standard)
Lowest cost per GB
Ideal for workloads where data is accessed infrequently and apps where the lowest cost storage is important.
Ideal for fileservers
Encryption:
Root Volumes cannot be encrypted by default, you need a 3rd party utility
Other volumes added to an instance can be encrypted.
AMIs:
AMI's are simply snapshots of a root volume and is stored in S3
AMI's are regional. You can only launch an AMI from the region in which it was stored
You can copy AMI's to other regions using the console, CLI or Amazon EC2 API
Provides information required to launch a VM in the cloud
Template for the root volume for the instance (OS, Apps, etc)
Permissions that control which AWS accounts can use the AMI to launch instances
When you create an AMI, by default its marked private. You have to manually change the permissions to make the image public or share images with individual accounts
Block device mapping that specifies volumes to attach to the instance when its launched
Hardware Virtual Machines (HVM) AMI's Available
Paravirtual (PV) AMI's Available
You can select an AMI based on:
Region
OS
Architecture (32 vs. 64 bit)
Launch Permissions
Storage for the root device (Instance Store Vs. EBS)
Security Groups:
Act like virtual firewalls for the associated EC2 instance
If you edit a security group, it takes effect immediately.
You can not set any deny rules in security groups, you can only set allow rules
There is an implicit deny any any at the end of the security group rules
You don't need outbound rules for any inbound request. Rules are stateful meaning that any request allowed in, is automatically allowed out
You can have any number of EC2 instances associated with a security group
Snapshots:
You can take a snapshot of a volume, this will store that volumes snapshot on S3
Snapshots are point in time copies of volumes
The first snapshot will be a full snapshot of the volume and can take a little time to create
Snapshots are incremental, which means that only the blocks that have changes since your last snapshot are moved to S3
Snapshots of encrypted volumes are encrypted automatically
Volumes restored from encrypted snapshots are encrypted automatically
You can share snapshots but only if they are not encrypted
Snapshots can be shared with other AWS accounts or made public in the market place again as long as they are NOT encrypted
If you are making a snapshot of a root volume, you should stop the instance before taking the snapshot
RAID Volumes:
If you take a snapshot, the snapshot excludes data held in the cache by applications or OS. This tends to not be an issue on a single volume, however multiple volumes in a RAID array, can cause a problem due to interdependencies of the array
Take an application consistent snapshot
Stop the application from writing to disk
Flush all caches to the disk
Snapshot of RAID array --> 3 Methods:
Freeze the file system
Unmount the RAID Array
Shutdown the EC2 instance --> Take Snapshot --> Turn it back on
Placement Groups:
A logical group of instance in a single AZ
Using placement groups enables applications to participate in a low latency, 10Gbps network
Placement groups are recommended for applications that benefit from low network latency, high network throughput or both
A placement group can't span multiple AZ's so it is a SPoF.
Then name you specify for a placement group must be unique within your AWS account
Only certain types of instances can be launched in a placement group. Computer Optimized, GPU, Memory Optimized, and Storage Optimized.
AWS recommends that you use the same instance family and same instance size within the instance group.
You can't merge placement groups
You can't move an existing instance into a placement group
You can create an AMI from your existing instance and then launch a new instance from the AMI into a placement group
Pricing Models:
On Demand:
Pay fixed rate by the hour with no commitment
Users that want the low cost and flexibility of EC2
Apps with short term, spiky or unpredictable workloads that cannot be interrupted
Apps being developed or tested on EC2 for the first time
Reserved:
Provide capacity reservation and offer significant discount on the hourly charge for an instance (1-3 year terms)
Applications have steady state, or predictable usage
Apps that require reserved capacity
Users able to make upfront payments to reduce their total computing costs even further.
Spot:
Bid whatever price you want for instance capacity by the hour
When your bid price is greater than or equal to the spot price, your instance will boot
When the spot price is greater than your bid price, your instance will terminate with an hours notice.
Applications have flexible start and end times
Apps that are only feasible at very low compute prices
Users with urgent computing needs for large amounts of additional capacity
If the spot instance is terminated by Amazon EC2, you will not be changed for a partial hour of usage
If you terminate the instance yourself you WILL be charged for any partial hours of usage.
Elastic Load Balancing offers two types of load balancers that both feature high availability, automatic scaling, and robust security. These include the Classic Load Balancer that routes traffic based on either application or network level information, and the Application Load Balancer that routes traffic based on advanced application level information that includes the content of the request.
When configuring ELB health checks, bear in mind that you may want to create a file like healthcheck.html or point the ping path of the health check to the main index file in your application
Remember the health check interval is how often a health check will occur
Your Healthy/Unhealthy thresholds are how many times either will check before marking the origin either healthy or unhealthy
Health Check Interval: 10 seconds
Unhealthy Threshold: 2
Healthy Threshold: 3
This means that if the health check interval occurs twice without success, then the source will be marked as unhealthy. This is 2 checks @ 10 seconds per check, so basically after 20 seconds the origin will be marked unhealthy
Likewise, if the healthy threshold is marked at 3, then it would be 3 x health check interval or 10 seconds being 30 seconds. After 30 seconds with 3 consecutive success checks, the origin will be marked as healthy.
Enable Cross-Zone Load Balancing will distribute load across all back-end instances, even if they exist in different AZ's
ELBs are NEVER given public IP Addresses, only a public DNS name
ELBs can be In Service or Out of Service depending on health check results
Charged by the hour and on a per GB basis of usage
Must be configured with at least one listener
A listener must be configured with a protocol and a port for front end (client to ELB connection), as well as a protocol and port for backed end (ELB to instances connection)
ELBs support HTTP, HTTPS, TCP, and SSL (Secure TCP)
ELBs support all ports (1-65535)
ELBs do not support multiple SSL certificates
Classic ELBs support the following ports:
25 (SMTP)
80 (HTTP)
443 (HTTPS)
465 (SMTPS)
587 (SMTPS)
1024-65535
HTTP Error Codes:
200 - The request has succeeded
3xx - Redirection
4xx - Client Error (404 not found)
5xx - Server Error
Application Load Balancer Limit
Default Limit
Load balancers per region:
20
Target groups per region:
50
Listeners per load balancer:
10
Targets per load balancer:
1000
Subnets per Availability Zone per load balancer:
1
Security groups per load balancer:
5
Rules per load balancer (excluding defaults:
10
No. of times a target can be registered per LB:
100
Load balancers per target group:
1
Targets per target group :
1000
Classic Load Balancer Limit
Default Limit
Load balancers per region:
20
Listeners per load balancer:
100
Subnets per Availability Zone per load balancer:
1
Security groups per load balancer:
5
Load Balancers per Region Limit NOTE:
This limit includes both your Application load balancers and your Classic load balancers. This limit can be increased upon request.
Amazon EC2 Container Service (ECS) is a highly scalable, high performance container management service that supports Docker containers and allows you to easily run applications on a managed cluster of Amazon EC2 instances.
Not covered as exam topic currently
Resource or Operation
Default Limit
Number of clusters per region per account:
1000
Number of container instances per cluster:
1000
Number of services per cluster:
500
For additional information about Elastic Container Service Limits, see Limits in Amazon ECS
AWS Elastic Beanstalk is an easy-to-use service for deploying and scaling web applications and services developed with Java, .NET, PHP, Node.js, Python, Ruby, Go, and Docker on familiar servers such as Apache, Nginx, Passenger, and IIS.
Compute service that runs your code in response to events and automatically manages the underlying compute infrastructure resources for you.
Serverless processing
AWS lambda can automatically run code in response to modifications to objects in S3 buckets, messages arriving in Amazon Kinesis streams, table updates in DynamoDB, API call logs created by CloudTrail, and custom events from mobile applications, web applications, or other web services
Lambda runs your code on high-availability compute infrastructure and performs all of the administration of the compute resources including server and operating system maintenance, capacity provisioning and automatic scaling, node and security patch deployment, and code monitoring and logging.. All you need to do is supply the code.
Supports NodeJs, Python 2.x, Java
99.99% availability for both the service itself and the functions it operates.
First 1 million requests are free
0.20 per 1 million requests thereafter
Duration is calculated from the time your code begins executing until it returns or otherwise terminates, rounded up to the nearest 100 ms
The price depends on the amount of memory you allocate to your function. You are charged 0.00001667 for every GB-second used
Free Tier gives you 1 Million free requests per month, and 400K GB-Seconds of compute time per month
The memory size you choose for your functions, determines how long they can run in the free tier
The lambda free tier does not automatically expire at the end of your 12 month AWS free tier term, but is available to both existing and new AWS customers indefinitely
Functions can be ran in response to HTTP requests using API Gateway or API calls made using AWS SDKs
Amazon Simple Storage Service (Amazon S3), provides developers and IT teams with secure, durable, highly-scalable cloud storage. Amazon S3 is easy to use object storage, with a simple web service interface to store and retrieve any amount of data from anywhere on the web.
Object based storage only for files, can not install OS or applications
Data is spread across multiple devices and multiple facilities
Can loose 2 facilities and still have access to your files
Files can be between 1 byte and 5TB, and has no storage limit
Files are stored flatly in buckets, Folders don't really exist, but are part of the file name
S3 bucket names have a universal name-space, meaning each bucket name must be globally unique
S3 Stores data in alphabetical order (lexigraphical order)
Read after write consistency for PUTS of new objects (As soon as you write an object, it is immediately available)
Eventual consistency for overwrite PUTS and DELETES. (Updating or deleting an object could take time to propagate)
S3 is basically a key value store and consists of the following:
Key - Name of the object
Value - Data made up of bytes
Version ID (important for versioning)
Meta-data - Data about what you are storing
ACLs - Permissions for stored objects
Amazon guarantees 99.99% availability for the S3 platform
Amazon guarantees 99.999999999% durability for S3 information (11 x 9's)
Tiered storage, and life-cycle management available
Versioning is available but must be enabled. It is off by default
Offers encryption, and allows you to secure the data using ACLs
S3 charges for storage, requests, and data transfer
Bucket names must be all lowercase, however in US-Standard if creating with the CLI tool, it will allow capital letters
The transfers tab shows uploads, downloads, permission changes, storage class changes, etc..
When you upload a file to S3, by default it is set private
You can transfer files up to 5GB using PUT requests
You can setup access control to control your buckets access by using bucket policies or ACLs
Change the storage class under the Properties tab when an object is selected
S3 buckets can be configured to create access logs which logs all requests to the S3 bucket
S3 Events include SNS, or SQS events or Lambda functions. Lambda is location specific, not available in South Korea
All storage tiers have SSL support, millisecond first byte latency, and support life-cycle management policies.
Storage Tiers:
Standard S3:
Stored redundantly across multiple devices in multiple facilities
Designed to sustain the loss of 2 facilities concurrently
11-9's durability, 99.99% availability
S3-IA (Infrequently Accessed):
For data that is accessed less frequently, but requires rapid access when needed
Lower fee than S3, but you are charged a retrieval fee
Also designed to sustain the loss of 2 facilities concurrently
11-9's durability, 99.99% availability
Reduced Redundancy Storage (RSS):
Use for data such as thumbnails or data that could be regenerated
Costs less than Standard S3
Designed to provide 99.99% durability and 99.99% availability of objects over a year
Designed to sustain the loss of a single facility
Glacier:
Very cheap, Stores data for as little as $0.01 per gigabyte, per month
Optimized for data that is infrequently accessed. Used for archival only
It takes 3-5 hours to restore access to files from Glacier
Versioning and Cross-Region Replication (CRR):
Versioning must be enabled in order to take advantage of Cross-Region Replication
Versioning resides under Cross Region Replication tab
Once Versioning is turned on, it can not be turned off, it can only be suspended
If you truly wanted versioning off, you would have to create a new bucket and move your objects
When versioning is enabled, you will see a slider tab at the top of the console that will enable you to hide/show all versions of files in the bucket
If a file is deleted for example, you need to slide this tab to show in order to see previous versions of the file
With versioning enabled, if you delete a file, S3 creates a delete marker for that file, which tells the console to not display the file any longer
In order to restore a deleted file you simply delete the delete marker file, and the file will then be displayed again in the bucket
To move back to a previous version of a file including a deleted file, simply delete the newest version of the file or the delete marker, and the previous version will be displayed
Versioning does store multiple copies of the same file. So in the example of taking a 1MB file, and uploading it. Currently your storage usage would be 1MB. Now if you update the file with small tweeks, so that content changes, but the size remains the same, and upload it. With the version tab on hide, you will see only the single updated file, however if you select show on the slider, you will see that both the original 1MB file exists as well as the updated 1MB file, so your total S3 usage is now 2MB not 1MB
Versioning does NOT support de-duplication or any similar technology currently
For Cross Region Replication (CRR), as long as versioning is enabled, clicking on the tab will now give you the ability to suspend versioning, and enable cross region replication
Cross Region Replication (CRR) has to be enabled on both the source and destination buckets in the selected regions
Destination bucket must be created and again globally unique (can be created right from the versioning tab, in the CRR configuration section via button)
You have the ability to select a separate storage class for any Cross Region Replication destination bucket
CRR does NOT replicate existing objects, only future objects meaning that only objects stored post turning the feature on will be replicated
Any object that already exists at the time of turning CRR on, will NOT be automatically replicated
Versioning integrates with life-cycle management and also supports MFA delete capability. This will use MFA to provide additional security against object deletion
Life-cycle Management:
When clicking on Life-cycle, and adding a rule, a rule can be applied to either the entire bucket or a single 'folder' in a bucket
Rules can be set to move objects to either separate storage tiers or delete them all together
Can be applied to current version and previous versions
If multiple actions are selected for example transition from STD to IA storage 30 days after upload, and then Archive 60 days after upload is also selected, once an object is uploaded, 30 days later the object will be moved to IA storage. 30 days after that the object will be moved to glacier.
Calculates based on UPLOAD date not Action data
Transition from STD to IA storage class requires MINIMUM of 30 days. You can not select or set any data range less than 30 days
Archive to Glacier can be set at a minimum of 1 day If STD->IA is NOT set
If STD->IA IS set, then you will have to wait a minimum of 60 days to archive the object because the minimum for STD->IA is 30 days, and the transition to glacier then takes an additional 30 days
When you enable versioning, there will be 2 sections in life-cycle management tab. 1 for the current version of an object, and another for previous versions
Minimum file size for IA storage is 128K for an object
Can set policy to permanently delete an object after a given time frame
If versioning is enabled, then the object must be set to expire, before it can be permanently deleted
Can not move objects to Reduced Redundancy using life-cycle policies
S3 Transfer Acceleration:
Utilizes the CloudFront Edge Network to accelerate your uploads to S3
Instead of uploading directly to your S3 bucket, you can use a distinct URL to upload directly to an edge location which will then transfer the file to S3
There is a test utility available that will test uploading direct to S3 vs through Transfer Acceleration, which will show the upload speed from different global locations
Turning on and using Transfer Acceleration will incur an additional fee
2 types of encryption available:
In transit:
Uses SSL/TLS to encrypt the transfer of the object
At Rest (AES 256):
Server Side: S3 Manged Keys (SSE-S3)
Server Side: AWS Key Management Service, Managed Keys (SSE-KMS)
Server Side: Encryption with Customer provided Keys (SSE-C)
Client Side Encryption
Pricing (What your charged for when using S3):
Storage used
Number of Requests
Data Transfer
Resource or Operation
Default Limit
Buckets per account:
100
Largest files size you can transfer with PUT request:
Amazon CloudFront is a global content delivery network (CDN) service that accelerates delivery of your websites, APIs, video content or other web assets.
Edge Location is the location where content will be cached, separate from an AWS Region/AZ
Origin is the origin of all files, can be S3, EC2 instance, a ELB, or Route53
Distribution is the name given to the CDN which consists a collection of edge locations
Web Distributions are used for websites
RTMP - (Real-Time Messaging Protocol) used for streaming media typically around adobe flash files
Edge locations can be R/W and will accept a PUT request on an edge location, which then will replicate the file back to the origin
Objects are cached for the life of the TTL (24 hours by default)
You can clear objects from edge locations, but you will be charged
When enabling cloudfront from an S3 origin, you have the option to restrict bucket access; this will disable the direct link to the file in the S3 bucket, and ensure that the content is only served from cloudfront
The path pattern uses regular expressions
You can restrict access to your distributions using signed URLS
You can assign Web Application Firewall rules to your distributions
Distribution URLs are going to be non-pretty names such as random_characters.cloudfront.com; you can create a CNAME that points to the cloudfront name to make the URL user friendly
You can restrict content based on geographical locations in the behaviors tab
You can create custom error pages via the error pages tab
Purging content is handled in the Invalidations tab
Resource or Operation
Default Limit
Data transfer rate per distribution:
40 Gbps
Requests per second per distribution:
100,000
Web distributions per account:
200
RTMP distributions per account:
100
Alternate domain names (CNAMEs) per distribution:
100
Origins per distribution:
25
Cache behaviors per distribution:
25
White-listed headers per cache behavior:
10
White-listed cookies per cache behavior:
10
SSL certificates per account when serving HTTPS requests using dedicated IP addresses (no limit when serving HTTPS requests using SNI):
2
Custom headers that you can have Amazon CloudFront forward to the origin:
File storage service for EC2 instances. Its easy to use and provides a simple interface that allows you to create and configure file systems quickly and easily. With EFS storage capacity is elastic, growing and shrinking automatically as you add and remove files so your applications have the storage they need, when they need it.
Think NFS, only without a set storage limit
Supports NFSv4, and you only pay for the storage you use
Billing rate is 30 cents per GB
Can scale to exabytes
Can support thousands of concurrent NFS connections
Data is stored across multiple AZ within a region
Block based storage.
Can be shared with multiple instances
Read after Write Consistency
You must ensure that instances that will mount EFS are in the same security group as the EFS allocation. If they are not, you can modify the security groups, and add them to the same security group that was used to launch the EFS storage
The AWS Storage Gateway is a service connecting an on-premises software appliance with cloud-based storage to provide seamless and secure integration between an organization’s on-premises IT environment and AWS’s storage infrastructure. On-premise virtual appliance that can be downloaded and used to cache S3 locally at a customers site
Replicates data to and from AWS platform
Gateway Cached Volumes:
Entire dataset is stored on S3 and the most frequently accessed data is cached on-site
These volumes minimize the need to scale your on-prem storage infrastructure while providing your applications with low-latency access to their frequently accessed data
Can create storage volumes up to 32TBs in size and mount them as iSCSI devices from your on-premises application servers.
Data written to these volumes is stored in S3, with only a cache of recently written and recently read data stored locally on your on-premises storage hardware.
Gateway Stored Volumes:
Store your primary data locally while asynchronously backing up that data to AWS
Provide low-latency access to their entire datasets, while providing durable, off-site backups.
Can create storage volumes up to 1TB in size and mount them as iSCSI devices from your on-premises application servers.
Data written to your gateway stored volumes is stored on your on-prem storage hardware, and asynchronously backed up to S3 in the form of EBS snapshots.
Gateway Virtual Tape Library (VTL):
Used for backup and uses popular backup applications like NetBackup, Backup Exec and Veam
Pricing:
You pay for what you use
Has 4 pricing components:
Gateway usage (per gateway per month)
Snapshot storage usage (per GB per month)
Volume storage usage (Per GB per month)
Data transfer out (Per GB per month)
Import Export:
Import Export Disk - Import to EBS, S3, Glacier but only export to S3
Pay for what you use
Has 3 pricing components:
Per device fee
Data load time charge per data-loading-hour
Possible return shipping charges for expedited shipping or shipping to destinations not local to the Import/Export region