Amazon Relational Database Service (Amazon RDS) makes it easy to set up, operate, and scale a relational database in the cloud. Amazon RDS provides you six familiar database engines to choose from, including Amazon Aurora, Oracle, Microsoft SQL Server, PostgreSQL, MySQL and MariaDB.
Traditional relational databases that include tables, rows, fields
On-Line Transaction Processing (OLTP) type DB
You can copy a snapshot to another region if you want to have your database available in another region
You scale your DB by taking a snapshot and doing a restore to a larger sized tier
RDS maximum size for a MS SQL Server DB with SQL Server Express Edition is 10GB per DB
Supported RDS Platforms:
MS SQL Server
Oracle
MySQL Server
PostgreSQL
Aurora
MariaDB
When a backup is restored, the restore will always be a new RDS instance, with a new DNS name
Backup types:
Automated backups
Allows you to recover your database to any point in time within a retention period
Retention periods can be between 1 and 35 days
Takes a full daily snapshot and will also store transaction logs through the day
When you do a recovery, AWS will choose the most recent daily backup and then apply transaction logs
Allows you to do a point in time recover down to a second within the retention period
Enabled by default
Backup data is stored in S3
You get free storage space equal to the size of your database.
Taken within a defined window
During the backup, storage I/0 may be suspended and you may experience extended latency
Database snapshots
User initiated from the console
Stored even after you delete the original RDS instance unlike automatic backups
Encryption:
Encryption at rest is supported for MySQL, Oracle, SQL Server, PostgreSQL, and MariaDB
Encryption is done using the AWS Key Management Service (KMS)
Once your RDS instance is encrypted the data stored at rest in the underlaying storage is encrypted, as are its automated backups, read replicas and snapshots
To use RDS encryption, create a new DB instance with encryption enabled and migrate your data to it
Encrypting an existing DB instance is not supported
Multi-AZ:
Allows you to have an exact copy of your production database in another AZ
AWS handles the replication for you, so when your prod database is written to, the write will automatically be synchronized to the stand-by DB
In the event of DB maintenance, instance failure or AZ failure, RDS will automatically fail-over to the standby so that database operations can resume quickly without Admin intervention.
In a fail-over scenario, the same DNS name is used to connect to the secondary instance, There is no need to reconfigure your application
Multi AZ configurations are used for HA/DR only, and is not used for improving performance
To scale for performance you need to set up read replicas
Available for SQL Server, Oracle, MySQL, PostGreSQL, and Aurora
Read Replica's:
Uses asynchronous replication, from the primary instance to other instances that can be read from
You can have up to 5 read replicas of your main database
Allow you to have a read only copy of your prod database
Used primarily for very read-heavy database workloads
SQL Server and Oracle are not supported
Used for scaling not DR
Must have automatic backups setup
You can have read replicas of read replicas (but could incur latency as its daisy chained)
Each read replica will have its own DNS endpoint
You cannot have read replicas that have Multi-AZ
You can create read replicas of Multi-AZ source databases however
Read Replicas can be promoted to be their own databases, however this breaks replication
Read Replicas in a second region for MySQL and MariaDB, not for PostgreSQL
Read Replicas can be bigger than the primary source DB from a resource perspective
Aurora:
MySQL compatible relational database engine that combines speed and availability of high end commercial databases with the simplicity and cost-effectiveness of open source databases
Provides up to 5 times better performance than MySQL at a price point 1/10th of a commercial database while delivering similar performance and availability
Starts with 10GB, scales in 10GB increments up to 64TB (Storage Auto scaling)
Compute resources can scale up to 32 vCPUs and 244 GB of memory
Maintains 2 copies of your data contained in each availability zone, with minimum of 3 AZs. 6 copies of your data
Designed to transparently handle the loss of up to two copies of data without affecting the DB write availability and up to 3 copies without affecting read availability
Designed to handle loss of up to 2 copies without affecting DB write availability
Designed to handle loss of up to 3 copies without affecting DB read availability
Self healing storage, data blocks and disks are continuously scanned for errors and repaired automatically
2 Types of replicas available:
Aurora Replicas - Separate aurora DB, can have up to 15 replicas
MySQL read replicas, can have up to 5
If a failure occurs of the primary database, a fail-over will happen automatically to an aurora replica, but will NOT auto fail over to a MySQL read replica.
Fast and flexible NoSQL DB service for all apps that need consistent, single-digit millisecond latency at any scale. It is a fully managed database and supports both document and key-value data models. Its flexible data model and reliable performance make it a great fit for mobile, web, gaming, ad-tech, IoT, and many other applications.
Non Relational DB (No-SQL), comprised of collections (tables), of documents (rows), with each document consisting of key/value pairs (fields)
Document oriented DB
Offers push button scaling, meaning that you can scale your db on the fly without any downtime
RDS is not so easy, you usually have to use a bigger instance size or add read replicas
Stored on SSD Storage
Spread across 3 geographically distinct data centers
Eventual Consistent Reads (Default)
Consistency across all copies of data is usually reached within 1 second
Repeating a read after a short time should return updated data
Best Read Performance
Strongly Consistent Reads
Returns a result that reflects all writes that received a successful response prior to the read
Structure:
Tables
Items (Think rows in a traditional table)
Attributes (Think columns of data in a table)
Provisioned throughput capacity
Write throughput 0.0065 per hour for every 10 units
Read throughput 0.0065 per hour for every 50 units
First 25 GB of storage is free
Storage costs of 25 cents per additional GB per Month
Can be expensive for writes, but really really cheap for reads
The combined key/value size must not exceed 400 KB for any given document
US East (N. Virginia) Region
Default Limit
Maximum capacity units per table or global secondary index:
40,000 read capacity units and 40,000 write capacity units
Maximum capacity units per account:
80,000 read capacity units and 80,000 write capacity units
All Region Resource or Operation
Default Limit
Maximum capacity units per table or global secondary index:
10,000 read capacity units and 10,000 write capacity units
Maximum capacity units per account:
20,000 read capacity units and 20,000 write capacity units
Amazon ElastiCache is a web service that makes it easy to deploy, operate, and scale an in-memory data store or cache in the cloud.
Can be used for DB caching in conjunction with services like RDS
Web service that makes it easy to deploy, operate, and scale in memory cache in the cloud
Improves the performance of web applications by allowing you to retrieve information from fast, managed in-memory caches, instead of relying entirely on slower disk based databases
Improves application performance by storing critical pieces of data in memory for low-latency access
Cached information may include the results of I/O intensive database queries or the results of computationally intensive calculations
Supports 2 open-source in-memory caching engines:
Memcached:
Widely adopted memory object caching system
Elasticache is protocol complaint with memcached, so popular tools that you use today with existing memcached environments will work seamlessly with the service
No Multi AZ support
Redis:
Popular open-source in-memory key-value store that supports data structures such as sorted sets and lists
Elasticache supports Master/Slave replication and Multi-AZ which can be used to achieve cross AZ redundancy
Good choice if your db is read heavy and not prone to frequent changing
All Region Resource or Operation
Default Limit
Description
Nodes per region:
50
The maximum number of nodes across all clusters in a region.
Nodes per cluster (Memcached):
20
The maximum number of nodes in an individual Memcached cluster.
Nodes per cluster (Redis):
1
The maximum number of nodes in an individual Redis cluster.
Clusters per replication group (Redis):
6
The maximum number of clusters in a Redis replication group. One is the read/write primary. All others are read only replicas.
Parameter groups per region:
20
The maximum number of parameters groups you can create in a region.
Security groups per region:
50
The maximum number of security groups you can create in a region.
Subnet groups per region:
50
The maximum number of subnet groups you can create in a region.
Subnets per subnet group:
20
The maximum number of subnets you can define for a subnet group.
Fast and powerful, fully managed, petabyte-scale data warehouse service in the cloud. Customers can start small for just 25 cents per hour with no commitments or upfront costs and scale to a petabyte or more for 1000 per TB per year. Less than a tenth of most other data warehousing solutions.
Used for data warehousing / business intelligence
Uses 1024KB/1MB block size for its columnar storage
Tools like Cognos, Jaspersoft, SQL Server Reporting Services, Oracle Hyperion, SAP NetWeaver
Used to pull in very large and complex data sets
Used by management to do queries on data such as current current performance vs target
10 times faster than traditional RDS
Massively Parallel Processing (MPP)
Automatically distributes data and query load across all nodes
Currently only available in 1 AZ at a time
Can restore snapshots to new AZ's in the event of an outage
2 types of transactions:
On-line Transaction Processing (OLTP) - Standard transaction driven database insert/retrieval
-Pulls up a row of data such as Name, Date etc..
On-line Analytics Processing (OLAP) - Pulls up a row of data such as Name, Date etc..
Uses different type of architecture both from a DB and infrastructure layer
Pull in data from multiple queries, gathering tons of information depending on what type of report is required
Start with Single Node (160GB)
Multi-node configurations available:
Leader Node - Manages client connections and receives queries
Compute Node - Store data and perform queries and computations
Can have up to 128 compute nodes
Columnar data storage:
Instead of storing data as a series of rows, redshift organizes data by column.
Unlike row-based systems, which are ideal for transaction processing, Column-based systems are ideal for data warehousing and analytics where queries often involve aggregates performed over large data sets.
Only columns involved in the queries are processed and columnar data is stored sequentially on the storage media
Column-based systems require far fewer I/Os, greatly improving query performance
Advanced compression:
Columnar data stores can be compressed much more than row-based data stores because similar data is stored sequentially on disk
Redshift employs multiple compression techniques and can often achieve significant compression relative to traditional relational data stores
Does not require indexes or materialized views so uses less space than traditional relational db systems
Automatically samples your data and selects the most appropriate compression scheme
Priced on 3 things
Total number of hours you run across your compute nodes for the billing period
You are billed for 1 unit per node per hour, so 3-node cluster running an entire month would incur 2,160 instance hours
You will not be charged for leader node hours, only compute nodes will incur charges
Charged on backups
Charged for data transfers (only within VPC not outside)
Security:
Encrypted in transit using SSL
Encrypted at rest using AES-256 encryption
Takes care of key management by default
Manage your own keys through Hardware Security Module (HSM)
AWS Database Migration Service helps you migrate databases to AWS easily and securely. The source database remains fully operational during the migration, minimizing downtime to applications that rely on the database. The AWS Database Migration Service can migrate your data to and from most widely used commercial and open-source databases. The service supports homogenous migrations such as Oracle to Oracle, as well as heterogeneous migrations between different database platforms, such as Oracle to Amazon Aurora or Microsoft SQL Server to MySQL.
Allows migration of your production DB platforms to AWS or between services like MySQL -> PostgreSQL
Once started, AWS manages all the complexities of the migration process like data type transformation, compression, and parallel transfer for faster transfer, while ensuring that data changes to the source database that occur during the migration process are automatically replicated to the target
AWS schema conversion tool automatically converts the source DB schema and a majority of the custom code, including views, stored procedures and functions to a format compatible with the target DB
AWS Data Pipeline is a web service that helps you reliably process and move data between different AWS compute and storage services, as well as on-premise data sources, at specified intervals. With AWS Data Pipeline, you can regularly access your data where it’s stored, transform and process it at scale, and efficiently transfer the results to AWS services such as Amazon S3, Amazon RDS, Amazon DynamoDB, and Amazon Elastic MapReduce (EMR).
Not covered as exam topic currently
Resource or Operation
Default Limit
Adjustable
Number of pipelines:
100
Yes
Number of objects per pipeline:
100
Yes
Number of active instances per object:
5
Yes
Number of fields per object:
50
No
Number of UTF8 bytes per field name or identifier:
256
No
Number of UTF8 bytes per field:
10,240
No
Number of UTF8 bytes per object:
15,360 (including field names)
No
Rate of creation of a instance from an object:
1 per 5 minutes
No
Retries of a pipeline activity:
5 per task
No
Minimum delay between retry attempts:
2 minutes
No
Minimum scheduling interval:
15 minutes
No
Maximum number of roll-ups into a single object:
32
No
Maximum number of EC2 instances per Ec2Resource object:
The three capacity limits scale proportionally. For example, if you increase the throughput limit to 10MB/second, the other limits increase to 4,000 transactions/sec and 10,000 records/sec.
Shards per region:
US EAST, US WEST, EU: 50 All other supported regions: 25
Amazon Machine Learning is a service that makes it easy for developers of all skill levels to use machine learning technology. Amazon Machine Learning provides visualization tools and wizards that guide you through the process of creating machine learning (ML) models without having to learn complex ML algorithms and technology.
Not covered as exam topic currently
Resource or Operation
Default Limit
Data file size:
100 GB
Batch prediction input size:
1 TB
Batch prediction input (number of records):
100 Million
Number of variables in a data file (schema):
1000
Recipe complexity (number of processed output variables):
10,000
Transactions Per Second for each real-time prediction endpoint:
200
Total Transactions Per Second for all real-time prediction endpoints:
10,000
Total RAM for all real-time prediction endpoints:
10 GB
Number of simultaneous jobs:
5
Longest run time for any job:
7 days
Number of classes for multiclass ML models:
100
ML model size:
2 GB
Data File Size Note:
The size of your data files is limited to ensure that jobs finish in a timely manner. Jobs that have been running for more than seven days will be automatically terminated, resulting in a FAILED status.
For additional information about Machine Learning Limits, see Limits in Amazon ML
Amazon QuickSight is a very fast, cloud-powered business intelligence (BI) service that makes it easy for all employees to build visualizations, perform ad-hoc analysis, and quickly get business insights from their data.
Allows for centralized control and shared access to your AWS Account and/or AWS services
By default when you create a user, they have NO permissions to do anything
Root account has full admin access upon account creation
Not region specific, can be shared between all regions
Granular permission sets for AWS resources
Includes Federation Integration which taps into Active Directory, Facebook, Linkedin, etc. for authentication
Multi-factor authentication support
Allows configuration of temporary access for users, devices and services
Set up and manage password policy and password rotation policy for IAM users
Integration with many different AWS services
Supports PCI DSS compliance
Access can be applied to:
Users - End users (people)
Groups - Collection of users under one set of permissions
Roles - Assigned to AWS resources, specifying what the resource (such as EC2) is allowed to access on another resource (S3)
Policies - Document that defines one or more permissions
Policies can be applied to users, groups and roles
You can assign up to 10 policies to a single group
Policy documents must have a version, and a statement in the body; The statement must consist of Effects (Allow, Deny), Actions(Which action to allow/deny such a * for all actions), and Resources (affected resources such as * for all resources)
All resources can share the same policy document
There are 3 different types of roles:
Service Roles
Cross account access roles
Used when you have multiple AWS accounts and another AWS account must interact with the current AWS account
Identity provider access roles
Roles for facebook or similar Identity providers
In order for a new IAM user to be able to log into the console, the user must have a password set
By default a new users access is only accomplished through the use of the access key/secret access key
If the users password is a generated password, it also will only be shown at the time of creation.
Customizable Console Sign-in link can be configured on the main IAM page (aws.yourdomain.com)
Customizable Console Sign-in links must be globally unique. If a sign in link name is already taken, you must choose an alternative
Root account is email address that you used to register your account
Recommended that root account is not used for login, and should be secured with Multi-factor Authentication (MFA)
Can create Access Keys/ Secret Access Keys to allow IAM users (or service accounts) to be used with AWS CLI or API calls
Access Key ID is equivalent to a user-name, Secret Access Key is equivalent to a password
When creating a user's credentials, you can only see/download the credentials at the time of creation not after.
Access Keys can be retired, and new ones can be created in the event that secret access keys are lost
To create a user password, once the users have been created, choose the user you want to set the password for and from the User Actions drop list, click manage password. Here you can opt to create a generated or custom password. If generated, there is an option to force the user to set a custom password on next login. Once a generated password has been issued, you can see the password which is the same as the access keys. Its shown once only
Click on Policies from the left side menu and choose the policies that you want to apply to your users. When you pick a policy that you want applied to a user, select the policy, and then from the top Policy Actions drop menu, choose attach and select the user that you want to assign the policy to
Resource or Operation
Default Limit
Groups per account:
100
Instance profiles:
100
Roles:
250
Server Certificates:
20
Users:
5000
Number of policies allowed to attach to a single group:
AWS Directory Service makes it easy to setup and run Microsoft Active Directory (AD) in the AWS cloud, or connect your AWS resources with an existing on-premises Microsoft Active Directory.
Amazon Inspector is an automated agent based security assessment service that helps improve the security and compliance of applications deployed on AWS.
Allows customers to install agents on EC2 instances and inspect the instance for security vulnerabilities
AWS WAF is a web application firewall that helps protect your web applications from common web exploits that could affect application availability, compromise security, or consume excessive resources.
Allows customers to secure their cloud infrastructure
Not covered as exam topic currently
Resource or Operation
Default Limit
Web ACLs per account:
10
Rules per account:
50
Conditions per account:
50
For additional information about Web Application Firewall Service Limits, see Limits in Amazon WAF
The AWS CloudHSM service helps you meet corporate, contractual and regulatory compliance requirements for data security by using dedicated Hardware Security Module (HSM) appliances within the AWS cloud.
Allows customers to secure their cloud infrastructure
AWS Key Management Service (KMS) is a managed service that makes it easy for you to create and control the encryption keys used to encrypt your data, and uses Hardware Security Modules (HSMs) to protect the security of your keys.
Not covered as exam topic currently
Resource or Operation
Default Limit
Customer Master Keys (CMKs):
1000
Aliases:
1100
Grants per CMK:
2500
Grants for a given principal per CMK:
30
Requests per second:
Varies by API operation
KMS Note:
All limits in the preceding table apply per region and per AWS account.
For additional information about Key Management Service Limits, see Limits in Amazon KMS
We recommend subscriptions if you are continuously processing new data. If you need historical data, we recommend exporting your data to Amazon S3. This limit can be changed only in special circumstances.
AWS Config is a fully managed service that provides you with an AWS resource inventory, configuration history, and configuration change notifications to enable security and governance.
Provides customer with configuration history, change notifications, and inventory
Can perform tasks such as ensuring that all EBS volumes are encrypted etc..
An on-line resource to help you reduce cost, increase performance, and improve security by optimizing your AWS environment, Trusted Advisor provides real time guidance to help you provision your resources following AWS best practices.
Automated service that scans customer environment and offers advise on how how to save money, lock down resources, and reports security vulnerabilities
Amazon AppStream enables you to stream your existing Windows applications from the cloud, reaching more users on more devices, without code modifications.
AWS version of XenApp
Steam Windows apps from the cloud
Not covered as exam topic currently
Resource or Operation
Default Limit
Concurrent streaming sessions per account:
5
Concurrent streaming application deployments using the interactive wizard:
2
streaming applications in the Building, Active, or Error states:
Amazon CloudSearch is a managed service in the AWS Cloud that makes it simple and cost-effective to set up, manage, and scale a search solution for your website or application.
Makes it simple to manage and scale search across your entire application
Amazon Elastic Transcoder is media transcoding in the cloud. It is designed to be a highly scalable, easy to use and a cost effective way for developers and businesses to convert (or “transcode”) media files from their source format into versions that will playback on devices like smart phones, tablets and PCs.
Media transcoder in the cloud
Convert media files from their original source format to different formats that will play on smart phones, tablets, PC's etc.
Provides transcoding presets for popular output formats, which means you don't need to know or guess with which settings work best on which devices
Pay based on the minutes that you transcode and the resolution at which you transcode.
Resource or Operation
Default Limit
US-EAST (VA) , US-WEST(Oregon), EU (Ireland)
All Others
Pipelines per region:
4
User-defined presets:
50
Max no. of jobs processed simultaneously by each pipeline:
Amazon Simple Email Service (Amazon SES) is a cost-effective email service built on the reliable and scalable infrastructure that Amazon.com developed to serve its own customer base. With Amazon SES, you can send and receive email with no required minimum commitments – you pay as you go, and you only pay for what you use.
Not covered as exam topic currently
Resource or Operation
Default Limit
Daily sending quota:
200 messages per 24 hour period
Maximum send rate:
1 EMail per second
Recipient address verification:
All recipient addresses must be verified
Maximum Send Rate:
The rate at which Amazon SES accepts your messages might be less than the maximum send rate.
For additional information about Simple E-Mail Service Limits, see Limits in Amazon SES
Web service that gives you access to a message queue that can be used to store messages while waiting for a computer to process them. SQS is a distributed queue system that enables applications to quickly and reliably queue messages that one component of the application generates to be consumed by another component. A queue is a temp repository for messages that are awaiting processing.
Used to allow customers the ability to decouple infrastructure components
Very first service AWS released. Even older then EC2
Messages can contain up to 256 KB of text in any format
Acts as a buffer between the component producing and saving data, and the component receiving and processing the data
Ensures deliver of each message at least once and supports multiple readers and writers interacting with the same queue
A single queue can be used simultaneously by many distributed application components, with no need for those components to coordinate or communicate with each other
Will always be available and deliver messages
Does not guarantee FIFO delivery of messages
Messages can be delivered multiple times and in any order
FIFO is not supported
If sequential processing is a requirement, sequencing information can be placed in each message so that message order can be preserved
SQS always asynchronously PULLs messages from the queue
Retention period of 14 days
12 hour visibility timeout by default
If you find that the default visibility timeout period (12 hours) is insufficient to fully process and delete the message, the visibility timeout can be extended using the ChangeMessageVisibility action
If the ChangeMessageVisibility action is specified to set an extended timeout period, SQS restarts the timeout period using the new value
Engineered to provide delivery of all messages at least one
Default short polling will return messages immediately if messages exist in the queue
Long polling is a way to retrieve messages from a queue as soon as they are available; long polling requests don't return a response until a message arrives in the queue
Maximum long poll time out is 20 seconds
256kb message sizes (originally 64kb)
Billed for 64kb chunks
First million messages free, then $.50 per additional million thereafter
Single request can have from 1 to 10 messages, up to a max payload of 256KB
Each 64KB chunk of payload is billed as 1 request. If you send a single API request with a 256KB payload, you will be billed for 4 requests (256/64 KB chunks)
"Decouple" = SQS on exam
Auto-scaling supported
Message prioritization is not supported
Process:
Component 1 sends a message to the queue
Component 2 retrieves the message from the queue and starts the visibility timeout period
Visibility timer only starts when the message is picked up from the queue
Component 2 processes the message and then deletes it from the queue during the visibility timeout period
If the visibility timeout period expires, the message will stay in the queue and not be deleted
The process is only complete when the queue receives the command to delete the message from the queue
Simple Workflow Service is a web service that makes it easy to coordinate work across distributed application components. Enabled for a range of uses such as media processing, web back ends, business process work-flows, and analytics pipelines, all to be designed as a coordination of tasks. Tasks represent invocations of various processing steps in an application which can be performed by code, API calls, human action and scripts.
Build, run and scale background jobs or tasks that have sequential steps
Way to process human oriented tasks using a framework
SQS has a retention period of 14 days, vs SWF has up to a 1 year for work-flow executions
Workflow retention is always shown in seconds (3.1536E+07 seconds)
"Task could take a month" = SWF, as SQS only has a 14 day retention
Presents a task-oriented API, whereas SQS offers a message-oriented API
Ensures a teaks is assigned only once and is never duplicated; SQS duplicate messages are allowed, and must be handled
Keeps track of all tasks and events in an application, SQS would need an implementation of a custom application-level tracking mechanism
A collection of work-flows is referred to as a domain
Domains isolate a set of types, executions, and task lists from others within the same account
You can register a domain by using the AWS console or using the RegisterDomain action in the SWF API
Domain parameters are specified in JSON format
SWF Actors:
Workflow starters - An application that can initiate a Workflow
Decider's - Control the flow or coordination of activity tasks such as concurrency, or scheduling in a work-flow execution; If something has finished in a work-flow (or fails), a decider decides what to do next
Activity Workers - Programs that interact with SWF to get tasks, process received tasks, and return the results
Brokers the interactions between workers and the decider; Allows the decider to get consistent views into the progress of tasks and to initiate new tasks in an ongoing manner
Stores tasks, assigns them to workers when they are ready and monitors their progress
Ensures that a task is assigned only once and is never duplicated
Maintains the application state durably, workers and decider's don't have to keep track of the execution state, and can run independently, with the ability to scale quickly