Tutorial / Cram Notes
Amazon Web Services offers a variety of database services to meet different needs, such as high performance, ease of management, scalability, and full-text search capabilities. As a candidate preparing for the AWS Certified Solutions Architect – Professional (SAP-C02) exam, understanding these services and their use cases is essential. Below we discuss popular database services such as Amazon DynamoDB, Amazon OpenSearch Service (formerly known as Amazon Elasticsearch Service), Amazon RDS, and self-managed databases on Amazon EC2.
Amazon DynamoDB
DynamoDB is a fully managed NoSQL database service that supports key-value and document data structures. It is built for high performance, scalability, and low-latency access. DynamoDB automatically spreads the data and traffic for tables over a sufficient number of servers to handle throughput and storage requirements.
Features:
- Single-digit millisecond performance at any scale.
- Fully managed with automatic scaling of throughput and storage.
- Built-in high availability and fault tolerance.
- Supports ACID transactions for complex business workflows.
Use Cases:
DynamoDB is well-suited for scenarios where rapid, consistent performance is required, such as mobile backends, gaming, IoT, and many others.
Amazon OpenSearch Service
Amazon OpenSearch Service (successor to Amazon Elasticsearch Service) is a fully managed service that makes it easy to deploy, operate, and scale OpenSearch, compatible with Elasticsearch. It provides a powerful, full-text search capability with the ease of HTTP/REST requests.
Features:
- Real-time distributed search and analytics.
- Seamless scaling as the size of the data and traffic to your application change.
- Kibana is integrated into the service for visualization.
- Support for log analytics and real-time application monitoring.
Use Cases:
OpenSearch Service is often used for log analytics, real-time application monitoring, and full-text search features integrated into other applications.
Amazon Relational Database Service (Amazon RDS)
Amazon RDS simplifies the setup, operation, and scaling of a relational database in the cloud. It provides efficient and resizable capacity while managing time-consuming database administration tasks.
Features:
- Support for several database engines (MySQL, PostgreSQL, Oracle, SQL Server, and Amazon Aurora).
- Automated backups, database snapshots, and automatic host replacement.
- Easy to deploy read replicas for improved read performance.
- Multi-AZ deployments for increased availability.
Use Cases:
RDS is ideal for applications that need a relational database and require complex transactions, such as enterprise applications, e-commerce websites, and mobile apps.
Self-Managed Databases on Amazon EC2
Self-managing databases on Amazon EC2 involves installing, configuring, and managing your database server on EC2 instances. This gives you complete control over the database environment and the underlying host.
Features:
- Full control over the database server and environment.
- Flexibility to use any database software that can run on an EC2 instance.
- Choice of instance type to match your workload requirements.
- Use of EC2 features such as Elastic IP addresses, security groups, and IAM roles.
Use Cases:
This is suitable for applications with unique requirements that are not met by AWS-managed services, or if you need to use a specific version or configuration of a database.
In preparing for the AWS Certified Solutions Architect – Professional exam, you will need to understand the core features, advantages, and potential use cases for each of these database services. AWS also emphasizes best practices and architectural principles, so you should be familiar with topics such as:
- Ensuring high availability and fault tolerance for your databases.
- Choosing the right database based on the needs of your application.
- Configuring read replicas, backups, and disaster recovery strategies.
- Scaling databases and estimating costs based on expected workloads.
By understanding these concepts and the specific features of AWS database services, you’ll be better prepared to design effective, scalable, and reliable solutions on the AWS platform.
Practice Test with Explanation
True/False: Amazon DynamoDB supports both document and key-value data models.
- True
- False
Answer: True
Amazon DynamoDB is a NoSQL database service that supports both document and key-value store models, making it a flexible option for various use cases.
Which of the following AWS services provides a managed relational database?
- Amazon DynamoDB
- Amazon RDS
- Amazon S3
- Amazon EC2
Answer: Amazon RDS
Amazon Relational Database Service (RDS) is a managed service that makes it easier to set up, operate, and scale a relational database in the cloud.
True/False: Amazon OpenSearch Service is primarily used for full-text indexing and search capabilities.
- True
- False
Answer: True
Amazon OpenSearch Service (formerly known as Amazon Elasticsearch Service) is a managed service that is designed to be used for search and analytics, including full-text search capabilities.
Which AWS service would you use to run a self-managed database on virtual machines?
- Amazon ECS
- Amazon RDS
- Amazon DynamoDB
- Amazon EC2
Answer: Amazon EC2
Amazon Elastic Compute Cloud (EC2) allows users to run virtual servers where they can manage and run their own database instances.
True/False: Amazon DynamoDB automatically replicates data across multiple AWS Availability Zones to ensure data availability and durability.
- True
- False
Answer: True
DynamoDB automatically spreads the data and traffic for tables over a sufficient number of servers to handle throughput and storage requirements, while maintaining consistent and fast performance.
Amazon RDS supports which of the following database engines? (Choose ALL that apply)
- MySQL
- Oracle
- MongoDB
- SQL Server
- PostgreSQL
Answer: MySQL, Oracle, SQL Server, PostgreSQL
Amazon RDS supports several database engines including MySQL, Oracle, SQL Server, and PostgreSQL. MongoDB is not supported by Amazon RDS; it is typically run on Amazon EC2 or through the Amazon DocumentDB service which is MongoDB-compatible.
True/False: Amazon DynamoDB has a SQL-like query language.
- True
- False
Answer: False
DynamoDB uses its own API for table operations and does not support SQL. However, for querying, it provides a Query and Scan API that allows you to use expressions to retrieve data from tables.
A feature of Amazon RDS that allows you to run database instances in multiple AWS regions for disaster recovery is called:
- RDS Multi-AZ
- RDS Multi-Region
- RDS Read Replicas
- Amazon Aurora Global Database
Answer: RDS Multi-Region
While RDS Multi-AZ is for high availability within a single region, RDS Multi-Region allows you to run database instances in multiple regions for disaster recovery purposes.
True/False: Amazon EC2 instances running databases can be scaled automatically using Auto Scaling Groups.
- True
- False
Answer: True
Although databases often require careful scaling considerations, it is technically possible to automate the scaling of EC2 instances running databases using Auto Scaling Groups.
Which AWS service automatically handles the patching of the database software?
- Amazon EC2
- Amazon DynamoDB
- Amazon RDS
- Amazon S3
Answer: Amazon RDS
Amazon RDS provides a managed service experience where administrative tasks such as database software patching are automatically handled.
True/False: Amazon DynamoDB tables have a fixed schema once created.
- True
- False
Answer: False
Unlike traditional relational databases, DynamoDB tables do not require a fixed schema for all records. Each item in a table may have a unique set of attributes.
To ensure that sensitive query data is never logged in any form, you would enable which feature in Amazon RDS?
- Transparent Data Encryption
- Enhanced Monitoring
- Performance Insights
- RDS Database Log Export
Answer: Transparent Data Encryption
Transparent Data Encryption is used to encrypt the data at rest and doesn’t directly affect logging. The question though aims to highlight settings and features that would prevent logging of sensitive data rather than providing mechanisms to do so, therefore it’s a trick question as none of the options provided would ensure that sensitive query data is never logged.
Interview Questions
Can you explain the key differences between Amazon RDS and DynamoDB, and when you might choose one over the other?
Amazon RDS is a relational database service that supports various database engines like MySQL, PostgreSQL, Oracle, SQL Server, and MariaDB. It is designed for structured data models and supports SQL queries, transactions, and joins. Amazon RDS is suitable when dealing with complex queries, joins, and transactions.
On the other hand, DynamoDB is a NoSQL database service that provides fast and predictable performance with seamless scalability. It’s best for unstructured or semi-structured data and is suitable for use cases that require high throughput and low-latency access to data, regardless of scale. It’s often chosen for web-scale applications, real-time analytics, and serverless applications.
What are the benefits of using Amazon RDS Multi-AZ deployments, and how does it work?
Amazon RDS Multi-AZ deployments provide high availability and failover support for DB instances. They work by automatically provisioning and maintaining a synchronous standby replica of the primary database in a different Availability Zone (AZ). The benefits of using RDS Multi-AZ include improved fault tolerance, automatic failover without administrative intervention in the case of an infrastructure failure, and minimized downtime during maintenance events or DB instance failure.
Describe Auto Scaling in the context of DynamoDB and how it can help manage workloads.
DynamoDB Auto Scaling automatically adjusts the number of read and write capacity units for your DynamoDB tables and global secondary indexes in response to changing traffic patterns. This feature helps to manage workloads more efficiently by scaling capacity up or down based on predefined utilization thresholds, ensuring that applications maintain their performance standards while minimizing costs during periods of low usage.
What is the primary purpose of Amazon OpenSearch Service, and how does it integrate with other AWS services?
Amazon OpenSearch Service (formerly AWS Elasticsearch Service) is a managed service that makes it easy to deploy, operate, and scale OpenSearch, an open-source search and analytics suite. Its primary purpose is to offer powerful full-text search, data visualization, and real-time analytics capabilities. It seamlessly integrates with services like Amazon Kinesis for real-time analytics, AWS Lambda for data processing, and Amazon S3 for storing large amounts of data.
How would you secure data in transit and at rest using Amazon RDS?
To secure data in transit, Amazon RDS supports SSL to encrypt data between your application and your RDS instance. To secure data at rest, RDS provides an option to enable encryption for your databases using AWS Key Management Service (KMS). When enabled, RDS encrypts the underlying storage, automated backups, read replicas, and snapshots.
When would you consider using a self-managed database on Amazon EC2 instead of a managed database service like RDS or DynamoDB?
You might consider using a self-managed database on Amazon EC2 when you need total control over the database configuration, the database version, the operating system, the extensions or plugins to install, or when you need to use a database engine that’s not supported by Amazon RDS. Additionally, for licensing or compliance reasons, you may decide to manage your database on EC
How can you migrate a self-hosted MySQL database to Amazon RDS, and what are the considerations you should take into account?
To migrate a self-hosted MySQL database to Amazon RDS, you can use tools like the AWS Database Migration Service (DMS), which supports live migration and continuous replication. Considerations include the size of the database, the permissible downtime, security requirements, compatibility with RDS MySQL versions, network bandwidth, and ensuring that the necessary access permissions are configured for the migration process.
What is Amazon RDS Read Replica, and how does it differ from Multi-AZ deployments?
An Amazon RDS Read Replica allows you to have a read-only copy of your primary database instance that is replicated asynchronously. This is intended to serve high-volume read workloads as a way to increase performance and scalability. In contrast, Multi-AZ deployments are designed for high availability and failover; the standby replica in Multi-AZ deployments is kept in sync synchronously and is not accessed directly under normal operations.
Explain how Amazon DynamoDB Accelerator (DAX) can be beneficial for performance enhancement.
Amazon DynamoDB Accelerator (DAX) is an in-memory caching service that provides fast read performance for DynamoDB tables by delivering response times in microseconds. DAX is beneficial for applications requiring extremely low latency reads and allows you to reduce the read load on your tables, thus saving costs on provisioned throughput.
What are the key considerations when choosing provisioned IOPS storage for an Amazon RDS instance?
When choosing provisioned IOPS (input/output operations per second) storage for an RDS instance, key considerations include the database workload’s performance requirements, the expected IOPS, the consistency of the workload, and the cost implications. Provisioned IOPS is generally chosen for I/O-intensive, mission-critical database workloads that require low-latency and high-throughput IOPS performance.
How does Amazon RDS automate database maintenance, and what options are available to control the maintenance windows?
Amazon RDS automates database maintenance by applying patches, conducting backups, and performing other maintenance tasks. Users have control over the maintenance windows by specifying a preferred time period during which these tasks should take place, ensuring minimal disruption to the application’s operation. Users can also opt to receive notifications for upcoming maintenance and can sometimes defer specific maintenance events.
In the context of a database running on Amazon EC2, what strategies would you use to ensure the durability and reliability of the data?
For a database running on Amazon EC2, strategies to ensure durability and reliability include implementing regular backups using snapshots to Amazon S3, setting up EC2 instances in multiple Availability Zones for high availability, employing RAID configurations for data redundancy, and utilizing Amazon EBS volumes with provisioned IOPS for consistent and reliable performance. Additionally, using EBS snapshots and automating failover with tools like AWS Elastic IP or Route 53 Health Checks can reduce the risk of data loss and downtime.
I have been using Amazon RDS for a while now, and it seems pretty robust. Any tips to optimize my queries?
Appreciate the detailed explanation about Amazon DynamoDB!
Is DynamoDB suitable for storing relational data?
This blog post is really helpful, especially the section on Amazon OpenSearch Service.
For those using self-managed databases on Amazon EC2, what are the best practices for backup and recovery?
Thanks for this awesome guide on AWS databases!
Any thoughts on migrating from a self-managed PostgreSQL on EC2 to Amazon RDS?
Great breakdown of Amazon OpenSearch Service, really clarified a lot of things for me!