These are four database services of Amazon and are mostly used by AWS professionals. Among these database services, you may have to choose any one as per your suitability. To make your decision more accurate, we have come up with this post. Developers cannot always use all database engines simultaneously so to compare them they may need certain measures and must know their features as well.
In this post firstly, we will discuss the introduction of these database engines and then will compare them depending on their features. By this, you can easily select the database engine of your choice and continue your development process. This post is intended to provide you full functional and feature-wise comparison. So let us begin our discussion with small introduction to all these four database engines:
Amazon Relational Database Service or Amazon RDS makes the task of setting up, scaling, and operating a relational database in cloud. A lot of repetitive work occurs in managing a running database, which obviously becomes a bottleneck in staying ahead of your organizational growth.
Amazon RDS ordinarily provides six database engines Amazon Aurora, Microsoft SQL Server, Oracle, MariaDB, PostgreSQL, and MySQL respectively. The users can continue to use their already existing tools. They can easily manage this without installing any kind of additional hardware or software. Amazon RDS can by default repair all the missing links of database software backups and take its own backup periodically. This is the reason it is considered to be the most cost-efficient, resizable and time-efficient.
Whenever you purchase any package then it includes server, CPU, IOPs and storage units. In case of RDS, they all are like individuals that can be scaled up independently. Here the basic building block is DB instance, and this is an isolated database environment that is provided through cloud. Here the user can store multi user-created databases.
You can make multiple user-created database with the help of RDS client tools and apps. These apps and tools are meek to alter as well as develop. Amazon RDS utilizes any Standard SQL client app that does not permit any kind of direct host access. It first creates a master user account for the DB instance during the creation process. Here the master has got many permissions including that of database creation, selecting, deleting, update and insertion operation execution on various tables. You would also need to create the password if you want to access & update the database. You can change this password any time and for this AWS offers you with several tools such as Amazon AWS command-line tools, Amazon RDS APIs or AWS Management Console. A few SQL commands can also aid you in efficient management of the database.
Amazon has taken this word Redhift from astronomy, in which they use the word in association with their “big bang theory” as they mean to say that their Redshift can handle any amount of data that your service require. As this was the fastest growing service initially so it was considered as it can handle any amount of data so that Amazon customers can easily adopt and implement Redshift. They can save their valuable time that may have to be spend in swapping out the Enterprise Data Warehouse or EDW.
Redshift- popularly known as the analytics database is suitable to maintain large data volume It can manage the implementation of big or heavy queries easily against the large datasets and can be completely managed as well. The database is a seamless collection of several computing resources or you can say nodes. These computing nodes are prearranged in a group called clusters. From here the Amazon Reshift engine runs in every cluster that may have one or more databases.
The nodes of clusters can be of two types DW1 and DW2, here the DW2 nodes are very fast and solid state drives that are known as SSD that have large I/O benefits. DW1 nodes can be scaled up to petabyte of storage and run on traditional storage disks, though they are not as much faster, but customers have to bear less cost for these.
For small businesses, Amazon SimpleDB is the most appropriate database engine that can’t exceed 10 GB storage to execute query and store database. If you may have the tables that can enhance in size in future then, in that case, SimpleDB will not be suitable for you. You can partition data manually in such case across domains and the benefits retrieved by this can be an additional advantage for small business organizations.
SimpleDB database of Amazon works around the domains that are analogous to relational tables. These domains may contain multiple items and the set of various key-value pairs to ease access of the database. It supports the simple select statement that can be used by even a simple SQL programmer. SimpleDB does not support domain joins. To combine the data from multiple domains then you will have to write a custom program to operate. You can execute simple Join operation here, but for complex Join operations you may have to use another database.
Amazon had developed and designed DynamoDb specifically for the most demanding applications that would require reliable & scalable data storage. It was for the apps that may need an advance data management support in place of the old school hard disks. Several solid-state tools and frameworks are being utilized to provide low-latency as well as the constant update of items. It can easily manage large data volumes. Not only this, it can aslo maintain, and improve the performance of the system.
As AWS DynamoDB usually works with bigger enterprise databases, so it may require some additional aids and administrations for effective data management. For this particular reason, AWS can essentially integrate DynamoDB with Elastic MapReduce or the EMR along with the help of AWS Hadoop service and Redshift. One can also use EMOR or Amazon Redshift to resolve the large-scale issues or queries and for more concrete queries that are based on hash as well as hash-range can be accomplished by DynamoDB. In order to avoid any extra overhead difficulty to manage the partitioned domains, one can use DynamoDB because of one very good reason- It has no size limit.
In DynamoDB the indexing is being done on primary keys but is allowed for secondary indexes as well. They are not based on a single select statement, instead, are based on hash and hash-and-range keys. The services also use scan and query statements. Here, Scan reads all table items that offer flexibility, but it can slower down the query processing speed especially for the large tables.
|Amazon RDS||Amazon Redshift||Amazon DynamoDB||Amazon SimpleDB|
|Database engine||Amazon Aurora, MySQL, MariaDB, Oracle Database, SQL Server, PostgreSQL||Redshift (adapted PostgreSQL)||NoSQL||NoSQL (with limited capacity)|
|Computing resources||Instances with 64 vCPU and 244 GB RAM||Nodes with vCPU and 244 GB RAM||Not specified, software as a service||Not specified, software as a service|
|Data storage facilities (max)||6 TB per instance, 20.000 IOPS||16 TB per instance||Unlimited storage size, 40.000 Read/Write per table||10 GB per domain, 25 Writes/Sec|
|Maintenance Windows||30 minutes per week||30 minutes per week||No effect||No effect|
|Multi-AZ replication||As an additional service||Manual||Built-in||Built-in|
|Tables (per basic structural unit)||Defined by the database engine||9.900||256||250|
|Main usage feature||Conventional database||Data warehouse||Database for dynamically modified data||Simple database for small records or auxiliary roles|
Those who want to run relational database service or RDS that does not require any administration and maintenance may need to maintain certain standards. AWS preassumes that RDS is a fully functional alternative to common hardware databases. Available RDS engines are:
There are many versions available of these database engines.
Moreover, there is not as such restriction for these engines. To run other engines, you may have to flush, lock and stop all tables manually. You may have to use various computer resources to run these engines. Like to run the standard version of the RDS you may have to be equipped your system with:
For database and logs, Amazon RDS provides three types of attached storages technologies that differ by price and performance characteristics. The three types of storages are:
For better availability feature RDS is equipped with Multi-AZ or Availability Zone deployment. In this feature, the replica of full databse is stored along with its settings at a completely different and distant location that has different availability zone as well. These instances are not connected either by hardware or network in any way. Therefore failure or disasters cannot affect the two data centers at the same time.
Amazon Redshift tool is designed to work with even petabytes or a huge volume of data. It can be applied to any kind of SQL application even with minimum changes. Technically Redshift is a cluster database without any consistency feature. A number of nodes are included with virtual databases that are again powered by Amazon Elastic Compute Cloud or EC2 instances. Amazon Redshift has mainly two computing nodes one is leading node and other is computing one.
Well, users can use Redshift for huge data volumes, but still, it comes with some limitations and they are:
The way that it happens in RDS similarly in the case of Redshift all the infrastructure is preserved and repaired by AWS, a technique in which the user does not get the root access. The only negative side of Redshift is its Maintenance Window. Over here the user is bound to look after the database downtime all by himself, and this is not scheduled by default as it happens in RDS. A few other activities such as auto-scaling, monitoring, and networking type of features are preserved & supported by Redshift very easily. It is most of the time used in processes like that of data warehousing, , database analytics, customer activity monitoring and big data processing.
This NoSQL database service of AWS is being used for fast processing of small data that can dynamically grow and change. The tables of DynamoDB do not follow any structure as they can store the database values in key-value format or as the text itself. DynamoDb does not come with any hardware restriction for its capabilities. The main value is the read/write throughput used by the database. Moreover, DynamoDb does not follow any restriction on storage as well; it can grow as the size of database grows up. Here the data availability is also present just like RDS but is automatically replicated among three Availability zones within selected region.
In DynamoDB the administrative activities like data replication and final performance scaling remain totally absent, and that is why it becomes extremely durable. It also does not support advanced querying functions and transactions. It has following restrictions for the storage capacity:
Some of the additional features are:
This another NoSQL database engine of Amazon that technically resembles DynamoDB. Here the basic structural unit is a domain that referred to as a table of any relational database. The allowable size of the domain is 10 GB that can also deploy additional domains. The maximum time for query execution is 5 seconds. SimpleDB and DynamoDb also differ in their capacities.
|Write Capacity (per table)||10.000-40.000 units||25 writes/sec|
|Performance Scaling Method||Presettable Throughput||Horizontal (no bursts available)|
|Attributes per table||Unlimited||1 billion/td>|
|Attributes per item||Unlimited||256|
|Items per table (with maximal size)||Unlimited||3.906.250|
|Tables per account||256||250|
|The maximum size of item||400KB||1KB|
|Data types supported||Number, String, Binary, Boolean, NULL values, collection data||String|
|Encoding of string data||UTF-8||UTF-8|
A comparison among features of both these databases is shown in the following image that makes the points more clear. SimpleDB is used for lightweight and easily managed databases.
These all database engines are offered by Amazon, and the choice of any particular platform will depend on level of flexibility required and the present power of computing resources. A data warehouse, business activity, and external index may require different database engine, storage capacity, and performance rate. You can also deploy a pre-configured database image for EC2 by installing all of its required software and accessing its root features directly from Amazon server.
JanBask Training is a leading Global Online Training Provider through Live Sessions. The Live classes provide a blended approach of hands on experience along with theoretical knowledge which is driven by certified professionals.
Receive Latest Materials and Offers on AWS Course