Maybe you heard it in a punchline on the television show Silicon Valley, or maybe you heard it from a serious VC looking to invest in a company. Either way, you’ve probably heard the question: does it scale? All jokes aside, scalability is a key concept for not only Solutions Architects, but anyone in the DevOps world. So let’s talk a little bit about the concept of scalability in cloud computing.
What is scalability?
Scalability refers to the idea of a system in which every application or piece of infrastructure can be expanded to handle increased load.
For example, suppose your web application gets featured on a popular website like ProductHunt. Suddenly, thousands of visitors are using your app – can your infrastructure handle the traffic? Having a scalable web application ensures that it can scale up to handle the load and not crash. Crashing (or even just slow) pages leave your users unhappy and your app with a bad reputation.
Systems have four general areas that scalability can apply to:
- Disk I/O
- Network I/O
When talking about scalability in cloud computing, you will often hear about two main ways of scaling – horizontal or vertical. Let’s look deeper into these terms.
Vertical is often thought of as the “easier” of the two methods. When scaling a system vertically, you add more power to an existing instance. This can mean more memory (RAM), faster storage such as Solid State Drives (SSDs), or more powerful processors (CPUs).
The reason this is thought to be the easier option is that hardware is often trivial to upgrade on cloud platforms like AWS, where servers are already virtualized. There is also very little (if any) additional configuration you are required to do at the software level.
Horizontal scaling is slightly more complex. When scaling your systems horizontally, you generally add more servers to spread the load across multiple machines.
With this, however, comes added complexity to your system. You now have multiple servers that require the general administration tasks such as updates, security and monitoring but you must also now sync your application, data and backups across many instances.
So which is better?
Horizontal scaling is often considered a long term advantage, whereas vertical scaling is usually considered a short term advantage.
The reason for this is that you can typically add as many servers as you need to your infrastructure, but at some point, hardware upgrades are just not plausible.
One of the primary reasons for scaling your system is to increase performance. This is only one aspect of performance though – scaling ties in with many other concepts such as elasticity and fault tolerance.
Performance of a system is measured by many different metrics – one of the main ones is response time. Interestingly, scaling your system may increase response times. If you move away from the type of system architecture that has all of the components (database, application code, caching) on one server to a type of system architecture that separates these components onto their own servers then the response time will naturally increase as you now have network latency and other considerations. Let’s look at two popular system architecture types below.
A monolith system architecture is the idea of having many of your components in one place. When talking about an application then it may mean that you have all of your services coupled together such as your data layer, caching layer, file layer and business logic. When talking about hardware and servers it can mean that you run all of your processes in one place such as your database, web server and file system.
A microservices system architecture is the process of splitting up core services into their own ecosystems. A key part of your application may be an image processing service that can save, delete, cache and manipulate images. This service could be set up as its own infrastructure which means that it would be separated from the other application services. You’ll often hear the term separation of concerns when referring to microservices. Although each core service having its own infrastructure can make scalability easier, it can still add a lot of complexity to your application. You’ll now have to manage multiple servers but also change your application code to handle these changes.
Scalability and databases
Each application is different but the key is to identify key services that may be a bottleneck and the first ones to cripple under increased load pressure. One of the most common bottlenecks can be the database.
The database is used to store data in an application. You may use a traditional relational database such as MySQL or a NoSQL database such as MongoDB. In simple terms, the database is used to write data (save it) and read it (view it). The database can often be one of the first components to fall down under high load pressure in an application environment.
To shard a database for scalability is to split your data up into separate database servers. Instead of having all of your data on one database server you would split the data into “shards”. This can help with performance in a few ways:
- The data requests are shared across multiple servers instead of a the same database server each time
- Less data on each shard reduces index sizes which can improve data seek time
- Less data on each shard means there are less rows of data, this can allow queries to run quicker since there is less data to traverse or calculate
Database partitioning is similar to database sharding, but not exactly the same. Database partitioning separates the data into distinct parts. Certain partitioning methods include:
- Splitting data by range (alphabetically or numerically)
- Row wise (horizontal partitioning)
- Column wise (vertical partitioning)
Application code database optimizations
You can also perform application-level database optimizations, such as:
- Using database indexes
- Table partitioning
- Caching database queries
- Running large queries/batch queries offline
The main benefit of scalable architecture is performance and the ability to handle bursts of traffic or heavy loads with little or no notice. A scalable system can help keep your application or online business running during peak times and not end up losing you money or damaging your reputation. Having your system set up into services such as the microservices system architecture can make monitoring, feature updates, debugging and scaling easier.
Scalability does have its caveats; it is certainly not a silver bullet. Creating a fully scalable system and infrastructure can be a large task that requires planning, testing and more testing. If you already have an application in place, splitting up that system can be a tedious process that may require code changes, software updates and more monitoring.
Scalability on AWS
Amazon Web Services as a platform has scalability built in. They offer many services that can help set up your application scale up or down depending on the resource requirements. One AWS product, the Elastic Load Balancer scales automatically on demand with the traffic it receives for your application. It also integrates with the Auto Scaling on your back end services (such as EC2 instances) to offer a full end to end scaling layer to handle different levels of traffic.
Companies that use the cloud don’t stay the same forever. They’re using the cloud to help them expand their business. Scalability is one of the core concepts that aspiring Solutions Architects need to understand in order to be as effective as possible.
That’s a wrap! In this post you’ve learned all about scalability, how it affects systems and applications, its benefits and caveats, optimizing your database for scalability and how scalability is used with Amazon Web Services.