Navigating traffic management for your app can be confusing. Do you need an API gateway or a load balancer? The answer is often both. However, this raises further questions: Where do they go? How many do you need? What kind? Let us help you make informed decisions for your application.
What are load balancers and API gateways?
A load balancer sits in front of multiple servers running the same application and distributes traffic among them, ensuring no single system becomes overloaded.
An API gateway sits in front of multiple backend systems and routes traffic to the appropriate service based on factors such as the URL, authentication status, and other request attributes.
Although these problems are different, the solutions often overlap. Sometimes an API gateway will do simple load balancing, and vice versa. If you choose to implement your own, you might even use the same library for both.
One term that we're not going to discuss is an Internet Gateway. In the context of a cloud provider, this would connect Kubernetes or a VM directly to the Internet. These are less compatible across cloud products such as managed containers and lambdas, and less feature-rich.
API Gateway
API gateways act as a single entry point for your application, managing and routing requests to different backend services. This is particularly beneficial in microservices architectures, where numerous services need to be exposed to clients. API gateways are also valuable for monolithic services, providing centralized control and additional features such as authentication, rate limiting, and analytics.
API gateways have access to your application’s logic and data, enabling them to handle authorization checks, subscriptions, rate limiting, and other business logic in a centralized location. This approach avoids spreading these responsibilities throughout your app or across multiple services, making your architecture cleaner and more maintainable.
There are many different API gateway options available, ranging from DIY solutions using common libraries to fully managed services provided by cloud providers.
Load Balancer
At its core, a load balancer is a proxy that knows about multiple instances of an application or service and distributes traffic among them.
A load balancer, at its most basic, is a proxy that knows about multiple instances of an application or service and distributes traffic among them. This simple concept has evolved into various types of load balancers. Some are straightforward and can be implemented using familiar libraries, while others are highly complex, spanning the globe and requiring data centers on multiple continents, intricate networking, and worldwide CDNs.
Here's a list of the various types of load balancers:
Understanding the TCP layer
When establishing a network connection, the process is divided into 7 layers, as defined by the OSI model. These layers range from the physical layer at the bottom to the application layer at the top.
• Layer 4 (Transport Layer): At this stage, the IP address and port are known, but the content of the communication remains opaque. This layer is focused on the delivery of packets across network connections.
• Layer 7 (Application Layer): This layer contains the entirety of the request, including the URL, headers, cookies, and other relevant data. It provides complete visibility into the content of the communication.
Choosing between these involves compromise. Level 4 load balancers is much faster, but because they have no visibility into the content, they can only route based on IP and port, and cannot handle SSL termination. On older networks, WebSockets may require a level 4 load balancer, but most modern level 7 load balancers can be used. Level 7 load balancers have a richer feature set, they can route based on URL, cookies, headers, and more. Some even have a cache built in.
Integrating load balancers and API gateways in your architecture
Combining load balancers and API gateways can optimize your application’s performance and reliability. Here’s how to use them together effectively:
• External Load Balancers: Use these as the first point of contact for incoming traffic. They distribute traffic across multiple instances of your API gateway, ensuring high availability and fault tolerance.
• Internal Load Balancers: After a connection is already in your network, use these to distribute traffic among different instances of your backend services, ensuring they can handle the load and scale horizontally. In some cases internal load balancing can be handled by an API gateway.
• API Gateways: After the load balancer, the API gateway takes over to handle request routing, authentication, rate limiting, and other API management tasks.
Using external load balancers
An external load balancer receives connections from the outside internet. These can be global or regional, but almost all applications benefit from a global load balancer. Global load balancers provide a geographically distributed entry point and can provide a CDN/cache for your static assets. Global load balancer allow you to host your app in multiple regions to maintain low latency worldwide. Even if your app is in a single data center, a global cache will speed up your static assets.
Examples of global load balancers include AWS Global Accelerator, Google Cloud Global Load Balancer, and Azure Front Door. The latter two are Layer 7 load balancers and offer a rich feature set, including built-in caching, path and host-based routing, and more. AWS Global Accelerator, while primarily a Layer 4 load balancer, can be combined with other AWS products to achieve similar features.
Implementing a global load balancer yourself is not recommended, as it requires multiple data centers, complex networking with Anycast IPs, and very high uptime.
Using internal load balancers
In contrast to external load balancers, internal load balancers are used to manage traffic within a private network, typically for a single service. These load balancers can be standalone entities or built into an API gateway. Your cloud provider will offer options for internal load balancers, and you can also implement your own using tools like Nginx, Consul, and libraries for Node.js, Python, Java, and Go.
Cloud providers often handle load balancing automatically if you use managed containers, serverless functions (like AWS Lambda), or even VMs. This built-in management simplifies the process and ensures efficient traffic distribution within your internal network.
Using an API gateway
Whereas load balancers forward traffic with minimal intervention, API gateways offer a wide range of advanced functionalities. These include handling authorization, security, subscriptions, logging, correlation IDs/traces, and more. The major cloud providers each offer API gateway services, and numerous open-source libraries are available for building your own.
Many API gateways support configuration using OpenAPI, YAML, or custom formats. Open-source solutions such as KrakenD, Kong, and Tyk can be self-hosted and integrated into your Kubernetes pods.
API gateways are so feature-rich that nearly all support a plugin system. For example, with Kong, you can add plugins for JWT authentication, an AI Proxy for accessing your chosen AI service, caching, metrics collection, AWS Lambda integration, and create your own custom plugins to manage subscriptions. Additionally, many API gateways include built-in load-balancing features.
Decision Points
Consider your specific needs when deciding whether you require a load balancer, an API gateway, or both.
When to add a load balancer
Note that many cloud products, such as managed containers and lambdas, often come with built-in load balancing.
- High VM load: If your VM instances are running hot and errors are occurring due to CPU or memory shortages, add another instance and use an internal load balancer to route traffic appropriately.
- Global expansion: If you’re expanding globally and would benefit from multiple global entry points to your cloud provider, consider a global load balancer combined with a CDN.
- Traffic overload: If you already have an application (Layer 7) load balancer but are still struggling with traffic, add a second instance and place a network (Layer 4) load balancer in front.
- Service resilience and failover: If you need to ensure high availability and resilience, use a load balancer to automatically reroute traffic from failed instances to healthy ones.
When to add an API gateway
- Scattered authorization code: If your authorization logic is spread throughout your app, consider an API gateway with an authorization plugin to centralize and prevent bugs.
- Proxying to services: If your main app is frequently proxying requests to Lambdas or other services, it’s effectively acting as an API gateway. Handle this properly with an actual API gateway.
- Rate limiting, throttling, and security: If rate limiting, throttling, and IP allowlisting are important, manage these with an API gateway.
- Multiple service requests: If your front end needs to hit multiple services, an API gateway can centralize this.
- Incomplete logging and metrics: If your logging and metrics are insufficient, an API gateway with a metrics plugin can provide comprehensive insights and tools for compliance.
- API versioning: If you’re developing a new version of your API, an API gateway can route between versions based on your chosen criteria.
- Service discovery: If you’re working in a microservices environment where services are frequently added or removed, an API gateway with built-in service discovery can help manage these changes dynamically.
- Centralized API management: If you need a centralized point to manage API documentation, versioning, and developer access, an API gateway can serve as an API management hub. Many API gateways accept OpenAPI.