Azure API Calls Queue for Performance and SLA Improvement

The queuing API calls strategy is a smart approach to prioritize the usage of Azure API quotas based on business logic - enhancing SLA improvements for customers and optimizing usage of the provider's quota boundaries.

Lunar Flow | Queueing Requests with Azure API

Implementing an API calls queue for Azure API can help you avoid hitting rate limits by delaying requests until slots are available, preventing Azure API Too Many Requests errors. It also allows you to prioritize critical API calls—like production over staging—ensuring that key business processes run smoothly and efficiently. This approach not only optimizes API usage but also aligns with your business goals by ensuring that the most important requests are handled first.

  • [REQUEST] Queue processor - This processor allows you to define a consumption rate by configuring window size and quota, as well as priorities for different groups, ensuring that higher-priority requests are processed first. By using this plugin, requests will be held in the queue and processed sequentially based on the predefined capacity, or until the TTL is reached.
  • ‍[RESPONSE] Send Response - If a request waits too long in the queue (exceeding the configured TTL), we send back an error message to the consuming service. You can also customize the flow to instead pass the request to a different API, or do anything else with it…

Configuring API Call Queue for Azure API:

You can see a sample configuration below for setting up a retry policy for Azure API. For full details, please see the documentation for the queue policy.

strategy_based_queue:

         allowed_request_count: <allowed_request_count>

         window_size_in_seconds: <window_size_in_seconds>

         ttl_seconds: <ttl_seconds>

         queue_size: <size_of_max_queue_connections_in_number>

         response_status_code: <response_status_code>

         prioritization:

           group_by:

             header_name: <header_name>

           groups:

             <group_name1>:

               priority: <priority1>

             <group_name2>:

               priority: <priority2>

  • allowed_request_count: Specifies the maximum number of requests allowed within each defined time window. Set this as a positive integer to control the rate of API calls.
  • window_size_in_seconds: Defines the duration of the time window in seconds during which the allowed requests can be made. This helps regulate the flow of requests.
  • ttl_seconds: Indicates the time-to-live for a request in the queue, measured in seconds. If a request stays in the queue longer than this time, it is discarded.
  • queue_size: Sets the maximum number of requests that can be held in the queue at any given time. This limits the total number of pending API calls.
  • response_status_code: Determines the HTTP status code returned to the client when a request is dropped due to exceeding the TTL. Commonly set to 429 for "Too Many Requests."
  • prioritization.group_by.header_name: Specifies the HTTP header name used to identify which priority group a request belongs to. This allows requests to be categorized based on the value of a specific header.
  • prioritization.groups.group_name: Defines the name of each group used for prioritization. This could be something like "production" or "staging."
  • prioritization.groups.*.priority: Assigns a priority level to each group. Lower numbers indicate higher priority, ensuring more critical requests are processed first.

Why to use API calls queueing for Azure API:

Queue Requests to Azure API to Avoid Hitting Rate Limits

Implementing an API calls queue allows you to manage the rate of requests sent to Azure API, effectively preventing the risk of hitting rate limits. By delaying requests until there are available slots within the allowed request count, you can avoid receiving "Too Many Requests" (HTTP 429) responses. This ensures that your API interactions remain within the quotas set by the provider, allowing for continuous operation without unnecessary interruptions caused by rate limiting.

Queue Requests to Azure API to Increase Reliability

Queuing requests to Azure API enhances system reliability by providing resilience against sudden spikes in demand. By decoupling the arrival of tasks from their processing, the queue helps to manage traffic surges without overwhelming the system. Additionally, this approach isolates potential malfunctions in queue processing from affecting the intake of requests, ensuring that even if processing slows down, the system remains stable and responsive to incoming requests.

Queue Requests to Azure API for Cost Optimization

Using an API calls queue enables cost optimization by reducing the need to overprovision resources to handle peak loads. Since the processing of requests is decoupled from their intake, you can manage workloads more efficiently, processing them at a pace that matches your available resources rather than in real-time. This approach allows you to scale resources according to actual processing needs, avoiding the costs associated with maintaining excess capacity for rare traffic spikes.

Queue Requests to Azure API for Performance Efficiency

Queuing requests to Azure API enhances performance efficiency by allowing you to design throughput intentionally. Since the intake of requests is not directly tied to the rate at which they are processed, you can optimize processing to maximize performance. This decoupling enables better control over the flow of API traffic, ensuring that resources are used effectively and that the system can handle varying loads without sacrificing performance.

When to use API queue for Azure API:

Implementing a queue for API calls is suitable in the following scenarios:

  • Prioritize Quota Usage: When there's an API provider quota that needs to be shared among consumers or customers, prioritizing the serving of more important or prioritized customers first.
  • Handling Peak Traffic: When there's an expected surge in API consumption traffic, prioritizing in advance which customer will be served first becomes crucial.
  • Subjected to Overloading: Useful for applications that rely on services prone to overloading, providing protection against potential system failures.
  • Async API Calls: Ideal for situations where API calls need to be handled asynchronously, allowing efficient processing without the constraints of synchronous operations.

About Azure API:

The Azure API is a robust cloud-based platform offered by Microsoft that provides a comprehensive suite of tools and services to build, manage, and deploy applications at scale. It enables developers to leverage an extensive array of API solutions that integrate seamlessly with the Azure ecosystem, facilitating tasks such as data analytics, machine learning, and IoT management. Whether you're looking to create RESTful APIs, manage microservices, or iterate with continuous deployment, Azure API offers the flexibility and resilience required for modern application development. With features like API Management, developers can efficiently publish, secure, transform, maintain, and monitor APIs. This includes a built-in gateway for API traffic governance, and integration with Azure Active Directory for streamlined authentication and authorization processes. Additionally, extensive documentation and support make it accessible for developers of all levels, enabling them to harness cloud capabilities to innovate and optimize business processes.

About Lunar.dev:

Lunar.dev is your go to solution for Egress API controls and API consumption management at scale.

With Lunar.dev, engineering teams of any size gain instant unified controls to effortlessly manage, orchestrate, and scale API egress traffic across environments— all without the need for code changes.

Lunar.dev is agnostic to any API provider and enables full egress traffic observability, real-time controls for cost spikes or issues in production, all through an egress proxy, an SDK installation, and a user-friendly UI management layer.

Lunar.dev offers solutions for quota management across environments, prioritizing API calls, centralizing API credentials management, and mitigating rate limit issues.

About Lunar.dev:

Lunar.dev is your go to solution for Egress API controls and API consumption management at scale.
With Lunar.dev, engineering teams of any size gain instant unified controls to effortlessly manage, orchestrate, and scale API egress traffic across environments— all without the need for code changes.
Lunar.dev is agnostic to any API provider and enables full egress traffic observability, real-time controls for cost spikes or issues in production, all through an egress proxy, an SDK installation, and a user-friendly UI management layer.
Lunar.dev offers solutions for quota management across environments, prioritizing API calls, centralizing API credentials management, and mitigating rate limit issues.

Table of content

See How it Works

Got a use case you want to get help with? Talk to an expert.

Let's Talk
By clicking “Accept”, you agree to the storing of cookies on your device to enhance site navigation, analyze site usage, and assist in our marketing efforts. View our Privacy Policy for more information.