Quota and Rate Limit Management for GitHub API

⚠ Important: Before implementing, consult with your legal team to ensure compliance with providers' Terms of Service. Lunar.dev provides a platform for building API middleware and tools to work within API provider boundaries, but it is the user's responsibility to enforce compliance correctly.

Use Cases

Handling Multiple Rate Limits – GitHub enforces different limits for authenticated, unauthenticated, and GraphQL API calls. A structured quota system ensures compliance with these varying limits.
Preventing Request Throttling – By shaping API traffic dynamically, this solution prevents GitHub from rejecting requests due to quota exhaustion.
Optimizing API Consumption – Properly managing quotas allows fair and predictable API usage across different consumers.‍
Mitigating Empty Responses – Some GitHub API requests return empty responses under high-load scenarios. Implementing retry logic for such cases improves reliability.

Solution Structure

Step-by-Step Breakdown

Quota Definition – Define API consumption quotas aligned with GitHub’s rate limits:
- REST API Rate Limits (e.g., 5,000 requests per hour per authenticated user).
- GraphQL API Limits (e.g., token-based points system per query complexity).
- Secondary Rate Limits (GitHub applies additional limits during high-traffic periods).
Rate Limiting Policy Enforcement – Implement real-time traffic control:
- Restrict request rates based on GitHub's tiered limits.
- Prioritize essential operations over non-critical requests.
Quota Segmentation – Apply different quotas for various API consumers (e.g., CI/CD jobs vs. interactive users).‍
Retry Mechanism for Empty Responses – Detect and retry GitHub API calls that return empty responses.

How It Works Together

By enforcing quotas for each GitHub API limit, the middleware dynamically controls traffic to stay within GitHub’s restrictions. A rate-limiting policy ensures that different consumers don’t exceed their allocated quota. Additionally, retry logic mitigates GitHub’s occasional empty response issue, ensuring reliable API interactions.

Concrete Example

Problem:
A CI/CD pipeline heavily uses GitHub’s API for repository management and pull request operations. As multiple jobs run simultaneously, they quickly hit GitHub’s 5,000 requests/hour limit per authenticated user. Additionally, GraphQL queries consume points unpredictably based on query complexity, leading to sudden API throttling. During peak usage, GitHub sometimes returns empty responses, causing failures in automation scripts.

Solution:
Introduce a quota management system that tracks API usage against GitHub’s limits. Requests are rate-limited based on predefined thresholds, preventing overuse. A retry policy is applied to handle cases where GitHub responds with an empty payload.

Quota and Rate Limit Flow YAML Configuration

quotas:
  - name: github_api_quota
    provider: github
    limits:
      - type: hourly
        value: 5000  # GitHub API allows 5,000 requests per authenticated user per hour
      - type: graphql
        value: 50000  # GitHub API assigns 50,000 points per hour per user for GraphQL queries
      - type: secondary
        value: dynamic  # Adjusted based on GitHub’s unpredictable secondary rate limits

endpoints:
  - url: "https://api.github.com/*"
    method: ANY
    remedies:
      - name: Rate Limiting
        enabled: true
        config:
          strategy: token-bucket  # Enforces rate limits smoothly over time
          max_requests_per_second: 10  # Ensures even distribution of API requests
          priority:
            - "high-priority-endpoints"  # Allocate more tokens to critical API calls
            - "low-priority-endpoints"

      - name: Retry
        enabled: true
        config:
          max_retries: 3
          retry_on_status_codes: [200]  # Retries if an empty response is received
          backoff_strategy: exponential  # Increases delay between retries

Final Notes

By structuring quotas and rate limits, API consumption remains predictable and efficient. The retry mechanism ensures stability when handling GitHub’s empty response issue. Developers can tune these configurations to match their API consumption needs while staying within GitHub’s policies.

About Lunar.dev:

Lunar.dev is your go to solution for Egress API controls and API consumption management at scale.
With Lunar.dev, engineering teams of any size gain instant unified controls to effortlessly manage, orchestrate, and scale API egress traffic across environments— all without the need for code changes.
Lunar.dev is agnostic to any API provider and enables full egress traffic observability, real-time controls for cost spikes or issues in production, all through an egress proxy, an SDK installation, and a user-friendly UI management layer.
Lunar.dev offers solutions for quota management across environments, prioritizing API calls, centralizing API credentials management, and mitigating rate limit issues.

Use Now