Amazon SP-API Rate Limits Explained: How to Stop Getting Throttled

SP-API throttling is one of the most common causes of broken Amazon seller integrations. Your tool stops pulling order data. Inventory levels freeze. Reports don’t generate. And often there’s no clear error message â€” just a 429 Too Many Requests response that your code silently swallows.

Understanding Amazon’s rate limiting model and building integrations that respect it is the difference between a reliable data pipeline and one that breaks under load.

How Amazon SP-API Rate Limiting Works

Amazon uses a token bucket algorithm for rate limiting. Each API operation has a bucket that:

Holds a maximum number of tokens (the burst limit)
Refills at a steady rate over time (the rate in requests per second)

Each API call consumes one token. When the bucket is empty, requests are throttled (HTTP 429). Tokens refill continuously, so waiting before retrying will usually unblock you.

Key rate limit parameters for common operations:

API Operation	Rate (req/sec)	Burst
`getOrders` (Orders API)	0.0167 (1/min)	20
`getOrder` (single order)	0.5	30
`getOrderItems`	0.5	30
`getInventorySummaries`	2	2
`searchCatalogItems`	2	2
`getCatalogItem`	2	2
`getListingsItem`	5	10
`createReport`	0.0167 (1/min)	15
`getReport`	2	15
`getReportDocument`	0.0167 (1/min)	15

Official reference: SP-API rate limits documentation

The Rate Limits That Surprise Most Developers

Orders API: `getOrders` is Only 1 Request Per Minute

This catches nearly every developer building order syncing for the first time. You see a burst limit of 20 and assume you can make 20 rapid requests. You can â€” but only once. After the burst is exhausted, getOrders allows just 1 call per minute (0.0167 req/sec).

If you try to poll getOrders every few seconds looking for new orders, you’ll be throttled almost immediately.

Fix: Use the Orders API notifications via Amazon EventBridge instead of polling, or limit your polling to once per minute maximum.

Inventory: `getInventorySummaries` at 2 req/sec with Burst of 2

Checking inventory for 10,000 SKUs individually would take 83+ minutes at 2 req/sec â€” and that’s only if each call handles one SKU. In practice, the summaries endpoint returns paginated results, so the volume needed is lower, but still significant.

Fix: Use the Reports API to download full inventory snapshots rather than querying the Inventory Summaries endpoint per product.

Reports API: Highly Variable Processing Time

The Reports API pattern (create report â†’ poll for status â†’ download) is efficient on call volume but introduces latency. A report can take anywhere from 2 minutes to 45 minutes to process depending on date range and data volume.

Polling for report status too aggressively also consumes rate limit tokens. Use exponential backoff when polling.

Identifying Throttling in Your Integration

The HTTP 429 Response

A throttled request returns:

{
  "errors": [
    {
      "code": "QuotaExceeded",
      "message": "You exceeded your quota for the requested resource.",
      "details": ""
    }
  ]
}

With response headers:

HTTP/1.1 429 Too Many Requests
x-amzn-RequestId: abc123
x-amzn-RateLimit-Limit: 0.0167
Retry-After: 60

The x-amzn-RateLimit-Limit header tells you the current rate limit for that endpoint. The Retry-After header (when present) tells you how many seconds to wait before retrying.

Silent Throttling

The more insidious problem: some integrations catch all exceptions and continue without retrying, silently dropping data. If your order sync shows gaps during high-volume periods, check your application logs for 429 responses that were swallowed.

# Bad â€” silently drops throttled requests
try:
    response = sp_api.orders.get_orders(CreatedAfter=start_date)
    process_orders(response.payload)
except Exception:
    pass  # This swallows 429 errors

# Better â€” check specifically for throttling
except sp_api.exceptions.SellingApiRequestThrottledException:
    logger.warning("Throttled â€” will retry after backoff")
    raise  # Re-raise for the retry handler to catch

Implementing Exponential Backoff

Every SP-API integration should implement exponential backoff with jitter for rate limit handling. This is the standard approach for API calls that might be throttled.

Python Implementation

import time
import random
import logging
from sp_api.api import Orders
from sp_api.base import Marketplaces, SellingApiException

logger = logging.getLogger(__name__)

def api_call_with_backoff(func, *args, max_retries=5, base_delay=1.0, **kwargs):
    """
    Execute an SP-API call with exponential backoff on rate limit errors.

    Args:
        func: The SP-API function to call
        max_retries: Maximum number of retry attempts
        base_delay: Initial delay in seconds (doubles each retry)
    """
    for attempt in range(max_retries + 1):
        try:
            return func(*args, **kwargs)

        except SellingApiException as e:
            if e.code == 429:  # Rate limited
                if attempt == max_retries:
                    logger.error(f"Max retries reached after {max_retries} attempts")
                    raise

                # Exponential backoff with random jitter
                delay = base_delay * (2 ** attempt) + random.uniform(0, 1)
                delay = min(delay, 300)  # Cap at 5 minutes

                logger.warning(
                    f"Rate limited (attempt {attempt + 1}/{max_retries}). "
                    f"Retrying in {delay:.1f}s"
                )
                time.sleep(delay)

            else:
                # Not a rate limit error â€” don't retry
                raise

# Usage
orders_api = Orders(credentials=credentials, marketplace=Marketplaces.UK)

response = api_call_with_backoff(
    orders_api.get_orders,
    CreatedAfter='2025-09-01T00:00:00Z',
    MarketplaceIds=['A1F83G8C2ARO7P'],
    OrderStatuses=['Unshipped', 'PartiallyShipped']
)

Backoff Timing Reference

With base_delay=1.0 and exponential doubling plus jitter:

Attempt	Wait Before Retry
1st retry	~1â€“2 seconds
2nd retry	~2â€“5 seconds
3rd retry	~4â€“9 seconds
4th retry	~8â€“17 seconds
5th retry	~16â€“33 seconds

For getOrders which refills at 1/minute, you may need longer waits on heavily throttled calls. Respect the Retry-After header when present:

import re

def get_retry_after(exception):
    """Extract Retry-After seconds from throttle exception headers."""
    if hasattr(exception, 'headers'):
        retry_after = exception.headers.get('Retry-After')
        if retry_after:
            return float(retry_after)
    return None

# In your backoff handler:
retry_after = get_retry_after(e)
if retry_after:
    delay = retry_after + random.uniform(0, 2)  # add jitter
else:
    delay = base_delay * (2 ** attempt) + random.uniform(0, 1)

Restructuring Integrations to Avoid Throttling

Backoff handles throttling when it happens. Architectural changes prevent it from happening in the first place.

Strategy 1: Replace Polling with Notifications

Before (polling â€” high call volume):

# Runs every 5 minutes â€” 288 calls/day just to check for new orders
def sync_orders():
    response = orders_api.get_orders(
        CreatedAfter=five_minutes_ago,
        MarketplaceIds=[marketplace]
    )
    for order in response.payload.get('Orders', []):
        process_order(order)

After (event-driven â€” near-zero polling):

# Amazon sends an event when an order is created â€” you receive it immediately
# Set up via: Notifications API + Amazon EventBridge

@app.route('/webhook/order-created', methods=['POST'])
def handle_order_notification():
    notification = request.json
    order_id = notification['Payload']['OrderChangeNotification']['AmazonOrderId']

    # Fetch just this single order (uses getOrder, not getOrders â€” different rate limit)
    order = orders_api.get_order(order_id)
    process_order(order.payload)
    return '', 200

The notification-based approach uses getOrder (rate: 0.5 req/sec, burst: 30) rather than getOrders (rate: 0.0167 req/sec) â€” a 30Ã— improvement in rate limit headroom.

Strategy 2: Bulk Reports Instead of Item Queries

Before (individual inventory lookups â€” 10,000 calls for 10,000 SKUs):

for asin in asin_list:
    inventory = inventory_api.get_inventory_summaries(
        marketplaceIds=[marketplace],
        sellerSkus=[asin]
    )
    update_local_inventory(asin, inventory)

After (one report = all inventory data):

# Creates a single report with all inventory data
report_response = reports_api.create_report(
    reportType='GET_FBA_MYI_UNSUPPRESSED_INVENTORY_DATA',
    marketplaceIds=[marketplace]
)
report_id = report_response.payload['reportId']

# Poll with exponential backoff until done
report_data = wait_for_report(report_id)

# Process all 10,000 SKUs from one file download
process_inventory_report(report_data)

Rate limit cost: 3 API calls instead of 10,000+.

Strategy 3: Request Queue with Rate Limiting

If you must make many individual API calls, use a queue with a built-in rate limiter:

import asyncio
from collections import deque

class RateLimitedQueue:
    """Processes API calls respecting SP-API rate limits."""

    def __init__(self, rate_per_second=0.5, burst=30):
        self.rate = rate_per_second
        self.burst = burst
        self.tokens = burst
        self.last_refill = time.time()
        self.queue = deque()

    def refill_tokens(self):
        now = time.time()
        elapsed = now - self.last_refill
        new_tokens = elapsed * self.rate
        self.tokens = min(self.burst, self.tokens + new_tokens)
        self.last_refill = now

    def wait_for_token(self):
        self.refill_tokens()
        if self.tokens >= 1:
            self.tokens -= 1
            return
        # Wait until a token is available
        wait_time = (1 - self.tokens) / self.rate
        time.sleep(wait_time)
        self.tokens = 0

    def execute(self, func, *args, **kwargs):
        self.wait_for_token()
        return api_call_with_backoff(func, *args, **kwargs)

# Usage
queue = RateLimitedQueue(rate_per_second=0.5, burst=30)
for order_id in order_ids:
    result = queue.execute(orders_api.get_order, order_id)
    process_order(result.payload)

Monitoring Your Rate Limit Usage

Build logging that tracks your API consumption:

import datetime
from collections import defaultdict

class ApiUsageMonitor:
    def __init__(self):
        self.call_log = defaultdict(list)

    def record_call(self, endpoint, status_code):
        self.call_log[endpoint].append({
            'timestamp': datetime.datetime.now(),
            'status': status_code
        })

    def get_throttle_rate(self, endpoint, window_minutes=60):
        cutoff = datetime.datetime.now() - datetime.timedelta(minutes=window_minutes)
        recent = [c for c in self.call_log[endpoint] if c['timestamp'] > cutoff]
        total = len(recent)
        throttled = sum(1 for c in recent if c['status'] == 429)
        return throttled / total if total > 0 else 0

    def report(self):
        for endpoint, calls in self.call_log.items():
            throttled = sum(1 for c in calls if c['status'] == 429)
            print(f"{endpoint}: {len(calls)} calls, {throttled} throttled ({throttled/len(calls)*100:.1f}%)")

A throttle rate above 5% on any endpoint indicates you need architectural changes â€” backoff alone won’t fix a fundamentally over-polling integration.

Rate limit issues are solvable but require changing how you think about data retrieval â€” from “fetch when you need it” to “design for the API’s capacity.” With notifications, bulk reports, and a proper retry strategy, you can build SP-API integrations that handle scale reliably.

We build SP-API integrations with rate limiting and monitoring built in from the start. Book a free consultation to discuss your Amazon data integration requirements.