SP-API throttling is one of the most common causes of broken Amazon seller integrations. Your tool stops pulling order data. Inventory levels freeze. Reports don’t generate. And often there’s no clear error message — just a 429 Too Many Requests response that your code silently swallows.
Understanding Amazon’s rate limiting model and building integrations that respect it is the difference between a reliable data pipeline and one that breaks under load.
How Amazon SP-API Rate Limiting Works
Amazon uses a token bucket algorithm for rate limiting. Each API operation has a bucket that:
- Holds a maximum number of tokens (the burst limit)
- Refills at a steady rate over time (the rate in requests per second)
Each API call consumes one token. When the bucket is empty, requests are throttled (HTTP 429). Tokens refill continuously, so waiting before retrying will usually unblock you.
Key rate limit parameters for common operations:
| API Operation | Rate (req/sec) | Burst |
|---|---|---|
getOrders (Orders API) | 0.0167 (1/min) | 20 |
getOrder (single order) | 0.5 | 30 |
getOrderItems | 0.5 | 30 |
getInventorySummaries | 2 | 2 |
searchCatalogItems | 2 | 2 |
getCatalogItem | 2 | 2 |
getListingsItem | 5 | 10 |
createReport | 0.0167 (1/min) | 15 |
getReport | 2 | 15 |
getReportDocument | 0.0167 (1/min) | 15 |
Official reference: SP-API rate limits documentation
The Rate Limits That Surprise Most Developers
Orders API: getOrders is Only 1 Request Per Minute
This catches nearly every developer building order syncing for the first time. You see a burst limit of 20 and assume you can make 20 rapid requests. You can — but only once. After the burst is exhausted, getOrders allows just 1 call per minute (0.0167 req/sec).
If you try to poll getOrders every few seconds looking for new orders, you’ll be throttled almost immediately.
Fix: Use the Orders API notifications via Amazon EventBridge instead of polling, or limit your polling to once per minute maximum.
Inventory: getInventorySummaries at 2 req/sec with Burst of 2
Checking inventory for 10,000 SKUs individually would take 83+ minutes at 2 req/sec — and that’s only if each call handles one SKU. In practice, the summaries endpoint returns paginated results, so the volume needed is lower, but still significant.
Fix: Use the Reports API to download full inventory snapshots rather than querying the Inventory Summaries endpoint per product.
Reports API: Highly Variable Processing Time
The Reports API pattern (create report → poll for status → download) is efficient on call volume but introduces latency. A report can take anywhere from 2 minutes to 45 minutes to process depending on date range and data volume.
Polling for report status too aggressively also consumes rate limit tokens. Use exponential backoff when polling.
Identifying Throttling in Your Integration
The HTTP 429 Response
A throttled request returns:
{
"errors": [
{
"code": "QuotaExceeded",
"message": "You exceeded your quota for the requested resource.",
"details": ""
}
]
}
With response headers:
HTTP/1.1 429 Too Many Requests
x-amzn-RequestId: abc123
x-amzn-RateLimit-Limit: 0.0167
Retry-After: 60
The x-amzn-RateLimit-Limit header tells you the current rate limit for that endpoint. The Retry-After header (when present) tells you how many seconds to wait before retrying.
Silent Throttling
The more insidious problem: some integrations catch all exceptions and continue without retrying, silently dropping data. If your order sync shows gaps during high-volume periods, check your application logs for 429 responses that were swallowed.
# Bad — silently drops throttled requests
try:
response = sp_api.orders.get_orders(CreatedAfter=start_date)
process_orders(response.payload)
except Exception:
pass # This swallows 429 errors
# Better — check specifically for throttling
except sp_api.exceptions.SellingApiRequestThrottledException:
logger.warning("Throttled — will retry after backoff")
raise # Re-raise for the retry handler to catch
Implementing Exponential Backoff
Every SP-API integration should implement exponential backoff with jitter for rate limit handling. This is the standard approach for API calls that might be throttled.
Python Implementation
import time
import random
import logging
from sp_api.api import Orders
from sp_api.base import Marketplaces, SellingApiException
logger = logging.getLogger(__name__)
def api_call_with_backoff(func, *args, max_retries=5, base_delay=1.0, **kwargs):
"""
Execute an SP-API call with exponential backoff on rate limit errors.
Args:
func: The SP-API function to call
max_retries: Maximum number of retry attempts
base_delay: Initial delay in seconds (doubles each retry)
"""
for attempt in range(max_retries + 1):
try:
return func(*args, **kwargs)
except SellingApiException as e:
if e.code == 429: # Rate limited
if attempt == max_retries:
logger.error(f"Max retries reached after {max_retries} attempts")
raise
# Exponential backoff with random jitter
delay = base_delay * (2 ** attempt) + random.uniform(0, 1)
delay = min(delay, 300) # Cap at 5 minutes
logger.warning(
f"Rate limited (attempt {attempt + 1}/{max_retries}). "
f"Retrying in {delay:.1f}s"
)
time.sleep(delay)
else:
# Not a rate limit error — don't retry
raise
# Usage
orders_api = Orders(credentials=credentials, marketplace=Marketplaces.UK)
response = api_call_with_backoff(
orders_api.get_orders,
CreatedAfter='2025-09-01T00:00:00Z',
MarketplaceIds=['A1F83G8C2ARO7P'],
OrderStatuses=['Unshipped', 'PartiallyShipped']
)
Backoff Timing Reference
With base_delay=1.0 and exponential doubling plus jitter:
| Attempt | Wait Before Retry |
|---|---|
| 1st retry | ~1–2 seconds |
| 2nd retry | ~2–5 seconds |
| 3rd retry | ~4–9 seconds |
| 4th retry | ~8–17 seconds |
| 5th retry | ~16–33 seconds |
For getOrders which refills at 1/minute, you may need longer waits on heavily throttled calls. Respect the Retry-After header when present:
import re
def get_retry_after(exception):
"""Extract Retry-After seconds from throttle exception headers."""
if hasattr(exception, 'headers'):
retry_after = exception.headers.get('Retry-After')
if retry_after:
return float(retry_after)
return None
# In your backoff handler:
retry_after = get_retry_after(e)
if retry_after:
delay = retry_after + random.uniform(0, 2) # add jitter
else:
delay = base_delay * (2 ** attempt) + random.uniform(0, 1)
Restructuring Integrations to Avoid Throttling
Backoff handles throttling when it happens. Architectural changes prevent it from happening in the first place.
Strategy 1: Replace Polling with Notifications
Before (polling — high call volume):
# Runs every 5 minutes — 288 calls/day just to check for new orders
def sync_orders():
response = orders_api.get_orders(
CreatedAfter=five_minutes_ago,
MarketplaceIds=[marketplace]
)
for order in response.payload.get('Orders', []):
process_order(order)
After (event-driven — near-zero polling):
# Amazon sends an event when an order is created — you receive it immediately
# Set up via: Notifications API + Amazon EventBridge
@app.route('/webhook/order-created', methods=['POST'])
def handle_order_notification():
notification = request.json
order_id = notification['Payload']['OrderChangeNotification']['AmazonOrderId']
# Fetch just this single order (uses getOrder, not getOrders — different rate limit)
order = orders_api.get_order(order_id)
process_order(order.payload)
return '', 200
The notification-based approach uses getOrder (rate: 0.5 req/sec, burst: 30) rather than getOrders (rate: 0.0167 req/sec) — a 30× improvement in rate limit headroom.
Strategy 2: Bulk Reports Instead of Item Queries
Before (individual inventory lookups — 10,000 calls for 10,000 SKUs):
for asin in asin_list:
inventory = inventory_api.get_inventory_summaries(
marketplaceIds=[marketplace],
sellerSkus=[asin]
)
update_local_inventory(asin, inventory)
After (one report = all inventory data):
# Creates a single report with all inventory data
report_response = reports_api.create_report(
reportType='GET_FBA_MYI_UNSUPPRESSED_INVENTORY_DATA',
marketplaceIds=[marketplace]
)
report_id = report_response.payload['reportId']
# Poll with exponential backoff until done
report_data = wait_for_report(report_id)
# Process all 10,000 SKUs from one file download
process_inventory_report(report_data)
Rate limit cost: 3 API calls instead of 10,000+.
Strategy 3: Request Queue with Rate Limiting
If you must make many individual API calls, use a queue with a built-in rate limiter:
import asyncio
from collections import deque
class RateLimitedQueue:
"""Processes API calls respecting SP-API rate limits."""
def __init__(self, rate_per_second=0.5, burst=30):
self.rate = rate_per_second
self.burst = burst
self.tokens = burst
self.last_refill = time.time()
self.queue = deque()
def refill_tokens(self):
now = time.time()
elapsed = now - self.last_refill
new_tokens = elapsed * self.rate
self.tokens = min(self.burst, self.tokens + new_tokens)
self.last_refill = now
def wait_for_token(self):
self.refill_tokens()
if self.tokens >= 1:
self.tokens -= 1
return
# Wait until a token is available
wait_time = (1 - self.tokens) / self.rate
time.sleep(wait_time)
self.tokens = 0
def execute(self, func, *args, **kwargs):
self.wait_for_token()
return api_call_with_backoff(func, *args, **kwargs)
# Usage
queue = RateLimitedQueue(rate_per_second=0.5, burst=30)
for order_id in order_ids:
result = queue.execute(orders_api.get_order, order_id)
process_order(result.payload)
Monitoring Your Rate Limit Usage
Build logging that tracks your API consumption:
import datetime
from collections import defaultdict
class ApiUsageMonitor:
def __init__(self):
self.call_log = defaultdict(list)
def record_call(self, endpoint, status_code):
self.call_log[endpoint].append({
'timestamp': datetime.datetime.now(),
'status': status_code
})
def get_throttle_rate(self, endpoint, window_minutes=60):
cutoff = datetime.datetime.now() - datetime.timedelta(minutes=window_minutes)
recent = [c for c in self.call_log[endpoint] if c['timestamp'] > cutoff]
total = len(recent)
throttled = sum(1 for c in recent if c['status'] == 429)
return throttled / total if total > 0 else 0
def report(self):
for endpoint, calls in self.call_log.items():
throttled = sum(1 for c in calls if c['status'] == 429)
print(f"{endpoint}: {len(calls)} calls, {throttled} throttled ({throttled/len(calls)*100:.1f}%)")
A throttle rate above 5% on any endpoint indicates you need architectural changes — backoff alone won’t fix a fundamentally over-polling integration.
Rate limit issues are solvable but require changing how you think about data retrieval — from “fetch when you need it” to “design for the API’s capacity.” With notifications, bulk reports, and a proper retry strategy, you can build SP-API integrations that handle scale reliably.
We build SP-API integrations with rate limiting and monitoring built in from the start. Book a free consultation to discuss your Amazon data integration requirements.