Why Bulk APIs?
Importing 10,000 records one at a time over a REST API is brutally slow. Each request has HTTP overhead, a TCP round trip, authentication checks, and a database write. At 50ms per request, 10,000 records take over 8 minutes.
Bulk APIs solve this by batching many operations into a single HTTP request:
# Without bulk: 10,000 requests
POST /api/users { "name": "Alice" } → 200
POST /api/users { "name": "Bob" } → 200
# ... 9,998 more requests
# With bulk: 1 request
POST /api/users/batch
[ {"name": "Alice"}, {"name": "Bob"}, ... 9,998 more ]
→ 207 Multi-Status
Beyond performance, bulk APIs enable atomic operations (all or nothing), reduce API gateway rate limit pressure, and simplify client retry logic.
Request Design
Array-of-Objects Payload
The simplest and most common pattern: send an array of resource objects:
POST /api/v1/users/batch
Content-Type: application/json
{
"operations": [
{ "name": "Alice Chen", "email": "[email protected]", "role": "admin" },
{ "name": "Bob Smith", "email": "[email protected]", "role": "viewer" },
{ "name": "Carol Doe", "email": "[email protected]", "role": "editor" }
]
}
Mixed Operation Batches
Some APIs support mixed creates/updates/deletes in a single batch:
POST /api/batch
{
"operations": [
{ "method": "POST", "path": "/users", "body": {"name": "Alice"} },
{ "method": "PATCH", "path": "/users/42", "body": {"role": "admin"} },
{ "method": "DELETE", "path": "/users/99" }
]
}
This pattern (popularized by Microsoft Graph API and Google Batch) requires robust handling since each operation can independently succeed or fail.
Payload Size Limits
Always document and enforce limits:
# views.py
MAX_BATCH_SIZE = 1000
def batch_create_users(request):
operations = request.data.get('operations', [])
if len(operations) > MAX_BATCH_SIZE:
return Response(
{'error': f'Batch size exceeds limit of {MAX_BATCH_SIZE}'},
status=400,
)
# ...
Response Design: 207 Multi-Status
207 Multi-Status (RFC 4918, originally WebDAV) is now widely used for batch APIs. It signals that the response body contains per-item status information:
HTTP/1.1 207 Multi-Status
Content-Type: application/json
{
"results": [
{
"index": 0,
"status": 201,
"id": "usr_abc123",
"success": true
},
{
"index": 1,
"status": 422,
"success": false,
"error": {
"code": "validation_error",
"field": "email",
"message": "Email '[email protected]' already exists"
}
},
{
"index": 2,
"status": 201,
"id": "usr_def456",
"success": true
}
],
"summary": {
"total": 3,
"succeeded": 2,
"failed": 1
}
}
Key design decisions:
- Include the original index so clients can correlate results to inputs
- Include a per-item status code (not just a boolean)
- Include a summary for quick overview without iterating all results
- Use consistent error objects matching your single-resource error format
Transactional vs Best-Effort Semantics
This is the most important design decision in bulk API design:
All-or-Nothing (Transactional)
If any item fails, the entire batch rolls back. Return 400 with error details:
from django.db import transaction
@transaction.atomic
def batch_create_transactional(request):
operations = request.data['operations']
created = []
errors = []
for i, op in enumerate(operations):
serializer = UserSerializer(data=op)
if not serializer.is_valid():
errors.append({'index': i, 'errors': serializer.errors})
if errors:
# Validation failed — transaction will roll back
return Response({'errors': errors}, status=400)
for i, op in enumerate(operations):
user = User.objects.create(**op)
created.append({'index': i, 'id': user.id})
return Response({'created': created}, status=201)
Use transactional semantics for: financial operations, related entity creation (order + line items), data imports where consistency is critical.
Best-Effort (Partial Success)
Process each item independently. Some succeed, some fail — return 207:
def batch_create_best_effort(request):
results = []
for i, op in enumerate(request.data['operations']):
try:
user = User.objects.create(**op)
results.append({'index': i, 'status': 201, 'id': user.id, 'success': True})
except Exception as e:
results.append({'index': i, 'status': 422, 'success': False,
'error': str(e)})
return Response({'results': results}, status=207)
Use best-effort for: bulk sends (email campaigns, notifications), analytics events, data sync where individual failures are acceptable.
Performance Considerations
Database Efficiency
Process batch items in bulk at the database level, not one-by-one:
# Slow: N individual INSERTs
for op in operations:
User.objects.create(**op)
# Fast: single bulk INSERT
users = [User(**op) for op in operations]
User.objects.bulk_create(users, ignore_conflicts=False)
bulk_create can insert thousands of rows in milliseconds vs seconds.
Async Processing with 202
For very large batches (millions of records), accept the batch and process asynchronously:
POST /api/imports
→ 202 Accepted
{
"import_id": "imp_xyz789",
"status": "queued",
"status_url": "/api/imports/imp_xyz789",
"estimated_completion": "2026-02-26T14:30:00Z"
}
# Poll for status:
GET /api/imports/imp_xyz789
→ { "status": "completed", "succeeded": 9847, "failed": 153 }
Return 202 Accepted when the batch is queued but not yet processed. Provide a status_url for polling and consider webhook delivery on completion.