#7957 npx wrangler deploy randomly fails on GitHub Runner
524 timeout on KV bulk uploads for very large Workers Sites (~210K assets). Retry logic added in wrangler 3.79.0 insufficient. Internal tracking WC-4097 exists. Related to #10459, #2794. API-side issue, not wrangler code.
Link to #10459 as potential duplicate; monitor WC-4097 progress; add api-limitation label
Analysis Report
Issue #7957: npx wrangler deploy randomly fails on GitHub Runner
Summary
| Field | Value |
|---|---|
| Issue | #7957 |
| Title | npx wrangler deploy randomly fails on GitHub Runner |
| Created | 2025-01-29 |
| Updated | 2025-10-30 |
| State | OPEN |
| Labels | bug, kv-asset-handler |
| Reporter Version | wrangler 3.106.0 |
| Current Version | wrangler 4.60.0 |
Problem Description
The reporter experiences intermittent failures (30-50% of the time) when deploying Workers Sites with a very large number of static assets (~210,427 assets) to KV from GitHub Actions runners. The error is a 524 timeout from the Cloudflare API during KV bulk upload operations.
Error Message:
PUT /accounts/***/storage/kv/namespaces/.../bulk -> 524
Received a malformed response from the API
<!DOCTYPE html>... (length = 7180)
The 524 error indicates a server-side timeout at Cloudflare's edge, meaning the KV API request took too long to complete.
Analysis
Retry Logic Already Present
PR #6801 added retry logic to wrangler deploy and wrangler versions upload in wrangler 3.79.0 (merged 2024-10-01). The reporter is using version 3.106.0, so the retry feature is already available but is apparently insufficient for this scale of uploads.
Related Issues
- #2794 (OPEN) - "Wrangler randomly throws 'Received a malformed response from the API' when publishing pages" - Same error pattern for Pages deployments
- #10459 (OPEN) - "Issue uploading large number of small, static assets" - Very similar issue with ~10,000 assets timing out. The same reporter (@Maxastuart) has commented on this issue, confirming they still experience problems at ~50% failure rate.
Historical Improvements
Several PRs have attempted to improve upload reliability:
- #1195 (2022-06-13) - Batch sites uploads under 100MB
- #3098 (2023-04-28) - Improve Workers Sites asset sync reliability (limit in-flight requests, avoid OOM)
- #5813 (2024-05-14) - Add gateway failure retries for Pages uploads
- #6801 (2024-10-01) - Retry deployments for spotty network/service flakes
Ongoing Investigation
Per comments on issue #10459:
- Cloudflare has created internal ticket WC-4097 to investigate
- API-side changes were made in September 2024 that helped some users
- The reporter has provided account ID and worker names to Cloudflare for investigation
- Multiple users continue to report issues with large asset uploads
Root Cause
This appears to be a Cloudflare API/infrastructure limitation when handling very large KV bulk uploads:
- 210,427 assets is an unusually large number for Workers Sites
- The 524 timeout occurs server-side, not in wrangler
- GitHub Actions runners' ephemeral network characteristics may exacerbate timing issues
- The issue is intermittent, suggesting rate limiting or resource contention on the API side
Recommendation: KEEP OPEN
Reason: This is a valid, actively investigated bug affecting real deployments.
- Not a wrangler-only fix - The 524 timeout is a server-side issue; wrangler already has retry logic
- Active Cloudflare investigation - Internal ticket WC-4097 exists
- Multiple affected users - Issue #10459 and #2794 show this affects others
- Not resolved - Reporter confirmed in comments (2025-02-04) the issue persists
Suggested Actions for Maintainers
- Consider linking this issue to #10459 as potentially duplicates (same root cause)
- Monitor WC-4097 progress
- Consider adding a
waiting-on-cloudflareorapi-limitationlabel - May need to document recommended maximum asset counts for Workers Sites
Workarounds for Users
- Re-run failed deployments (usually succeeds on retry)
- Consider reducing asset count if possible
- Use Workers with static assets instead of Workers Sites (different upload mechanism)
- Implement CI retry logic at the workflow level
Notes & Feedback (0)
No notes yet.