At 10:47 PM on a Friday, one of our support engineers dropped a message I never enjoy reading: “Uploads are timing out again, and a customer says an image became a downloadable PHP file.” Two different failures, one root problem. We had treated file upload like a single HTTP request instead of a security-critical pipeline.
That weekend, we rebuilt the flow in layers. The browser uploaded directly to object storage using short-lived signed URLs. PHP handled only intent, policy, and finalization. Every file stayed quarantined until async malware scanning finished. The UI showed processing instead of pretending success too early.
If you are modernizing upload flows this year, this pattern is a practical middle ground between “simple but risky” and “enterprise overkill.”
The shape of a safer upload pipeline
For this article, I will use one primary keyword naturally: secure PHP file upload pipeline. Secondary keywords I am targeting are signed URL uploads, PHP MIME validation, and asynchronous malware scanning.
The architecture:
- Client asks your PHP API for an upload ticket (not the final file URL).
- PHP creates a short-lived presigned PUT URL for a tightly scoped object key.
- Client uploads directly to S3 (or equivalent object storage).
- Client calls a
/finalize-uploadendpoint with object key + declared metadata. - Server runs validation checks, enqueues scanning, and marks status
pending_scan. - Worker scans with ClamAV (or managed AV), then marks file
readyorrejected.
This aligns with OWASP guidance on defense in depth for uploads, including extension allowlists, content validation, safe storage, and malware checks.
Why direct-to-storage plus server-side finalization works
Direct upload removes the heavy byte stream from PHP-FPM workers. That alone improves stability under burst traffic. But the bigger win is control boundaries:
- PHP controls policy: allowed types, max size, tenant path, ownership.
- Storage handles transport: cheap, durable, parallelizable uploads.
- Workers handle expensive checks: malware scanning and deeper content validation.
Tradeoff, you add eventual consistency. A file may exist in storage before it is approved for application use. Accept that state explicitly in your product UX, and you avoid dangerous shortcuts.
Code block 1, issue a constrained presigned upload ticket in PHP
The key idea is to never let the client choose an arbitrary object key and never trust browser-provided content type as your only check.
<?php
use Aws\S3\S3Client;
use Ramsey\Uuid\Uuid;
function createUploadTicket(int $userId, string $originalName, string $declaredMime, int $bytes): array {
$allowedMimes = [
'image/jpeg',
'image/png',
'application/pdf'
];
if (!in_array($declaredMime, $allowedMimes, true)) {
throw new RuntimeException('Unsupported file type.');
}
if ($bytes <= 0 || $bytes > 10 * 1024 * 1024) {
throw new RuntimeException('Invalid file size.');
}
$ext = pathinfo($originalName, PATHINFO_EXTENSION);
$safeExt = preg_replace('/[^a-zA-Z0-9]/', '', strtolower($ext));
$key = sprintf('quarantine/u%d/%s.%s', $userId, Uuid::uuid7()->toString(), $safeExt ?: 'bin');
$s3 = new S3Client([
'version' => 'latest',
'region' => getenv('AWS_REGION')
]);
$cmd = $s3->getCommand('PutObject', [
'Bucket' => getenv('UPLOAD_BUCKET'),
'Key' => $key,
'ContentType' => $declaredMime,
'Metadata' => [
'uploader-id' => (string)$userId,
'upload-intent' => 'profile-doc'
]
]);
$request = $s3->createPresignedRequest($cmd, '+10 minutes');
return [
'object_key' => $key,
'upload_url' => (string)$request->getUri(),
'expires_in_seconds' => 600
];
}
Important details:
- Use a server-generated key under a quarantine prefix.
- Keep URL TTL short (for example 5-10 minutes).
- Scope IAM so this signer can only put objects in the upload prefix.
- Treat
ContentTypeas a hint, not proof.
Code block 2, finalize endpoint with MIME sniffing and scan queue
After the browser upload succeeds, you still need a trusted server-side step. Here we read object bytes into a temp file, verify MIME via PHP fileinfo, and queue scanning.
<?php
function finalizeUpload(PDO $db, S3Client $s3, int $userId, string $objectKey): array {
if (!str_starts_with($objectKey, "quarantine/u{$userId}/")) {
throw new RuntimeException('Invalid object key scope.');
}
$bucket = getenv('UPLOAD_BUCKET');
$tmpPath = tempnam(sys_get_temp_dir(), 'up_');
// Stream object to local temp for validation.
$result = $s3->getObject([
'Bucket' => $bucket,
'Key' => $objectKey,
'SaveAs' => $tmpPath,
]);
$finfo = new finfo(FILEINFO_MIME_TYPE);
$actualMime = $finfo->file($tmpPath) ?: 'application/octet-stream';
$allowed = ['image/jpeg', 'image/png', 'application/pdf'];
if (!in_array($actualMime, $allowed, true)) {
@unlink($tmpPath);
throw new RuntimeException('File content type not allowed.');
}
$stmt = $db->prepare('INSERT INTO uploads (user_id, object_key, status, detected_mime) VALUES (?, ?, ?, ?)');
$stmt->execute([$userId, $objectKey, 'pending_scan', $actualMime]);
$uploadId = (int)$db->lastInsertId();
enqueueScanJob([
'upload_id' => $uploadId,
'bucket' => $bucket,
'key' => $objectKey,
'tmp_path' => $tmpPath
]);
return ['upload_id' => $uploadId, 'status' => 'pending_scan'];
}
At this point, the product should show “processing” and prevent downstream features from consuming this file until scan status becomes ready.
Code block 3, async scanner worker (ClamAV example)
ClamAV is a practical baseline in many stacks. Keep the scanner isolated, and do not expose clamd sockets to the internet.
#!/usr/bin/env bash
set -euo pipefail
UPLOAD_ID="$1"
TMP_PATH="$2"
SCAN_OUTPUT=$(clamdscan --no-summary "$TMP_PATH" || true)
if echo "$SCAN_OUTPUT" | grep -q "OK$"; then
php artisan upload:mark-ready "$UPLOAD_ID"
else
php artisan upload:mark-rejected "$UPLOAD_ID" "malware_detected"
fi
rm -f "$TMP_PATH"
In production, you would usually move approved files from quarantine/ to public/ or private/ prefixes after a clean verdict, then attach immutable metadata in your DB.
Common failure modes and how to handle them
1) “Works in dev, fails in prod” because presigned PUT headers do not match
If the client sends a different Content-Type from what was signed, object storage rejects the request. Keep the client request headers deterministic and aligned with the presigned command.
2) Users can upload, but files never leave pending_scan
Usually queue worker lag, dead-letter growth, or ClamAV definition update problems. Add metrics for queue age and scan latency, and alert on backlog thresholds.
3) False positives block legitimate business documents
This happens. Build an audited manual-review flow instead of silently dropping files. Security and support teams both need visibility.
4) MIME checks pass, but parsing later still crashes
MIME detection is one layer only. If you process PDFs or images deeply, sandbox that processing step and enforce CPU/memory/time limits.
Where this fits in your broader engineering stack
If this topic is part of a larger reliability push, these 7tech guides pair well:
- Node.js webhook idempotency patterns
- GitHub Actions with AWS OIDC hardening
- Zero-trust cloud API design
- Practical CSP rollout lessons on WordPress
FAQ
Do presigned URLs remove the need for server-side validation?
No. Presigned URLs solve transport and credentials exposure, not trust. You still need extension allowlists, server-side MIME checks, and malware scanning before a file becomes usable.
Should I scan before upload or after upload?
For most web apps, scan after upload in an async pipeline. Scanning inline on the request path hurts latency and reliability. Quarantine + async scanning keeps UX responsive while staying safe.
Is ClamAV enough by itself?
Usually no. It is one control, not the whole strategy. Pair it with strict upload policy, safe storage layout, access control, and downstream parser sandboxing.
Actionable takeaways for this week
- Move browser uploads to presigned URLs with a short TTL and server-generated object keys.
- Keep every new file in a quarantine prefix until async scanning returns a clean status.
- Add upload states in your DB (
pending_scan,ready,rejected) and reflect them honestly in UI. - Track queue age, scanner latency, and reject reasons as first-class operational metrics.
- Document a manual-review path for false positives instead of bypassing controls in production.
Sources reviewed
- OWASP File Upload Cheat Sheet
- AWS S3: Uploading objects with presigned URLs
- PHP Manual: finfo_file
- ClamAV Documentation: Scanning
When upload incidents happen, teams often respond with one more regex or one more Nginx rule. The more durable fix is architectural, treat upload as a staged pipeline with explicit trust boundaries. Once we did that, timeout tickets dropped, and more importantly, our security conversations became calmer and evidence-based.

Leave a Reply