JustAppSec

Secure File Handling

Uploads, storage, and serving files without opening the door to attackers.

0:00

File uploads are one of the most dangerous features you can add to a web application. If handled poorly, they can lead to remote code execution, data exfiltration, denial of service, and stored XSS. This lesson covers how to upload, store, and serve files safely.

The risks

When a user uploads a file, they are sending arbitrary binary data to your server. The risks include:

  • Remote code execution — uploading a web shell (e.g., a .php or .jsp file) that the server then executes
  • Stored XSS — uploading an HTML or SVG file containing JavaScript that is served to other users
  • Path traversal — a filename like ../../etc/passwd that writes to unintended locations
  • Denial of service — uploading extremely large files or zip bombs that exhaust disk or memory
  • Malware distribution — using your server as a hosting platform for malicious files

Never trust file metadata

The filename, MIME type, and file extension are all user-controlled. An attacker can:

  • Send a file named photo.jpg that is actually an executable
  • Set the Content-Type header to image/jpeg for an HTML file
  • Use double extensions like malware.php.jpg

Do not rely on any of these for security decisions. Validate the actual file content.

Validation strategies

Check file content, not just the extension

Use magic bytes (file signatures) to verify the actual file type:

import magic

mime = magic.from_buffer(file.read(2048), mime=True)
if mime not in ["image/jpeg", "image/png", "image/gif"]:
    raise ValueError("Invalid file type")

Restrict allowed types to a strict allowlist

Only accept the specific file types your feature requires. If the feature is profile photos, accept JPEG, PNG, and WebP. Reject everything else.

Enforce file size limits

Set limits at multiple layers:

  • Web server level (e.g., Nginx client_max_body_size)
  • Application level (check Content-Length and actual bytes read)
  • Storage level (per-user quotas)

Re-encode images

The safest approach for image uploads is to decode the image and re-encode it using a library like Pillow (Python), Sharp (Node.js), or ImageMagick. This strips any embedded scripts, EXIF data containing sensitive information, and malformed content.

from PIL import Image
import io

img = Image.open(uploaded_file)
img.verify()  # Verify the image is valid

# Re-open and re-encode
img = Image.open(uploaded_file)
output = io.BytesIO()
img.save(output, format="PNG")

Sanitise filenames

Never use the user-provided filename directly. Generate a new filename:

import uuid
import os

ext = ".png"  # Determined from validated content type, not from user input
safe_filename = f"{uuid.uuid4()}{ext}"

This prevents path traversal, special character issues, and filename collisions.

Storage

Store outside the web root

If uploaded files are stored within the web server's document root, the server may execute them. A .php file in the web root will be executed by Apache/Nginx with PHP enabled.

Store uploaded files in a location that the web server does not serve directly. Use a separate endpoint to serve files, with explicit content-type headers.

Use object storage

Cloud object storage (S3, GCS, Azure Blob) is the preferred approach:

  • Files are stored outside your application server
  • No risk of server-side execution
  • Built-in access control, encryption, and CDN integration
  • Pre-signed URLs with expiry for time-limited access
# Generate a pre-signed URL (AWS S3)
url = s3_client.generate_presigned_url(
    "get_object",
    Params={"Bucket": "uploads", "Key": safe_filename},
    ExpiresIn=3600,
)

Encrypt at rest

Enable server-side encryption on your storage (S3 SSE, GCS default encryption, etc.). This protects against data exposure from storage-level breaches.

Serving files safely

Set Content-Type explicitly

When serving uploaded files, set the Content-Type header based on your validated type, not the stored metadata:

Content-Type: image/png

Set Content-Disposition for downloads

Force the browser to download files rather than rendering them inline:

Content-Disposition: attachment; filename="photo.png"

This prevents HTML or SVG files from being rendered in the browser context.

Use a separate domain

Serve user-uploaded content from a separate domain (e.g., uploads.example.com instead of example.com). This creates a different origin, so even if an attacker manages to upload an HTML file, any scripts in it cannot access cookies or data from your main application domain.

Set restrictive headers

X-Content-Type-Options: nosniff
Content-Security-Policy: default-src 'none'

nosniff prevents the browser from guessing the content type. A restrictive CSP prevents any scripts from executing in the served content.

Specific file type risks

File typeRiskMitigation
SVGCan contain embedded JavaScriptSanitise with a library, or reject SVGs entirely
HTMLExecutes scripts when renderedNever serve as text/html; use Content-Disposition: attachment
PDFCan contain JavaScript and external linksServe with Content-Disposition: attachment
ZIP / archiveZip bombs (expand to enormous size), path traversal in entriesSet extraction size limits, sanitise entry paths
Office documentsMacro execution, external data connectionsScan with antivirus, convert to PDF for viewing

Antivirus scanning

For applications that accept arbitrary file types, scan uploads with an antivirus engine (ClamAV is a common open-source option). This is not foolproof — novel malware can bypass signatures — but it catches known threats.

Process scans asynchronously. Do not block the upload response while scanning. Store the file in a quarantine area, scan it, and move it to the final location only if it passes.

Summary

File uploads are a high-risk feature. Never trust the filename, extension, or MIME type. Validate actual file content with magic bytes. Re-encode images to strip embedded payloads. Store files outside the web root (preferably in object storage). Serve files with explicit content types, Content-Disposition: attachment, and from a separate domain. Combine these controls to reduce a dangerous feature to a manageable one.


This training content is AI-assisted and reviewed by our team, but issues may be missed and best practices evolve rapidly. Send corrections to [email protected].