The risks
- RCE - web shell (
.php,.jsp) server executes - Stored XSS - HTML/SVG with JavaScript served to users
- Path traversal - filename
../../etc/passwdwrites elsewhere - DoS - enormous files or zip bombs
- Malware hosting - your server as free distribution
Never trust metadata
Filename, MIME type, extension - all user-controlled. Validate actual content.
Validation
Check content, not extension:
import magic
mime = magic.from_buffer(file.read(2048), mime=True)
if mime not in ["image/jpeg", "image/png"]:
raise ValueError("Invalid file type")
Size limits: Web server, application, storage quotas.
Re-encode images: Strips embedded scripts, EXIF, malformed content.
from PIL import Image
img = Image.open(uploaded_file)
img.save(output, format="PNG")
Sanitise filenames: Generate new ones.
safe_filename = f"{uuid.uuid4()}.png"
Storage
Outside web root: Files in document root get executed. .php in web root = Apache runs it.
Object storage preferred: S3, GCS - outside your server, no execution risk, built-in access control. Use pre-signed URLs with expiry.
Serving safely
- Explicit Content-Type from validated type
- Force download:
Content-Disposition: attachment - Separate domain:
uploads.example.com- different origin, injected scripts can't access main app - Headers:
X-Content-Type-Options: nosniff
File type risks
| Type | Risk | Fix |
|---|---|---|
| SVG | Embedded JS | Sanitise or reject |
| HTML | Script execution | Attachment, never text/html |
| ZIP | Bombs, traversal | Size limits, sanitise paths |
| Office | Macros | AV scan, convert to PDF |
The takeaway
Never trust filename, extension, MIME type. Validate content. Re-encode images. Store outside web root. Serve with attachment header, separate domain.
