Input Validation and Schema Enforcement | Training | JustAppSec

Let's be clear about what input validation is and isn't, because misunderstanding this gets people breached. Validation is not your defence against injection or XSS. Parameterised queries defend against injection; output encoding defends against XSS. Validation is the supporting layer underneath both: it rejects obviously bad data at the boundary, shrinks your attack surface, and catches mistakes early. Treat it as the only thing standing between you and an attacker, and you'll have a bad day.

One rule before anything else: client-side validation is for user experience, server-side validation is for security. The browser checks are a convenience the attacker simply skips.

The core strategies

Allowlist, don't denylist. Define what's permitted and reject the rest, because you can enumerate the good values but you'll never finish enumerating the bad ones.

ALLOWED_STATUS = {"active", "inactive", "pending"}
if status not in ALLOWED_STATUS:
    raise ValidationError("Invalid status")

Enforce types. A value that has to be an integer can't carry a SQL injection string:

order_id = int(request.params.get("order_id"))

Cap lengths. Every string field needs a maximum. Unbounded input is how you get denial of service and buffer surprises.

Check ranges on anything numeric:

if quantity < 1 or quantity > 1000:
    raise ValidationError("Quantity out of range")

Let a schema do the work

Rather than scatter these checks by hand, declare the whole shape of the input once and enforce it in a single step. The big win, beyond tidiness, is rejecting unexpected fields, which shuts down mass assignment.

JSON Schema:

{
  "type": "object",
  "required": ["name", "email"],
  "properties": {
    "name": { "type": "string", "maxLength": 100 },
    "email": { "type": "string", "format": "email" }
  },
  "additionalProperties": false
}

That additionalProperties: false is the line that matters most: it drops any field you didn't ask for. The same idea in Zod and Pydantic:

const CreateUserSchema = z.object({
  name: z.string().max(100),
  email: z.string().email(),
}).strict();

class CreateUser(BaseModel):
    name: str = Field(max_length=100)
    email: EmailStr

    class Config:
        extra = "forbid"

Normalise before you validate

This is the step people forget, and it quietly defeats otherwise-correct validation. Canonicalise the input first, so that you're checking the same form the system will actually act on.

Unicode. é might be a single code point or an e with a combining accent. They look identical and compare differently.
URL encoding. %2e%2e%2f is just ../ wearing a disguise.
Case. If your allowlist has admin, an attacker will try ADMIN.

Validate after normalising, never before.

So: allowlists over denylists, types and lengths on everything, a schema library to enforce it in one place, unknown fields rejected, canonicalisation first, and all of it on the server. None of that replaces your real defences. It just stops a lot of nonsense reaching them.