Direct prompt injection
An adversarial user crafts an input designed to override the system prompt — getting the agent to ignore instructions, reveal internal context, or take unintended actions.
Strict instruction hierarchy with a privileged system context that is never co-mingled with user input. Inputs are parsed against a structured schema; outputs are validated before being released. Detected override attempts are logged and surfaced to the customer audit trail.