Gandalf

Generative AI security-II

6 minute read

A practical follow-up on deploying LLM features: treat the model as untrusted, control both prompts and outputs, and assume cost is a product constraint. I map concrete mitigations to the OWASP GenAI Top 10 (2025) and discuss what to look for in an AI gateway.

Read the post

Recreating Gandalf by Lakera

6 minute read

I played Lakera’s Gandalf prompt-injection game and tried to recreate several levels with a plain LLM and a system prompt. The exercise makes prompt reinjection failures concrete, and highlights why you still need input/output controls beyond prompting.

Read the post

Generative AI security

11 minute read

An overview of how LLMs handle prompts, why prompt injection works, and why multimodal systems widen the attack surface. I use Lakera’s Gandalf game as a concrete thread and end with the defensive patterns that actually matter.

Read the post