How a seemingly harmless image can jailbreak AI
Florida International University researchers developed a technique called JaiLIP that uses nearly invisible image modifications to bypass AI safety guardrails and trigger responses models would normally block.