Hidden Prompt Injection Attacks in External Texts for LLMs: An Exploratory Analysis of a HARO Corpus and the Implications for AI Agents
Abstract
This article examines a class of indirect prompt injection attacks in which a malicious instruction is delivered to a large language model not as an explicit user command, but as a hidden fragment embedded in external text supplied to the system as data for analysis. Using a corpus of 20 HARO queries, the paper shows that, in a real information flow, obfuscated instructions may appear in hex, Base64, and invisible Unicode formats, including zero-width symbols. The case is interpreted within the broader literature on indirect prompt injection, AI agent security, and retrieval-augmented systems. The findings suggest that any external text routed into an LLM should be treated as potentially compromised until it has been normalized, sanitized, and provenance-scoped.
Keywords
Citation
References
- [1]Greshake, K., Abdelnabi, S., Mishra, S., Endres, C., Holz, T., & Fritz, M. (2023). Not What You've Signed Up For: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection. ACM AISec / arXiv:2302.12173.
- [2]Hines, K., Lopez, G., Hall, M., Zarfati, F., Zunger, Y., & Kiciman, E. (2024). Defending Against Indirect Prompt Injection Attacks With Spotlighting. arXiv:2403.14720.
- [3]Beurer-Kellner, L., et al. (2025). Design Patterns for Securing LLM Agents against Prompt Injections. arXiv:2506.08837.
- [4]Graves, M. (2026). Reverse CAPTCHA: Evaluating LLM Susceptibility to Invisible Unicode Instruction Injection. arXiv:2603.00164.
- [5]OWASP Cheat Sheet Series. LLM Prompt Injection Prevention Cheat Sheet.
- [6]OWASP GenAI Security Project. LLM01:2025 Prompt Injection.
- [7]Microsoft Security Response Center. (2025). How Microsoft defends against indirect prompt injection attacks.
- [8]Microsoft Learn. (2026). Defend against indirect prompt injection attacks.