RAG pipelines are great, but they can still retrieve "toxic" chunks:– prompt injection attempts– leaked API keys/secrets– stale or conflicting content– unapproved external URLsWe built an open-source "retrieval firewall" that scans chunks before they reach the LLM:– denies injection & secrets– flags/reranks PII, encoded blobs, untrusted URLs– audit log (JSONL) of all decisions– drop-in wrappers for LangChain and LlamaIndex retrieversInstall: pip install rag-firewallRepo: https://github.com/taladari/rag-firewallCurious if others here handle retrieval-time risks, or just ingest/output filtering.Would love feedback and red-team payloads.Comments URL: https://news.ycombinator.com/item?id=45068582Points: 1# Comments: 0