Show HN: SafeKey – PII redaction for LLM inputs (text, image, audio, video)

safekeylab.com

4 points by safekeylab a day ago

Hey HN, I built SafeKey because I was handling patient data as an Army medic, then doing AI research at Cornell. Every time we tried to use LLMs with sensitive data, something leaked. Existing tools only covered text at ~85% accuracy. Nothing worked across modalities. SafeKey is an AI input firewall. It sits between your app and the model, redacting PII before data leaves your environment. What we built:

PII Guard: 99%+ accuracy across text, images, audio, video AI Guard: Blocks prompt injection and jailbreaks (95%+ F1, zero false positives) Agent Security: Protects autonomous AI workflows RAG Security: Secures retrieval-augmented generation pipelines

Sub-30ms latency. Drop-in SDK for OpenAI, Anthropic, Azure, AWS Bedrock. Runs in your VPC or our cloud.

Would love feedback on the approach. Happy to answer questions.

Thanks, Sukin

itake 13 hours ago

I’ve spent the past five years working in content moderation.

In my opinion, the real gap in the market isn’t “better safety models”. it’s turn-key orchestration platforms that provide:

- A web portal for manual moderation and data-labeling workflows

- Multi-tier moderation checks (e.g., if a keyword is detected, escalate to an LLM)

- Simple integration of custom, business-specific models (e.g., blocking competitor mentions)

- A rules engine that combines all model outputs and issues the appropriate treatments

Two Hat and Azure kinda had this, but they didn't support custom models or rules engine.

While I love the idea of redacting/auto-correcting media, e-commerce / social media companies are structurally setup against this. They'd rather stick with the status quo of rejection, than using nano-banana to remove non-compliant features (like pii) from the images.

Once, I had to anonymize student data, so we could have a prod copy on staging. So maybe there is a use-case there...

tonetegeatinst a day ago

Awesome tool and team you have.

Few questions I though of, and I apologize if they seem stupid as ML is not my focus of study.

1.Has your team ever considered formal verification of code to show how reliable the process you have is?

2. If data has been removed via your pipeline, is it possible to still infer the type of data based on position or format? (Names of people being located in certain places of a sentence, or say the fact the data is formated a certain way could reveal its a date or timestamp?)

3. You mentioned clients can deploy via VPS, does that mean this is a fedramp ready product? (Do you see this tool being offered to public institutions?)

4. Do you have any internship openings for college students in the summer of 2026?

  • safekeylab a day ago

    Thanks! Great questions.

    Formal verification: We've validated through pilot deployments and CS/DevOps teams who've stress-tested the pipeline in production.

    Positional inference: Good catch. We replace PII with type-consistent tokens (e.g., [NAME], [DATE]) so format is preserved for downstream tasks, but the actual value is gone. For higher security, we offer synthetic replacement (fake but realistic values) so position and format don't leak information.

    FedRAMP: Not yet certified, but the architecture supports it — runs inside customer VPC, no data leaves the environment, full audit logging. FedRAMP and StateRAMP are on our compliance roadmap after SOC2 and HIPAA. Yes, public sector is a major target market.

    Internships: Not formally open yet, but email me at sukin@safekeylab.com — always interested in students working on AI security.

freakynit 11 hours ago

Notable Angel Investors Sam Altman CEO, OpenAI Dario Amodei CEO, Anthropic Jensen Huang CEO, NVIDIA Satya Nadella CEO, Microsoft Marc Benioff CEO, Salesforce Sundar Pichai CEO, Google

is this real? damn!!

vunderba a day ago

The only link on the site to a source repository is a 404 Github repository.

https://github.com/safekeylab

EDIT: Manually searching Github leads to this https://github.com/sukincornell/safekeylab (assuming that is the correct one)

  • safekeylab a day ago

    Thanks for flagging. We're not open-source — the GitHub link shouldn't have been on the site. Removing it now. We offer a private SDK for customers. If you want to test it, you can go to the website and create your account or ping me at sukin@safekeylab.com