The ngrok for PHI

One Import Change.
Automatic PHI Redaction.

Drop-in replacement for OpenAI, Anthropic, and Gemini SDKs. PHI is automatically redacted before reaching any LLM provider. All processing happens locallyโ€”no Redact API keys, no signup. Uses your existing LLM provider keys.

your_app.py
# BEFORE - PHI goes directly to OpenAI
from openai import OpenAI

# AFTER - PHI is automatically redacted
from redact_proxy import OpenAI

# Everything else stays exactly the same
client = OpenAI()
response = client.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content":
        "Patient John Smith, DOB 01/15/1980, has diabetes."}]
)

# What OpenAI actually receives:
# "Patient [NAME_a1b2c3], DOB [DATE_d4e5f6], has diabetes."
Live Demo
Ready
Your Input (contains PHI)
Sent to LLM (PHI redacted)
97.3%
Precision
91.1%
Recall
94.1%
F1 Score
~8ms
Latency
18
PHI Types
Benchmarked on gold-labeled EMR notes from Meditech, Cerner, and eCW
NEW

Need PHI back in responses?

DE-ID removes PHI permanently. Our RE-ID SDK creates reversible tokens - LLM never sees PHI, but you can restore it. pip install redact-proxy[reid]

Learn About RE-ID โ†’

Intercept, Redact, Forward

Redact-Proxy wraps your LLM SDK and intercepts all requests before they leave your machine.

๐Ÿ“
Your Code
Contains PHI
โ†’
๐Ÿ›ก๏ธ
Redact-Proxy
Local processing
โ†’
๐Ÿค–
LLM Provider
PHI-free request
What gets sent to OpenAI
# Your original message:
"Patient John Smith, SSN 123-45-6789, was seen on 01/15/2024
at Springfield Medical Center. Contact: (555) 123-4567"

# What OpenAI receives:
"Patient [NAME_a1b2c3], SSN [SSN_d4e5f6], was seen on [DATE_g7h8i9]
at [FACILITY_j0k1l2]. Contact: [PHONE_m3n4o5]"

Why Redact-Proxy?

Zero infrastructure. Zero signup. Just change your import.

1

One-Line Migration

Change one import statement. All your existing code works exactly the same, but now PHI is protected.

0

Zero Infrastructure

No API keys, no cloud services, no Docker containers. Everything runs locally in your Python environment.

๐Ÿ”’

PHI Never Leaves

All redaction happens locally before any network request. Your PHI never touches third-party servers.

โšก

Fast Detection

Pattern-based detection adds ~8ms latency. Transformer mode available for higher accuracy.

๐Ÿ”„

Drop-In Compatible

Same API, same methods, same parameters. Works with OpenAI, Anthropic, and Google Gemini.

๐Ÿ“–

Open Source

MIT licensed core. Inspect the code, contribute, or fork it. No vendor lock-in.

Where Your Data Goes

Understand exactly what happens to PHI at each step.

๐Ÿ 

In-Process Redaction

All detection and redaction happens in your application's memory. No external service calls.

๐Ÿšซ

PHI Stays Local

Only redacted text goes to the LLM provider. Original PHI never leaves your machine.

๐Ÿ—‘๏ธ

No Persistence

PHIโ†”placeholder mappings are stored in memory only and cleared after each request completes.

๐Ÿ“

Logging Off by Default

Debug logging is disabled by default. If enabled, ensure your logs are stored securely.

Common Pitfalls

PHI can still leak through other parts of your application.

โšก Streaming Responses
PHI in streamed chunks may bypass redaction. Redact after the full response is assembled.
๐Ÿ”ง Tool/Function Calling
Function arguments may contain PHI. Redact tool inputs before passing to the LLM.
๐Ÿ”„ Retries & Error Handling
Stack traces can expose PHI in variables. Scrub exceptions before logging.
โฐ Background Jobs
Async workers may bypass the proxy. Use Redact Proxy in your worker code too.
๐Ÿ’พ Prompt Caching
Cached prompts aren't re-redacted on retrieval. Cache only already-redacted prompts.
๐Ÿ“Š App Logs & Analytics
Your logging framework and APM tools may capture request bodies. Review your full data flow.

18 PHI Types Detected

Covers all HIPAA Safe Harbor identifiers plus clinical extensions.

๐Ÿ‘ค
Names
๐Ÿ“…
Dates
๐Ÿ”ข
SSN
๐Ÿฅ
MRN
๐Ÿ“ž
Phone
๐Ÿ“ง
Email
๐Ÿ“
Address
๐Ÿ™๏ธ
City/State
๐Ÿ“ฎ
ZIP Code
๐ŸŽ‚
Age (90+)
๐Ÿข
Facilities
๐Ÿ‘จโ€โš•๏ธ
Providers
๐Ÿ’ณ
Account #
๐Ÿชช
License #
๐Ÿš—
VIN/Plates
๐Ÿ’Š
DEA #
๐ŸŒ
URLs/IPs
๐Ÿ”
Biometrics

Choose Your Speed/Accuracy Tradeoff

Three detection engines for different use cases.

Balanced
~400ms per request

spaCy NER + patterns. Good for mixed content types.

  • Named entity recognition
  • Context-aware detection
  • Requires spaCy model
  • Better name detection
Transformer
~800ms per request

Clinical NER model. Best for free-text narratives without structured labels.

  • Clinical language model
  • Higher recall on edge cases
  • GPU recommended
  • Continuously fine-tuned
# Configure detection mode
from redact_proxy import OpenAI

client = OpenAI(
    redact_mode="fast"      # or "balanced" or "accurate"
)

Works With Your LLM

Same API you're already using. Just change the import.

OpenAI
from redact_proxy import OpenAI

client = OpenAI()
response = client.chat.completions.create(...)
Anthropic Claude
from redact_proxy import Anthropic

client = Anthropic()
response = client.messages.create(...)
Google Gemini
from redact_proxy import Gemini

client = Gemini()
response = client.generate_content(...)

Built for Healthcare AI

๐Ÿฉบ

Clinical Documentation AI

Build AI scribes and documentation assistants that process patient encounters without sending PHI to cloud LLMs. Perfect for ambient listening apps.

๐Ÿ”ฌ

Research & Analytics

Analyze clinical notes with GPT-4 or Claude without IRB concerns. Extract insights from medical records while maintaining patient privacy.

๐Ÿ’ฌ

Patient-Facing Chatbots

Build symptom checkers and health assistants. Patients can describe conditions freely knowing their information stays private.

๐Ÿ“Š

EHR Integration

Add AI features to your EHR or practice management system. Process clinical data with LLMs while staying HIPAA compliant.

vs. Building It Yourself

Task DIY Approach Redact-Proxy
Setup time Days to weeks 5 minutes
Code changes Wrap every API call Change 1 import
PHI patterns Write your own regex 18 types included
Maintenance Ongoing updates needed pip upgrade
Testing Build test suite Validated on clinical data
Multi-provider Implement per provider OpenAI, Anthropic, Gemini

Get Running in 60 Seconds

1

Install the package

No additional dependencies beyond your existing LLM SDK.

pip install redact-proxy
2

Change your import

Just change where you import from. That's it.

# Change this:
from openai import OpenAI

# To this:
from redact_proxy import OpenAI
3

That's it - you're protected!

Your existing code now automatically redacts PHI before sending to any LLM.

# Your existing code works unchanged
client = OpenAI()

response = client.chat.completions.create(
    model="gpt-4",
    messages=[
        {"role": "user", "content":
         "Patient John Smith, SSN 123-45-6789, has diabetes."}
    ]
)

# OpenAI receives: "Patient [NAME_a1b2c3], SSN [SSN_d4e5f6], has diabetes."
# PHI never leaves your machine

DE-ID SDK is Free Forever

Run de-identification locally with no limits. Need to restore original PHI? Add RE-ID.

RE-ID SDK
$99/mo

DE-ID + Re-identification

  • Everything in DE-ID, plus:
  • Automatic re-identification
  • Unlimited RE-ID calls
  • Multi-turn support
  • Encrypted token storage
  • Priority email support

Need Cloud Workspace, HIPAA Chat, or BAA? See full platform pricing →

Frequently Asked Questions

Does this make me HIPAA compliant?
Redact-Proxy helps you avoid sending PHI to third-party LLMs, which is one aspect of HIPAA compliance. Full HIPAA compliance requires additional measures including BAAs, access controls, and security policies. Consult a compliance expert for your specific situation.
Is PHI ever sent to your servers?
No. All PHI detection and redaction happens locally in your Python environment. The open source version has zero network calls except to your chosen LLM provider. We never see your data.
What if it misses some PHI?
No de-identification system is 100% perfect. Our fast mode achieves 94.1% F1 on structured EMR data (97.3% precision, 91.1% recall). Transformer mode captures more edge cases in free-text narratives. For maximum safety, consider manual review for high-risk data.
Can I get the original PHI back?
The DE-ID only version permanently redacts PHI - there's no way to restore it. Need re-identification? Our DE-ID/RE-ID SDK automatically restores PHI in LLM responses using encrypted token storage.
Does it work with streaming?
Yes. Redact-Proxy supports streaming responses from all providers. PHI is redacted before the request, and streaming works normally for the response.
Can I use my own PHI patterns?
Yes. Pro users can add custom regex patterns for organization-specific identifiers like custom MRN formats or internal ID schemes.

Ready to protect your PHI?

Get started in 60 seconds. No signup required.