Ways to share less personal data with AI assistants

Six options that fit different situations. None of them is the answer for every case. Pick the lightest one that works for the message in front of you.

Independent studies of real chat logs find that more than 70% of LLM queries contain personally identifiable information, and the largest providers retain raw conversations for months or years. The numbers are not great. The good news is that real, low effort changes cut most of the exposure.

1. Send less in the first place

The cheapest fix is also the most underused. A surprising number of prompts include details the model does not actually need. “Help me draft a reply to my boss Sarah Patel about the Q3 budget” works just as well as “Help me draft a reply to my boss about the Q3 budget”. The first version puts a real name on a real person at your real company onto someone else’s server. The second one does not.

A useful habit: write the prompt first, then read it back and remove every detail the model does not need to give a useful answer.

2. Substitute placeholders

When you do need specifics, you can often use stand ins. “Person A”, “Company X”, “Project Foo”. Most modern models handle this well and produce answers you can mentally translate back. Same goes for documents: paste the structure, replace the names with letters.

Caveat: tax forms, medical records, and contracts have so many entangled identifiers that doing this by hand is brittle. For those, the next options are stronger.

3. Pick a tool with stronger privacy defaults

Different tiers of the same product have different rules. The free consumer chat is usually the most permissive about retention and training. The paid API is usually the most strict. In between are options like ChatGPT’s temporary chat mode, Claude’s training opt out, and the various enterprise plans.

Concrete things to check, by name, on the provider’s privacy page:

Whether your data is used to train models by default, and where the toggle is.
Whether the plan includes zero retention beyond the request.
Whether human reviewers can read your messages, and under what conditions.
Whether the plan supports HIPAA, SOC 2, or other certifications, if your work needs them.

For most consumer use, switching from the free tier to a paid plan and turning off training closes the largest gap. It is not perfect, but it is a noticeable step.

4. Run a model on your own machine

If a request truly does not need a frontier scale model, a local one is the only setup where the data physically does not leave your computer. Options as of today:

Ollama and LM Studio. Download a model, run it locally, point a chat UI at it. Works on Mac, Windows, and Linux.
llama.cpp and similar runtimes. Lighter weight, command line oriented.
Apple Intelligence on recent Macs and iPhones. Some requests run on device. Apple states which ones do and which ones go to its private compute servers.

Trade off: local models are smaller and slower than the hosted ones. They are usually fine for summarization, rewriting, casual coding help, and routine drafting. They struggle with the hardest reasoning tasks. If your prompt would only work on a frontier model, a local model will not save the day.

5. Redact the file before pasting it

Sometimes the document itself has to go in. A contract, a tax form, a customer list, a transcript. The whole point is that the model needs the structure of the document, but the model does not need to know whose document it is.

This is where redaction helps. The idea is simple: before pasting, replace the names, addresses, account numbers, and dates of birth with blanks or stand ins. The model still reads a contract. It just reads a contract about no one in particular.

This is what Total Redact is built for. It runs entirely on your Mac, scans documents for names, SSNs, phone numbers, emails, addresses, dates of birth, and credit cards, and lets you keep a personal watchlist of values you always want caught. Other tools cover similar ground; some are free, some are server based. The right tool is the one that runs where you want your data to stay.

Redaction is the right answer when you have a finished document. It is overkill for a one line question.

6. Keep work data on work tools

If your employer pays for Copilot, Claude Enterprise, ChatGPT Enterprise, or a similar plan, the privacy story is usually meaningfully better than the consumer free tier. Enterprise plans typically come with no training on inputs, stricter retention, audit logs, and a contract that imposes liability on the provider.

The cost of using your personal account at work is that you bypass all of that. The cost of using your work account at home is that your employer can read what you typed. Pick the right account for the situation. They are different products with different guarantees.

The bigger picture

LLM conversations are one corner of a larger problem. Browsers leak data through trackers and fingerprinting. Phones leak data through advertising IDs and third party SDKs. Cloud sync turns your local files into someone else’s database. Smart speakers, fitness trackers, connected cars, and identity providers each contribute their own slice. None of these is solved by being careful with chat prompts.

What chat prompts have over those other channels is that you can see exactly what is going out, in plain text, before you press send. That visibility is the most useful tool you have. Use it.

One sentence summary. The cheapest privacy is the data that never leaves your device. The next cheapest is the data you decided not to type.

Total Redact is a free Mac app that handles the case where the document has to go but the names do not. See what it does.