What does the AI see when you press 'generate'?

If you have spent time evaluating AI scribes for psychology, you will have felt this. Every product page carries the same set of reassurances, that the service is UK-hosted, ICO-registered, UK GDPR compliant, and contractually opted out of training on your content, all of which are mostly true and all of which will be repeated to you in slightly different orders by every sales call you take, while almost without exception missing the question that decides whether the thing in front of you is something you can responsibly use.

The question is small, and it lives behind all the others. When the AI receives the request, what is actually inside it?

Most of the AI tools you will have looked at, including the popular ones, work like this:

Audio (or text) goes to the vendor's servers.
The vendor transcribes it.
The vendor sends the transcript and any structured fields to an AI provider.
The AI provider generates the draft.
The draft comes back, the vendor stores it, and you see it on your screen.

The transcript and the form fields contain everything you said about the client, including their name, their NHS number, where they live, the names of family members, what they disclosed, and what you are working on, all of which the vendor holds and all of which the AI provider sees for the duration of the request, with the contractual no-training agreements (which you should absolutely have) doing the heavy lifting on what happens next.

I want to be fair about this. It is a defensible architecture, and it is also the most common one in this industry and in software generally, because encryption, residency, audit logs, and contracts are real protections and the vendors building serious products in this space are not careless about them. For most consumer software this set of safeguards is exactly the right amount, but for clinical work it has a particular shape, because the safety of your clients' names depends on every link in the chain holding, which means vendor breaches, AI-provider breaches, court orders against either, misconfigured employee access, and bugs that log the wrong thing into the wrong system are all categories that can rearrange what you thought you had agreed to.

Each of those is a real category of incident rather than a hypothetical one, with Capita's 2023 cyber incident exposing personal data including from the NHS pension scheme, and OpenAI in the same year publicly disclosing a caching bug that briefly exposed ChatGPT conversation titles between users. Both companies had serious security teams, both had encryption, contracts, and audit logs, and the trouble in each case was not that the safeguards failed but that there was data to protect in the first place.

A different shape

I built Cogent around a simpler intuition, which is that the safest thing you can do with a client's name is never send it to the AI in the first place, and once I had sat with that thought long enough the rest of it mostly followed.

In practice it comes down to three commitments, each of which a DPO can verify in the product itself rather than take on trust.

The names are removed before the request reaches the AI. Cogent spots the identifying details in what you have typed, the kinds of things a regulator would expect to be spotted, shows you what it found, and replaces them with placeholders in your browser, so that Sara becomes [PERSON_1] and the model that drafts the note never sees real names, because Cogent never sends them on the drafting path. Live session audio is the one exception: it streams to our EU transcription provider before any masking is possible, so the raw audio may contain identifiable speech, and only the returned text is then de-identified.

The mapping back to the real names is held encrypted at rest on Cogent's UK servers. Rather than living on a single device, the mapping is stored under an encryption key on UK infrastructure with access tightly controlled and logged, so it follows you across your devices and a forgotten password is an ordinary email reset rather than something that loses you the names.

What the AI receives is the placeholder version. The text that goes out for drafting carries placeholders rather than client names, and when you open the returned draft your browser puts the names back in for display, so the drafting model and its sub-processors only ever work with pseudonymised content.

Run the same incident catalogue against an architecture shaped this way and the failure modes change. They do not go away, because nothing does, but the most exposed link, the AI provider and its sub-processors, only ever holds pseudonymised content, so the place a client's name is most likely to travel and be copied is the place it is never sent.

An AI-provider breach reveals: placeholder content only, processed for the draft and never the real names.
A court order to the AI provider returns: the same placeholder content, because the names were never sent.
A breach of Cogent's own servers reaches: the mapping and the records, which sit encrypted at rest under a managed key with access tightly controlled and logged, rather than client names in plain text.
An employee access incident faces: least-privilege controls and an audit log, with access to stored content limited and recorded rather than open.

What we trade for this

I am not going to pretend this comes free, because some things are harder under this architecture, and one of them is properly useful, so let me name them honestly.

Cross-client analysis is impossible. The AI cannot see what you wrote about anyone else, which means a "summarise themes across my caseload" feature is not something I can build without breaking the wall, and I will not build it on purpose, because the wall is the point of the product.

Detection has to be near-perfect. A missed identifier is a real one going to the AI, and while Cogent catches the categories that matter and the confirm-before-send step is the safety net underneath, any missed identifier is still treated as the most serious kind of bug, and DPOs reviewing Cogent for procurement get the detail under NDA.

You are trusting Cogent's encryption at rest, not pure on-device secrecy. Because the mapping and the records are now held encrypted on Cogent's UK servers rather than locked to one machine, you can move between devices and a forgotten password is just an email reset, but the honest cost of that convenience is that the protected names sit on infrastructure Cogent runs, so you are relying on encryption at rest, least-privilege access, and an audit log rather than on the names simply not existing anywhere Cogent can reach, and Cogent can technically decrypt stored content under those controls. The line that does not move is the one that matters most, which is that the AI still never sees the names, so the part of the chain most prone to copying, training, and onward transfer is the part the names are never handed to.

Some compute happens in the browser. A few practitioners on slower laptops notice a half-second pause when a long note is being scanned, and while I tune for that, it is not free.

Why we made this trade

When I started thinking about how a tool like this should work, the first question kept arriving and re-arriving, and it was not about features but about what the thing being shipped would look like under the worst day of the year.

For most of the architectures already in the market, the worst day looks like a vendor incident or a court order, and the response is "we did everything reasonable, the contract held up, the encryption held up, no harm done", which can be honestly true and still leave you, the practitioner, explaining to a client why their name was sitting at a vendor whose name the client does not know.

For Cogent, the worst day looks like the same incident, but the response is that the names were never sent to the AI in the first place, so the part of the chain most likely to copy or transfer them never received them, and what Cogent itself holds sits encrypted at rest under a managed key with access tightly controlled and logged, which the audit trail records, the DPO can confirm, and the HCPC, if asked to look, can see for themselves.

That is the reason for the whole architecture. Everything else, including the modality-specific drafting, the living formulation, and the pre-supervision briefs, is what gets built on top of it.

How to verify any of this

You should not have to take Cogent's word for it, so here is how to check independently.

Open the developer tools in your browser and watch the Network tab when you press 'generate', and the text body of the request will contain placeholders rather than names, and if it does not then there is a bug worth hearing about.
Read the privacy policy and the Data Processing Agreement, which set out the same point in legal language.
Read the AI principles, which set it out in plain language.
Send a DPO at your organisation, who will know what to ask for, and the answers will be supplied.
The Trust Centre lists every outside service Cogent uses, where each one lives, and what each one receives.

Cogent does not name competitors in posts like this, because several products in this category do good work under different threat models and the contractual safeguards they apply are not nothing, so wherever possible the case worth making is the architectural one rather than the case against what other people are not doing.

What does the AI actually see when you press 'generate'?

A different shape

What we trade for this

Why we made this trade

How to verify any of this

Try it on this week's work.