Friday, December 12, 2025

AI Assisted Patient Appointment Traceability

The following scenario is just an example of AI use and AI Transparency impact. The intent of the use-case is to show that where AI gets engaged in the Patient care, attribution to the AI needs to be clearly indicated. The AI use in specifically Patient Appointment is not what I am endorsing but rather using it as a representative interaction for the purpose of showing Provenance and thus Accountability to AI use.

  1. Patient provides lab test specimens prior to appointment.
  2. AI analyzes lab test results along with patient history.
  3. Patient appointment with Doctor considering AI report.
  4. Patient care improved by AI

Detailed Steps

  1. Patient is scheduled for a routine check-up appointment.
  2. Patient had provided specimens for lab tests prior to the appointment.
  3. On the day of the appointment, an AI is called to analyze the lab test results.
  4. The AI considers the lab test results, related to prior lab test results, current conditions, current medications, and family medical history.
  5. The AI generates a summary report highlighting any abnormalities or areas of concern.
  6. The AI summary report includes various actions that could be recommended based on the analysis.
  7. During the appointment, the healthcare provider reviews the AI-generated report with the patient.
  8. The healthcare provider discusses any abnormalities or concerns identified in the report.
  9. The healthcare provider considers the recommendations from the AI generated report and recommends further tests or lifestyle changes if necessary.
  10. The patient is given an opportunity to ask questions and discuss their health.
  11. The appointment concludes with a follow-up plan, if needed, and scheduling of the next routine check-up.
  12. The AI-generated report is stored in the patient's medical records for future reference. Patient AI Summary
  13. The healthcare provider documents the appointment details and any recommendations made.
  14. The patient receives a summary of the appointment and any next steps via their patient portal.

Patient AI Summary

This document outlines the steps involved in a typical patient appointment for a routine check-up, including the integration of AI analysis for lab test results and AI recommendations.

In this case, since the Patient AI Summary is generated by the AI, the author of the document is the AI system itself. The document may also be tagged with metadata indicating that it was AI-generated.

The summary would itemize the list of history, conditions, medications, lab results, and family history that were considered by the AI in its analysis. It would indicate the new lab test results that were analyzed in the context of prior lab test results and the patient's overall medical history. It would include citations to medical knowledge bases or guidelines that the AI used to inform its analysis and recommendations.

The recommendations would each include a rationale, linking to evidence from the patient's data and relevant medical literature. There would be discussion of benefits, risks, and side effects.

AI Provenance

Provenance information about the AI analysis is recorded to ensure transparency and accountability. This includes details such as the AI model version, data sources used for analysis, and any relevant parameters or settings applied during the analysis process.


Audit the AI

An audit trail is maintained to track the AI's analysis process, ensuring that all steps taken by the AI are documented for future reference. This includes logging the input data, analysis steps, and output results. This is different from Provenance as it records the searches into the patient medical record that the AI made to gather information for its analysis. The audit record of a search typically includes the search request parameters, and does not include the response to the search request. As such the audit analysis would re-run the search to determine what was returned. For example a broad search on a patient record would include all medical history. The AI would likely not process some of this medical history that is determined by the AI to be not relevant. As not relevant data, it would not be included in the AI Provenance as data used by the AI Analysis. Data the AI considered not relevant to the analysis, such as resolved conditions, resolved broken bones, prior medications no longer being taken, etc. The AI may appropriately pull all historic medical data, as there may be some relevant data in the historic record. The AI can quickly determine what is relevant and what is not relevant. The Audit would include the search of the full medical history, while the Provenance would only include the relevant data used by the AI.

The Audit would include a independent Audit entry for the creation of the Patient AI Summary document itself. This might include the data used, depending on the configuration of the audit system.

If there is some business rule, or privacy consent restriction, that would prevent the AI from accessing certain data in the patient record, the Audit would include the access control denial.

The Audit log would cover everything found in the Provenance, but would be less succinct.

Encounter Documentation

The healthcare provider documents the appointment details, including any findings from the AI report and recommendations made during the consultation.

The writing of this documentation may also be assisted by AI, which can help summarize the key points discussed during the appointment and ensure that all relevant information is accurately recorded in the patient's medical record. This is a different use of AI from the above, and has different inputs and outputs. This documentation would be authored by the Doctor, with assistance from the AI. Thus another Provenance indicating the AI assistance in documentation, with authorship attribution to the Doctor.

Patient Summary

The patient receives the summary of the appointment, including any next steps or recommendations, via their patient portal for easy access and reference.

AI slop remediation

Now imagine that the healthcare providing organization has learned that the AI model they were using make specific mistakes with specific kinds of lab results. The organization can find all of the Provenance attributed to that AI Model, thus the subset of outputs that that AI Model influenced. They could further find those Provenance that have a .entity relationship with a given AI Prompt known to have produced poor results, so now have the subset of instances where the AI was used with the defective AI Prompt. They can then review those outputs, and determine if any patient care was negatively impacted. If so, they can reach out to those patients to remediate the situation. This is an example of how Provenance enables accountability for AI use in healthcare.

New AI software, Models, and Prompts

When new AI software, models, or prompts are introduced, the healthcare organization can track their adoption and usage through Provenance records. This allows them to monitor the performance and impact of the new AI tools on patient care. If any issues arise, they can quickly identify which AI tools were involved and take appropriate action to address any concerns. This ongoing monitoring and accountability help ensure that AI integration in healthcare continues to benefit patients while minimizing risks.

The change would be represented in a new Device resource representing the new AI software or model, and if there is configured prompt this would also be represented in the Device resource.

The Provenance records for AI analyses would then reference the new Device resource as the .agent, allowing for clear tracking of which AI tools were used in each analysis.

Conclusion

The AI Transparency IG includes standards for recording that data was influenced by AI. The IG does not try to control how AI is used or restrict how the AI Transparency are used. The examples given in the guide are very focused on minimal expression for illustrative purposes. I try here to express a more realistic use-case so as to drive more clear understanding of the benefit of AI Transparency.


Wednesday, December 10, 2025

Controlling AI in Healthcare

AI must be controlled. That is to say that AI accessing data and making data is a privileged activity. It is not uncommon during the early days of a new technology for that technology to be uncontrolled. It is not uncommon for Security to be seen as There are three specific moments when AI needs to be controlled. \

  1. when the AI is trained on a dataset, 
  2. when the AI is used to make treatment decisions (e.g. on a given Patient),
  3. when the AI is used to make payment decisions (e.g., on a given Patient)

Teaching

Teaching an AI/ML/LLM with dataset needs to be controlled to prevent ingestion of data that is not authorized to be used for this purpose. With this use-case, HL7 has identified a specific PurposeOfUse that would be used to indicate this teaching/training purpose - MLTRAINING. With this code a few things can be done:


When the training is done, the authorization request is for MLTRAINING PurposeOfUse. Thus, the access control will either permit or deny such a PurposeOfUse, and the authorization would be audited as such. This PurposeOfUse would not be given to Agent that is not authorized to use this PurposeOfUse. Thus, this PurposeOfUse can't be used by other actors.

A Dataset can be marked as forbidden for MLTRAINING PurposeOfUse, which would make that Dataset unavailable for training. This, in theory, could be done down to the data artifact basis.

There is a standard in the general AI world that I helped create to tag datasets with Provenance and Authorizations including the license that would need to be followed if the data are to be ingested by an AI/ML/LLM. The Data & Trust Alliance has published this Data Provenance Standard, that is elaborated on here.

Patient based Consent on Teaching

This MLTRAINING PurposeOfUse could be leveraged in a Patient specific Consent. This would enable a Patient to indicate that they do not want THEIR data used to teach an AI. This would mean that the Access Control is more fine-grain, in that each datum pulled from the database must be checked to see if the given subject of the data (the Patient) has authorized, or did not deny authorization for AI to learn from their data.

Treatment Decisions

There are other PurposeOfUse when the AI is used during treatment (TREATDS) or payment (PMTDS) decisions. These PurposeOfUse are specific to the outcome, and are therefore distinct so that business rules or Patient Consent can allow one but not the other. They would otherwise work rather similar.

The most likely use-case is one where Patients get to indicate that they do or do-not want AI used in making Clinical Decisions (or Payment Decisions). This is diagrammed below where each Patient has a Consent with a term around PurposeOfUse of TREATDS of go or no-go; and that is used by the AI System authorization to allow the AI to make decisions, and thus look at historic patient data.

Conclusion

These PurposeOfUse already are defined for these purposes. There may be other PurposeOfUse codes that need to be defined, this is a good exercise for discussion. The above scenarios are also not the only ones, and indeed these scenarios might not be the most likely or most useful ones. My point in this article is to show that we (Security WG) have done some thinking and developed some standards codes.



Healthcare AI Transparency ballot

Healthcare use of AI needs to be Transparent, clearly labeling and attributing when patient data was created or influenced by AI. This is the goal of a new Implementation Guide going to HL7 Ballot really soon. This Implementation Guide will also be the focus of an HL7 FHIR Connectathon testing track in January.

The guide is designed for health IT developers, clinicians and institutions that use AI (including generative AI or large language models) to generate or process health data. It provides a common format so downstream systems and human users can see what data came from AI — when, how, and by which algorithm. This helps them judge whether AI-derived data are reliable, appropriate, or need further review.

Key features include:
  • Tags or flags on FHIR resources (or individual data elements) to mark AI involvement.
  • Metadata about the AI tool: model name and version, timestamps, confidence or uncertainty scores.
  • Documentation of human oversight (for example, whether a clinician reviewed or modified AI outputs).
  • Traceability: which inputs (e.g., clinical note, image, lab result) were fed to the AI, and how outputs were used to produce or update health data.

For stakeholders — such as patients, clinicians, and health-system administrators — the main benefit is transparency. Users can tell whether data was AI-generated or human-authored, which supports trust, safety, and informed use of AI in care.

And when the AI model or prompt is found to produce unsafe recommendations, then this transparency indications can be used to find potential problems that can then be reexamined.

AI will be used, and attribution to that use will help us deal with the data in the future.

Monday, October 20, 2025

Age Verification is much more important than porn

There is much talk now days, driven by some regulations around the globe, of a need on the internet for services to know a user's age. The main one that comes to the discussion is to protect children from accidently seeing porn. This use-case is hiding a much more important problem that must be solved at the same time.  The porn problem is rather easy to argue is a universal "good" use-case. Not many will be able to argue against this use-case from any perspective. Thus, it is used to hammer a solution into existence. But once that solution exists, it will be used for many use-cases that are not as "universal good". Meaning it will be used by some governments against small groups that have much less leverage than the porn industry has.

Parent solution:

Many solutions that are being proposed today have 'the parent' indicate their children's 'age'. This seems like a good solution for a while, but who proves that that individual is 'a parent' and specifically 'the parent of that child'? These solutions are trying to build a sound logic upon ground that is not solid.

What is Age limited

Porn is easy to identify as a problem, and as I have said above it is easy to agree. One might add some topics like online gambling as easy to identify and universally agreed to.

In the physical world we have access to Alcohol, Tobacco, Vaping, and other drugs; along with Driving, Voting, Military Services, Credit Cards, Car Rental, and even solo travel. In the physical world these are controlled at the source, where they item or service is dispensed. 

In the mixed physical and virtual world, we somewhat have a history (mostly failed) with Movies, Music, and Video Games.  It can be argued that these were early efforts that if we had age verification that these would be more effectively controlled.  These are all, like porn, in that they are rather universally agreed to.

Problematic Age Limited

Less clear are other information (internet) topics that "some" people consider should be "age limited". Who are these "some" people, and what criteria are they using to determine what is "age limited"? I am sure many of the things beyond porn will NOT be universally agreed to. Which means that in one location topic ABC is age limited, and in another area it is not. Some of these topics are deep/heavy topics, like abortion; while others are stigmatizing topics that are appear to be simply embarrassing. But all of them can be leveraged to great harm by governments, parents, spouse, peers, and bullies.

- Abortion (information, consulting, or services)
- Sexual Health
- Self-harm
- Addiction
- Trauma
- Telehealth
- Weight advice
- LGBTQ+
- sex education and reproductive health
- domestic violence, sexual assault
- emotional abuse
- child abuse or neglect
- homelessness
- poverty
- ADHD
- chronic pain
- autoimmune disorders
- emancipation or foster care
- etc...

The problem is not that these information topics exist, but rather that anyone seeking these information must provide age verification; and the government must NOT be able to determine who has tried to gain access to these information.

Note that someone might be simply intellectually curious, or doing research for school, or helping out a friend. But because they search a topic, they will be vulnerable to being discovered as having been interested. Being interested should not be a crime, even in government regions where the act is a crime.

Age Verification Service

There is good discussion going on about the design and standardization of these services. The discussion more broadly is mostly about how those that provide an "age limited" service want to use an "age verification" service so that they don't have to do this difficult task. This is a good topic to discuss as the doing this wrong is easy and exposing the individual privacy is common.

What is not discussed broadly, but I have confidence that in the standards this is discussed, is how the "age verification" service must also be isolated from knowing WHY the age assertion was requested. This is to say that the "age verification" service can't become the thing that a government can subpoena to turn over records so that the government can know the individuals that have been seeking "abortion" information (for example). 

The governments will want to be able to do this subpoena, so they are not going to be pointing out this privacy problem. Much like they want encryption backdoors, they want backdoors to age verification.

Thus, the solution must be blinded BOTH directions; this is what makes it so much harder.

The Age Verification Service must not have an audit trail. None at all. It is far better for it to have failed "open" (allowing access when it should have been forbidden) than for the whole service to expose the whole population that it serves. Privacy Principles must be prime.

Age Verification Service problem

The App stores, like Apple and Google, are being challenged to provide these Age-Verification services. If they focus on the easy use-cases they will not see the hard problems. I hope that they are not blind. Once we have a solution, however flawed it is, it will be used everywhere.



Monday, October 13, 2025

Modern view on Pseudonymization

For years, the terms 'anonymization' and 'pseudonymization' described distinct technical methods for de-identifying data. But if you're still thinking of them that way, you might be behind the times. Driven by regulations like GDPR and court decisions, the focus has shifted from pseudonymization as the method to pseudonymized is the dataset itself. Key is who possesses the re-identification method. This subtle change has profound implications.

Ten years ago, I worked on the De-Identification Handbook with IHE and also on the Health Informatics Pseudonymization standard within ISO at that time the concept of de-identification was broken down into two kinds there was "anonymization" and there was "pseudonymization".

Where anonymization had no way to reverse and pseudonymization had some mechanism for reversing the pseudonymization. At the time these were seen as methods not as the resulting dataset. These methods would be used to identify how data would be De-Identified. The resulting dataset would then be analyzed for its risk to re-identification. That risk would be inclusive of risks relative to the pseudonymization methodology.

Today IHE is working on updating the De-Identification handbook. I'm no longer working on that project due to my employment situation. But while I was working on it before then the other subject matter experts were insisting on a very different meaning behind the words "pseudonymization" and "anonymization".

The following podcast by Ulrich Baumgartner really opened my eyes to how these words got a different meaning. They got a different meaning because they are used in a different contextual way. Whereas before the words were used purely as explanations of methodologies, they are today more dominantly used as words to describe a dataset that has either been pseudonymization or fully anonymized.

[The Privacy Advisor Podcast] Personal data defined? Ulrich Baumgartner on the implications of the CJEU's SRB ruling #thePrivacyAdvisorPodcast https://podcastaddict.com/the-privacy-advisor-podcast/episode/208363881




Where today because of GDPR there is a bigger focus on the dataset than the methodology. GDPR sees "pseudonymization" as a word describing the dataset that has only been pseudonymized but is still in the hands of the organization that possesses the methodology to re-identify. This is contextual. Therefore, the contextual understanding of that dataset is that it is contextually in the hands of an organization that has the ability to undo the pseudonymization. Therefore, the data are NOT de-identified. The data becomes de-identified when the pseudonymization re-identification mechanism is broken, that is to say when the dataset is passed to another party while the re-identification mechanism is NOT passed to that party.

This is the key point that is adding clarity to me. To me, the organization that is using pseudonymization is preparing a dataset to give to someone else; the first party organization already has the fully identified data, thus the pseudonymized data is not something they intend to operate on. It is the NEXT party, the data processor, that gets the dataset and does NOT get the re-identification mechanism. It is this NEXT party that now has de-identified data. 

I now do understand the new diagram, as there was a diagram that was drawing distinction between Identified data, and Anonymized data; with the transition of data from Fully-Identified->Pseudonymized->Anonymized. I saw this diagram, and it did not align with the original methodology perspective, but it does follow with this contextual/relative perspective.

Overall, this understanding is consistent with the original "methodology" meaning of the words, but for some reason the GDPR courts needed to say it out loud that the FIRST organization doesn't get the benefit of de-identification until they pass the data to the NEXT organization. This concept is why

There are some arguments within the GDPR community as to whether it is ever possible to make anonymous data out of pseudonymous data. This because there is SOME organization that does have access to the re-identification mechanism. As long as someone has that ability, then some courts see the data as potentially re-identifiable. That conclusion is not wrong on the blunt fact, but it does not recognize the controls in place to prevent inappropriate use of the re-identification mechanism. The current courts do see that there is a perception of a pathway from pseudonymization to anonymization.

Pseudonymization is more like Encryption than Anonymization

The interesting emphasis at this point is that within Europe under GDPR pseudonymization of a data-set is much like an encryption of a data-set. Both encryption and pseudonymization are seen as purely methodologies of protecting data, neither are a clear methodology to gain anonymization.

Conclusion

GDPR has placed a different emphasis on pseudonymization with the default meaning is where the data holder has used pseudonymization methods but still holds the re-identification key. This state of the data transition was never mentioned in the past, as ultimately the goal of pseudonymization is to produce a dataset that could be passed to another organization who does NOT get the re-identification keys. Whereas in the past we would have said that the other organization got a pseudonymized dataset without ability to re-identify; GDPR would now say that the other organization got an anonymized dataset.

Friday, October 10, 2025

How are complex trust networks handled in http/REST/OAuth.

 > How are http/REST authorized in complex trust networks handled? 

I don't have all the answers. This has not been worked out. I am not holding back "the" answer just waiting for someone to ask.

Whereas in XCA today we use a network of trust (saml signers certificate authorities, and tls certificate authorities), and the network communication also goes through "trusted intermediaries". 

In OAuth there are no "Trusted intermediaries". The search parameters and responses are always point to point between the one requesting and the one responding. The OAuth token used in that point-to-point request/response has been the hard thing to create. Where OAuth has a mechanism to "discover" who that responding service trusts. This is advertised as well-known metadata at that responding service endpoint. So, the Requester queries that well-known metadata, and from that data it then needs to figure out a trust arrangement between the requesting OAuth authorities and that responding trusted OAuth issuers. 

A. Where no trusted third party is needed

The majority case that is used very often today is that the well-known OAuth metadata can be directly used by the client. Client asks that OAuth authority to create a new token, given the requester token, for authorization to access the responder system. 

THIS is what everyone is doing today with client/server FHIR RESTful. This is what everyone looks to get their system to work with OAuth

The token has some lifetime and scope; and is used for multiple request/response. Again, this is normal. and this fact is normal for all uses of OAuth.

B. Where a trusted third party is needed

The case where the requester does not have a trust relationship with that responder defined OAuth authority is where the hard work comes in. In our use-cases where the requester and responder are in different communities. Like with XCA some trust authority is needed. Like with XCA discovering who that trust authority is the job of directory services. 


Ultimately the requesting system finds a trusted OAuth issuer, and it asks for a new token, given the requesting system token, be generated targeting the responding system. Once this token is issued then the requester can do http/REST/FHIR direct to the responding service endpoint using the internet for routing, with that last OAuth token. The responding system can test that OAuth token is valid.

In the healthcare scenario we might want to force an unusual nesting of prior tokens. In this way the responding service can record who/why and from where the request came from. This nesting is not typical and considered complex to implement and parse.

see:  OAuth 2.0 Token Exchange (RFC 8698)

C. Where multiple trusted third parties are needed

I think that the (B) solution can be iterated or recursed on infinitely. 

SO:

The main point of OAuth is that you get a new OAuth token issued for a given target/scope based on the OAuth token that you have. EACH OAuth authority makes a permit or deny decision; hence why an issued OAuth token is always a statement of authorization. If you were not authorized, you would not be issued a token.

In this way the authorization is established up-front; and the data transactions reuse that token until it expires. Thus, the up-front authorization may be expensive, but that token is reused 1000 times in the 60 seconds it is good for (simplified for illustration sake)

Caveat Emptor

I have no idea if the above is right. I think it is close, but I don't know.

I welcome commentors to correct me, especially if they can point at standards profiles that have been established. Especially if these standards profiles are established in general IT, not specific to healthcare. I am suspicious of healthcare experts who invent healthcare specific standards profiles.

Monday, September 29, 2025

FHIR RLS - Record Location Service

I was asked

> Does an IG for such a thing exist (FHIR RLS)? I was wondering if IHE did this? Part of MHD?

 
Not fully. IHE has PDQm, which has most of what is needed,  but no one has brought federation to IHE to solve. PDQm supports a FHIR way to do Patient Identity resolution. It supports a few models

  • Demographics to identity
  • Identifier to identity 
  • Fuzzy match to identity 
  • Search to identity 
The result is one of more Patient Identity. Some of them might be already correlated to the same individual, some may be alternatives. This is common support for a RLS.

What is missing is an indication of the community that the given identity exists within. When using MHD the assumption is that your MHD Document Responder can figure this out on the backend. This the PDQm + MHD client doesn't need to know. This gap is being discussed now. 

The second thing that is missing is some mechanism for the PDQm server to seek out partners that might have identity matches. This mechanism is not defined today in IHE XCPD, so might not need to be said for FHIR. I expect some may want that.

The third thing, that is needed, is a way to translate a community identifier to network communication mechanism. This is available in mCSD. This mechanism can work like it would for XCA, listing XCA gateways; or could be more Internet based simply listing FHIR endpoints.

There is a very good white paper from Grahame in HL7 on Intermediaries. This multiple levels of services is a vision like what IHE has with XCPD+XCA, but for full access to FHIR services. There are some solutions proposed, but no further solution defined. HL7 didn't want to work on it as it is not core, so plan was to have IHE work on it with backing from HL7. The problem is that although the problem was presented to IHE IT-Infrastructure, not enough interest in working on it came forward. Thus, a gridlock. 

These struggles, there is XCPD, which is not FHIR, but would work to find identity at community, lookup in mCSD to find, the FHIR servers.