Saturday, March 2, 2024

Give me a cyber security check list for FHIR RESTful API Systems

So I asked Gemini, googles latest AI...

Cyber Security Checklist for FHIR RESTful API Systems

FHIR (Fast Healthcare Interoperability Resources) is a standard for healthcare data exchange. It is important to secure FHIR RESTful API systems to protect sensitive patient data. Here is a checklist to help you secure your FHIR API system:

Authentication and Authorization

  • Implement strong authentication: Use a standard like OAuth 2.0 or OpenID Connect to authenticate users and clients. This will help to ensure that only authorized users and applications can access the API.
  • Implement authorization: Use a fine-grained authorization system to control what users and applications can do with the API. This includes what data they can access, what operations they can perform, and what resources they can create, read, update, and delete.
  • Use strong passwords: Enforce strong password policies for all users. This includes requiring a minimum password length, complexity requirements, and regular password changes.

Data Security

  • Encrypt data at rest and in transit: Use strong encryption algorithms to protect patient data at rest and in transit. This will help to prevent unauthorized access to the data, even if it is intercepted.
  • Mask sensitive data: Mask sensitive data, such as Social Security numbers and credit card numbers, before returning it from the API. This will help to prevent unauthorized disclosure of the data.
  • Implement logging and auditing: Log all API access attempts, including successful and failed attempts. This will help you to track who is accessing the API and what they are doing.

Other Security Measures

  • Keep software up to date: Regularly update your FHIR server software and any other software that is used to access the API. This will help to fix security vulnerabilities that could be exploited by attackers.
  • Perform regular security assessments: Regularly assess the security of your FHIR API system to identify and fix any vulnerabilities.
  • Monitor for suspicious activity: Monitor your FHIR API system for suspicious activity, such as unauthorized access attempts or unusual data access patterns.

By following these recommendations, you can help to secure your FHIR RESTful API system and protect sensitive patient data.

Additional points to consider:

  • Rate limiting: Implement rate limiting to prevent denial-of-service attacks.
  • Input validation: Validate all input to the API to prevent malicious attacks, such as SQL injection attacks.
  • Error handling: Handle errors gracefully and avoid returning sensitive information in error messages.
  • Security awareness: Train your staff on the importance of security and how to protect patient data.

By following these recommendations, you can help to ensure the security of your FHIR RESTful API system and protect sensitive patient data.

Monday, February 19, 2024

IHE IT-Infrastructure Winter 2024

Updated with final announcement: IHE IT-Infrastructure with two major new profiles (DSUBm, and PDQm match), and three minor updates (BALP, PIXm, and PCF), all on #FHIR. https://mailchi.mp/ihe/ihe-iti-tf-documents-published-2024-02-29

IHE just completed our winter quarter face-to-face meeting, held in Oak Brook IL at the RSNA headquarters. We primarily focused on two IHE-Profiles, and a set of other tasks. The update to PDQm and DSUBm will soon be formally published for Trial-Implementation.
  1. PDQm 3.0.0 - adding support for $match operation.
  2. DSUBm 1.0.0 - subscription to document sharing.
Other tasks worked on:
  1. Scheduling for Mobile - incremental development of transactions.
  2. Downgrade PCF 1.1.0 to R4 for greater usability.
  3. Minor Bug fixes in BALP 1.1.3 profiles.
  4. Minor Bug fixes in PIXm 3.0.4 profiles.
  5. FormatCode 1.1.1 vocabulary inclusion of status active.
  6. Plan for support of sex and gender profiling.
Next quarter's work:
  1. Continue developing Scheduling for Mobile.
  2. Add to DSG, support for JSON Web Signature (JWS).
  3. Develop Finance and Insurance workflow.
  4. Update HIE Whitepaper and MHDS with newest Profiles.
  5. Continue to work with IPA for suitability for QEDm use-cases

PDQm support for $match operation

Patient Demographics Query for mobile (PDQm) version 3.0.0 now has support that is more in alignment with the original use-cases with the addition of support for $match operation. The $match operation allows a client to provide all that it knows about the subject and enables the server to utilize algorithms to get the best match. Previously PDQm supported only the FHIR query, which does not give the server the power to utilize algorithms. 

Both query and $match are now available in PDQm, as there are clear use-cases where one wants to use query, such as at a registration desk where a human would then further match. Whereas the $match may be more beneficial where some information is known about the patient and a best match is needed.  

The PDQm server (Patient Demographics Supplier Actor) must support the search transaction and has an option to declare support for the $match. The PDQm client (Patient Demographics Consumer) will need to declare at least one option indicating that it will use either query, $match, or both. Support for the normal FHIR dynamic discovery using the metadata endpoint returning a CapabilityStatement is used in operational environments to detect server support.

Document Subscription for Mobile (DSUBm)

The Document Subscription for Mobile (DSUBm) version 1.0.0 profile describes the use of document subscription and notification mechanisms for RESTful applications. In a similar way to the DSUB profile, a subscription is made in order to receive a notification when a document publication event matches the criteria expressed in the subscription. 


The DSUBm allows clients to subscribe to specific kinds of changes in the Document Sharing system, such as new documents, updates to folders, and replaced documents. For example, subscribing on a given patient for a new lab report. DSUBm also adds support for MHD clients to get subscriptions about these activities in an XDS environment and adds support to DSUB to enable searching on subscriptions that are not supported in DSUB. A significant portion of the documentation is guidance on assembling these various subscription capabilities across different Document Sharing methods. These are the documented use-cases that drive the solution and show the breadth of support:
  1. Document Subscription for Mobile application in MHDS Environment
  2. Document Subscription for Mobile application in MHDS Environment using Folder Subscription
  3. Document Subscription for Mobile Device in XDS on FHIR Environment
  4. Document Subscription for Mobile Device in XDS on FHIR Environment extending DSUB with DSUBm
  5. Document Subscription for Mobile Alert System
  6. Document Subscription for Mobile Device in XDS on FHIR Environment with availabilityStatus update
  7. Document Subscription for Mobile Device in XDS on FHIR Environment with document metadata update
  8. SubmissionSet Subscription in a XDS Environment where DSUBm is Grouped with DSUB
This Profile is using FHIR R4 to be consistent with the other IHE Document Sharing infrastructure and uses the HL7 Subscriptions R5 Backport. This enables support in R4, using the newer subscription methodology developed in FHIR R5. The support for FHIR R4 is stronger in the marketplace, with very little interest in FHIR R5. Both IHE and HL7 have received strong feedback from the implementer communities that FHIR R4 will continue to be the focus until FHIR R6 is readily available. Note that there are some trickery being used to utilize this newer subscription methodology in an FHIR R4.

Clients can discover the kinds of subscriptions available [ITI-114], search on current subscriptions [ITI-113], and subscribe [ITI-110]. When a subscription is triggered, a notification is sent [ITI-112]. The other transaction [ITI-111] enables backend infrastructure between the Document Sharing environment and the notification environment.

Notable

Downgrade PCF to R4 for greater usability. We have found that one can't depend upon a FHIR R4B implementation guide in a FHIR R4 implementation guide and further profile. Given that PCF is intended to be further refined by more advanced use-cases or regional needs, this was creating problems. The only reason we started with R4B was that there was a bug in the FHIR R4 build around an example in Consent that caused an IG build error. This is no longer a problem, so we downgraded PCF to R4. 

Bug fixes in BALP profiles. There were some observed profiling bugs that were reported by the community (big thank you to all you in the community that report these things). The changes made were to the intended constraint. The changes make the constraint more specific, correct, and better for further profiling.

Bug fixes in PIXm profiles. A helpful community comment observed that the PIXm specification was improperly profiling how an error would be returned.

FormatCode vocabulary inclusion of status active. Previously only the codes that were deprecated had a status on them. The current vocabulary infrastructure was thus not seeing any active codes. So, all the codes that are not deprecated now have a status of active. This will result in proper valueSet expansions. I am in the process of updating the HL7 valueSet that includes the IHE codes.

Spring Quarter

We looked at FHIR R5 this last quarter. We were not looking to move to FHIR R5, but rather to evaluate the impact if it was needed. We found that some of our Profiles will convert easily, but we also found significant problems with FHIR R5, significant enough that we created HL7 jira tickets requesting that these be fixed before FHIR R6. We will continue this effort with the intent that the effort that we take now will provide more insights to better the FHIR R6 release. We have strong marketplace indications that FHIR R4 will continue to dominate until FHIR R6, specifically that FHIR R5 will only be used for specific isolated use-cases.

We have been looking at the possible impact of the HL7 lead cross-SDO effort on Gender Harmony. IHE, especially ITI, have many Profiles that touch upon the concept of the patient (HL7 v2, v3, CDA, and FHIR) so it is important that we assess the impact. At this time our approach is to look for any problems where our profiling is contra to the Gender Harmony recommendations. We are finding that the impact is mostly an opportunity to inform our reader that when they need to communicate the concepts developed in the Gender Harmony specifications that they MUST use the approaches developed there. This approach helps encourage the correct behavior.

Scheduling -- a vendor agnostic specification providing FHIR APIs and guidance for access to and booking of appointments for patients by both patient and practitioner end users, including cross-organizational workflows. First developed in Argonaut on STU3, the further development has been cooperatively transferred to IHE. The IHE specification is based on FHIR Version 4.0.1, and references the Schedule, Slot, and Appointment resources. This workflow profile defines transactions that allow a scheduling client to obtain information about possible appointment opportunities based on specific parameters, and based on that information, allow the client to book an appointment. 

Add to DSG, support for JSON Web Signature (JWS). There is market interest in using JSON Web Signatures (JWS) rather than XML-Signature. The use-cases will be the same as in the current Document Digital Signature (DSG), that supports whole document signature. The likely impact will be a new option to support JWS signatures, which will predominantly be a new MIME-TYPE of DSG document object. The JWS will likely use the JAdES profile on JWS for long-term signature, like today is in DSG using XAdES for XML-Signature.

Develop Finance and Insurance workflow. Outside the USA there is interest in a general use Finance and Insurance workflow. This is mostly needed in the developing countries where there isn't an existing model, or where the existing model needs radical changes. The model will take inspiration from the work of OpenHIE Finance and Insurance Services Workflow

For those that want to participate, please contact me. IHE always welcomes new participants.

Wednesday, January 31, 2024

Provenance use in AI

I have been engaged in a few initiatives around AI/ML, both inside healthcare and broader. I have been engaged to work on a variety of different needs, that all use a variation of Provenance. The following is not a tutorial, but rather an outline of the various ways that Provenance is useful in AI. Useful is not to say that these are currently used.

  1. Provenance on dataset that is available for various uses, including being used as a learning dataset.
  2. Provenance on the learning dataset showing where each data came from. 
  3. Provenance on a ML model node showing which data influenced this node.
  4. Provenance on an AI output showing which nodes influenced this AI output (decision, observation, derivation, etc)
  5. Provenance on some action taken because of some AI output.
These steps are simplified and generalized. Especially inside of various architectures of AI/ML the concept of a node is not always identifiable. There is a push to use Provenance to enable explainable and trustworthy AI that would be able to explain why an AI output came to be. So, the above presumes that some node(s) in the knowledge model is identifiable.

These Provenance artifacts are also illustrated here purely as provenance details. That is to say that the Provenance does not carry the inputs or outputs; but certainly, points at them. Thus, one can't look to Provenance to embody the "AI Output", that AI output would be encoded in some other artifact.


I also speak of Provenance broadly. Within FHIR, the FHIR Provenance works fine. Outside of FHIR, the W3C PROV model works fine. But it is also possible that one has some other metadata structure that carries the artifacts of Provenance.

Provenance on dataset

This use of Provenance addresses the situation that those looking to teach an AI/ML, need data. The data may already be known, but there may be cases where one looks to a library of data looking for appropriate data. Where appropriate may include quality indicators, fit for use indicators, authorization rights. These are typical "Provenance What" attributes. As well as classic provenance attributes: Who owns the data, Where is the data, When was the data collected, Why was the data collected. 

The key here is to identify all the useful attributes that might be needed, and thus profile how that is expressed as part of Provenance. Some use-case needs:
  • How was this data collected? User questionnaire, Survey, Synthetic, Combination, Subset, etc
  • Is there a regulation covering this data?  Indicate the regulation
  • What region was this data collected within?
    • Is the data region locked?
  • Is the data about human subjects?
    • Is there subject authorization? 
    • Is the data de-identified? To what risk level?
  • Use obligations? Must be used in aggregation, must be de-identified, must get individual authorization, must be encrypted, etc
  • Allowed uses vs Forbidden uses?


Note that a source dataset may be derived from other source datasets. This is something that is key to Provenance. To be able to say this data is derived from that data using how methodology. In this way a Provenance can indicate that a dataset imports three other datasets. This said, the above What attributes would also need to be combined in appropriate ways. For example, I pull in three EHR datasets with de-identification that supports longitudinal consistency, and because the data are de-identified the original HIPAA regulation requirement is eliminated, yet the region covered is expanded. As such, there needs to be the ability to navigate back to the source of this derivation, but that pathway is likely privileged so not possible to navigate by all users.

Provenance on Learning dataset

This is very related to the Provenance on source dataset, but the distinction is that the source dataset doesn't always come with Provenance. But the learning dataset should know where all the data came from. Thus, the use-case need here is more classic Provenance holding simply where the data came from. This is not to say that one can't include the full details, but would be unnecessary if one can navigate from the learning dataset provenance to the source dataset provenance. Being able to navigate from one kind of provenance to the other is a key feature of provenance.

If there is a specific obligation that comes with some source data, this might be traceable using Provenance as well. I would think a simplifying methodology would be to have the obligations managed independently, so that the obligations have their own Provenance back to the source of that obligation. In this way a learning dataset may have a functional obligation that is sourced from more than one source dataset. This is simply one obligation (rule) with many Provenance. 

Similar to the source dataset discussion around derivation from multiple sources. The Learning dataset would have a wholistic Provenance that expresses the derived state, in addition to Provenance on each of the datasets that were imported.

Provenance on ML node

I will use the concept of a ML node, as an identifiable portion of a ML knowledge model. If there is a very specific ML model concept of a node, this works for me, but I didn't intend only that. I also know that some ML models don't have identifiable sub-divisions of the model, in that case then Provenance will be only possible to the Provenance on Learning dataset. Thus, the concept that a ML node is not always possible, but it certainly is important to explainable and trustworthy AI

The details of how the node was derived from the identifiable data is likely to be less describable. But where it can be explained, that explanation can be recorded in the Provenance as a how attribute.

Provenance on AI output

An AI model will take some input against the current model and produce some output. This input, current model, and output; are clearly attributes for Provenance of that output. The key use-case here is to track that some output is attributable to AI, and attributable to a given model. Use-case would also then be able to tack these outputs based on a given model, thus if the model is found to be defective, then those outputs can be re-evaluated or put into question.

Here I first put some emphasis on output being a subject of Provenance, so let me be clear that Provenance itself is not a way to encode the output. As with all artifacts, Provenance presumes that inputs, outputs, agents, algorithms, etc; are all encoded in some relevant and good standard and are able to be referenced by the Provenance.

Provenance on Actions taken because of AI output

This is getting a bit beyond AI/ML, but one uses an AI/ML to do something, and that something is what I am referring to here. I simply indicate that Provenance is applicable here too. So that one can indicate that some action was taken because of some output from an AI.

Summary

Provenance is not the core of AI/ML, but the general concept of Provenance is very valuable to the use of AI/ML


Tuesday, January 30, 2024

VIP Patients in #FHIR

The FHIR security tag `VIP` is used to indicate that a patient's health information is considered to be highly confidential and requires heightened security measures. This may be due to the patient's public profile, occupation, or other factors. VIP is a designation of a person, not a designation of the data. 

To use the VIP security tag, simply add it to the security tag of any FHIR resource that contains the patient's health information. For example, the following code shows how to add the VIP security tag to a Patient resource:

{ "resourceType": "Patient", 
 "id": "1234567890", 
 "meta": {
   "security": [ { 
     "system": "http://terminology.hl7.org/CodeSystem/v3-ActCode", 
     "code": "VIP" } ] }
... other content ...
}

This is an example of tagging the Patient resource to indicate that the patient is a VIP, and thus implies that all the data associated with this Patient needs to be treated as VIP patient data. Once the VIP security tag is added to the Patient, the patient's health information should be treated with heightened security measures. This may include restricting access to the information, encrypting the information, or auditing access to the information.

Here are some examples of how the VIP security tag might be used:
  • A hospital might use the VIP security tag to protect the health information of famous patients or patients who are in the public eye.
  • A government agency might use the VIP security tag to protect the health information of high-ranking officials or other sensitive individuals.
  • A research institution might use the VIP security tag to protect the health information of participants in sensitive clinical trials.
It is important to note that the VIP security tag is just one way to indicate that a patient's health information is considered to be highly confidential. There are other security tags that can be used, such as the Confidentiality or Sensitivity security tag codes. The specific security tags that are used will depend on the organization's policies and procedures.

Typically, VIP patients are limited to a subset of the clinical staff, such as a clearance or role. This might be implemented purely in the security infrastructure or might leverage FHIR CarePlan or PractitionerRole. All accesses to VIP patient data often will trigger stricter scrutiny of accesses. On a regular basis (e.g. daily) all accesses to VIP patient data are reviewed, and inappropriate accesses are investigated with potential corrective actions against the user.


Standards for Accounting of Disclosures

I was asked lately if there are standards that support "Accounting of Disclosures". The use-case of Accounting of Disclosures is specific to the USA, but the broader concept is an expected Privacy Principle. The broader concept of an Access Report, or a Report of Data Uses, would inform a data subject of any use of their data both those that were authorized by the patient (e.g. Consent) and those that were against that authorization. The USA concept of Accounting of Disclosures is a much smaller subset, and in my view a useless subset as this subset is made up of only those uses of the data that the patient explicitly authorized outside the normal Treatment, Payment, and healthcare Operations.

So, are there standards? YES. The standards don't produce a human readable report, but rather would provide the raw material that is used to fill out a human readable report. This is an important distinction, although it is a common distinction between technical standard and User Experience. For example, the technical standards for encoding a lab result are not fit for patient consumption, but they are key contributors to the human readable report that is given to the patient. The report includes context setting, and assistance with understanding the details.

Are their interoperability standards?

Yes, there is a long history of Healthcare and general standards that are designed to support Accounting of Disclosures, Access Log, and many other use cases.

  • ASTM E2147 - Setup the concept of security audit logs for healthcare including accounting of disclosures
  • IETF RFC 3881 - Defined the Information Model (IETF rule forced this to be informative)
  • DICOM Audit Log Message - Made the information model Normative, defined Vocabulary, Transport Binding, and Schema
  • IHE ATNA - Defines the grouping with secure transport and access controls; and defined specific audit log records for specific IHE transactions.
  • NIST SP800-92 - Shows how to do audit log management and reporting - consistent with our model
  • HL7 PASS - Defined an Audit Service with responsibilities and a query interface for reporting use
  • ISO 27789 - Defined the subset of audit events that an EHR would need
  • ISO/HL7 10781 EHR System Functional Model Release 2
  • ISO 21089 Trusted End-to-End Information Flows

More specifically does FHIR have this?

Yes, the AuditEvent resource has as a use-case to provide support for Accounting of Disclosures. The AuditEvent resource is a collaboration between HL7, DICOM, and IHE.

In FHIR R4 - http://hl7.org/fhir/R4/auditevent.html

IHE has a relevant Implementation Guide – Basic Audit Log Patterns (BALP)
    https://profiles.ihe.net/ITI/BALP/index.html

within BALP IG, which is all relevant to Security/Privacy audit log recording and access to that recording using FHIR, there is a specific profile of the AuditEvent resource for recording a known disclosure.
    https://profiles.ihe.net/ITI/BALP/content.html#3577-privacy-disclosure-audit-message

IHE has a supplement on ATNA that brings in FHIR AuditEvent
    https://www.ihe.net/uploadedFiles/Documents/ITI/IHE_ITI_Suppl_RESTful-ATNA.pdf

With this linkage between FHIR and ATNA, the events can be recorded using FHIR restful create, and can be accessed using FHIR search. 

Which brings up ATNA (Audit Trails and Node Authentication) which is the long-standing solution in IHE. 

Further IHE governance has each Profile that IHE writes should have in it how that Profiles transactions would be logged in the audit log. These would be in Volume 2, in the Security Considerations section.

Must I record using ATNA or FHIR AuditEvent?

No, one of the benefits of the supplement adding FHIR AuditEvent to ATNA is to provide a search mechanism that produces a FHIR Bundle of AuditEvent records. These records do not need to be originally stored in ATNA or FHIR AuditEvent, just made available in FHIR AuditEvent format. Much like clinical APIs to EHRs that expose the clinical data in FHIR clinical resources, while not mandating the format of the database to be FHIR.

Thus a system can record the event using whatever mechanism it wants to, which might be native database and web-server formats. 

Are there implementations of BALP?

Yes: The following commonly used FHIR Servers have BALP implemented within them. You just need to turn it on. For more details:

PS

IHE is a recognized standards organization focusing on profiling standards. The use of AuditEvent is recognized broadly for support of Security and Privacy audit log requirements.




Tuesday, November 7, 2023

IHE IT-Infrastructure Fall 2023

The IHE IT-Infrastructure committee has approved four milestones; sIPS, NPFS, DSUBm, and PDQm match alternative. This winter quarter will be a lighter load, recognizing the holidays: Patient Scheduling, prospective look at FHIR R5/6, and evaluating impact of Gender Harmony.  

This article is published before these are formally published, so I include a (will be at) link that likely won't be proper until later in November.

(updated to clarify the links and add YouTube presentation links)

Sharing IPS (sIPS)

Formal Publication -- https://profiles.ihe.net/ITI/sIPS

This Implementation Guide was out for Public-Comment and is now ready for Trial-Implementation.

The Sharing of IPS (sIPS) IHE Profile provides for methods of exchanging the HL7 International Patient Summary (IPS), using IHE Document Sharing Health Information Exchange but does not modify the HL7 IPS specification. 

The International Patient Summary (IPS) content,
as defined in the ISO 27269 data model specification, utilizes IHE’s document sharing infrastructure including cross-community, HIE, direct exchange models, and more. It has been designed specifically to remove barriers to adoption, by leveraging architectures that are currently implemented, well-established, and robust. 

The sIPS Profile provides implementation guidance to vendors and implementers and joins a growing suite of IPS standards artefacts contributed by a variety of Standards Development Organizations (SDOs) and coordinated by the Joint Initiative Council for Global Health Informatics Standardization (JIC).

YouTube presentation, long, and short.

Non-Patient File Sharing (NPFS)

Formal Publication -- https://profiles.ihe.net/ITI/NPFS

This Implementation Guide was converted from PDF form into the Implementation Guide form and is now transitioning to Trial-Implementation again.

The Non-Patient File Sharing (NPFS) Profile defines how to share non-patient files such as clinical workflow definitions, domain policies, and stylesheets. Those files can be created and consumed by many different systems involved in a wide variety of data sharing workflows.

YouTube presentation.

Document Subscription for Mobile (DSUBm)


This Implementation Guide is going to Public-Comment as a new specification that provides for subscriptions and notification mechanisms to Document Sharing publications.

The Document Subscription for Mobile (DSUBm) profile describes the use of document subscription and notification mechanisms for RESTful applications. In a similar way to the DSUB profile, a subscription is made in order to receive a notification when a document publication event matches the criteria expressed in the subscription.

This profile can be applied in a RESTful-only environment as MHDS but it can also be used with different non-mobile profiles as XDS.b and DSUB. This profile intends to grant the same functionality as the DSUB profile and its supplements regarding Document subscription but also adding some other functionalities (e.g. Subscription Search).

YouTube presentation.

Patient Demographics for Mobile (PDQm)


This Implementation Guide is going to Public-Comment with a new alternative for looking up a Patient using the FHIR $match operation. 

Patient Demographics Match is used by the Patient Demographics Consumer to request that the Patient Demographics Supplier identify Patient records that match the demographics supplied in the request message. The request is received by the Patient Demographics Supplier. The Patient Demographics Supplier processes the request according to its internal matching algorithm and returns a response in the form of demographics information for the matching patients.


YouTube presentation.

Winter Quarter

The winter quarter we will continue working on:
  • Scheduling -- a vendor agnostic specification providing FHIR APIs and guidance for access to and booking of appointments for patients by both patient and practitioner end users, including cross-organizational workflows. This specification is based on FHIR Version 4.0.1, and references the Schedule, Slot, and Appointment resources. This workflow profile defines transactions that allow a scheduling client to obtain information about possible appointment opportunities based on specific parameters, and, based on that information, allow the client to book an appointment.
  • Evaluating FHIR R5 to improve FHIR core for R6 -- The workgroup will look at what it might take to convert the current set of FHIR R4 implementation guides to R5, with the goal to uncover concerns with FHIR core that IHE should recommend be remediated in FHIR core R6. There is no intention by IHE to publish these FHIR R5 implementation guides as market demand is very low.
  • Evaluating the impact of Gender Harmony on IHE Profiles -- HL7 has published a set of implementation guides covering the Gender Harmony use-cases including HL7 v2, CDA, and FHIR. The workgroup will be evaluating the potential to update existing IHE Profiles to add these capabilities. 


Thursday, October 19, 2023

Teaching an AI/ML/LLM should be a distinct PurposeOfUse

I have been thinking about a specific need around AI/ML. That is, that when data are being requested/downloaded for the intent of feeding to a Machine Learning; this action should be distinguished from a request for Treatment.



This came up on a TEFCA/QTE call this week, where a question was posed as to how a patient could express that they wanted to forbid their data from being used to teach Machine Learning.

This use-case would need the above ability to understand when a data request could result in the data being used for Machine Learning. Note that data requests are encouraged to include ALL purposeOfUse values for which the data would be used. So in the USA, this would include Treatment, Payment, and Operations. (Note that it is known in the existing nationwide health exchange that many participants can't handle more than one, and thus in that exchange Treatment is presumed to be TPO. I don't like this, but reality is often less than perfect).

Thus, I think we need a specific PurposeOfUse to indicate these requests intend to be used for Machine Learning. I think that this PurposeOfUse would logically be a sub-concept of the existing Healthcare Operations. I argue this because it clearly is not about Treatment, or Payment; that is not to say that the resulting algorithms may not be used for Treatment or Payment; but the reason to ask/get data at this point in the data flow is to feed the Machine Learning. It might be argued that the Machine Learning Training PurposeOfUse would possibly be a new top level PurposeOfUse, but I don't think that is correct either as much of the data captured already today is presumed to be available for Machine Learning (best-practice is that it is consumed in de-identified form, but this topic is not about de-identification or not).

It is possible that we might need a new Obligation/Refrain code as well (thanks to Kathleen for pointing this out). Thus data could be communicated with an attached Obligation to not use it for Machine Learning Training (seems like a refrain). I don't mind putting this code in, but at this time Obligation/Refrain codes are not used, where PurposeOfUse is emerging as being used.

So a PurposeOfUse code specific to Machine Learning
  1. can be used in a response (bundle) to indicate positively the intended purposeOfUse allowed
  2. can be used in a request to indicate desired purposeOfUse -- which could be rejected if the responder disagrees
  3. can be expressed in a security token to indicate authorized PurposeOfUse
  4. can be used in policy rules to indicate permit/deny of that specific policy. In this way a data-use-agreement could state that the high level operations purpose of use is intended to enable all sub-concepts; and it could be used to indicate that the high level operations purpose of use is intended to ONLY speak to some sub-concepts such as eliminating the Machine Learning as being allowed or requested.
  5. can be used in a Consent, where allowed, to allow an individual patient to express rules specific to that purposeOfUse.
  6. can be placed on a dataset that has been properly gathered with that purposeOfUse
  7. can be placed on a data item within a dataset to indicate that the data has been properly gathered with that purposeOfUse 
    • note tagging the dataset is more common, as replicating the tag millions of times over at the data resource level is not adding value, but I express this one as a dataset might be a mixture of some data that was collected with authorization and some that were not. this would require tagging each data resource.
I would like to get wider consensus on this(these) concepts before we add a code. This consensus would also help inform what it is called, what it is described as, where it is placed, etc. I am confident that we have healthcare standards infrastructure to support this new use-case.