Wednesday, June 22, 2022

RESTful search using POST vs GET on #FHIR

I got a Question: 
Can you address a specific example of the intersection of FHIR standards and OWASP guidance?  
The FHIR spec allows for sensitive ids such as patient identifier to be used on the query string when searching for a patient.  See the following:
https://try.smilecdr.com:8000/baseR4/Patient?identifier=47
However, the folks at OWASP consider this practice a vulnerability:
https://owasp.org/www-community/vulnerabilities/Information_exposure_through_query_strings_in_url

Do you have any thoughts or guidance on this topic?  Break the standard (and REST) and implement these GETs as POSTs?  Create a proxy table that maps sensitive ids to external ids and require the usage of the external id on the query parameter? 

OWASP is fantastic resource. Everyone should use it. 

However on the topic of GET vs POST for search in FHIR, I do have some further emphasis and guidance:

The FHIR core specification addresses the basic support, and thus why Search is supported on POST as well as GET. All the examples are using GET, but that is just because it is easier to show in examples.

Secure Communication is a must

The OWASP article does recognize that using TLS is helpful against untrusted infrastructure. Fully protecting against them. Communications about patient data better be protected using TLS as any request (GET or POST) will be returning patient data, thus the query parameters are just as vulnerable as the query response.  So, using TLS will prevent all of the Internet infrastructure from grabbing any patient identity or sensitive data. 

The problem, as the OWASP article points out is that logging or inspection might happen on the Client or on the Server on either end of the TLS communication. What they don't say in the OWASP article is that the vulnerability is due to failure to secure the endpoints.  Again, patient data will be flowing, so to be worried about query parameters and not data is silly. Either an endpoint is designed and secured properly, or it should not be trusted with patient data. So the distinction between POST and GET is really odd. You either have control and can be trusted; or you don't have control and not trusted. If you can't be trusted with GET then you can't be trusted with POST.

Body logging

If your logging is out-of-control, then I assert you must assume your audit logs are recording the content of the body. Thus the POST body is logged, as is the Response body. 

Protect the whole System

Servers can certainly protect themselves fully. Any logging can be controlled to log high quality logs and protect the log storage and access fully, or to not log anything. We are trusting these servers to have secured patient data, so certainly they should be expected to be able to securely store log files too. Even cloud based servers that have scale functionalities can be properly secured. If you just secure your database engine, then you have not secured your server. You must secure everything, not just the easy stuff.

So, clients are the biggest exposure point. Applications (aka not browser hosted code) have full control of their environment and thus can also be designed to NOT log things. 

Browser apps that leverage the browser for display and networking are not securable, at all! They should not be trusted with patient data, say nothing about allowing them to use GET vs POST.  The exception that is often allowed is when the whole client computer is controlled, such as a Clinician application.  So, are browser apps forbidden, no. But, like Servers a whole Client can be secured. If the whole Client is secured then there is no problem, if anything in the Client isn't secured then the patient data is just as at risk as the logs.

Any service, intermediary, or client that can't be trusted to maintain secure logs should not be trusted at all. Anyone that puts patient data on a service that can't maintain logs securely is the actual problem. Anyone that allows a insecure client to gain access to patient data is the actual problem.

Patients have rights

Note one exception that as a Patient advocate I must remind everyone... Patients are empowered to make stupid decisions for themselves. It is useful to explain that the patient has chosen an insecureable client, but it is not proper to forbid using the application the patient has chosen. In this case the patient has the right. Privacy Principles favor giving the data to the patient over using security as an excuse to not give them the data. Warn the patient first, but if they say they understand and really want it to happen, then do it.

Not all of FHIR is patient data

That said, there are many uses of FHIR that are not about patient identifiable data. Infrastructure resources, Directory resources, vocabulary resources, definitional resources, etc.

And there are uses of FHIR on synthetic data, or properly de-identified data.

Conclusion 

I have rallied against this security theatre against GET.  POST is not more secure than GET. It is not. In fact GET enables better, for example in that it supports cache control with trusted infrastructure. GET is also expressly idempotent, where POST is not explicitly (although POST search very likely is).

Comment with arguments for/against this position. I feel confident, but I also know that I don't know everything.

Updates

  • Grahame Grieve June 22, 2022 at 3:44 PM
I agree, but the way people are fixed to this speaks to a lack of confidence in managing log access. Which is typically true for general web servers, but cannot be true for servers handling PHI / Clinical data
  • Note that FHIR Paging forces the use of GET for next / previous page. So, one needs to address securing GET

  • Note that POST URL parameters are part of the http specification. So just changing to POST changes nothing. One must change to POST NOT using URL parameters, but content type application/x-www-form-urlencoded with the parameters in the body.

Friday, June 17, 2022

RelatedPerson Consent - how to record the #FHIR Consent that authorizes a #FHIR RelatedPerson

This article summarizes a concept that came from my blog reader. This is actually published in a personal Implementation Guide at -- https://johnmoehrke.github.io/RelatedPersonConsent/ . This concept has not been proposed as a formal work item, but I think it would fit nicely in IHE Basic Patient Privacy Consents for Mobile that I have proposed (more on that soon).

This IG focuses on a use-case where the existence of a representative (e.g. guardian) is backed by a rationale and agreement from the Patient. Specifically some cases:

  1. When the Patient is a minor and the representative is a parent.
  2. When an adult Patient is physically or mentally competent, but still wants to appoint a representative to manage his/her medical records (e.g., a Lawyer).
  3. When the Patient does not have competency to manage their medical records, thus some representative is assigned.
  4. When the courts appoint a representative.

There may be more, but this list gives us a set of perspectives upon the reason why there is a need for a Consent to back the representative.

patientPatientRelatedPersonguardianConsentConsent.patientConsent.agent.referenceextensionRelatedPerson.patient


Thus

  • Patient resource is used to identify the Patient
  • RelatedPerson resource is used to identify the representative
  • Consent resource is used to document the Patient agreement with the representative. This might further be used in advanced cases to define what the RelatedPerson is allowed to do, and thus differentiate between multiple RelatedPerson resources a division of responsibilities.

The RelatedPerson resource would be the way that most will document a relationship between a patient and a representative (e.g., guardian). It is a clear link between the Patient and the other person. However the RelatedPerson does not have anywhere to explain the details of why the relationship exists, or any conditions on the relationship. There is a RelatedPerson.relationship that can be used to differentiate some roles, but this is very coarse level.

  • RelatedPerson.relationship has a clear code for Father, Mother, etc.

It is not clear to me that the RelatedPerson needs to have some indication that there is a Consent explaining the rationale. One would determine this by searching for Consents that point at the RelatedPerson instance. It is possible that the RelatedPerson.relationship could hold normal codes explaining the relationship, and one more that indicates that a rationale is available. Not clear that is proper or needed. It is also possible that there should be an element in RelatedPerson to point at the Consent, but I am not sure yet about that either.

Thus for any given RelatedPerson, one can look for Consent.provision.actor.references that include the RelatedPerson.id value. this can be done by searching on Consent using the actor parameter:

GET [path]/Consent?actor=RelatedPerson/1234

might be good to make sure the Consent is for that patient, and that the Consent is PERMITing that RelatedPerson… etc…

There are other rules that might be possible to do with invariants, but I just itemize them:

  • The RelatedPerson.patient must be the same as the Consent.patient
  • The Consent.provision.agent.reference must be the same as the RelatedPerson.id
  • The Consent is authorizing (permit) the RelatedPerson, and is not expired.

This may seem cumbersome, so I was thinking that an extension in RelatedPerson that explicitly points at the Consent would be more appropriate.

Thus in this IG there is a minimal profile on RelatedPerson that simply indicates that this extension is needed.

Note that this extension does make the creation of the Consent and RelatedPerson resources difficult as they both reference each-other. Thus from a purely REST perspective one needs to create the RelatedPerson resource, then create the Consent that points at the RelatedPerson, then UPDATE the RelatedPerson to add the extension that points back at the Consent. This kind of double pointers is discouraged in REST and in FHIR.

As with any Consent, often there is paperwork that ultimately holds the legal details. This legal paperwork is critical to overall legal precedent, and represents the ceremony of the act of consent from the patient. These details should be captured by a DocumentReference and Binary. The Consent.sourceReference would then point at that DocumentReference. (Could use Consent.sourceAttachment, but I am not a fan of bloating the Consent with that detail).

The Consent then would need to be profiled. The main difference from the FHIR core Consent I outlined in my Consent article is that this might be a specific kind of Privacy Consent delegating authority, and the RelatedPerson instance would be indicated specifically in the .provision.agent.

  • status - would indicate active
  • category - would indicate patient consent, specifically a delegation of authority
  • patient - would indicate the Patient resource reference for the given patient
  • dateTime - would indicate when the privacy policy was presented
  • performer - would indicate the Patient resource if the patient was presented, a RelatedPerson for parent or guardian
  • organization - would indicate the Organization that presented the privacy policy, and that is going to enforce that privacy policy
  • source - would point at the specific signed consent by the patient
  • policy.uri - would indicate the privacy policy that was presented. Usually, the url to the version-specific policy
  • provision.type - permit - given there is no way to deny, this would be fixed at permit.
  • provision.agent.reference - would indicate the RelatedPerson resource
  • provision.agent.role - would indicate this agent is delegated authority
  • provision.purpose - would indicate some set of authorized purposeOfUse

In the case where the court or some actor that is not the Patient is compelling the RelatedPerson relationship, then the Consent.performer would indicate that the Patient is not the one granting the relationship, but rather the guardian or the courts.

One advantage of using a Consent resource as defined here is that there would be a natural set of provisions in a Consent that would be processable by an Access Control engine that understands Consent. This Access Control engine would not need to understand RelatedPerson, other than to know that a given user is a RelatedPerson (vs Patient, Person, Practitioner, etc). Thus the Consent.permit rules are used to mediate access to that Patient’s data by that given user.

Consideration

Given this setup, a newborn would need a Consent drafted as soon as that newborn has a Patient resource to enable the parents’ access. This could be done by the system creating the newborn Patient resource. This could also be done using Implied Consent mechanisms, which is a default policy that is used when no Consent exists for a given Patient->agent relationship.

Same is true for any new Patient for which there is some precedent for implied consent representative.

Forcing a Consent to exist does prove that the representative relationship is explicit, and is thus more transparent. Implied representative relationships are common, but not very transparent.

The Consent resource is not intended to be used to drive the workflow of the capturing of the Consent. The Consent is following the “Event Pattern”, which means that it is the output of an event. The workflow that preceded this event would need to be managed by other resources in the Request pattern

The Task resource is generic and can do this work. There are some specializations of Task, so we could end up at some kind of a Task derivative that is specific to the workflow leading up to a Consent. However it is first best to see if Task can be profiled to address the workflow.

For example a use-case where the Patient nominates a potential Person to become their RelatedPerson; that triggering a GP to review and approve it; that triggering some legal review and approval; resulting in a Consent instance and the creation of the RelatedPerson. This workflow could be profiled into an ActivityDefinition… I like the power of this modeling concept, but have not done it formally so am not sure of all the possible issues.

Note we have tried to keep workflow states out of the Consent.status; but some states have gotten in that I don’t think are proper. But at this time we allow them in until there is a more formal task flow.

Examples

There is a basic example of a Patient delegating their father as their RelatedPerson. The resource objects are clickable to their examples.


Thursday, June 16, 2022

IHE-Connectathon around the world and back

IHE-Connectathon is scheduled for September 12-16, 2022.

Many FHIR based IHE - Profiles (Implementation Guides) will be tested, in addition to the other popular Interoperability specifications from IHE. There will even be some testing of HL7 published Implementation Guide.

I will be present in Atlanta, as hard as I have tried to be sent to Switzerland.

Wednesday, June 15, 2022

IHE Most Salient - based on specification use analytics

IHE, especially the IT-Infrastructure domain, has been publishing specifications in HTML format and Implementation Guide format on a new web site -- https://profiles.ihe.net.  

This web site is enabled with Google Analytics. Thus there is some data available that indicates which parts of the IT-Infrastructure specifications are of interest. Presuming they are interesting because they are used. Google Analytics data is not perfect, there will be no analytics from people using script blocking Browsers. So, there is likely 25% more activity than the analytics show.


Just looking at the data for May 2022. The first view is that the IHE specifications are of global interest. This is dominated by the USA, but is quickly followed by Germany, Italy, France, India, Netherlands, Austria, China, Canada, and Switzerland.  From the picture, there is interest almost everywhere. Actually, I wonder why not everywhere? 

The next perspective is to just growth over time. So over the month of May the number of users (as defined by google analytics, don't ask me to explain that) is growing. Fun part of this diagram is the purple line that shows that the readership does indeed, mostly, take the weekends off. 
Last perspective is to simply look at the traffic per page, as viewed as traffic interest per Profile/Specification. So the following is the ranked list of IT-Infrastructure Profiles in the order. I have included some fun graphics. No, I am not going to explain the graphics.  Note that 40% of these are #FHIR ๐Ÿ”ฅ based.
  1. Cross-Enterprise Document Sharing (XDS.b) ๐Ÿงผ

  2. Audit Trail and Node Authentication (ATNA) Profile ๐Ÿ”’

  3. Mobile Access to Health Documents (MHD) ๐Ÿ”ฅ

  4. Patient Identifier Cross-referencing (PIX) ๐Ÿ˜Š

  5. Cross-Community Access (XCA) ๐Ÿงผ

  6. Patient Demographics Query (PDQ) ๐Ÿ˜Š

  7. Patient Demographics Query for Mobile (PDQm) ๐Ÿ”ฅ

  8. Internet User Authorization (IUA) ๐Ÿ”ฅ๐Ÿ”’

  9. Patient Administration Management (PAM) ๐Ÿ˜Š

  10. Cross-Community Patient Discovery (XCPD) ๐Ÿงผ๐Ÿ˜Š

  11. Cross Enterprise User Assertion (XUA) ๐Ÿ˜Š๐Ÿ”’

  12. Basic Patient Privacy Consents (BPPC) ๐Ÿค๐Ÿฝ๐Ÿ“œ๐Ÿ”’

  13. Consistent Time (CT) ⏰

  14. Patient Identifier Cross-reference for Mobile (PIXm) ๐Ÿ”ฅ๐Ÿ˜Š

  15. Cross-Enterprise Document Reliable Interchange (XDR) ๐Ÿงผ

  16. Basic Audit Log Patterns (BALP) ๐Ÿ”ฅ๐Ÿค๐Ÿฝ

  17. Mobile Health Document Sharing (MHDS) ๐Ÿ”ฅ

  18. Patient Master Identity Registry (PMIR) ๐Ÿ”ฅ๐Ÿ˜Š

  19. Cross-Enterprise Document Media Interchange (XDM) ๐Ÿ—œ️๐Ÿ“ง

  20. Mobile Care Services Discovery (mCSD) ๐Ÿ”ฅ

  21. Comprehensive FormatCode Vocabulary ๐Ÿ—‚️




Tuesday, June 14, 2022

HL7 Security & Privacy Tutorial: July 12-14

HL7 FHIR Security & Privacy

The HL7 FHIR Security & Privacy online class describes how to protect a FHIR server (through access control and authorization), how to document what permissions a user has granted (consent), how to enable appropriate access by apps and users and how to keep records about what events have been performed (audit logging and provenance).

Virtual Classroom : July 12-14

This will be a refreshed version of the Tutorial I have given annually to HL7 20212020, and 2019. Each year I do update and enrich the content. More, if you ask questions.

My slides are freely available on google slides at this easy to type address http://bit.ly/FHIR-SecPriv. Each time I give the tutorial I update these master slides. So each time you go there you will see the latest set of slides. Some slides do have notes, and there are additional detail in slides that I don't cover during the tutorial.

In the past, I have had to compress these into two parts, but will be able to give them in the natural three parts

Part 1 - Basics

  • Security Principles
  • Privacy Principles
  • Basic Security and Privacy Considerations
    • Anonymous Read
    • Business Sensitive
    • Individual Sensitive
    • Patient Sensitive
    • Not Classified
  • HTTP[S] - TLS
  • Authentication & Authorization
    • SMART on FHIR
    • IUA
    • Mutual-Authenticated TLS
  • Access Denied Responses

Part 2 - FHIR capability

  • Provenance
    • Basic
    • Digital Signature
  • Audit Logging
    • Audit Reporting
    • Audit Purging
  • Consent - for Privacy
    • HEART
    • Permission 
  • Attribute Based Access Control
    • Security Tags
    • Compartments / Clearance
    • Obligations
  • Break-Glass
  • De-Identification

Part 3 - Practical application

  • Multiple Organization Provider Directory
    • using relational linking
  • Multiple Organization Profile Directory
    • using security tags as compartments with clearance
  • Extra-Sensitive Treatment
    • Share with Protections
  • De-Identified Research

Note that ALL of these topics have been covered in this blog. See Security TopicsConsent/Privacy, and FHIR for index to these articles.
   

Wednesday, May 18, 2022

Patient data embargo management

There are legitimate reasons for data to be embargoed for some timeframe. I am not a fan of these reasons, but as a Privacy and Security subject matter expert, I get asked how to solve these business needs. Many think this is an easy problem to solve, just slap a security tag on the data, but it is a much bigger systems-engineering problem.

Embargo Use-case

The clearest embargo reason is a patient safety reason, preventing a Patient from seeing a particularly damaging observation, until their primary care physician can have a one-on-one discussion. For example, a lab result that clearly indicates cancer. The clinical expectation is that the primary care physician can break the news more carefully or provide a more complete explanation.

The timeframe is not always clear, and most timeframes get cut short when some activity happens. Such as the above example, once the primary care physician has had the conversation then the embargo should stop.

These embargo timeframes can be just a few hours but can also be many months. I understand from a discussion on FHIR chat that there are countries that have regulations. That allow for timeframe to be up to six months. 

Oh, and data embargoed from the Patient would also need to be embargoed from their delegates (parent, guardian, etc).  

I don't know of embargo use-cases for Clinicians, no idea either on Payers. I could imagine that authorized research would or should be embargoed with the same rules as for Patients.

Meta Data

The security label (.meta.security) is not a great place to handle this, due to the variability. The security label could be used to tag data as falling into the category where an embargo might apply, subject-to-embargo. This would not indicate that an embargo does apply, just that the data qualifies for a potential embargo. Not clear what initial data analysis would be able to set this tag, but it is possible that there might be some set of codes and conditions that would be detectable. It might also be that all data is presumed to be subject to an embargo until data analysis or clinician explicitly marks it as not-subject-to-embargo

So, at this point we have a tag that says the data is either subject-to-embargo, or not-subject-to-embargo. Either method gives us the same state, that of a method to determine which data needs to have some timeframe applied vs which data does not.

When a clinician has a discussion with the Patient, and thus the embargo timeframe should be cut-short, then the clinician can just remove the subject-to-embargo or change it to no-longer-subject-to-embargo. Thus, the mechanism for counting-down the timeframe no longer applies.

Preliminary data status

An alternative that does not use the .meta.security tagging is to just use the .status element on the data. Most, likely all, data that would be potential subject to an embargo has  a .status element and the vocabulary available for the .status element has a preliminary code. This alternative has all data first published as preliminary, and only after some data analysis does it get set to final. This data analysis might be automated or might be clinician driven. In this case any data marked as preliminary would be embargoed from Patient access. One might expect that a Patient might not be given any data that is not in the final status.

The benefit of this method is that it is leveraging elements that have other clinical uses, but the drawback is that the security infrastructure must be aware of specific FHIR Resources like Observation. Further this method will only work for FHIR Resources that do have a .status element. Where the .meta.security is in the exact same place in all FHIR Resources, so the security infrastructure only need to understand the most basic of FHIR Resource.

Another potential drawback is that this dual purpose of the .status element may interfere with appropriate lifecycle management of the data.

Timeframe Management

As indicated in the use-case, the timeframe for automatic expiration of the embargo often varies by setting, data type, and clinician assessment. Where any timeframe exists there needs to be some mechanism to address the timeframe, but where the length of the timeframe is not fixed this makes the problem more difficult.

Count Down Clock

First solution, come up with some set of timeframes that fit the need, and assign them a code. Use that code on the .meta.security.
  • embargo-2-hours
  • embargo-2-days
  • embargo-1-week
  • embargo-2-weeks
  • embargo-1-month
  • embargo-2-months
  • embargo-6-months
As you can see this would be possible if the number of quanta are a few. But gets out of control really quick.

Another alternative is to add an extension with an integer. The integer would be similar to the above in that it would identify some time that would need to elapse.

Both a set of codes and an integer present the problem of time-elapsed-relative-to-what? The _lastUpdated element is available, but it will get updated whenever the data change. Thus any update resets the count down clock.

You could use a Resource specific element, like Observation.issued. Like above with using the Observation.status, using the Observation.issued is elegant but does mean the security layer does need to know about Observation rather than just Resource.

End Time

I would recommend that if an extension is being added, that it rather be a datatype of dateTime, or Period. The meaning of the value would be the date/time after which the embargo is lifted. In this case there is no need to look elsewhere. 

For efficiency's sake, once the time has expired; then the .meta.security should be set back to not embargoed. Thus the date comparison only needs to be done on those with active, or about to expire, data.

Permission

The security wg is working on a Permission resource. It is very drafty at this point, not worth looking too closely at, although we are welcoming use-cases to help drive our design. Note that in this case I am going to use Permission in a negative way, that is that the Patient is Denied access while the Permission is valid. For efficiency, the use of Permission would seem tied to data with a .meta.security with subject-to-embargo. The flag would tell the security layer to go look for a Permission resource instance that applies. That Permission resource has a .validity element that would indicate when the Permission expires.

Note that although Permission has a .validity element, it does not have a way to express Deny.

Note, like the FHIR Permission, the IT Security infrastructure might be able to do everything with no evidence in the FHIR world. That is, an XACML or other access control engine could be given the embargo information for a given resource identifier, it would enforce that rule, and it would flush that rule out when it expires. Thus there would be no FHIR evidence of this rule. 

Expiring the embargo

Some of these mechanisms will automatically expire the embargo, some can have automated expiration of the embargo, but other mechanisms would require a human to disable the embargo. There should be mechanisms in place to assure that the embargo does eventually expire. Such as when status of preliminary is used, some mechanism should detect that the data was in preliminary state for too-long. This detected status might simply alert the primary care physician, or might automatically disable the embargo.

Abuse for Illegitimate reasons

The method used for these use-cases can certainly be used for illegitimate reasons. I suspect that many "data blocking" activities are using these legitimate excuses when there is not a legitimate reason. The concern of all Privacy professionals is this abuse. Many would prefer we have no mechanism for legitimate embargo, but that is not reasonable. Thus my approach is to have mechanisms that are clearly designed, and transparent. 

Transparency is key to Privacy. When a patient is allowed to know how their data are used, and why restrictions are in place, enable the Patient to be more informed. Thus I would prefer the Patient has access to an Audit Event log of all uses, or attempts to use, their data.  see IHE Basic Audit Implementation Guide

Conclusion

Given all of this. I would first look at the use-cases and see if they are always applying to Observation. If so, then I would use the Observation.status and Observation.issued. I next would ask if there is a fixed, or small number of fixed, increments. If there are, then a code could be used for that fixed time. I am familiar with a fixed 2 days timeout, after which the embargo automatically expires. I would then have a security label code for subject-to-embargo, and no-longer-subject-to-embargo. I would have the second code so that it was clear that an embargo was enforced. I would always want the subject-to-embargo code to get removed at some point so as to limit the overhead for the vast majority of data, data that has never been subject to an embargo or data that has an expired embargo.