Sunday, July 29, 2018

Privacy and Security Considerations for the use of Open APIs for Patient Directed Exchange.

I have the great honor to be hosting a panel discussion in Washington DC as part of the Office of the National Coordinator's 2nd Interoperability Forum. This event is, next week, August 6-8. My panel is scheduled for the afternoon of August 7th, from 1:30 to 3:00 pm. My panel title is "Privacy and Security Considerations for the use of Open APIs for Patient Directed Exchange."

Here is the main vision for this panel:
Assuming that we agree that patient advocates’ and privacy advocates’ vision is our goal; what lack of standards is getting in the way.  This is not a discussion about what the goal is, but rather it is meant to focus on what is still preventing the big stakeholders from embracing this vision.

General flow

Before my panel is a "Blockchain" topic and "Identity and Trust"; after my panel is a Lighting round where some new tech will be showcased.

Where Blockchain and the Lighting round will be clearly looking to new and shiny tech; I am hopeful that Identity/Trust, and my segment will be more grounded in reality. 

I have invited to my panel individuals representing very large organizations, very well established organizations. The reason I did this is because we don't get to hear publically from this perspective. Mostly the reason is because when this size of an organization makes a change, it affects MANY. However this size of an organization can't make quick changes. This size of an organization can't make risky changes. This size of an organization NEEDS some standard to guide them. This standard must be mature and have partner acceptance. 

Yes, sometimes a large organization can lead. This does happen. But it happens toward a standard. Example is Apple adopting FHIR.

What's a Standard?

Where in this panel standards is the broadest view. Inclusive of :
  • Interoperability Standards -- from the likes of HL7, FHIR, and IHE;
  • Vocabulary Standards -- from the likes of HL7, SNOMED, LOINC, IEEE, ISO, etc;
  • Implementation Guides -- specific use-case analysis with specific solutions -- From Argonaut, IHE, or ONC;
  • Standards of Practice -- professional society guidance from HIMSS, AMA, and other medical professional societies;
  • Standard policy framework -- Legal framework that encompases many reglations and defines appropriate use and responsibilities;
  • Trust Framework -- Multi-party trust agreement that binds the parties to a set of rules and mitigations, backed by technology (like Certificate Authority). For example Sequoia DURSA or DirectTrust;
  • Reference Implementation -- software provided in open-source by a consensus body as an implementation of a standard. Such as the many FHIR open source projects;
  • Standard Interpretation of Regulation -- like HHS/ONC has done with for example the use of email with patients; and
  • Laws and Regulations -- we all hope for as few regulations as possible, but sometimes they are needed.

Ideal Patient Centered Privacy

Here are my notes extracted from my blog on Privacy Principles. I hope not to cover these in this detail, but am ready to if need be. I think it is important to recognize ALL of these Principles, not just "Consent".
  1. Collection Limitation Principle -- There should be limits to the collection of personal data and any such data should be obtained by lawful and fair means and, where appropriate, with the knowledge or consent of the data subject.
  2. Data Quality Principle -- Personal data should be relevant to the purposes for which they are to be used, and, to the extent necessary for those purposes, should be accurate, complete and kept up-to-date.
  3. Purpose Specification Principle -- The purposes for which personal data are collected should be specified not later than at the time of data collection and the subsequent use limited to the fulfilment of those purposes or such others as are not incompatible with those purposes and as are specified on each occasion of change of purpose.
  4. Use Limitation Principle -- Personal data should not be disclosed, made available or otherwise used for purposes other than those specified in accordance with Paragraph 9 except: a) with the consent of the data subject; or b) by the authority of law.
  5. Security Safeguards Principle -- Personal data should be protected by reasonable security safeguards against such risks as loss or unauthorised access, destruction, use, modification or disclosure of data.
  6. Openness Principle -- There should be a general policy of openness about developments, practices and policies with respect to personal data. Means should be readily available of establishing the existence and nature of personal data, and the main purposes of their use, as well as the identity and usual residence of the data controller.
  7. Individual Participation Principle -- An individual should have the right:a) to obtain from a data controller, or otherwise, confirmation of whether or not the data controller has data relating to him; b) to have communicated to him, data relating to him within a reasonable time; at a charge, if any, that is not excessive; in a reasonable manner; and in a form that is readily intelligible to him; c) to be given reasons if a request made under subparagraphs(a) and (b) is denied, and to be able to challenge such denial; and d) to challenge data relating to him and, if the challenge is successful to have the data erased, rectified, completed or amended.
  8. Accountability Principle -- A data controller should be accountable for complying with measures which give effect to the principles stated above.

Modes of Communication

I am not constraining the mode of healthcare data communication. I want us to be inclusive of "Mediated" exchange, "Directed" exchange, "Controlled" exchange, and "Negotiated" exchange. I have not seen these formally defined, so here is my informal definition. Let me know if you know of another mode of communication.
  • Mediated Exchange -- where the Patient themselves is an active part of the communication pathway. Such as carrying the data within their possession, using a personal device and application, --- Such as using a phone resident App using FHIR to download their data, then upload that data to some recipient. 
  • Directed Exchange -- where the Patient actively requests that the information flow to a selected destination. --- Such as a patient using Direct Secure Messaging, or where a patient requests that the data be pushed.
  • Controlled Exchange -- where the Patient does not get directly involved in the communication, but should be understanding of the communication and possibly have control. over that communication ---- Like using Health Exchange between Provider organizations
  • Negotiated Exchange -- where the Patient themselves connects two parties and authorizes the flow between those two parties. This might use the HEART standard for authorization, and FHIR bulk data access.

Are we there yet?

So, standards in the broadest definition are important to the large organization. So I want to hear from them, what standard is still need. What lack of a standard is preventing them from achieving the vision of the Privacy and Patient advocates?

I would love to hear: "Nothing is needed, we are already there." I think we are closer than many think. I know that my efforts within the VA on their Patient Portal -- My HealtheVet -- shows that they are really close.

I do expect there is still some standards needed. Identity? Authentication? Consent? Care-Team? Provenance? Data-Tagging? Obligations? App-Validation? App-Store?

I certainly have blog articles on many of theses topics: FHIRPrivacy/Consent, Health Exchange, Blockchain in Healthcare, De-Identification, Patient Identity, Direct, and even GDPR.

Saturday, July 21, 2018

Timebound XDS queries done right

As the author of the soon to be published IHE "Document Sharing Metadata Handbook", I have been involved in some very deep and disturbing discussions on how to do timebound queries in XDS/XCA. I say very deep because this discussion included almost a dozen of the best minds on the XDS Metadata and Query models. Disturbing because the discussion showed that the simple concept of timebound queries in XDS/XCA is not understood well. Perplexing because we have figured this out many times. If it takes us this long to re-invent this understand, then it must be much harder for others.

Mistakes have been made

I wrote an article on our first attempt. I thought it was good. It was not wrong. But it would have resulted in false-positives, and false-negatives that can be avoided. See Basics of doing Document Sharing Query right

For the version of the Metadata Handbook that we sent to public comment, we flipped the logic, thus making a very bad mistake.

Back to the mostly right logic I had in my article, a couple of optimizations were determined this week in the solution we came up with. 

One adjustment, to add wiggle-room on the query parameters, helps because although we want everyone to have well synchronized clocks, many of these times are based on human statements of start and stop. Thus adding wiggle-room extends the times you are looking for to put your start a bit earlier, and you stop a bit later.

The other adjustment is to use the other two service time query parameters to eliminate document entries that have only one of the service times (start or stop, but not both). Clearly if something stopped before the time you are interested in, then it s not what you are looking or; same is true about something that didn't start until after then timeframe you are looking for.

When we got to the final understanding, it became clear that it is possible our readers don't understand this too. Some form of this might end up in the Technical Framework, as this handbook (and blog) have very limited audiences.  We also felt we had done this before, and had written up changes to the Technical Framework. We had, but had only discussed the CreationTime, which is  point in time. Service Time is a range, which brings more complexity...

serviceStartTime - serviceStopTime 

When there is a timerange of the service event that you are interested, you will query against the serviceStartTime and service StopTime metadata element to find documents that indicate they fit your timerange. . The service times are specific to the time range of the treatment or episode. This is different than the document creation time, which is when the document was created. The query results will return any document whose “service time” falls within that range. It is important to note that these parameters work together to give a period of time.

Given you are interested in a specific time range (Start -> Stop).

The serviceStartTimeFrom and serviceStopTimeTo are clear they should bound that time with a little slop to deal with poor timeclocks:

  • serviceStartTimeFrom parameter in the query should be set to a few minutes before the time you are interested in being the Start of the service time range
  • serviceStopTimeTo parameter in the query should be a set to a few minutes after the time you are interested in being in the Stop of the service time range
When either or both service time is missing on a DocumentEntry, it will be included in the above query results. So we need to look for ways to eliminate these false-positives. 

Some DocumentEntries will have a service start time but not have a service stop time. This is common in chronic care, radiology, and other circumstances where the end of the service has not happened or where the end is unknowable;  therefore you should include a query parameter that would eliminate DocumentEntries that have a declared start time well after the time range you are interested in:

  • serviceStartTimeTo parameter in the query should be set to a few minutes before the time you are interested in being the Stop of the service time range

Some DocumentEntries will have a service stop time but not a service start time. This is not common, but will happen where there is no clear start time to an observation, therefore you should include a query parameter that would eliminate DocumentEntries that have a declared stop time well before the time range you are interested in:

  • serviceStopTimeFrom parameter in the query should be set to a few minutes after the time you are interested in being the Start of the service time range

Some DocumentEntries will have neither service start or stop. These will be returned regardless of any timeframe query parameters. Your Community Metadata Specification should encourage all metadata publications populate the serviceStartTime and serviceStopTime element as much as possible to avoid false-positive query results.

Post processing to eliminate false-positives

Ultimately one will get false-positive results from the query, the solution is to look at ALL of the metadata to find reasons to find and eliminate these false-positives.


The only real way to avoid false-positives is to force all DocumentEntries in the Community to have at least ONE service time. Most of the time Service Start Time can be determined.  It might be only to the accuracy of the day, or month, or year. But even that eliminates many false-positives. When there is not one of these times available, one is sure to have false-positives. For the Metadata Handbook, this is a fantastic observation as it can become a community rule that at-least one, preferably both times must be filled in.

Yes we know that there are MANY other query parameters. Yes we know that one could design better query language support. The point though is that this is the system we have. AND when used properly it will work just fine. Any data publisher that doesn't follow the rules will mess up Any well designed system. This is not a case of a poorly designed query system, this is a fact. Query is at the mercy of the data, bad data gives bad query results.