Monday, June 25, 2018

FHIR patient extensible data portability

Discussions at FHIR DevDays raised this question of a basic way to leverage FHIR API capability to enable the Patient beyond the limited Apps their provider has approved.

If a healthcare provider offers a FHIR API (e.g. argonaut) -- meeting the Meaningful Use "API" requirement. Would it be reasonable to expect that the Meaningful Use "Download" (from View/Download/Transfer) capability would be offered in FHIR format?

Proposed quick-and-dirty solution?

This could be something like a zipped Bundle containing the results of $everything for that Patient, given that patient as the user.  The zipping would help with the potential huge size.

The amount of data would be limited to that which is available on the FHIR API offered, which is often not all possible data.

Concern would come up when this export of $everything includes other data such as documents and images. The theory is that they are all necessary to export, but the problem is the overwhelming size that might result. (Unfortunately compression would be less helpful with documents and images where they are slight variations that are mostly the same, but because of the base64 encoding they all look very different)

Should this be a FHIR Document?

It might be more well-formed to have this be a FHIR Document, but it is not clear to me that adds much benefit over simply an export of all data that is known (aka $everything). 

There would be benefit that the Bundle would then be far more identifiable as coming from a specific Organization on a specific Time for a specific purpose. This could also be simply a Provenance resource on the bundle. This might be very important to a downstream recipient of this blob so that it can be authenticated and proven as complete (Principles of a Document)

Essentially the CCD provides this today, and is often the solution for "download", so this is just a different mime-type (FHIR).

I think just getting an export is more important than making sure it is a FHIR Document.

Why would anyone do this or want this?

The reason why this is useful is for patients that want to use Apps that are not (yet) approved by that provider.  Waiting for each of their providers to approve a useful App can be limiting on the Patient and on the Marketplace.

Counter this with the potential stupid patient that doesn't realize what they are doing with their data... This is going to happen, and when it does there will be an outcry. This outcry will complain that the provider was not acting as a parent to that child, but this kind of an outcry should not be a reason not to do this. Better that the patient be warned, but not forbidden from getting all their data in an download of FHIR format.

Who does this?

If this is reasonable, who does this today? 


Note that this would also be helpful in support of GDPR Article 20 - Right to data Portability

Wednesday, June 13, 2018

IHE ITI Document Sharing Metadata Handbook Published for Public Comment

Those implementing or improving their XDS or XCA community are the intended audience of this Handbook. As a Handbook, it strives to inform on critical aspects of Metadata management. Please review and comment. Please forward to those with experience.

IHE IT Infrastructure Handbook Published for Public Comment

The IHE IT Infrastructure Technical Committee has published the following new handbook for public comment in the period from June 13 through July 13, 2018:
  • Document Sharing Metadata - Rev. 1.0 
The document is available for download at Comments submitted by July 13, 2018 will be considered by the IHE IT Infrastructure Technical Committee in developing the subsequent version of the handbook. Comments can be submitted at ITI Public Comments.

Saturday, June 9, 2018

Presenting IHE on FHIR at FHIR-DevDays

I am getting excited to give two IHE on FHIR tutorials at FHIR DevDays in Boston. The FHIR DevDays event is focused on educating IT professionals in the newest standard in healthcare, FHIR – Fast Healthcare Interoperability Resources. The FHIR specification may be the newest standard in healthcare, but it is backed by all those that have developed the previous standards (e.g. HL7, CDA, IHE, XDS,  DICOM), taking the best from experience and using the latest in RESTful API technology. For more on FHIR Dev Days 


Both tutorials will focus on IHE use of FHIR. The first tutorial will focus on a high-level view of how IHE and FHIR interact, and a broad view of the current 22 IHE Profiles that leverage FHIR. This tutorial will be most interesting to a general audience interested in Standards organizations and evolution of IHE Profiles. This tutorial is a shortened version of the IHE-on-FHIR tutorial I gave at the HL7 Workgroup meeting in Cologne.


The Second tutorial will go deeper into the IHE use of FHIR to enable API access to Document Sharing infrastructures based on XDS and/or XCA. These Document Sharing environments exist within USA, Canada, Europe, and elsewhere. The Document Sharing environment provides a way for Clinical Practitioners to publish documents (e.g. CDA, C-CDA, DICOM, PDF, etc) such as medical summary, episode of care, discharge summary, laboratory results, x-ray, CT, MRI, and other. The IHE MHD Profile defines how FHIR enabled apps can discover these documents, retrieve them, and publish new documents.

Advanced IHE Profiles that will be discussed show how these documents could be decomposed into FHIR Resources that are more naturally made available, with Provenance linkage back to the source document from which the FHIR Resource information comes.


The emergence of HL7 FHIR is very exciting, and the collaboration with IHE shows a strong endorsement. I am always excited to work with this new platform to accelerate the advancement of Healthcare.

My blog articles on the topic of FHIR, Document Sharing, and Privacy

Monday, May 21, 2018

Erasure Receipt

During the GDPR discussions at the HL7 workgroup meeting in Cologne, we uncovered a potential 'nice to have' in the general information technology space, an 'Erasure Receipt'. The idea is that GDPR includes Article 17 the Right to Erasure (Recital 65 - Right of rectification and erasure), which is similar to the 'Right to be Forgotten' (Recital 66 - Right to be forgotten). In GPDR there are requirements that the data controller must pass on the Erasure request to other downstream Controllers that they have disclosed the data to; AND they must inform the Individual of each of these downstream Controllers (Article 19 - Notification obligation regarding rectification or erasure of personal data or restriction of processing). The Erasure Receipt would focus on making statements about the act of Erasure. 

I think this would be good to get as domain independent, not something that Healthcare does alone.

Like a Consent Receipt

Much like the "Consent Receipt" work that Kantara has developed. Where the Consent Receipt is a consistent concept that states the facts about a Consent that an individual has agreed to. The first versions of this Consent Receipt was not structured or coded, but had some requirements of the text and would be delivered to the Individual. The main goal of a "Consent Receipt", much like any cash register receipt, has very little use when everything works as expected, but is there as evidence in the case where things do not progress as expected. Specifically when the terms of the Consent are not enforced, the Individual can leverage their Consent Receipt against the violating custodian.

Erasure Receipt

So an "Erasure Receipt" would be given to the Individual after they have asked for data to be Erased. When that Erasure works as expected, the Erasure Receipt has very little usage. However if at a later time it is found that the data was not properly erased, then the Erasure Receipt can be used against the violating custodian. We also envisioned that the Erasure Receipt might be useful to probe the custodian to check that there is no current evidence of the data that was erased. So the Erasure Receipt is an artifact that shows due diligence, transparency, and trustworthiness. 

One reason why an Individual might request Erasure is when they withdraw their consent. In this case the Erasure Receipt and Consent Receipt might be the same.
"... where a data subject has withdrawn his or her consent or objects to the processing of personal data concerning him or her, ..."

Requirements of an Erasure Receipt

I am not a lawyer, so this is not legal advice... So the overall requirements that I think an Erasure Receipt has is:
  • Date of Erasure Request
  • Date of Erasure Receipt (typically must be within 90 days of request)
  • Jurisdiction
  • Human Language
  • Identification of the individual
  • Identification of the data controller
  • Description of data to be Erased
    • Purpose of Use the data was collected under
    • Type of data that was collected
    • Identifier of previously capture Consent Receipt
  • Exceptions
    • Reason why data could not be Erased (e.g. Medical Records Retention, Obligation to Report)
    • Identification of Purpose and Type of data not deleted
  • Success
    • Identification of Purpose and Type of data deleted
    • Method used to Erase (e.g. Deleted, De-Identification, etc)
  • Downstream Recipients
    • For every downstream Recipients of the data being asked to be Erased.
    • Identification of downstream Processer
    • Response if any received from request made to downstream Recipient
  • Pseudonym -- given the Individual has been Erased, a pseudonym (i.e. GUID) can be assigned to the remaining data, proof of erasure, and the Erasure Receipt.
    • This might be useful by the individual in the future to probe the erasure facts
    • This might be most useful where the data are de-identified and maintained for other required purposes. A probe of the pseudonym would show integrity of that data, while assuring the Controller no longer knows who the individual is.
Once the Erasure has happened, and the Erasure Receipt has been delivered. The Custodian must now erase the individual details around the Erasure Request. Thus the power of the Erasure Receipt is that it is placed into the Individuals hands and only that Individual. Thu the Erasure Request likely does need to be Digitally Signed by the Custodian.

Erasure Exceptions

Given my discussion is most around Healthcare, the first clarification that I always express is that the GDPR Erasure Request does not override a Healthcare organizations regulated requirements (Article 23 Restrictions), such as Medical Record Retention regulations. Thus an Erasure Request in these cases might be completely denied. The likelihood is that there might be some data held by the Healthcare organization that is not protected by a regulated responsibility, such as social contacts and interactions. This exception puts many in Healthcare at ease. I then remind them that under GDPR, once that regulated reason expires they MUST erase the data. Thus if your country has a requirement to maintain Healthcare data for 30 years, once that has expired the data must be erased.

Other similar use-cases

This similar concept might also be applied to the Article 16 - Right of rectificationArticle 18 - Right of restriction of processing, and Article 21 - Right to object.  I simply did not look further into this.


Erasure is different than Consent, but the receipt processing and overall use as a token when things do not progress as expected is similar. The big advantage of an Erasure Receipt comes when it can be Digitally Signed and include structured content. 

I am not a Lawyer, this is not legal advice...

Friday, May 18, 2018


I just finished a very long week in Cologne at the HL7 workgroup meeting and FHIR Connectathon. I had the idea that I could host a discussion of how to use FHIR in a GDPR compliant organization. So I created a FHIR Connectathon track. This track was hopeful that we could do testing with various parts of the FHIR specification. We ended up talking more about generally how to use the various capabilities that exist in FHIR to meet the various Articles in GDPR.

FHIR is GDPR enabled

The Security Workgroup and the Privacy (CBCC or CBCP) are global workgroups, so we have been aware of GDPR for many years. As concrete needs came up we would add that capability. Thus FHIR includes many capabilities that can be leveraged to meet GDPR needs. From a purely geek perspective the GDPR is not technically unusual, it simply places some higher emphasis on Privacy and Security capabilities. Thus overall the GDPR is a good thing to me, as it validates and will leverage the work I have spent 20 years developing in HL7, IHE, and DICOM.

More on this later in this article

GDPR is more than Privacy

I have heard a few people express that GDPR is more than Privacy and Security. I think that these people are misguided at the extent of the real definition of Privacy. The Privacy Principles that are generally included in many standards and guidance are inclusive of giving the Individual the right of Access, the right of Correction, and the right to control how their data are used.  Too often people think Privacy is only about restricting access. The HL7, IHE, and DICOM workgroups I have participated in have always used the more expansive definition of Privacy Principles.

Even the GDPR right to Erasure, related to the EU Right to be Forgotten, is an extension of the Privacy Principles rights to control data about the individual and the rights of the individual to correct improper information.

Adding emphasis: GDPR is very much patterned after "Privacy by Design", indeed it requires that "Privacy by Design" is used.  I have lots of experience with Privacy by Design, and like it. In my GE days, I made it the backbone of the Security and Privacy guidance for all products at GE Healthcare.

Thus GPDR is primarily about Privacy...

GDPR is more than technology

This is a fundamental truth. GDPR will drive far more work in the space of writing Policies, Procedures, Communications, and such. There are many publications, consultants, and lawyers that can help with this. There are many publications that will explain GDPR to you. I am not going to try to explain GDPR.  The law itself is not that hard to read.

So at this point I assume the reader is knowledgeable in GDPR. If not, then go learn that much first....

How is FHIR GDPR enabled?

This week we agreed to write a whitepaper that will explain this. It has the general ouline

  1. Introduction and Scope -- that will explain we are only addressing FHIR specific topics
  2. Mapping of GDPR Articles to the existing Security or Privacy capability in FHIR -- this section will provide terse guidance on how we visualize the use. 
  3. Identification of some gaps we identified -- nothing critical, mostly nice-to-have operations
  4. Conclusion -- FHIR is GDPR enabling

We are targeting the audience that is aware of FHIR and GDPR, that just needs some help extracting out the specifics. This paper should be about 8-10 pages.

The expectation is that once published, the external community may ask for clarification, or for specific actions. We might revise the paper. We might make it more visible through balloting as informative. We might convert it into an Implementation Guide.

Which parts of FHIR are useful?

Not to get ahead of myself. The following are the capabilities that we have already in the FHIR Specification today:

  • Provenance, Resource
    • Includes a signature mechanism that can be leveraged in many comprehensive ways.
  • AuditEvent Resource, and Guidance
  • Consent, Resource
  • any Resource can be tagged with Security/Privacy tags
  • De-Identification guidance
  • Secure Communications
    • Common recommended use of TLS (HTTPS)
    • May use Client Authentication
    • Recommend follow good TLS principles such as BCP195
  • Authentication and Authorization
  • Identity -- various FHIR resources are tied to identities that can be used in Policy (e.g. Consent), and would be used in AuditEvent and Provenance to record Who did some action.
    • Patient
    • RelatedPerson
    • Practitioner
    • PractitionerRole
    • Group
    • Organization
    • Location

Gaps -- those that we found this week.

We did identify some gaps. But all of these gaps are 'nice to have'. They all are new functionality that helps automate an Organization's actions on the Individual actions:
  • Operation that would provide response to an Individual "Right of Access" Article 15
    • assemble all the PurposeOfUse the organization utilizes, which would be beyond those used in the FHIR infrastuture
    • For each PurposeOfUse assemble the types of data utalized
    • And how the data are processed
    • etc...
  • Operation that would provide response to an Individual "Right to Data Portability" Article 20
    • assemble ALL the data in the best encoded format. 
    • This includes data not found in FHIR format
      • Possibly that data can be just described in meta terms
      • Possibly that data can be encapsulated in DocumentReference + Binary
    • etc...
  • Operation that would provide capability and response to Individual "Right to Erasure" Article 17
    • NOTE that Erasure does not override other regulated reasons to keep the data. Thus Medical Records are not subject to Erasure.
    • Erasure affects all data, not just FHIR data
    • Identity must be confirmed. 
    • Action must be confirmed
    • Various Erasure methods might be used
      • simple delete
      • de-identification
      • etc
    • An Erasure Receipt might be useful to define as a standard.
  • Possibly others...
Although these were discussed, it is very unclear how realistic their use would be.


I assert that FHIR is ready for GDPR use. I welcome engagement that helps enlighten me.

Monday, April 16, 2018

IHE Perspective on EU GDPR

I just became aware of a Whitepaper published by IHE Europe in January on "IHE perspective on EU GDPR".

I did not have a hand in writing this whitepaper. It looks good to me. My evaluation only on the Security & Privacy capabilities IHE offers, not on GDPR interpretation. All of the IHE profiles available to support security and privacy are outlined on this IHE page. Their whitepaper does not mention the Document Digital Signature (DSG) profile, or the Document Encryption (DEN). Both would only have a supporting role in GDPR compliance. I mention them only for completeness.

Other IHE Europe publications

Their Conclusion

The examples discussed [above] highlight the complexity of applying the GDPR to processes in health care and how the requirements are interwoven with IHE Profiles. The good news is that even today IHE Profiles provide solutions by combining security and privacy specific IHE Profiles such as ATNA, IUA, XUA, BPPC and APPC with the Profiles focused on information exchange in cross-border, national or regional ehealth deployments.

In conclusion the GDPR can be an effective catalyst to significantly extend the reach and use of IHE Profiles. Some Profiles or combinations of Profiles already meet GDPR’s security and privacy requirements. Others enable the portability of health information which will become a topic for any vendor providing solutions. 

The users of IHE Profiles can be assured that the IHE community will work on evaluating and enhancing the Profiles to meet the GDPR requirements.

GDPR impact beyond EU

I look forward to GDPR. I think that it will bring a focus to Security and Privacy topics. I hope that enforcement drives adoption, while reasonable enforcement drives reasonable reaction. I fear that an overly strict interpretation of GDPR could drive away some very important advancements in healthcare, and social networking. I welcome the extensive and painful penalties for non compliance.

Thursday, April 12, 2018

De-Duplicating the received duplicate data

Everyone is frustrated by duplicate data. In Healthcare space there is a fresh cry from Clinicians around their frustration at seeing duplicate data. On the bright side, this means that they are now getting data. So we in the Interoperability space MUST be succeeding with all the efforts to create Health Information Exchanges, and to enable Patient to access their data.

We standards geeks are quickly put in our chair because we failed to prevent this duplicate data problem... Well, yes and no. Each standard we created included mechanisms that are there specifically prevent duplication. However when those standards are used, shortcuts are taken. It might be a shortcut in the software development. It might be shortcuts in deploying a network. It might be shortcuts in deploying a network of networks. It might be a shortcut when the data was created. It might be a shortcut when the data was exported. It might be a shortcut when the data was 'Used'... But it is shortcuts, that is where the standard was not used the way it was intended to be used.  

Are these shortcuts bad???? Not necessarily. Many times a shortcut is taken to get a solution working quickly. If no shortcuts were taken, then we wold not be where we are today. Thus shortcuts are good, in the short-term. Shortcuts are only bad when they are not fixed once that shortcut is determined to be presenting a problem. Some shortcuts never present a problem.

Standards solutions to Duplicate Data

Let me explain the things that are in the standards we use today (XDS, XCA, CDA, and Direct) that can be used to prevent duplicate data:
  • Patient Identity -- the protocols used to create the virtual identity out of the many identities given to a Patient by many different organizations. (XCPD, PIX, PDQ, etc)
  • Home Community ID – unique identifier of a community of organization(s) 
  • Patient ID Assigning Authority (AA) – uniquely identifies the authority issuing patient identifiers. Usually one per healthcare organization, although can be assigned at a higher level.
  • Document unique ID – uniquely identifies a document regardless of how it was received (Including when received through Direct or Patient portals)
  • Document Entry Unique ID -- A document entry is metadata about a document, including the document uniqueID. A document entry has a unique ID.
  • Element ID – unlikely to be used today, but the standards support it. Fundamental to FHIR core
  • Provenance - unlikely to be used today, but would uniquely identify the source

Elaboration of these points

This is a complex problem, and many layers are used to solve various parts of that complex problem. Where each layer addresses a specific portion of the complexity.

Discovering the virtual Patient Identity

The protocols like IHE PIX, PDQ, and XCPD are designed to discover the various identifiers that the patient is known by. This is a reality, even in a case where government dictates  national identifier.

Duplicate network pathways

Broadest reason for duplicate data is that there are multiple pathways to the same repository of data (documents). Such as HealtheWay, CommonWell or CareQuality. Use just one of these and you don't have multiple pathways, use more than one and you might. The reason to use more than one is caused by the fact that each network has a subset of overall healthcare providers. The duplication is that some participate in more than one network... just like you are... Thus if no one participates in multiple networks, there is no duplicate pathways. 
Heat-Map for CareQuality Network

You might end up finding that you have two or three pathways to the same healthcare organization. You could just disable specific endpoints through specific networks. Pick to talk to a partner only through one of the networks. I would argue that this method of avoidance will be low tech, and initially effective. However as the network matures and expands we need a method to recognize when a new duplicate pathway happens.

I would argue that having multiple pathways is possibly useful to address major disasters that take out one of the networks, or one of the pathways.

Duplicate pathways are detectable, and when detectable, can be automatically prevented.

Detecting duplicates by homeCommunityID. This is the most reliable, but not perfectly foolproof. This however does require that the participants in these networks use the homeCommunityId as it was intended, as an identifier of a community that uniquely holds data.

Special case of hiding communities: Most configurations of XCA behave, but there are some communities that hide many sub-communities behind them. If these sub-communities are only attached through the one community interface, then there is no problem. This is the likely case for these configurations. These configurations are done this way as convenience to the sub-communities. that is to say the sub-communities like that the larger community adds value and connects them to the world. If one of these sub-communities ever decides to connect to another network, then they must become a full community everywhere, else they become a duplicate data source knowingly.

Preventing duplicate data using the homeCommunityID: So the point is that homeCommunityID is a strong indicator of duplicate pathway that would result in duplicate data. Given that in Patient Discovery (XCPD) you target the patient discovery question to a specific homeCommunityID(s), you are in control of which communities you target. Where you have already gotten a response back from a homeCommunity, you can skip the potentially duplicative Patient Discovery (XCPD) or can ignore the secondary results if you already sent out the question. By having a secondary pathway choice, allows you to dynamically detect that the primary pathway is failing. Yes you would need to identify primary vs secondary preferences; logic for delayed attempts; and handling of delayed responses.

Duplicate Patient Id Assigning Authority (AA)

I first mentioned that there are protocols used to discover all the identifiers that a single patient has. This is made up of a Patient ID and the Assigning Authority (AA) that issued that patient ID.  The patient ID assigning authority (AA) is the second level indicator of a unique organization. This can be used today, because everyone does indeed manage their own patient identities, and thus must have a globally unique AA.

Special case is where a community aggregates patient identities into a community patient identity. Such as will happen in an XDS Affinity Domain. Like the sub-community issue above, this is likely not a problem as those that participate in XDS Affinity Domain tend to be small and only want one connection.

Where a nation issues patient identifier, the Assigning Authority (AA) becomes just the national Assigning Authority and no longer would be useful for de-duplicating. In this case many organizations and communities would use the same assigning authority and patient identity. This does not cause duplicate data, but does make the Assigning Authority less helpful at detecting duplicate data.

Duplicate Document UniqueID

The Document UniqueID is an absolute proof of duplicate documents. The Document UniqueId is readily available in the Document Sharing (XDS/XCA) metadata, so can be used at that level to keep from pulling a document unnecessary. With other networks, like Direct or Patient apps, the Document UniqueId can be found within document types like CDA or FHIR. If a case is ever found where this can’t be used as an absolute proof of duplicate document, then the source of that document must be fixed.

This solution will work regardless of the network. This will work with XDS/XCA based networks, but will also work with FHIR based networks, or where the Patient uses an app of any kind. 

A special mention of on-demand documents, but I will address them below.

Duplicate data element identifiers

The solution that would work absolutely the best, happens to be the one least likely to be available today. 

The standards (CDA and FHIR) include the capability to uniquely identify data elements (resources). However, like a good standard, they allow you to not uniquely identify the data element. Yes, I said this was a good thing. It is a good thing for low-end scale. It is a really bad thing for a mature market. This is where Implementation Guides and Profiles come in. In the case of CDA there are implementation guides that do require each data element be uniquely identified, and that Provenance proof always accompany data. 

However uniquely identifying at the data element level is very expensive. That is it is hard to code, makes the database bigger, adds validation steps, and such. When that data is only used within the EHR, there is no value to all this extra overhead. Thus it is often never designed into an EHR.

Duplicate data thru Provenance

Special mention of Provenance... This is supported by the standards, but very poorly implemented. It is expressly important when a unique piece of data is used beyond the initial use. For example where a lab result was taken for one condition, but it also was found to be helpful in a second diagnosis. Both for the same patient, different conditions or different episodes. This is especially true when that original data was exported from one system and imported into another. So a historic CDA was used at a different treatment encounter. That second use needs to give credit to the first, Provenance. How this factors ino duplicate data is that a CDA document from the second encounter will include the very same data from the first. Now two different documents from two different organizations carry the same data but that data has different element identifier as it exists in two places. The solution is Provenance can show the second instance is a copy of the first.

I have worked with EHR that could tell you where the data came from. If it was imported from a CDA received from some other organization, this was noted. Most of the time these Provenance were empty, thus you assume the data was internally generated. But the capability was there on Import, the database had support for Provenance. Using this data on export is another task, thus an opportunity for shortcut...

I also am the owner of the Provenance resource in FHIR.

Clinically same

This is what most deduplication engines work on, they detect that the data found is already known and presume the data is duplicate.  They leverage any identifiers in the data. But ultimately they are looking at the clinical value and determining that they have the same clinical value.

This works except for longitudinal repetition that is clinically significant (an observation presents and resolves over and over)

Duplicate On-Demand Documents

On-Demand documents present the hardest to deal with case of duplicate data. These are also detectable if they follow the IHE on-demand profile. In that the document entry that advertises the availability of on-demand data has a globally unique and stable identity. Thus you can know that you should NOT request a new on-demand instance be made, because you already know about the data. The problem is that you don't know that that new instance would not contain new data. 

So using the unique ID of this on-demand document entry would need to be carefully handled. Never pulling a new on-demand document, will prevent you from ever learning of new data. However pulling a new on-demand document unnecessarily will cause you to spend energy determining that all the data it contains is data you already knew. This is a false-positive and false-negative.

There are poorly implemented on-demand solutions, that don't follow the IHE specification. They create a new on-demand document entry each time they are queried. This is not correct. There should be one uniquely identified document entry that everyone gets the same. When that document is requested, is when the on-demand generation of the specific document is done. And, that generated document should be stored as a 'snapshot'.  These poorly implemented on-demand solutions will present two totally different document entries each time you query, so if you are querying via duplicate pathways, you will think you have found two totally different sources of unique data.

Good news is that if the generated document is of the highest quality, then the content can quickly be separated into data you know from data that is new. That is to say tha the element level identity and/or Provenance can prevent unnecessary duplication.

Detecting a Duplicate

As you can see there are many identifiers, that when they are found to be EQUAL then you know you have duplicates. I present them from largest scope to smallest scope. The larger scope you can use the less energy it takes to stop processing duplicate data. This solution breaks down when the identifiers are not equal, in that case you are not assured that you do not have duplicate data. Thus the whole spectrum must be used, one level is not enough. Ultimately there will be false-positives and false-negatives.

Organizational Policy driving Maturity

Now that we have Interoperability, we need to address over-Interoperability.  I think that identifying the need for HealtheWay, CareQuality, CommonWell, DirectTrust, and any other networks to have reasonable and good control of their identifiers. There is already strong push to move to more coded documents like C-CDA R2.1. There are efforts around Provider Directories.

I don’t think this is a big effort, most do the right thing already today. What is needed is governance that says that the right behavior is expected, and when improper behavior is found it must be fixed. The current Sequoia specifications do not address this level of detail.

Improvement is always good, but we must recognize that much of the health data is longitudinal, and it is very possible a document was created 10 years ago according to the best possible guidance at that time. That historic document likely contains good data, but does not conform to current best-practice. Postel’s law must guide: Be specific in what you send, liberal in how you receive from others.

IHE Mobile Cross-Enterprise Document Data Element Extraction

I have worked on projects within both IHE and HL7 on these topics. I can’t claim they have solved the issue, but they have raised up the common set of issues to be resolved and gathered good practice as I outline above. The most recent project is one in IHE that starts with the Document Sharing infrastructures (XCA, XDS, and CDA) much like above, and presents the de-duplicated data using FHIR API (QEDm). This solution built upon the family of Document Sharing profiles and FHIR profiles IHE has.