Tuesday, March 8, 2016

Provenance vs Audit -- it is not a competition.

Provenance and Audit seem to be recording the same thing, therefore why do they both exist and further why are they different?

Much of my response is focused on FHIR, but the concept is broader too.


In HL7 we are trying to harmonize the historic healthcare use of Provenance and Audit, with a model of Provenance that is put forth by W3C. We must understand there is a theoretical Provenance concept that is primarily what W3C is focused on. Yes it has implementation models, I am not saying it isn't complete, however it doesn't have a HTTP REST model. So we need to build one. And we have.

Provenance is a record that describes entities and processes involved in producing, updating, delivering, or otherwise influencing resources. Provenance provides critical foundation for assessing authenticity, trustability, and traceability. Who authored and why, who updated and why, where were these changes, when were these changes, and what influenced these changes. When data is moved, Provenance tracks where the data came from. In W3C it also tracks who accessed, or where the data was sent.

Audit is broader including any Privacy or Security relevant event, not just actions upon data (Create, Read, Update, Delete). Audit is used to capture all actions upon data, but also actions upon other protected resources. The Audit log exists to provide evidence that a system is working properly, and thus is used to detect when it is not working properly. So it is used to detect failures in Confidentiality, Integrity, and Availability. So it is used to provide reports such as an Accounting of Disclosures.  See Guest Post: Use-Case - Security Audit Prompts Investigation 


We must understand that in FHIR, that Provenance theory has been distributed differently. Much of this because of the momentum of healthcare, meaning we must recognize our own legacy. Second because healthcare is so very tied to Provenance as a concept, it is not new to healthcare, it is found in all our standards going back many decades. In historical systems they have a view that Provenance and Audit log are simply part of the database. So we are not new to this concept domain.

There is a 'view' in FHIR, the W5 (Who, What, Where, When, Why) report shows, much of the purpose of Provenance (in medicine) is handled by elements in each of the clinical and financial resources. This doesn't mean an additional Provenance record can't exist, but it means that in those cases to create a  Provenance record is to duplicate the essential elements, which few will choose to do. BUT THEY CAN. The FHIR Provenance resource is there for those cases where an additional record is desired to cover the cases where an explicit record of provenance is desired.

The FHIR Provenance is only recording the provenance of a Create or Update; okay that isn't fully true as during a Transfer we record where it came from, etc. But Provenance is not there to record READ operations. The most likely use of FHIR Provenance is for import use-cases, to indicate where the data came from which might be different than the provenance information contained within the resource. FHIR Provenance also covers things like putting a Signature across the targeted resources. It covers explaining how prior-evidence (entities) were used. It allows for explaining who (agents) was involved in the activity that produced the update. Mostly it fits well within the FHIR model, leveraging FHIR representations of the target, agent, entity, location, etc.

The FHIR AuditEvent resource is there to cover more than W3C provenance covers when they say they cover audit. The AuditEvent can record any event, not just provenance relevant events. AuditEvent can cover any security or privacy relevant event, not just operations on data. The big one that we point out, just to prove the point, is that AuditEvent can record logon events, both successful and failed. This is helpful in healthcare where, unfortunately, many healthcare systems still do user authentication rather than using an enterprise class authentication service (e.g. SAML, OAuth, etc). AuditEvent is there to record Query operation, not something that is Provenance based. There are other events too. And like Provenance it fits well within the FHIR model, leveraging FHIR representations of agent, entity, location, organization, etc…

Breadth of the event is different.

Provenance record might be on a larger operation (activity), whereas there might be many Auditable 'events' that happened during that 'activity'. For example to create an Order for some procedure, the Clinician would have reviewed many parts of the record, would need to align with the Encounter, and might have needed to create a few ancillary resources that go along with the Order (like Specimen). There would be one Provenance record on the DiagnosticOrder (actually unusual as DiagnosticOrder contains all the Provenance they desire, but for argument sake), it would point at the evidence that Clinician determined important from the whole record she reviewed (difference in Provenance.entity from all the AuditEvents recorded).

The Audit log is designed to include many redundant recording actors making redundant records, so any action like a Create of a resource instance might result in many (usually at least 2) Audit log entries. This redundancy is part of the pattern used to see that a system is working properly. As such, the Audit log tends to need to be analyzed and pruned through various filtering, reporting, alerting, and offloading.

So, for every piece of Data there might be a Provenance record, or it might be contained within the Data. One Provenance record can point at many pieces of Data. The graph shown is just a visualization of the 'number of' entries of data vs entries of audit log vs entries of provenance records. It is not intended to be complete, but representative.

But for every access to data there will be an AuditEvent, including accesses to the AuditEvent record will produce another entry in the AuditEvent. And there are also AuditEvents for login, system-startup, system-shutdown, etc. Thus making AuditEvent the most popular Resource in FHIR.... by definition.,

Utility driven by Use

The biggest difference is in use-cases over time.

* Audit

  • is accessed by support staff such as Security Office, Privacy Office, and IT Office
  • is analyzed quickly, within minutes/hours/days.
  • it might get filtered and forwarded, reports might get made,
  • some of it might get purged, other moved to offline storage.

* Provenance

  • is accessed by clinicians, billing, quality, safety
  • it might not ever be referenced, as it an unusual situation that doesn't have the data
  • is part of the medical record, it lives as long as the data, and
  • it goes with the data (should).

Exceptions to that general pattern do exist. But they are exceptional cases.

Provenance and Audit are both critical

I hope I have explained WHY we are not just adopting the W3C Provenance model. It is a model, we are looking toward it as a good model. But we are building something for Healthcare, and using FHIR. So there will be differences.

"All models are wrong, some are useful."

Updated: March 9th, 2016 -- fix the last graphic to be a bar-chart to better show this as a visualization of the number of entries, not to be confused with a Venn diagram concept of the other charts. Also swapped the first two diagrams so that a diagram of my work was first, the second is from the FHIR specification.

Friday, March 4, 2016

FHIR Security and Privacy - tutorial outline

Rene asked for an outline of Security topics for FHIR for an upcoming tutorial he is giving.

The easy answer is go read all my blog articles under the #FHIR topic

The second easy answer is to point at the FHIR security pages.

I find it interesting is that I answered this same question back in January 2013.. I didn't notice this until after I completed the list below and was confirming I hit all my blog articles. Not much has changed in 3 years. http://healthcaresecprivacy.blogspot.com/2013/01/security-considerations-healthcare.html

The outline of the main topics to be covered:
  1. With HTTP REST interaction model for FHIR, it is designed to leverage any security model that HTTP includes. That is to say that HTTP interaction model has a set of security models that are transparent to the data-model contained in the HTTP transaction.
    • With messaging, you should be able to use http security, it is just not as obvious.
  2. We encourage use of HTTPS. Servers should enforce this as appropriate to their environment.
  3. We encourage the use of Federated Identity for Authentication
    • Most likely OAuth 2.
    • Profiles of OAuth 2 exist from IHE (IUA), HEART, and SMART. They are all about the same.
    • Equally useful is SAML, which might be more friendly to the Enterprise use-cases.
    • The most important part is to recognize that this is totally independent but totally supporting of the FHIR specification.
  4. We encourage use of the AuditEvent for recording whenever a security/privacy relevant event happens. (This is different than a provenance record). All actors should record AuditEvents from their perspective, it is through these various perspectives that security audit log analysis sees unusual events and thus starts an investigation.
  5. We encourage use of the Provenance for recording persistent record of provenance of any create or update transaction. There is also provenance built right into some FHIR Resources when it is so fundamental to the operation of that Resource. (This is different than an audit log) This is important to Security and Privacy; but also to Medical Records integrity.
  6. All resources have a 'meta' element that can hold security-labels (inclusive of privacy labels). These tags are used in an "Attribute Based Access Control" scheme. That is to say that an Access Control engine will use these meta tags to inform the decision that it makes; and can place tags into meta to inform any downstream Access Control engine (decision or enforcement).
  7. Some meta tags are 'obligations'; when in a trust relationship one party that trusts another party can communicate obligations which are constraints or actions the receiving party is obliged to carry out. When no trust relationship exists, obligations are of no value.
  8. Access Denied must, like in any standard, be carefully managed so as to give appropriate information but not give away important information. Sometimes it is best to tell the client that their query was perfectly accepted but that no results are available, sometimes one tells the client 403 or 404, etc..
  9. There are efforts underway to create a Privacy Consent Directive modeled in FHIR. This is a profile on Contract resource. This is intended to record the facts of a consent. This includes the various rules that would need to be enforced by the Access Control engine.
  10. There are efforts underway to show how to use UMA to enable Patients to control access to their managed data. This is an extension on OAuth for the purposes of "User Managed Access" (UMA). This should complement the Privacy Consent Directive.
  11. There are efforts underway to define OAuth 'scope' values. This is not an obvious science as the way that FHIR data-model is defined is not an logical set of access control restrictions.

There are a few other topics on my blog, but not much.

Wednesday, March 2, 2016

electronic Privacy Consent -- Patient choice

There are three efforts going on right now to develop a Privacy Consent systems. It might seem that three efforts are two too many, but I don’t think so. It might seem that three efforts are thus in competition, but that is also not the case. These three efforts are focused on different parts of the Privacy Consent space, the proverbial ‘elephant’. Let me explain why all three exist, and why that is not a problem.

The efforts are:
  • IHE – Privacy Consent for Document Sharing that supports patient specific exceptions, an advancement over their Basic Patient Privacy Consent (BPPC)
  • HL7 FHIRPrivacy Consent Directive Implementation Guide that supports capturing a Privacy Consent using FHIR friendly mechanisms and leveraging the FHIR model
  • HEART – Cross functional effort of the communities from OpenID/OAuth/UMA and Healthcare to leverage the User Managed Access (UMA) infrastructure to put the controls at the Patient’s finger tips.

Patient Privacy Consent Domains of Influence

The main difference is the domain of the data each of these is looking to control, and thus the infrastructure they have to leverage.
  • IHE -- they are focused on Document Sharing (XDS, XDR, XDM, MHD), where the atom that is being controlled are documents, the metadata that can be leveraged to do that control is the XDS metadata, the metadata describing the requester is the XUA (SAML) assertion. In the case of an XDS (MHD) there is an “XDS Affinity Domain” concept to also bound the sharing.
  • HL7 FHIR – they are focused on the FHIR, where the atom being controlled are the FHIR Resources, the metadata that can be leveraged to do that control is the Resource header and related resources such as Provenance, the metadata describing the requester is less defined but is likely an OAuth token. The big problem is that FHIR as a core standard has no boundary, so there is no theoretical limiter on the domain of control. 
  • HEART – they have just the opposite problem from FHIR, although they are mostly talking about FHIR as the data-model and access-model; they can’t necessarily rely on FHIR Resource model, metadata model, or supporting Resources. However HEART has a strong set of Profiles on OpenID Connect, and OAuth. What they have the most strength in is User Managed Access, a profile on OAuth, that allows for user to identify access rules to the data that user controls (hence the name – User Managed Access). This means they have a control domain, that which is in the control of the Patient. Thus their system will work great once it is fed with the data, and once fed with the data the original author must not share through other means or it thwarts the Patients user managed access.

The similarity of all these efforts is that they are all looking to add fine-grain controls so that the Patient is enabled to control where and when their data is Created, Accessed, and Disclosed. They are working on the same set of use-cases, mostly because the people involved are involved in all three efforts. Why write use-cases multiple times when they have already been written once. Now their use-case descriptions are not perfect copies, we do customize them with flavor.

Privacy Consent Basics:

Basics – The foundation that all of these are starting with is that there is some known set of policies that define basic behavior. This is all that BPPC could deal with, the patient picked the policy they agreed with from those that were offered. No deviation allowed. We now need to evolve past this.

Advanced Privacy Consent

So the policy can leverage the following kinds of metadata from the user and data. Having rules that specifically exclude data with specific values in these metadata elements, or specifically including in sharing.

Data metadata
  • Facility/Hospital/Organization 
  • Episode of care 
  • Location of care 
  • Diagnostic Order and results 
  • Publication date/time 
  • Identity of a Document/Resource 
  • Privacy/Security tags 
Related to the User Context
  • User Identity
  • User Authentication strength
  • User Role
  • PurposeOfUse
  • Organization user is from
Thus one could say: Allow All users at Organization W, to have information from Document X and Episode Y, but not information related to Order Z.

Privacy Consent Enabling Use-cases

Thus we support (this is not an exhaustive list)
  • Organization focused Treatment scenarios (payment)
  • Patient centric data management
  • Research scenarios – Patient directed and Organization requested
  • Clinical Trials
  • Precision Medicine

Blog articles