Wednesday, January 5, 2011

Data Objects and the Policies that Control them

In reading the President's Council of Advisors on Science and Technology (PCAST) report on "Realizing the Full Potential of Health Information Technology to Improve Healthcare for Americans: The Path Forward"; among the many comments that I could make on it, I want to focus this blog article on one very specific point. The point I want to discuss is the directionality of the relationship between the Data Object and the Policies that Control them. This point is not specific to PCAST, and has been discussed many times over the years.

In the PCAST report Section "V. Privacy and Security Considerations", on page 49 in the following sub-section quoted below is the part I highlight in yellow:
A Health IT Architecture for 21st-Century Privacy and Security
We believe that a universal exchange language based on tagged data elements will allow the design of much better privacy and security protection than currently exists for either paper or electronic sys­tems, for two principal reasons.  First, the ability to tag an individual piece of data with privacy ­related information, as part of its metadata, enhances privacy safeguards.  Second, because tagged data ele­ment exchange protocols are designed to be efficient for the rapid exchange of small pieces of data, it is feasible to use security protocols that involve multiple exchanges of challenge and response.  We illustrate these points in this and the next subsection.
I will note that I am reading the PCAST report as a set of principles brought together by the small group invited by the Whitehouse. So, First I will note that this is a small group; Second this was a closed group; Third it was not a very transparent process, and Fourth this was a group with some suspicious membership. But this is not what I want to focus on.

I have no problem with 'tagged data elements', I speak highly of this in Data Classification - a key vector enabling rich Security and Privacy controls. The actual value of  these tags is part of the discussion.  In the future I would like to have a discussion about the practical realities of their implied definition of an 'element', where I would like to recommend that a 'Document' is a reasonable size to manage today. 

Data Objects pointing at the Policies that control them:

What I want to focus on is the directionality of the model being propagated by the PCAST report. They indicate that the individual piece of data has a tag (metadata) with the privacy related information. I read this that the Metadata points at the privacy policies that describe how to control the individual peice of data (the 'element').

Data Object  --> Privacy Policy

Policies point at the Data Objects that they control
This seems like a logical approach, and indeed was the approach that BPPC took in the first version. In the first version of BPPC the confidentialityCode of any document could hold the OID of the BPPC policy. This seemed logical to us at the time too. But what we found out was that this doesn't scale with the age of the data-set. What I mean is that as a patient changes their mind regarding sharing (opt-in, opt-out, etc), one needs to go and change all the metadata tags on all the data, rather than just change the policies.

So BPPC now has the confidentialityCode as a sensitivity/confidentiality classification, and the BPPC policy is self-contained. If the BPPC policy needs to have special rules about a specific Document, the the BPPC policy points at the document. For example when a patient wants a specific episode summary hidden, it is identified by the unique number in the privacy policy.

Privacy Policy --> Data Object

Indeed as the patient changes their privacy policy, or the organization changes their privacy or security policies; these changes should be reflected in the privacy/security policies, not in changes to the data. The data hasn't changed, the policy has changed.

Metadata describes the Data as facts, and Policies have the specifics about how to control the data 
In fact the Data object does have metadata that is used in the Access Control decision. Metadata that does include a confidentialityCode, classCode, formatCode, Patient Identifier, Facility Identifier, Author Identifier, etc. But I don't see these as 'pointers', but rather 'security context attributes'. See the IHE white paper Access Control - Published 2009-09-28

I am optimistic that the authors were not trying to impose an architecture, but rather identifying a principle. As a principle the directionality should not matter.

PS. software engineers might find this useful, or might be confused by it -- this is like the first-year school assignment in linked-lists, stacks, and fifo... the obvious direction of the pointers is not the right direction to implement. Ok, that might not help at all...


  1. John,

    1) so are you saying that you always need to go to BPPC to get the "security" tag for each data element?
    2) Does this mean the data needs to be reclassified into two streams, one that needs to be protected and one that is not, before transmission?
    3) what happens to the data that is already shared (after the patient changes their mind about the consent)?


  2. Madjid,

    I think you might still be conflating confidentialityCode and consent. They are very different things. The confidentialityCode is ONLY a labeling mechanism that indicates the group of sensitivity/confidentiality that the object belongs to. This grouping doesn't change over time. This grouping doesn't change based on patient consent-policy.

    There is one use-case where this grouping changes, and that is a case where a specific object is controlled independently. For example where the patient asks that a specific report be blinded. These are 'exceptions' that are handled by rules that reference that specific object by it's unique-id.

    So: responding to your numbered items:
    1) I am not sure what you mean by the 'security' tag... if you mean confidentialityCode, then no. The confidentialityCode is metadata. This is true of the IHE XD* profiles, HL7, and DICOM. All objects have a confidentialityCode value associated with them. This is totally independent of consent.
    2) I don't understand this question. Data does need to be classified. I am confused by your use of 'reclassified'.
    3) The new consent that would get published in your use-case would be 'enforced' as soon as the new consent is published. The publication of a new consent is totally independent of the publication of any clinical data. At any time an object is accessed, it is the responsibility of the access control engine to execute the CURRENT policies.

  3. well, the PCAST report talks about tags, so I was mixing my terminology. I just read your other blog on data classification and understand confidentialityCode slightly better. From that blog I get the impression that confidentialityCode is for a whole document and then individual decision has to be made regarding how to apply this code to individual object within the document, is this correct understanding? or the same code applies to all objects within the document.
    My question 2 above was based on assumption that different objects within a document get different assignments of level of sensitivity (I should not have used the term reclassification and should have just called it classification)
    So if that is the case, would different objects of the same document be stored differently or transported differently or you simply use a redacted version versus an "original" version and transport/store them according to their sensitivity level?
    Regarding 3 again reading the other blog, I understand that the patient gives a consent and may be sets up some policy for different categories of data and that could have some effect on the confidentialityCode in some fashion. My concern was mainly about the scenario when a patient decided at a later point in future that it wants to now classify some data that she previously did not think was sensitive as sensitive and now that the cats is out of the bag, how is this new policy is distributed to systems that previously received this data through an exchange?

    thanks a lot,

  4. Madjid,

    There is confidentialityCode metadata at many levels in the HL7 RIM and CDA. Thus there can be objects that contain other objects with each level of the object having a different confidentialityCode. But the exterior confidentialityCode seems to me would have be that of the most sensitive object within. As it is only a system that can handle the most sensitive confidentialityCode that can be trusted with the whole object.

    The blog post that follows this one does offer a redaction method, this is a possible way to handle this problem. I try to take a realistic approach, and recognize that documents are likely the smallest objects that one can manage on a cross-enterprise health information exchange. Much smaller and one looses track of the context of the object.

    On your third issue, the patient has control over the access control rules that apply to the different sensitivity classifications (aka confidentialityCode). The rules that are used to decide what sensitivity classification should be used are fixed. That is the rules used to decide if an object is Normal or Restricted are not in the control of the patient. These rules typically fall directly from legislation, like the substance-abuse regulations. This does not eliminate any control that the patient has, it just moves the control purely into the privacy policy enforcement space.

    The typical use-case that people worry about here is where a patient wants to retroactively blind a historical document. This would be done as a privacy policy that points exactly at that document. Thus the result is that when accesses are attempted to that object, the explicit privacy policy against that specific object would block access. Thus there is no need to update that historic document, just the privacy policies in effect.

  5. John,

    Thanks for taking the time responding. I will look for the following blog for the redaction mechanism.

    Yes, your view makes sense: the sensitivity level of the entire document should be equal to the level of the most sensitive object, until a redaction is done. And this means until the redaction is done nobody can touch the document even to get the less sensitive objects. And this means my second question is moot, since you have to store/transport different versions separately, rather than disect the document and store/transport different elements separately..

    On the third issue, I understand that change of heart by patient should change to a change in privacy policy, and this is fine if everybody always comes to one storage location for the document. If on the other hand, another party already downloaded the document and now you have copies lying around in different locations, how do you enforce the new policy on those copies that are already out? Can you based on an audit log, find out where the doc went and change the policies for those copies as well?


  6. Madjid,

    There certainly could be systems that can take apart the object and manage the differently labeled objects inside. I just think that to start a discussion of a nationwide health information network with this fine-grain control is not reasonable. So I would like to start with document level control.

    On the third issue: For the most part once a document has been copied, control has been lost. So one must be very careful to not allow a copy if one is not sure they should trust that destination. Part of this trust is clear rules around the PurposeOfUse, meaning that the copy is given for a specific purpose and only that purpose. If a new purpose comes up, a new request needs to be made.

    But if one looks at the system as a whole, these changes to the consent really can only affect future disclosures, not those that have happened in the past; and anyone that has used clinical documents for clinical decisions need to maintain a copy of that original evidence for the decision, so you see they really can't have the document retroactively blinded.