Wednesday, January 5, 2011

Data Objects and the Policies that Control them

In reading the President's Council of Advisors on Science and Technology (PCAST) report on "Realizing the Full Potential of Health Information Technology to Improve Healthcare for Americans: The Path Forward"; among the many comments that I could make on it, I want to focus this blog article on one very specific point. The point I want to discuss is the directionality of the relationship between the Data Object and the Policies that Control them. This point is not specific to PCAST, and has been discussed many times over the years.

In the PCAST report Section "V. Privacy and Security Considerations", on page 49 in the following sub-section quoted below is the part I highlight in yellow:
A Health IT Architecture for 21st-Century Privacy and Security
We believe that a universal exchange language based on tagged data elements will allow the design of much better privacy and security protection than currently exists for either paper or electronic sys­tems, for two principal reasons.  First, the ability to tag an individual piece of data with privacy ­related information, as part of its metadata, enhances privacy safeguards.  Second, because tagged data ele­ment exchange protocols are designed to be efficient for the rapid exchange of small pieces of data, it is feasible to use security protocols that involve multiple exchanges of challenge and response.  We illustrate these points in this and the next subsection.
I will note that I am reading the PCAST report as a set of principles brought together by the small group invited by the Whitehouse. So, First I will note that this is a small group; Second this was a closed group; Third it was not a very transparent process, and Fourth this was a group with some suspicious membership. But this is not what I want to focus on.

I have no problem with 'tagged data elements', I speak highly of this in Data Classification - a key vector enabling rich Security and Privacy controls. The actual value of  these tags is part of the discussion.  In the future I would like to have a discussion about the practical realities of their implied definition of an 'element', where I would like to recommend that a 'Document' is a reasonable size to manage today. 

Data Objects pointing at the Policies that control them:


What I want to focus on is the directionality of the model being propagated by the PCAST report. They indicate that the individual piece of data has a tag (metadata) with the privacy related information. I read this that the Metadata points at the privacy policies that describe how to control the individual peice of data (the 'element').


Data Object  --> Privacy Policy

Policies point at the Data Objects that they control
This seems like a logical approach, and indeed was the approach that BPPC took in the first version. In the first version of BPPC the confidentialityCode of any document could hold the OID of the BPPC policy. This seemed logical to us at the time too. But what we found out was that this doesn't scale with the age of the data-set. What I mean is that as a patient changes their mind regarding sharing (opt-in, opt-out, etc), one needs to go and change all the metadata tags on all the data, rather than just change the policies.

So BPPC now has the confidentialityCode as a sensitivity/confidentiality classification, and the BPPC policy is self-contained. If the BPPC policy needs to have special rules about a specific Document, the the BPPC policy points at the document. For example when a patient wants a specific episode summary hidden, it is identified by the unique number in the privacy policy.

Privacy Policy --> Data Object

Indeed as the patient changes their privacy policy, or the organization changes their privacy or security policies; these changes should be reflected in the privacy/security policies, not in changes to the data. The data hasn't changed, the policy has changed.

Metadata describes the Data as facts, and Policies have the specifics about how to control the data 
In fact the Data object does have metadata that is used in the Access Control decision. Metadata that does include a confidentialityCode, classCode, formatCode, Patient Identifier, Facility Identifier, Author Identifier, etc. But I don't see these as 'pointers', but rather 'security context attributes'. See the IHE white paper Access Control - Published 2009-09-28

I am optimistic that the authors were not trying to impose an architecture, but rather identifying a principle. As a principle the directionality should not matter.

PS. software engineers might find this useful, or might be confused by it -- this is like the first-year school assignment in linked-lists, stacks, and fifo... the obvious direction of the pointers is not the right direction to implement. Ok, that might not help at all...

Most popular blog entries of 2010

The blog tools make this post easy, but it is still useful to record an annual perspective on what seemed important at the time. Clearly the popularity contest is won by 'ranting'.

  1. Meaningful Use Security Capabilities for Engineers This is where I describe the Meaningful Use security capabilities and provide recommendations on what they mean and how to implement them
  2. Meaningful Use clearly does not mean Secure Use I am amazed at how many hits this got and continues to get. It is a rant on the MU draft, yes they did fix some of the things I rant about. See item 1.   
  3. Meaningful Use Certification issue with Encryption of data-at-rest This is where I rant about how the Meaningful Use rules messed up and defined tight requirements for encryption and integrity controls but failed to say anything about key management, content packaging, or portability. 
  4. Meaningful Use Security Capabilities Lacking, Privacy Capabilities NON-existent  Another rant...Yup, privacy is still missing... 
  5. Meaningful Use - Security Plan This is where I ranted less, but gave advice to how to read the Meaningful Use draft.
  6. Accountability using ATNA Audit Controls This is where I explain how to achieve the requirements of Accountability with simply an Audit Control. (Watching what people do is very important. It is sometimes the only way to detect users misbehaving, like looking at VIP patients or downloading thousands of documents)
  7. Data Classification - a key vector enabling rich Security and Privacy controls This is where I demystify the confidentialityCode as a part of segmentation, and explain how this is metadata to be used by access control engines as one of the factors used to determine if a specific use of data should be allowed or not. (For those reading PCAST, read this as if it is the PCAST concept of 'tagged data element approach'. It is part of the whole picture but not completely... )
  8. Meaningful Use Encryption - passing the tests This is where I explain just how bad the data-at-rest requirements are and how screwed up the testing is.   
  9. IT security problems continue (Designing a Secure HIE) This is where I explain that point-to-point security doesn't scale and that a walled-garden approach using TLS may be a better starting point. (Yes, this is an old article that still is true today. We see in NHIN Direct something closer to the unconstrained point-to-point, or end-to-end. The solution being discussed is to restrict NHIN Direct endpoints to 'organizations', thus ending up with a smaller map but still quite the spider web)
  10. Meaningful Use takes Security Audit Logging back a decade It is unfortunate that we work hard to advance security and privacy only to have regulation take us back to the dark ages.
I have some articles that I think are more long-term explanations of a concept. They are not so much popular at any given time, but are references:

IHE Security/Privacy primer
Meaningful Use Security Capabilities for Engineers
User Identity
Access Controls - Including enforcing Privacy
Consent Management
Audit Controls
Other Controls