Tuesday, April 7, 2015

NIST seeks comments on De-Identification

NIST is seeking comment on De-Identification. The good news is that they have used the Healthcare ISO 25237 specification. The bad news is that they didn't reference the IHE De-Identification Handbook-. Guess I have my first comment ready. De-Identification is a Process used to lower 'risk' of re-identification.

NIST IR 8053
DRAFT De-Identification of Personally Identifiable Information
NIST requests comments on an initial public draft report on NISTIR 8053, De-identification of personally Identifiable Information. This document describes terminology, process and procedures for the removal of personally identifiable information (PII) from a variety of electronic document types.

Background:
This draft results from a NIST-initiated review of techniques that have been developed for the removal of personally identifiable information from digital documents. De-identification techniques are widely used to removal of personal information from data sets to protect the privacy of the individual data subjects. In recent years many concerns have been raised that de-identification techniques are themselves not sufficient to protect personal privacy, because information remains in the data set that makes it possible to re-identify data subjects.

We are soliciting public comment for this initial draft to obtain feedback from experts in industry, academia and government that are familiar with de-identification techniques and their limitations.

Comments will be reviewed and posted on the CSRC website. We expect to publish a final report based on this round of feedback. The publication will serve as a basis for future work in de-identification and privacy in general.

Note to Reviewers:
NIST requests comments especially on the following:

    • Is the terminology that is provided consistent with current usage?
    • Since this document is about de-identification techniques, to what extent should it discuss differential privacy?
    • To what extent should this document be broadened to include a discussion of statistical disclosure limitation techniques?
    • Should the glossary be expanded? If so, please suggest words, definitions, and appropriate citations?

Please send comments to draft-nistir-deidentify@nist.gov by May 15, 2015.
Draft NISTIR 8053
Comment Template Form for Draft NISTIR 8053


References to articles I have written on De-Identification
 

Monday, March 16, 2015

What is MHD beyond XDS-on-FHIR?

I have been working on a Profile in IHE now for three years. It normally doesn’t take this long, but in my case I had the good luck of being the in the right place at the right time. I saw the tidal wave of “HTTP RESTful” coming, felt it strongly back when I was on “The Direct Project” creating a sub-optimal solution. At that time, IHE only had the XDS solution, which is based on Web-Services using SOAP, SAML, and ebRegistry. This XDS solution was and is still the best solution for business-to-business. However this solution is very hard to use if one is using programming tools more common on lightweight systems such as Mobile.

So back in 2011 I wrote the first profile in IHE that was targeting ‘ease of use by lightweight application platforms such as Mobile Health Applications”. Thus it targeted use of HTTP RESTful, using JSON encoding. The Mobile Health Documents (MHD) profile was born to provide a more simple API to an XDS environment. This happened to be the same timeframe that Grahame was fanning the FHIR flames. So we joined forces and brought the concepts needed for XDS into FHIR®. So now, I take those FHIR based Resources and re-write the profile.

Note there will be yet-another re-write (hopefully just tweaks) this summer after HL7 completes their DSTU2 ballot process. There are a set of gaps identified this winter that we have fixed in the proposed content for DSTU2.

The Mobile Health Documents (MHD) is the result.
I am not going to go into deep details, but take the perspective here that the reader is a FHIR expert, and wants to understand this MHD profile. I will assume you also have some understanding of XDS, but only as an overall concept.

The basics are shown here.

The MHD abstract actors are:

  • Document Source - the  producer and publisher of documents and metadata
  • Document Recipient - receives documents and metadata
  • Document Consumer - queries for documents metadata, and requests to retrieve documents
  • Document Responder - responds to requests for document metadata entries and documents.

The MHD abstract transactions are:

  • Provide Document Bundle - This transaction is used to transfer documents and metadata, and is analogous to a Provide and Register Document Set-b transaction.
  • Find Document Manifests – This transaction is used to provide parameterized queries that result in a list of Document Manifest resources.
  • Find Document References – This transaction is used to provide parameterized queries that result in a list of Document Reference resources.
  • Retrieve Document – This transaction is used to get documents.

MHD uses few FHIR Resources:

and

The MHD Profile enables many deployment models:

As and API to XDS environment

This is what is mostly talked about, but this was just the master pattern. The functionality provided is a more simplified API to a backbone that is fundamentally based on XDS. This simplified API is based on the HL7 FHIR RESTful API. It is therefore available in simple XML, or JSON. The elements of the metadata are thus more accessible to a Java Script application.

As an API to XCA environment

Just like with XDS, this is a more simplified API to a federated set of Document Sharing infrastructure. The interactions of the Document Source and Document Consumer MHD actors are just the same as with XDS. The implementation of the Document Recipient and Document Responder MHD actors might be more specialized.

As a standalone Document Sharing infrastructure

Similar to XDS or XCA, but without the need for XDS or XCA on the backend.

As an API to XDR environment

Either end of an XDR could be implemented

As a standalone PUSH environment 

Similar to XDR without XDR. Use your imagination that everywhere XDR might be used, the MHD Document Source to Document Recipient could be used.

As an API to the Direct Project HISP  

Either as PUSH based API, or including support for Query side interaction. The Direct Project is a secure email protocol for pushing documents from one place to another. There are value-add service providers that provide a hosted environment for this. They offer a few different APIs to their hosted service. Some are the secure email, some are based on XDR, some have their own HTTP REST API. These could be augmented through the addition of the MHD API as the front-end of the HISP. 

As an API to any document based system

The backend just needs to have a document concept

As simply a profiled FHIR service 

At the IHE Connectathon we showed that Document Sources and Document Consumers could just direct their API toward FHIR Servers and it would simply work.

Security and Privacy 

As with any Interoperability API dealing with Healthcare information, Security and Privacy
are important. IHE doesn’t mandate a specific Security or Privacy model, as that would be Policy. But IHE does encourage the use of ATNA, and IUA.This also described on the FHIR Site on the Security page.

For More information

  • User Identity and Authentication
  • Patient Privacy Controls
  • Access Control (including Consent Enforcement)
  • Audit Control
  • Secure Communications
  • Document Sharing Management (Health Information Exchange - HIE)
  • Patient Identity
  • mHealth
  • Meaningful Use
  • The Direct Project
  • Searching for an ATNA Audit Record Repository

    I have received more than a few requests lately for a listing of the Implementations (vendors or open-source) of the ATNA "Audit Record Repository". I was hoping that a simple internet search on "IHE Integration Statement ATNA Audit Record Repository" would be enough, but that seemed to fail. Mostly digging up articles about ATNA, not about available implementations. This search should have worked, so there is either very few implementations of the IHE Audit Record Repository, or the ones that are out there are not publishing their IHE Integration Statement on the internet where it can be indexed.

    Most worrying is that there is a lack of standalone implementations, that is vendors that specialize in Audit Record Repository functionality. I know these exist, but they don't show up on the search.

    It is very important for the success of the IHE process that vendors publish their "IHE Integration Statement" on the internet. This is not just true for ATNA, but for all Profiles.

    I expect that my blog followers are the likely audience to be these Implementations, or at least those that know some. So I ask that if you have an implementation of the IHE ATNA Audit Record Repository, please post a comment on this article that points at your IHE Integration Statement for that product. I am fine with you explaining your implementation capabilities in a few sentences, but please don't make me regret this offer.

    If you know of an ATNA Audit Record Repository and don't see it show up, please let me know. Or better, let THEM know.

    Thursday, February 12, 2015

    Is it really possible to anonymize data?

    De-Identification is a Process, and one that can be done right or WRONG!

    The argument 'for' or 'against' use of de-identification is a Red Herring. What the arguments are actually about is the point that a Process can be done badly. There should be no doubt that any Process can be done badly. Even a simple process like filling a glass of water can be done badly, even resulting in human harm.

    The big misunderstanding is that De-Identification is an absolute. It is not, it is a Process used to lower 'risk' of re-identification. As a process it can be done badly. As a domain of 'risk' it can't achieve zero-risk, except to end up at the null-set. 

    The standards in this space are clear about this risk factor. It is absolutists that insist on viewing de-identification as an absolute, that are causing the argument. This oversimplification is just as alarmist.

    As Yogi Berra is said to say: "In theory, there is no difference between theory and practice. But, in practice, there is."  The Practice of applying de-identification has occasional failures, like all 'risk' domains. No one hears about the  times when de-identification is done successfully.All the failures are held up to the light and used to show that the solution fails.

    This doesn't mean I am an absolutist that De-Identification is the solution. My perspective is that it is a "Tool". As all tools and processes; they must be used properly.

    UPDATED
    -----------------
    It was pointed out to me, by the awesome Gila Pyke, that I failed to remind the reader that De-Identification is just ONE tool in a mature risk management process. As a risk management tool, and as stated above, the risk will not be brought to zero; as such the resulting data-set might still require protection. It is true that too often one presumes that a data-set that has been de-identified can be globally published. This is true if that was the target of the risk management, and that the risk to re-identification has truly been reduced to the level necessary for global publication. This is one of the misunderstandings that also results in the outlined failures. This is also a fundamental misunderstanding, failing, of the HIPAA de-identification clause.

    De-Identification, Anonymization, Pseudonymization

    Friday, January 30, 2015

    MHD Connectathon Results

    IHE Connectathon did informal testing of  Mobile access to Health Documents (aka #MHD - #XDS on #FHIR.We had 9 different implementations testing at the Connectathon. : Qvera, Caradigm, Relay Health, CareEvolution, CRG Medical, Allscripts, EMC, ONC – Dragon, and an individual effort.
     
    MHD has a wide variety of deployment models, leveraging the simplicity of HTTP REST and the data model and interaction model defined in FHIR.  One use of MHD is as a front API to XDS, but the profile can also be used for point-to-point PUSH (similar to XDR), or access to a Community (like XCA), or just as a simplified API to an EHR or a Direct HISP.
    Among this august group they succeeded to cross-test in 25 different pathways. Which might sound large or small depending on how you do the math, but I will assure you this is just shy of all possible combinations of the Actors implemented.

    There was a much larger use of XML than JSON. All publication side were done using XML.

    Most gratifying for me was that we were able to point the Document Source and Document Consumer implementations at the Publically Available FHIR servers. Using Grahame’s server and Ewout’s server. This was a fantastic proof that the IHE profile is truly a profile of FHIR, and not a deviation or special case. IHE still has some work to make this formal, in the eyes of HL7, but we now have the joint workgroup to help make this happen.

    Overall the group identified a small number of open issues that will be fed into the Public Comment. Most of these are typos, most of them leftover from the first revision of MHD. Each of these we discussed how to resolve. The biggest issue on how to handle the Minimal Metadata vs Full Metadata, where we will leave it as is for now, but after HL7 FHIR DSTU2 we will pursue creation of two formal profiles to identify the differences.

    The conclusion is that the MHD profile, once the issues are resolved, should move to Trial Implementation. Everyone understands that this is based on HL7 FHIR DSTU1; which means we will need to adjust again this summer when DSTU2 gets finalized.

    This conclusion was so definitive that the MHD testers have been invited to join the HIMSS Interoperability Showcase, as participants within the normal workflows. AND the next IHE Connectathon in Luxemburg will be testing MHD more formally.

    For more on mHealth

    Wednesday, January 28, 2015

    TLS (not SSL) Connectathon trials and tribulations

    IHE Connectathon this year had a major disruption caused by TLS/SSL. In hindsight this was very predictable, but those that were feeling the pain were not those that understood what was going on.

    The problem is that platforms (Operating Systems, Web Servers, Application Servers) have decided to forbid the use of SSL, as it contains too may vulnerabilities and has too many successful attacks against it. Successful attacks with scary names like "POODLE". This simple policy, forced by the platforms, has caused a large audience at IHE Connectathon to grind to a halt.

    The solution is to go back to the IHE-ATNA profile that long time ago mandated the use of TLS 1; so Connectathon should never have allowed SSL anyway.

    The short term solution is to recognize this problem, and get back to testing healthcare specific things.

    The Following is the Notice written up last night:

    There have been problems with the NIST XDS tools regarding the use of TLS. This note documents the issues and describes the changes to testing.

    The ATNA profile requires the use of TLS v1 for protecting message exchange between systems. In the past we have seen TLS on-the-wire negotiations where one party requests TLS V1 and the other party requests SSL V3 and the connection is established at the SSL V3 level. This has been accepted in the past but it has been poorly understood. We now understand the following details.
    • In general this is controlled mostly by implementation platform and not vendor/tool code.
    • On Windows platforms, older versions allow the back off to SSL V3 and the newest absolutely requires TLS V1. I am not a Windows developer so I cannot point to specific versions based on my knowledge.
    • For Java based systems it is the JDK/JRE version that matters. Java 6 (1.6) will only negotiate to SSL V3. Java 7 (1.7) is able to negotiate TLS v1.
    So what is important when dealing with our tools is to understand what Java version is being used in which tool on the test floor.
    • The Internet Public Registry and the three copies here in Cleveland (RED/GREEN/BLUE) all are running on Java 6.
    • Toolkit for Connectathon is running on Java 6
    The immediate goal is to have vendors and monitors stop having to deal with this problem and get back to productive testing. So,
    Vendors and monitors: any test that cannot easily be validated via TLS (TLS selection in toolkit) should be validated without TLS. Do not spend any more time dealing with TLS issues unless it relates directly to the interaction between two vendor systems!

    This is not a statement of 'security is not important', it is a statement that we know how to secure the communications, so lets get back to the healthcare specific stuff. Much like my statement on FHIR and secuirty

    Sunday, January 18, 2015

    FHIR Security: Do (Not) Worry

    I have been asked quite often to explain how to secure FHIR. I have two different answers, depending on what you are trying to do. These two different answers are driving confusion so I need to explain them.
    1. Don’t worry about security, It is simply a use of the existing Security layers built into HTTP.
    2. Worry about Privacy and Security first thing in your design, bolting it on later will not work well and will cause issues.
    These two answers seem very confused, but let me explain the context of them. They should not be taken as absolutes, meaning sometimes you do need to worry about security and sometimes you don’t. They are actually very related. The basics of Security for FHIR are written up on the FHIR Specification.

    Don’t worry about security

    The first answer is the one I give to those in HL7 that are working on developing FHIR, or just getting their feet wet learning FHIR.

    Developing FHIR Resource definition:

    The committee members developing FHIR need to focus on developing well thought-out Resource Definitions. So I assure them that there is already security design built into the fundamental general-purpose standards that we are basing FHIR on. That we have a level of confidence that they can go ahead and design their Resources ignorant of how security and privacy will be enforced.

    Resources should be ‘just right’ sized. Not too big.

    There are exceptions for committee members that are developing FHIR. One exception that I am trying to get more recognition for is related to the size of Resources. Not really size, but in the number of elements included in the Resource definition. The security models that we are thinking about for protecting FHIR work best when they can make a YES/NO decision on a whole Resource content. If they have to hide some of the elements in a Resource, then you end up with a far more complex solution. It can be done, but not easily. And when things get hard, they generally don’t get implemented or get implemented wrong. So I recommend that anytime a Resource is getting too big, it should be evaluated on if there are mixed sensitivity.

    Note that 'too big' is a general statement. For example the "Family History", which is really not big at all, but is full of things that all are likely to need different types of protection. I would rather see this broken up into a bunch of resources so that each family member in your family history is documented in an independent Resource, so that Privacy can be more easily enforced.

    Prototypes and Learning

    Those just learning FHIR also need similar assurances. In fact I am comforted that this is the group that tends to question my statements about not worrying about security. These people get started using FHIR, using HTTP to an open test-server. No security at all. No HTTPS, no User Authentication, no Access Control, no evidence that Privacy Consents will be honored. I am glad that these people start to really worry, they should. But they also need assurance that it is okay to learn FHIR on a test server that has no security. They however soon become part of the camp that needs to hear the Second answer.

    Disturbing is that these people clearly haven't looked for the FHIR Security Page.

    Privacy and Security by Design

    Those who are writing applications that are going to use FHIR really need a totally different mindset. They must focus first on how Privacy will be respected, and how Security will be protected. This is the mantra of the “Privacy by Design” efforts.

    Basics of FHIR Security

    The basics of FHIR Security are documented on the FHIR Security page. This page will get improved over time. It includes ‘motherhood and apple pie’ of HTTP RESTful security. It however doesn’t include much detail for the implementer. So, how does one execute on this “Privacy and Security by Design” directive? The first answer is look at your programming toolkit and application platform documentation. It likely has instructions on how to turn on HTTPS, and forbid access to your server hosted Resources. Follow those instructions, there is nothing special for healthcare.

    User Authentication (AuthN)

    User Authentication is a bit more complex, but start again with your programming toolkit and application platform documentation. I would recommend that you look to supporting Federated Identity Management. This is an approach that treats the User Authentication step as a ‘Service’. That is, your application doesn’t get involved in the act of authenticating the user. Your application, through your web-server / application-server, force a user to be authenticated by a sub-set of ‘trusted third party identity providers’. Yes, you need to manage the list of ‘identity providers’ that you trust. This management is within your web platform, but you likely need to expose the configuration to someone, who is authorized as the administrator.

    In this way, your application will redirect the browser session or application session over to a web user interface hosted by one of the ‘trusted third party identity providers’. This is where the user would authenticate, using whatever technology that ‘identity provider’ supports. This means you are not involved in the decision on how the user gets authenticated, but this is why you want to be careful about the management of the list of ‘trusted third parties’; as you really need to be able to ‘trust’ them. The advantage for you is that you never need to know their password, nore even that a password is what is used. The ‘trusted third party identity provider’ can be using very fancy biometrics, smart-cards, anything…

    The result of this is that you receive back from this ‘trusted third party’ a ‘claim’. This claim is often encoded using SAML, if the identity provider is a business organization or government body. This claim is often encoded using “OAuth 2.0” and “OpenID Connect”, if the identity provider is a consumer focused Internet service provider (e.g. Facebook, Google, LinkedIn, Twitter, etc). Remember YOU get to determine who you put ‘trust’ in; so don’t worry that you will be forced to trust ‘Facebook’.

    With HTTP REST, the use of OAuth is going to be easier; so quickest path is through OAuth. Note that there are bridging services that can convert a SAML claim into an OAuth claim.
    • There is efforts in the past to profile this: IHE IUA and Bluebutton++.
    • There are efforts now starting to do a better job: HEART and such.

    User Authentication in an Application

    The above is really about the overall architecture of user authentication in a web environment, and how to apply this to a web-server. If you however are developing a web application that will use a FHIR Server as the backend, then you need to look into your programming platform and application platform for how to support “OAuth 2,0”; also look for “OpenID Connect”. 
    In this case you will need to handle a special OAuth 'token', that indicates that the user has authorized your application (the Authorization part of OAuth).

    User Authorization (AuthZ)

    This is a much more complex topic than I want to discuss in this article. The main problem is highlighted on the FHIR Security page. The solution is often a bunch of layers of Access Control decision and enforcement.

    There will be many more blog articles, updates to FHIR, and new profiles that appear this year.

    Conclusion

    So, the key is to understand if you are just learning FHIR, developing FHIR Resource definitions, or developing an software that will be used on patient data. Sometimes you should be confident that security is not something you should worry about; sometimes it should be the first thing to worry about. In all cases, it is important to understand why you are not worried or deeply worried.

    Resources mHealth Security