Monday, August 29, 2016

Blockchain and Smart-Contracts applied to Evidence Notebook

Moleskine notebookThere is a need where an individual or team needs to record chronological facts privately, and in the future make these facts public in a way that the public can prove the integrity and chronology.  Where the chronological facts need to be known to within some timeframe, typically within a day. Where the sequence of the facts needs to be provable. Where a missing recorded facts can be detected. Where an inserted fact can be detected. Where all facts can be verified as being whole and unchanged from the date recorded. Where all facts are attributable to an individual or team of authors.


These proofs are used to resolve disputes and prevention of fraud. Areas like in intellectual property management, clinical research, or other places where knowing who and when in a retrospective way is important. Aka: Lab Notebook, Lab Journal, Lab Book, Patent Notebook. Here is an image from the Laboratory Notebook of Alexander Grahame Bell, 1876.,

File:AGBell Notebook.jpg

Historically, tamper-evident notebooks provided assurance of data provenance with clear chronology. Sewn bindings and numbered pages were the foundation which the user annotated with name & date inscriptions in indelible ink. While not infallible, the notebooks were good enough for many important evidentiary functions.

Blockchain technology can bring this historical practice into the digital age. In particular, blockchain can be used to allow for work to be conducted in private yet be revealed, either by choice or circumstance, at a future date.

There are three variations on the use case:

  1. Bob is doing research that may eventually be presented publicly. When it is presented publically there is a need to have historic evidence of all the steps and data used. This is today done with a tamper-evident notebook. The authors of these notebooks are also careful to include date/time as they progressively record their work. In this way an inspection of the notebook can determine that it is whole, not modified, and thus a trust of the contents, when, and by whom.

  1. Prior to 2013, the US Patent and Trademark Office (USPTO) used First-To-Invent to determine priority. While the tamper-evident notebook was essential in that model, it is still valuable supporting evidence even after the switch to First-To-File. In particular, intellectual property disputes benefit from tamper-evident records.

  1. Publicly funded research (e.g. NIH, NSF, DARPA) increasingly mandate the release of underlying data at a future date. There is also a trend on the part of regulatory bodies for full data access, especially in light of concerns over negative results from clinical trials not being reported.


The following are the various steps in the overall process.
  • As entries are added to an Evidence Notebook
    • The evidence is recorded in a private notebook, and an Author Signature is submitted to a purpose specific blockchain.
    • The Author may choose to also archive the evidence onto the blockchain.
    • Members of the community, as part of their support of that community, will counter-sign these Author Signature blocks
  • At some time in the future when the Evidence Notebook needs to be disclosed, the Author will declare to the community their identity
  • In support of a disclosure, any member of the community with access to the Evidence Notebook may validate the notebook.

Use-Case Keeping Records

Bob at some periodic point, or based on some procedural point, submits the new Evidence Notebook pages. This is done using a Digital Signature across the new evidence pages, creating an Author Signature. This Author Signature is then placed onto the Evidence Notebook Blockchain, signed by an identity in the control of Bob. This Author Signature does not expose the content of the evidence notebook, but can be used by someone, like Edna, who has access to the Evidence Notebook to prove that the pages submitted have not changed.

  • ? Is there a need to define the Author Signature other than to say it is an XML-Signature format, with signature from the blockchain rather than from PKI?   Advantage the blockchain gives is the identities, algorithm choice, and public ledger.

Use-Case Escrow of Notebook

Bob can optionally put onto the blockchain the updated evidence notebook pages or any evidence (e.g. data) in encrypted form, with a smart-contract holding the key in escrow until one or more terms come true to release the content. The smart-contract can assure that the keys are appropriately disclosed upon trigger events such as time-period, inactivity by Bob, or other typical contract  terms. This escrow also preserves the content across the blockchain redundancy.

  • ? Should the encrypted notebook pages be also cross-signed by the community? The signature would be of the encrypted blob, which would be proof that the encrypted blob appeared on the blockchain at that time.

There is no way to confirm that Bob has placed complete evidence into this encrypted evidence package without also having access to the evidence. Thus there still is the risk that Bob has done an incomplete job of preserving evidence.

Support Use-Case Counter-Signature

Peers from the community will counter-sign these Author Signatures. This blockchain signature by peers simply indicates that the Author Signature block was observed on the Evidence Notebook BLockchain at the stated time. Through multiple counter-signatures by peers, trust in the Author Signature veracity is confirmed.

Automated timestamp peers could also be used, that do nothing but apply a verifiable timestamp signature across any new Author Signatures. These are indistinguishable from Peers, except that Peer identities would also be submitting their own Author Signatures, expecting peer counter-signatures.

Peers are compelled to counter-sign as an act of community. Through these peer identities counter-signing Author Signatures, these peer identities gain more of their own peers to counter-sign any Author Signatures that identity might post. (You wash my back, I’ll wash yours). Thus, a new identity on the blockchain that has not yet counter-signed other’s Author Signatures would not find peers willing to sign that new identity’s Author Signatures.

Use-Case Public Knowledge

The system to this point does not require identities to be known. Neither Bob nor the Peer identities need be publically known. They are simply identities in the Evidence Notebook Blockchain. An identity owner is free to explicitly make their identity known.

Bob needs to make public claims backed by Evidence Notebook proven through Author Signatures by a specific blockchain identity or identities. That is what Bob needs to make proof public that Bob is the holder of the private key associated with one or more identities. Thus binding Bob’s identity with all historic uses of that identity.

Once Bob makes identities public knowledge, others can monitor new Author Signatures created by that identity. This may be seen as exposing activity, so might cause identities that have been made public to not be used for new Author Signatures. The public knowledge of an identity may be seen as beneficial, so the identity may be made public early.

Use-Case Verifying Records

Edna needs to confirm an Evidence Notebook content. Edna has been given access to the Evidence Notebook content. Edna knows the Evidence Notebook Blockchain Identity that is claiming to have made Author Signatures corroborating the specific pages from the Evidence Notebook. The Evidence Notebook may be in any electronic form, as long as the Digital Signature process is repeatable. This is often use of XML-Signature mechanism.

Edna verifies Author Signatures of each submission (page). Edna verifies counter-signatures to gain assurances that the Author Signature has not been tampered with, and occurred during the time indicated.

Edna may choose to discount specific identities that have been determined to be fraudulent, or where the control of that identity private key has been compromised. Edna may choose to discount identities that have not yet made themselves public, holding public identities higher. Noting that the movement of an identity from anonymous to public has value to the community as a whole.


(brought in whole list from here. Figured we should re-use actors if they fit.)

Role in the use case
The person or entity that submits Author Signatures. They are assumed to be an investigator or worker in a research team.
An authenticated and authorized individual that has been granted access to the Evidence Notebook. This may be a staff researcher for the Study Sponsor doing cross-study correlations, or an external researcher with a different study question that can be answered with previously collected data.
A peer on the blockchain. The identity may be known or not known.
Generic bad actor
Research Sponsor
The organisation that receives research data. These individuals or systems need access to the evidence. They may receive this evidence directly, or through the Escrow Evidence. For the purpose of diagrams and data flows, any member of the study team will be represented as "Dan"
Research Team
The individuals and systems who are performing some research or other activity for which an Evidence Notebook is necessary. Bob is a member of the research team. For the purpose of diagrams and data flows, any member of the research team will be represented as "Bob"
The individuals and systems who counter-sign Author Signatures to help provide veracity. It is expected that peers will not be part of the same research team as Bob.

Prerequisites / Assumptions

  • Bob needs to keep the research confidential until some future time.
  • The format of the notebook need not be constrained, as long as digital signature can be validated once the notebook is made public.
    • Presume use of XML-Signature schema can mediate this
    • If Evidence data is disclosed it must be properly handled or de-identified
  • There is no need to publish the content of the notebook on the blockchain.
    • There is an option for encrypted notebook on the blockchain, and use of smart-contracts to unlock as appropriate
  • Bob may have many notebooks, or may have many research projects interleaved within one notebook. This similar to paper notebooks today.
  • Bob may need to hide his current activities, meaning new activity can’t be associated with Bob

Use Case Diagrams

Use Case steps

  1. New Author Signature
    1. Bob updates his evidence notebook
    2. Bob submits a Author Signature block to the blockchain
    3. Bob optionally submits Evidence blobs to the blockchain
    4. Paul notices a new Author Signature block
    5. Paul counter-signs the Author Signature block
  2. Evidence Notebook validation
    1. Edna is asked to confirm an Evidence Notebook
    2. Edna is given access to the Evidence Notebook (may not be public disclosure)
    3. Edna validates signatures from the blockchain
    4. Edna validates counter-signatures from the blockchain
    5. Edna extracts timestamps from set of signatures
    6. Edna may validate Public Signatures as necessary
  3. Evidence disclosed
    1. Smart-Contract triggers
    2. Smart-Contract may include notification mechanisms to Dan
    3. Dan receives Evidence and decryption keys given trigger on Smart-Contract

Sequence Diagrams

(drafting, not yet done)

End State

The use case ends when Bob stops submitting Author Signatures under a given identity. There is no expectation that identities must be publically unknown, or can’t be used once publically known.


  • Author Signatures are validated
  • Modified Author Signatures are detected as not valid
  • Participation sufficient to achieve (n) counter-signatures
  • Funding by organizations relying on output (research, clinical trials, etc)


  • Participants collusion to revise history
  • Is insufficient number of peers, and therefore insufficient number of prompt counter-signatures, a distinct failure mode?


Champion / Stakeholder

John Moehrke (self)
Scott Bolte (Niss Consulting)

Related Material

Common Accord: CommonAccord is an initiative to create global codes of legal transacting by codifying and automating legal documents, including contracts, permits, organisational documents, and consents. We anticipate that there will be codes for each jurisdiction, in each language. For international dealings and coordination, there will be at least one "global" code. Center for Collaborative Law

IP Handbook - “Inventors and Inventions” - Chapter 8: “How o Start-and Keep-a Laboratory Notebook: Policy and Practical Guidelines

MIT - Instructions for Using Your Laboratory Notebook May, 2007

NIH - “Keeping a Lab Notebook” - Presentation by Philip Ryan,

FDA - Pharmaceutical Quality Control Labs -

Cornell - LabArchives - an electronic lab notebook -

Howard Kanare - Writing the Laboratory Notebook, American Chemical Society Publications, 1985,  ISBN 978-0841209336

Astroblocks - Lab Journal on Blockchain, experimental use of bitcoin chain, April, 2015,

Saturday, August 27, 2016

Privacy Constraints in Controlling Big-Data Feeding Frenzy

This article covers the constraints often placed on an approved use of healthcare data. These are the conditions, restrictions, obligations, or handling caveats.

When a Patient allows use of their data, there is almost always restrictions. Some restrictions are supported in access control rules. That which I have already covered in Vectors through Consent to Control Big-Data Feeding frenzy. I am not going to re-describe "Vectors". The Vectors are used in rules to determine if an access is allowed or denied.

Some of those Vectors are similar to constraints, such as the discussion about "Treatment", "Payment", or "Operations. That I covered in Consent Basis in Controlling Big-Data Feeding frenzy. An important message from that specific example is "Purpose Of Use". This is both a "Vector", and a "Constraint". That is a rule can be based upon a user requesting, where the request indicates that the user is asserting that they will only use the data for a specific Purpose Of Use (e.g. "Treatment"). In this case the "Purpose Of Use" is satisfied at the "Vector" stage.

Purpose Of Use can also be a "Constraint", in that data might get released with a constraint attached indicating a set of Purpose Of Use that are allowable.  This might be done when the user indicates too many Purpose Of Use, where the Privacy Consent only allows a subset. Or it might be used when the User request context didn't have a clear Purpose Of Use declared.

A Constraint, in some technology is called an Obligation, in other technology it is just part of an Authorization Decision. What I am focused on here is some constraint that goes along with the data that will further restrict use or cause specific action.

Some Constraints are not explicitly said in the technology layer, but are part of Policy that enabled communication. Such is the case with a "Data Use" agreement. Here is where Purpose Of Use is seen again, often a communications "Data Use" agreement authorizes only specific kinds of uses. Some Health Information Exchanges have a restriction on"Treatment", Some Health Insurance Exchanges have a restriction on "Payment". Some Research networks have a restriction on "Research".

Up to now I have mostly talked about Purpose Of Use; which is relatively easy to understand and enforce. The following are more specific constraints. These too might be in the Data Use policy, might be represented in Vectors, or might be communicated with the data.

The following is some of the ideas in the space of constraints. Many of these have specific Obligation, or PurposeOfUse codes.

  • Purpose Of Use
    • Treatment
    • EmergencyTreatment
    • Payment
    • Operations
    • Resarch
    • PublicHealth
    • Marketing
    • Donation
  • Access
    • no access beyone given  user
  • Persistence
    • do not persist -- delete after use
    • do not print
    • persist only in encrypted form
  • De-Identification
    • declassify
    • mask
    • redact
    • minor
  • Auditing
    • audit rail
    • notification of subject on use
  • Future Consent
    • re-use requires new consent
    • restrict to specific users


It is very unusual for a Privacy Consent to allow access without Constraints. Most of the time the constraint is built into the Policy that enabled communications, so it doesn't need to be said in the communication. Much of the time the constraint can be handled as part of the Access Control decision. Sometimes the constraint needs to be communicated along with the data, often referred to as an Obligation. This is only done when there are assurances that the residual constraint, Obligation, will be enforced..

Other articles on Privacy Controls and Privacy Enforcement

Friday, August 26, 2016

Consent Basis in Controlling Big-Data Feeding frenzy

In the last article I wrote about all the Vectors through the healthcare data access control space that are commonly needed by Patient Privacy Consent Authorizations. In this article I will describe the residual policy rules and Obligations.

When a Patient says YES to authorize access to their data, they are saying it within some context. This authorization comes with metaphoric strings.

Overall Policy context

A Consent Policy is a multi-layered thing. Let me illuminate this by looking at a simple and most common Privacy Consent in healthcare is:

  • The Patient says YES to authorize use of their data for Treatment, Payment, and normal hospital Operations.

One might think that this is a very simple Consent. Simply "YES". Others might notice that there are some restrictions to "Treatment/Payment/Operations". Both are very important attributes of the consent, and would be seen clearly in the consent. 

The Consent that would be on file will likely just say these simple truths. You all have seen  Consent form, they are not very all encompassing.

What is implied is
  • This consent is only for the one organization. Likely implied by the author of the consent.
  • This consent has a start date, of today. 
  • This consent names the patient
  • This consent names the purpose-of-use of Treatment, Payment, and normal hospital Operations
What is unclear is
  • This consent doesn't appear to have an end date. 
    • So we need to look into the Organizations policies to see what their data retention policy is. Do they retain beyond receiving payment for services? Do they retain until death? Can I ask that they discard?
    • What control is there if the Organization is merged with another organization? Or goes out of business?
  • This consent relies on an agreed definition of "Treatment"
    • Does treatment mean all at the Organization can access the data regardless of treatment relationship?
    • Is there a formal treatment relationship system at this Organization?
    • Who is allowed to declare they are treating?
    • What actions are considered treatment, vs payment, vs operations?
    • One can imaging Treatment is restricted to licensed clinicians; but who is checking that?
    • Are any third parties used for any Treatment actions?
    • Are dietitians involved as part of Treatment, or Operations?
  • This consent relies on agreement of definition of "Payment"
    • Can I pay with cash and thus not expose this episode to any insurance?
    • Who are the people involved in Payment?
    • Are these accesses part of the access report?
    • Are third parties used for any of these Payment activities?
    • is involved?
  • This consent relies on agreement of definition of "Operations"
    • What is operations?
    • Who is authorized to do operations?
    • Who authorizes those that are authorized?
    • Are these operations actions also included in an audit?
    • Does this include government reporting?
    • Is there any way I can control what operations is?
    • Are third parties used for any of these Operations?
  • This consent doesn't say anything about things that are not mentioned. Does this mean that these other things are forbidden?
    • Often there is a statement hidden somewhere that indicates that there are sometimes when Marketing may happen. Often this is considered part of normal Operations
    • Often the organization is under government mandate to participate in quality reporting, immunization reporting, drug-abuse reporting, physical-abuse reporting, etc.
    • Often the organization is required to assist with law enforcement. Does this require a court order? 
    • Often the organization has a clinical-research function. Are the data used in clinical research? Are the data de-identified? If de-identified, what assurances that the de-identification is sufficient? What remedy is available if the de-identification is not sufficient?
    • Are third parties used for these unsaid things?
Further away and 
  • How do I get an accounting of access?
  • How do I dispute that someone got access that should not have?
  • How do I request a correction?
  • If I terminate the Consent, then what is still allowed to be done with my data?
  • What remedy is available?
Within HIPAA there is a requirement that the Notice of Privacy Practices be posted. Although HIPAA is very a minimalist regulation and specific to the USA, similar practices are found elsewhere in the world. Some of the above questions might be answered by that document. However I am sure some of the above is not stated in that document.

An important point is that the details needed are not found in any Regulation, they are specific to the Organization. The Organization must look at regulations and their goals and come up with their specific Policy. This concept of Layers of Policy was first introduced in my Healthcare Privacy & Security Bloginar, based on the IHE presentation.

This preparation is also the first step in my discussion on the overall Consent Process. Shown in this infographic.

So a Consent record will indicate who the Patient is, what the start date is, what organization it is with, etc... The Consent record needs to also be very clear what rules apply at that organization. This is what I am referring to as Base Policy. As in the basis of the Consent. That which this specific Patient specific Consent is built upon.

That Base Policy is defined to be a set of definitions and rules intended to meet some Goals and Regulations. Shown in blue in the following figure. That Base Policy informs and controls a bunch of IT Systems including a User Directory, Patient Directory, Role assignment, ec. That Base policy fulfills a set of regulations. So the Base Policy is fulfilling the Organizations responsibility to Regulations (like HIPAA), and to the Goals of the Organization.


The Base Policy of a Consent is just as important as the Consent. The Basis Policy is not the regulations, regulations are the basis of the Base Policy. The Base Policy includes a huge amount of rules and commitments that are specific to that organization. The Consent is the proverbial tip-of-the-iceberg.

Other articles on Privacy Controls and Privacy Enforcement

Wednesday, August 24, 2016

Vectors through Consent to Control Big-Data Feeding frenzy

This is part of a series of articles on the various Privacy Consent mechanisms that are being developed in HL7, IHE, and HEART. This article will detail the various vectors that Patients desire to control. This discussion will not be on any of the specific solutions, but rather the overall requirement.

For some background, please see my prior article Controlling Big-Data feeding frenzy with Privacy Consent Authorization

First step is to recognize that Privacy Consent must enable the Patient to define Rules and Obligations. This is abstractly represented by their Policy -- My Policy -- which follows their Data.

Thus when someone or something tries to access the their Data; there is an authorization (AuthZ) check done. This authorization check assures that the Patient would be happy allowing their data to be used in the way that the someone or something is going to use their data. I am speaking abstractly, so no specific authentication, context, method, obligations, etc... Just that the Patient would be happy, not upset.

I have taken from my other article, the five elements. 
  1. The Patient -- Smiling
  2. The Patient's policy - "My Policy"
  3. The Patient's data - "My Data"
  4. The someone or something that wants to gain access
  5. The Authorization decision that is based on the request, and the patient's policy

The solutions being created by HL7, IHE, and HEART; are specifying these things in different ways. The differences are important, although the goals are the same.

In all models they look to address the Privacy control. They all are careful to make it clear that businesses holding data must and can still control their data according to their rules. So this is not a replacement or hindrance of Role-Based-Access-Control (RBAC), or other mechanisms that manage workflow centric authorization.

My Policy -- rules

The contents of "My Policy" are potentially very complex. I have covered the space back in 2011 in "Access Controls: Policies --> Attributes --> Implementation", an article still valid. The policy might be very simple, or might be very complex. The solutions might be able to manage the simple, and only some of the complex. 

The most simple is


That is right, the most simple is when the patient has either explicitly said "no" or has through inaction in an explicit-consent environment. 

My Policy - unknown

Right away I had to add a different situation. That is what is the Policy when the Patient has not yet expressed their choice or agreement or 'consent'.  The more hard part about this is that the policy that is in place when the Patient has not yet expressed consent is often driven by Law/Regulation, or Medical-Ethical standards.

There is a concept of "Implied Consent" that is related, but not exactly the same. Implied Consent is the policy that is in place when the Patient has taken an action to engage, but has not yet expressed an explicit consent. That is through their actions to go to a Doctor, they are implicitly consenting to some default form of "YES". This default form of "YES" might not be obvious. HIPAA requires that the Doctor post Notification of Privacy Practices. 

Medical-Ethics is the moniker where a treating clinician, under their Hippocratic oath, can determine that it is better for your safety, or the safety of others, that you be treated regardless of your explicit denial of authorization, or absence of authorization. Most medical-ethics will keep this 'invasion of privacy' to a minimum. There maybe a technical override used, often called Break-Glass.

A patient that doesn't even want break-glass used must be really insistent. Most "NO!" indications have many exceptions. Not just medical-ethics, but required government reporting, required medical records retention, etc. Many of these simply can't be wished away. This is why the "My Policy" space in all models expects only to control the data to the level of control given to the Patient. Which sadly is sometimes is very little, Otherwise known today as "Data Blocking".

My Policy - Vectors

I am going to use the word "Vectors" as each of the following are independent attributes and controls. Most of them can be combined in various ways. This is best modeled using mathamatics using "Vectors".

The category of things to control:
  • User - the context of the request for data
  • Application -- what is going to process the data
  • Resource -- the data
  • User Identity -- who is this specific user (e.g. userId, Provider-ID, nationally issued ID)
  • User Relationship -- what is the relationship between the user and the patient (e.g. care-team, mother, son, guardian, lawyer, law-enforcement)
  • User Role -- what is this user functionally doing that requires access to the data
  • User Organization -- what organization is the user working within
  • User Purpose - what is the user going to do with the data (e.g. Treatment, Payment, Public-Health, Research, Disclose)
  • User Location - where is the user, and thus where might the data go
  • User timeframe - when is the user access happening
  • Application Identity - what application will get access to the data
  • Application Security-- what is the security of the application
  • Application timeframe - how and for how-long will the application persist the data
  • Application promise - what assurances can the application give on how it will treat the data
  • Data Identity - unique identifier of the data
  • Folder Identity this data sits within
  • When was the data created
  • When was the data last updated
  • Who authored the data
  • Who verified the data
  • Where was the data authored
  • Availability, has the data been replaced or refuted
  • What kind of treating facility authored the data
  • What kind of care practice setting authored the data
  • Predecessor data that was used in the authoring of this data (e.g. Order)
  • Successor data that was created based on this data (e.g. Discharge Summary)
  • Relationships to other data (e.g. folder identifier)
  • Type of data object
  • Type of clinical content implied by the data (e.g. Pregnant, Cancer, Addict)
  • etc

My Policy Examples

As I indicated, the above set of Vectors are cross-cutting, and most policy statements will be made up of many vectors, and any "My policy" is made up of many policy statements.

Here are a few reasonable examples:
  1. Authorization the release of the documents authored by St Michael hospital to a named rehabilitation facility for the purpose of recovery.
  2. Authorize the release of the documents authored by St. Michael hospital, related to a specific surgery episode, to a post-surgery care-team.
  3. Authorize a treatment facility to gather all historic medical records from four other treating organizations.
  4. Authorize Dr Bob to have access to all records at St. Michael hospital except for those documents created during the fall of 1998.  
My goal is not to show examples for everything. I simply want to show how what seems clear in English text might be very hard to encode in a set of rules, and possibly very hard to enforce. Such as the last example

      5. Authorize my Parents access to all records except those related to drug abuse.

Especially hard since various tests might be given for drug abuse, but those same tests have many non-drug-abuse purposes. Many medications might be given to alleviate the effects of drug-abuse, but are also used for other conditions. Many locations (e.g. Betty Ford Clinic) are clear indications of a drug-abuse case, but not all locations are so obvious.

Also note that it is not always obvious who "Parents" are. Especially when a Parent might be a Clinician who might have other User accounts to use. 


What I have expressed is Vectors that Consent Policies (authorization policies) potentially need to be able to encode in a computable way. That is in a way that an Access Control decision can be made that would -- make the Patient happy, not upset.  This set of vectors is the set of vectors that I recall from the various usecases, I am happy to hear about others, I will gladly update this article. 

In HL7, IHE, and HEART; they are building parts of the system to do this. The differences are important, although the goals are the same. Enable the patient to be happy, not upset.

Other articles on Privacy Controls