Friday, July 14, 2017

E-mail addresses -- Remedial and realistic

The difference between reality and theory for an e-mail address is huge. I want to drive a bit of open discussion as implementation of e-mail software, directories, and operational environments; fall somewhere along that gap. That is to say that there is much software and services that support e-mail that don't quite implement everything that the standards allow. There are also really good reasons to not support everything that the standards allow

e-mail address according to standards

I am not going to replicate the full definition of what can be in an email address from a standards perspective as there are fantastic write-up that is very readable on the wikipedia, and a nice graphic available from Jochen Topf . I will just pull some examples of just how ugly an email address can be. I am sure you all didn't know these are possible:
  • ""
  • "very.(),:;<>[]\".VERY.\"very@\\ \"very\".unusual"
  • #!$%&'*+-/=?^_`{}|
  • "()<>[]:,;@\\\"!#$%&'-/=?^_`{}| ~.a"
  • user@[IPv6:2001:DB8::1]
Internationalization examples
  • Latin alphabet (with diacritics): Pelé
  • Greek alphabet: δοκιμή@παράδειγμα.δοκιμή
  • Traditional Chinese characters: 我買@屋企.香港
  • Japanese characters: 甲斐@黒川.日本
  • Cyrillic characters: чебурашка@ящик-с-апельсинами.рф
  • Hindi email address: संपर्क@डाटामेल.भारत

Little bobby drop tables is possible

Direct impact

I have been working with developers on an implementation of "Direct". For those that don't know about "Direct", it is simply a profile of e-mail for use within USA healthcare. I was one of those that was involved in "The Direct Project", and wrote the security section and risk assessments. I am sorry that FHIR didn't exist at that time, as it would have been quickly selected over this e-mail solution. The e-mail solution is sub-optimal at best. 

The diagram shows the various parts. It is just the sending side, but it shows various parts that all will be impacted by the e-mail address. Most can just process the e-mail address as a string. But some need to do more with it than that.

Direct uses secure email (S/MIME), uses X.500 certificates to protect endpoints authentication, authorization, and confidentiality of communication. It has two mechanisms for discovering a certificate given an email address. And there are trust organizations, like, that provide governance and certificate services.

Hidden Practice - Remedial e-mail address

Most would not fully support all the capability of an e-mail address as defined in the standard. But most would also not tell others of their self-imposed restrictions. These restrictions might be because of their technology, but might be because of some organizational policy. Such as email systems that use a file-system architecture would limit the email address to what can be represented 'safely' in an file-system directory. 

Most likely today email address are restricted to alpha, number, period, underscore, and hyphen. 

Thus not allowed are ampersand, asterisk, plus, slash, equal, question, carrot, curly-brackets, or tilde. This restriction is not too worrisome. 


The Direct Project has not endorsed RFC 6530, which extended e-mail addressing to International Characters. So, they don't need to worry (or support) the international characters

Some worry about the International characters because of the fact that there are some characters in that set that 'look' exactly like ASCII characters, thus easy to fool a human. This is less of a concern with Direct as all addresses are looked-up to find their Certificate, and that Certificate must chain to a Trusted Certificate Authority. Thus this attack would not work unless a Certificate Authority has accepted a deviant email address and issued a Certificate. And if a CA did this, then that CA should not be trusted. So within Direct there is protection against this attack.

Directory vs e-mail address

Also, we are speaking specifically about the technical part of an email address, not what gets displayed to a user. It is this 'displayed to a user' that is a important separation. What gets displayed to the user might be far more relaxed, especially if a Directory is available to fully represent in full feature, the name the individual wants to be called.

In fact this is where I get specifically worried that some are demanding deviation from reasonable specification because of user-experience expectations. Specifically, most users expect that email addresses are NOT specific to the case of the characters typed, but technically at the e-mail protocol they are allowed to be case-sensitive. Thus the protocols are all defined as "case preserving" while allowing a server side determination if the server-side wants to treat e-mail addresses as case-sensitive or not.  

More specifically it is easy to give a user the experience of case insensitivity, while being case specific at the technical level. One can look through a Directory in a case-insensitive way, when only one entry is found then use that entry, but use the case of the e-mail address one found in that Directory (or certificate) at the protocol level. This is case-preserving, and thus does not require a deviation from the e-mail standards.

First step beyond remedial

The first challenge to remedial address is the need to include a single-quote such as is needed for "Fred O’Donnell".  The single-quote character is not often supported by email technology, as it can cause encoding issues in various technologies like file-systems, directories (LDAP), databases, and APIs. These are not impossibilities, but where a single-quote exists, it must be handled special. Where as all the other characters in the remedial list require no special handling.

This single-quote problem is what brought me to this whole topic. As someone with a single-quote in their email couldn't use Direct. There was a bug that was fixed. But as discussed above if all partners within a Direct trust domain don't also support single-quote equally, than this individual will only be able to send-to or receive-from those that do support single-quote. So is fixing this bug really helpful? Is the single-quote needed in the e-mail address, or just what gets displayed (Directory)?

Controlled Advancement

So I have addressed the DirectTrust community with this topic. From what I could tell, this issue is as big as I predict. Remedial email addresses are okay, but beyond that and major 'interoperability' issues would happen. Inducing O'Donnell, single-quote. 

I did hear some interest in adding the plus character. Plus is a special case, not just a character. Especially in light of the way that Direct supports individual addresses vs domain addresses.

I would very much recommend DirectTrust come up with a policy. They are an operational environment, an as such can make operational decisions that can't be made in an Interoperability specification like Direct. Thus any operational policy that DirectTrust comes up with does not need to be represented in the Direct specification. This policy might not be a ‘forever forbidden’ policy, but rather a ‘not allowed at this time’. This keeps open to future needs that are use-case and demand driven. This policy should be very clear about the fact that International characters are not required, and thus DirectTrust does not allow them.

I would recommend against the really special processing such as comments (), and quoted strings. These serve very little value, and are a place where trouble can hide.


It is likely that remedial e-mail address is sufficient, so this might actually not be a big issue. But it does require a Policy so that everyone can appropriately TEST to assure Interoperability.

Wednesday, July 5, 2017

Beyond Basic Privacy – The IHE APPC Profile

New IHE profile allows patients more flexibility in expressing their privacy preferences. 

This is a re-publication of an article that Tarik Idris from InterComponentWare and I co-authored. Originally posted on the ICW blog at 20.04.2017
The average size of electronic medical records grows each year. Some of the drivers of this growth are the increased utilization of EMR systems, scanning of paper records, and improved access to health information exchanges. As a consequence, patients need more sophisticated tools to adequately express their privacy preferences. When your medical record contains only a few x-rays and is only ever shared between your family doctor and your radiologist, a simple “yes” or “no” to data sharing may be sufficient to express your privacy preferences. But if your record contains family and social history, photos of skin conditions, STD panels, genetic information, and psychiatric evaluations and different parts of that record need to be accessed by a small army of physicians, surgeons, therapists and dental hygienists, you might need a more detailed method of defining your privacy preferences than “yes” or “no”.

Privacy preferences are commonly documented in a patient privacy consent document. The IHE profile BPPC (“Basic Patient Privacy Consents”) defined a common format for these consent documents. It is widely used, especially in projects sharing medical documents between different healthcare enterprises. BPPC was designed to cover cases where the patient has a simple choice between a handful of possible privacy policies. For example, the patient might choose between allowing all data sharing, only allowing sharing of summaries or only allowing sharing in case of emergency. The BPPC profile doesn’t determine what the choices are, it only requires that each choice that a patient is given has a unique identifier (“Privacy Policy Identifier”) and that there is some kind of access control system that knows what to do for each of those identifiers. The consent document only references the privacy policies (they are not expressed in a machine readable format as part of the consent document) and are just assumed to be understood by the recipient. BPPC was not designed for expressing fine-grained access rules, e.g. that a specific lab panel might only be viewed by a specific healthcare provider.

In light of this growing need for more sophisticated consent documents, the IHE IT Infrastructure Domain decided to define a new profile, IHE APPC (Advanced Patient Privacy Consents), that compliments BPPC by addressing additional use cases. Development of this profile started in late 2015 and was supported by an international group of stakeholders. The profile was published as a new “Supplement for Trial Implementation” in August 2016.

APPC’s focus is to enable automatic enforcement of consent documents. If a patient’s consent document states that facility X may not access his longitudinal record in an HIE, then the HIE should automatically deny access requests for this patient’s records coming from facility X. To enable this automatic enforcement, APPC includes a detailed, machine-readable, structured representation of the privacy policy. Whereas BPPC only included a reference to the privacy policy, APPC uses OASIS XACML (eXtensible Access Control Markup Language) to fully spell out the access control rules implied by the privacy policy. XACML is an XML-based domain specific language to unambiguously define access rules. This allows systems to implement an enforcement mechanism for these privacy policies by using one of several commercial or open source rules engines that can interpret XACML access rules.

We hope to empower both healthcare providers AND patients with the IHE APPC profile. Without a flexible language for defining access control rules for each project, vendors will force the same one-size-fits-all access control approach onto all healthcare providers, regardless of their specific needs. Using the IHE APPC profile, healthcare providers will be able to define an access control approach that fits their processes and their patient collective. E.g. pediatric oncology patients need their data available for regular follow-ups for a long time, whereas a surgery ER patient’s data could potentially be more ephemeral.

Patients will benefit by having a less monolithic approach to privacy preferences. While it is unrealistic that there will be completely different rules for each patient, there is a finite list of common customizations that can easily be implemented in an enforcement system based on IHE APPC. A common type of patient-specific customization is to blacklist specific providers (colleagues, relatives, former lovers, …) to deny them access to your patient record. Another common customization is hiding specific documents (e.g. drug testing results, psychiatric evaluations).

“The advantage for patients is the greater consideration of their individual needs.”

Of course in the grand scheme of things, technology, standards and products are just minor elements in arriving at a patient-friendly and efficient privacy scheme. Privacy laws and regulations, healthcare provider’s competitive landscape and association politics, liability laws, etc. usually have a greater impact on what the actual privacy choices for patients are. But when it is time to establish an agreed upon set of privacy policies in real world IT systems, it is important to have specifications like IHE APPC ready to enable an easy and cost-efficient implementation.